Disclosure of Invention
In order to solve the technical problems of lack of complexity, low safety and easy cracking of codes, the invention aims to provide an information platform data safety transmission method and system based on the Internet, and the adopted technical scheme is as follows:
an information platform data security transmission method based on the Internet comprises the following steps:
The method comprises the steps of obtaining platform data, preprocessing to obtain character data, wherein the platform data comprise historical data and current data of a plurality of days, and the character data comprise all historical character data and all current character data of a plurality of days based on a time sequence;
determining a coding table based on the character data;
Compressing and transmitting current data according to the coding table;
Wherein determining the encoding table based on the character data includes:
dividing each day into a plurality of time phases based on all historical character data based on the time series for the plurality of days;
Analyzing and obtaining the actual frequency, the distribution density tendency degree and the safety performance degree of each character data in each time stage based on the current all character data;
And determining the coding priority of each character data in each time stage according to the actual frequency, trend expression and safety expression of each character data in each time stage, and forming the coding table.
Preferably, dividing each day into a plurality of time phases based on all the historical character data based on the time series for the plurality of days includes:
acquiring the character data amount and the character data category amount at each moment of each day based on all the historical character data based on the time sequence for a plurality of days;
obtaining the character data complexity of each time of each day based on the character data quantity and the character data type quantity of each time of each day;
determining each minimum value point according to the complexity of the character data at each moment of each day and calculating the time coincidence degree of each minimum value point;
and determining a time stage demarcation point according to the time coincidence degree of each minimum value point, and dividing each day into a plurality of time stages based on the time stage demarcation point.
Preferably, the calculation formula of the character data complexity at each time of each day is as follows:
Wherein, Representing the character data complexity at the j-th minute on the i-th day,Indicating the amount of character data at the j-th minute on the i-th day,Indicating the character data type amount at the j-th minute on the i-th day,Representing the normalization function.
Preferably, the calculation formula of the time overlap ratio of each minimum point is as follows:
Wherein, Indicating the time coincidence degree of the z-th minimum point, n indicating the number of days,Character data complexity representing the z-th minimum point on the i-th day,The time difference between the z-th minimum point and the same number minimum point on the adjacent day is shown.
Preferably, the analyzing to obtain the distribution density tendency of each character data in each time stage based on the current all character data includes:
numbering all character data in each time stage and obtaining a numbering set of different character data;
and obtaining the distribution density tendency of each character data according to the number set of the different character data.
Preferably, the calculation formula of the distribution density tendency of each character data is:
Wherein, Representing the distribution density tendency of the s-th character data; G represents the data quantity existing in the coding set corresponding to the current s-th character data; Representing the number difference between the g data and the adjacent data in the number set corresponding to the current s character data; representing the number difference between the (g+1) th data and the adjacent data in the number set corresponding to the current(s) th character data; Representing the normalization function.
Preferably, the analyzing to obtain the security performance of each character data in each time stage based on the current all character data includes:
In all the current character data, selecting U/2 character data according to the time sequence of each character data to obtain U character comparison data, wherein U is a preset value;
And obtaining the safety expression degree of each character data by analyzing the actual frequency of each character data and the corresponding character comparison data.
Preferably, the calculation formula of the security expression degree of each character data is:
Wherein, U represents the amount of character contrast data determined based on the s-th character data; Representing the actual frequency of the s-th character data; representing the actual frequency of the ith character contrast data of the ith character data; Representing the normalization function.
Preferably, the calculation formula of the coding priority of each character data is as follows:
Wherein, Representing the coding priority of the s-th character data; Representing the actual frequency of the s-th character data in the current time stage; representing the distribution density tendency of the s-th character data; and the security expression of the s-th character data is represented.
An internet-based information platform data security transmission system, comprising:
The data acquisition module is used for acquiring platform data and preprocessing the platform data to obtain character data, wherein the platform data comprises historical data and current data of a plurality of days, and the character data comprises all historical character data and all current character data of a plurality of days based on a time sequence;
A code table determining module for determining a code table based on the character data;
the compression transmission module is used for compressing and transmitting the current data according to the coding table;
wherein the coding table determining module includes:
a time-phase dividing unit configured to divide each day into a plurality of time phases based on all the history character data based on the time series for the plurality of days;
The data analysis unit is used for analyzing and obtaining the actual frequency, the distribution density tendency and the safety performance of each character data in each time stage based on the current all character data;
And the coding table determining unit is used for determining the coding priority of each character data in each time stage according to the actual frequency, trend expression and safety expression of each character data in each time stage to form the coding table.
The internet-based information platform data safety transmission method and system provided by the invention have the beneficial effects that the different time phases are divided by analyzing the historical data, and the actual frequency of the character data is comprehensively adjusted on the basis of considering the data distribution performance and the safety based on each time phase, so that a dynamic coding table is constructed, the transmission is improved, and the safety is ensured.
The method can avoid the safety and the adaptation problem of the corresponding codes of the staged character data caused by the traditional fixed coding table, improves the complexity of the codes in a dynamic coding table mode to ensure the safety, determines the coding length of each character data based on the actual data expression of the current time stage in each dynamic coding table, and ensures the safety of the data while improving the transmission efficiency after compression.
Detailed Description
In order to further describe the technical means and effects adopted by the invention to achieve the preset aim, the following detailed description is given below of a method and a system for securely transmitting data of an information platform based on the internet according to the invention, which are provided by combining the accompanying drawings and the preferred embodiment. In the following description, different "one embodiment" or "another embodiment" means that the embodiments are not necessarily the same. Furthermore, the particular features, structures, or characteristics of one or more embodiments may be combined in any suitable manner.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
The invention provides a method and a system for safely transmitting data of an information platform based on the Internet, which are specifically described below with reference to the accompanying drawings.
Example 1
Referring to fig. 1, a method flowchart of an internet-based information platform data security transmission method according to an embodiment of the present invention is shown, including:
Step S1, acquiring platform data, preprocessing to obtain character data, wherein the platform data comprises historical data and current data of a plurality of days, and the character data comprises all historical character data and all current character data of a plurality of days based on a time sequence;
step S2, determining a coding table based on the character data;
and step S3, compressing and transmitting the current data according to the coding table.
As shown in fig. 2, the step S2 includes:
Step S21, dividing each day into a plurality of time phases based on all historical character data based on the time sequence for a plurality of days;
step S22, analyzing and obtaining the actual frequency, the distribution density tendency degree and the safety performance degree of each character data in each time stage based on the current all character data;
And S23, determining the coding priority of each character data in each time stage according to the actual frequency, trend expression and safety expression of each character data in each time stage, and forming the coding table.
According to the internet-based information platform data security transmission method, different time phases are divided through historical data analysis, and the actual frequency of character data is comprehensively adjusted on the basis of considering data distribution performance and security based on each time phase, so that a dynamic coding table is constructed, and the security of the data is ensured while transmission is improved.
The method can avoid the safety and the adaptation problem of the corresponding codes of the staged character data caused by the traditional fixed coding table, improves the complexity of the codes in a dynamic coding table mode to ensure the safety, determines the coding length of each character data based on the actual data expression of the current time stage in each dynamic coding table, and ensures the safety of the data while improving the transmission efficiency after compression.
The invention mainly aims to divide data into different time phases through historical data of an Internet information platform, determine a dynamic coding table based on each current time phase, compress the data so as to improve transmission efficiency and ensure the safety of the data.
Aiming at the multiple problems caused by a static coding table when the traditional Huffman coding processes the internet information platform data, the method determines a plurality of dynamic coding tables by dividing different time phases, so that the complexity of coding can be improved while the data is compressed, the efficiency of data transmission is improved, and meanwhile, certain safety is ensured.
The following describes the steps in detail:
In step S1, the acquired platform data is internet information based platform data, and the history data of multiple days may be data of one week (i.e., seven days), and all the history character data of multiple days based on time series is all character data of every minute in time sequence. Specifically, the step is to acquire historical character data based on time series in the last week (i.e. seven days) based on the internet information platform, and acquire current character data in real time for subsequent processing.
In step S2, it can be understood that, in daily use, the current internet information platform presents different information data amounts at different time periods due to work, rest, and hot events and topics of people. In this way, different time phases can be divided for how much is based on the amount of information data in the history data, whereby different encoding tables can be constructed for the different time phases. Meanwhile, for the data of each time stage, the character data can be adjusted on the basis of the actual occurrence frequency of the character data, for high-frequency data, the data information quantity is large, the data transmission performance is ensured to be more focused on the data transmission rate, and for low-frequency data, the data transmission quantity is small, and the data transmission safety is more focused. Thereby, corrections are made on the basis of the actual character frequency and on the basis of the above we determine the corresponding encoding table for the different time phases in current real time.
Thus, as shown in fig. 2, step S2 includes steps S21, S22, and S23.
In step S21, it can be understood that, in the actual operation of the internet information platform, different usage performances of the internet information platform are presented at different time periods according to life habit performances such as work, rest, and the like of the user, so that the corresponding data information amount performances are different. Therefore, we divide the different time phases by the representation of the data information presented by the recent internet information platform, which is used as the reference for the current different time phase division.
Preferably, as shown in fig. 3, the step S21 includes:
Step S211, acquiring the character data quantity and the character data type quantity at each moment of each day based on all the historical character data based on the time sequence for a plurality of days;
Step S212, obtaining the complexity of the character data at each time of each day based on the character data amount and the character data type amount at each time of each day;
step S213, determining each minimum value point according to the complexity of the character data at each moment of each day and calculating the time coincidence degree of each minimum value point;
Step S214, determining a time phase demarcation point according to the time coincidence degree of each minimum point, and dividing each day into a plurality of time phases based on the time phase demarcation point.
In step S211, namely, the amount of character data contained in the information data per minute (in units of minutes) of each day of the recent week is counted first, and recorded asI.e., the amount of character data at the j-th minute on the i-th day of the last week.The acquired character data amount at the j-th minute in the i-th day of the last week is counted according to the actual occurrence number of the same character data even if the same character data appears a plurality of times without considering the repeatability of the data. Here, we can analyze the amount of character data that exists every minute every day of the last week, count the amount of non-repeated character data that exists every day of the last week, and record it asI.e., the character data type amount of the j-th minute in the i-th day of the last week.
In step S212, by analyzing the two data expressions described above, we can determine the complexity of the character data per minute in the last week every day, which means how much the character data amount per unit time is analyzed from the unit of each minute, which is related not only to the character data amount but also to the character data kind amount, and when the larger the character data amount per minute and the larger the character data kind amount per minute, it is explained that the internet information platform presents a higher data information amount per minute based on the user' S usage expression.
Preferably, the calculation formula of the character data complexity at each time of each day is as follows:
Wherein, Representing the character data complexity at the j-th minute on the i-th day,Indicating the amount of character data at the j-th minute on the i-th day,Indicating the character data type amount at the j-th minute on the i-th day,Representing the normalization function.
In step S213, since our final objective is to determine different time phases, we can mark the minimum points of the data (the minimum value generally corresponds to the end of the last phase and the beginning of the next phase) for the complexity of the character data every minute of every day in the last week, so as to analyze the time difference of each minimum point based on the adjacent days, and the smaller the time difference is, the more likely the extreme point corresponds to the demarcation point of different time phases. Here, based on the complexity of the character data every minute every day in the last week, we determine each minimum point therein by traversal, and mark, and traverse the time (same in minutes) corresponding to each minimum point.
Preferably, the calculation formula of the time overlap ratio of each minimum point is as follows:
Wherein, Indicating the time coincidence degree of the z-th minimum point, n indicating the number of days,Character data complexity representing the z-th minimum point on the i-th day,The time difference between the z-th minimum point and the same number minimum point on the adjacent day is shown. +1 is to avoid the case where the denominator is 0.
The whole formula is characterized in that the time coincidence degree of each minimum value point is determined by analyzing the time difference performance of the minimum value point with the same sequence number corresponding to each day and adjacent days, and the higher the time coincidence degree is, the more likely the current minimum value point is the demarcation point of different time phases.
In step S214, M (e.g., 8) time phases may be preset, so that M-2 minimum point time overlap ratios are selected to determine M-2 time phase boundary points, so that time can be divided into M time phases, wherein each time phase maintains a relatively balanced data information amount.
In step S22, it can be understood that in step S1, each time period has been divided and determined based on the data information manifestation in the last week, where the above divided time period can be utilized as a period of the current data information and analyzed therewith. There are different data characteristics of the character data in each current time period, conventionally the actual frequency of the character data, for which we analyze the characteristics associated with it, namely data trend performance, security performance. This means that the character data may exhibit a regular distribution behavior according to the time sequence, such as a denser distribution, whereas the frequency of the adjacent data on the time sequence is closely related to the security of the high frequency character data (here, the actual frequency of the character data) for the security behavior, because the high frequency data may exhibit a security problem that is more easily broken when distributed together than when the low frequency data is distributed together.
The analysis is performed here for each current time phase:
(1) Statistics of actual frequency of each character data
The number of times of occurrence of each character data at the current time stage is traversed is referred to asAnd the total number of occurrences of all character data at the current time period, denoted as C.
Determining the actual frequency of the character data at the current time stage based on the twoWherein, the method comprises the steps of, wherein,Representing the actual frequency of the s-th character data at the current time period.
(2) Determining distribution-dense trend performance for each character data
The method comprises the steps of obtaining distribution density tendency of each character data in each time stage based on analysis of all the current character data, and preferably comprises the steps of numbering all the character data in each time stage, obtaining numbered sets of different character data, and obtaining the distribution density tendency of each character data according to the numbered sets of the different character data.
That is, the numbering of formulas 1,2, 3 is performed based on the character data at the current time stage while traversing the number set thereof for each character data. Thus, for each character data we can get its set of numbers at the current time stage. Then, for each set of numbers corresponding to the character data, we make differences between adjacent numbers, i.eThat is, the number difference between the g-th data and the adjacent data in the s-th character data number set is simply referred to as "g-th number difference".
For example, there are 10 character data in total, and the numbers are 1, 2, and 3..10, wherein 1, 3, 5,7, and 9 are character data a (first character data), 2, 4, 6, 8, and 10 are character data b (second character data), and then the character data are divided into two sets, that is, the first character data corresponds to the number set of (1, 3, 5,7, and 9), and the second character data corresponds to the number set of (2, 4, 6, 8, and 10). Then at this point in time,=,=。
Preferably, the calculation formula of the distribution density tendency of each character data is:
Wherein, Representing the distribution density tendency of the s-th character data; G represents the data quantity existing in the coding set corresponding to the current s-th character data; Representing the number difference between the g data and the adjacent data in the number set corresponding to the current s character data; representing the number difference between the (g+1) th data and the adjacent data in the number set corresponding to the current(s) th character data; Representing the normalization function. In this calculation formula, +1 is to change the final range from [ -1,1] to [0,2], and 1/2 is to change the final range to [0,1].
The formula is integrally passed throughThe method is characterized in that the method is used for representing the reference value of the distribution dense expression, and meanwhile, the trend expression (which is of an increasing type and is of a decreasing type) of adjacent number differences is analyzed from the integral step-by-step traversal angle, and the size of the numerical value represents the increasing or decreasing degree, so that the distribution dense tendency of the character data is determined on the basis of the reference value.
(3) Determining security performance of individual character data
The method comprises the steps of obtaining safety performance of each character data in each time stage based on analysis of all current character data, and preferably obtaining the safety performance of each character data by selecting U/2 character data from all current character data according to time sequences of the current character data and on the left and right sides of the current character data, wherein U is a preset value, and obtaining the safety performance of each character data by analyzing actual frequency of each character data and the corresponding character comparison data.
For each character data, we select U/2 character data on the left and right sides of the data according to the time sequence, so as to distinguish and record the data as character comparison data, and analyze the safety performance with the current character data, when the actual frequency of the current character data and the corresponding character comparison data is higher and the actual frequency difference of the current character data and the corresponding character comparison data is smaller, the actual frequency of the current character data and the corresponding character comparison data is higher, so that the actual frequency of the current character data and the corresponding character comparison data is higher, the actual frequency of the current character data and the corresponding character comparison data is higher under the condition of being based on adjacent data distribution, and in general, the situation is high in repeatability, and the risk of cracking is possibly higher. In some specific embodiments, U may be 30.
Preferably, the calculation formula of the security expression degree of each character data is:
Wherein, U represents the amount of character contrast data determined based on the s-th character data; Representing the actual frequency of the s-th character data; representing the actual frequency of the ith character contrast data of the ith character data; Representing the normalization function. It will be appreciated that the number of components, The calculation formula of (2) can be referred to in the foregoingIs a calculation formula of (2). +0.01 is to avoid the case where the denominator is 0.
In step S23, for each stage we analyze the actual frequency, distribution density tendency, security expression of the character data, since different data information expressions are corresponding to different stages in the actual operation process of the internet information platform, for the coding priority of the character data, we can adjust by the distribution density tendency and security expression on the basis of the actual frequency of the character data, when the actual frequency is higher, we should consider the transmission efficiency, and when the actual frequency is lower, we should consider the security. By the above manner, the encoding priority of the character data is determined.
Preferably, the calculation formula of the coding priority of each character data is as follows:
Wherein, Representing the coding priority of the s-th character data; Representing the actual frequency of the s-th character data in the current time stage; representing the distribution density tendency of the s-th character data; and the security expression of the s-th character data is represented.
Thus, the coding priority of each character data in each current time stage is determined, the larger the coding priority is, the shorter the corresponding coding length is, and the corresponding coding table can be determined based on each time.
In step S3, the data of the internet information platform is compressed and transmitted according to the encoding table determined in S2. It can be understood that by the above method, a plurality of dynamic coding tables can be determined, so that the transmission efficiency is improved and the safety of data in the dynamic coding tables is ensured.
Example two
The invention also provides an information platform data safety transmission system based on the Internet, as shown in fig. 4, the system comprises:
The data acquisition module 10 is used for acquiring platform data and preprocessing the platform data to obtain character data, wherein the platform data comprises historical data and current data of a plurality of days, and the character data comprises all historical character data and all current character data of a plurality of days based on a time sequence;
A code table determination module 20 for determining a code table based on the character data;
And the compression transmission module 30 is used for compressing and transmitting the current data according to the coding table.
Wherein the coding table determining module 20 includes:
a time-phase dividing unit 21 for dividing each day into a plurality of time phases based on all the history character data based on the time series for the plurality of days;
A data analysis unit 22, configured to analyze the current all character data to obtain an actual frequency, a distribution density tendency, and a security performance of each character data in each time period;
and a coding table determining unit 23, configured to determine the coding priority of each character data in each time period according to the actual frequency, trend performance and safety performance of each character data in each time period, so as to form the coding table.
In the embodiment of the internet-based information platform data security transmission system of the present invention, all technical features of each embodiment of the internet-based information platform data security transmission method are included, and description and explanation contents are basically the same as those of each embodiment of the data processing method, which are not repeated here.
It should be noted that the sequence of the embodiments of the present invention is only for description, and does not represent the advantages and disadvantages of the embodiments. The processes depicted in the accompanying drawings do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments.