CN117176178B

CN117176178B - Data processing method of photoelectric communication system

Info

Publication number: CN117176178B
Application number: CN202311451245.1A
Authority: CN
Inventors: 张慧阳; 侯立明
Original assignee: Guangdong Huayida Communication Technology Co ltd; Shenzhen Huayida Communication Equipment Co ltd
Current assignee: Guangdong Huayida Communication Technology Co ltd; Shenzhen Huayida Communication Equipment Co ltd
Priority date: 2023-11-03
Filing date: 2023-11-03
Publication date: 2024-04-12
Anticipated expiration: 2043-11-03
Also published as: CN117176178A

Abstract

The invention relates to the technical field of data compression, in particular to a data processing method of an optoelectronic communication system. The method comprises the steps of obtaining data to be processed in a photoelectric communication system, then obtaining a length interval of a distribution accumulation table, taking any integer in the interval as a length value to be detected, obtaining a basic variable according to the difference between the length value to be detected and an initial length value, obtaining a deviation degree value according to the distribution ratio of each type of characters in the length of the distribution accumulation table and the frequency of the character, and then obtaining a difference variable to obtain an optimization degree value of the length value to be detected; and finishing the compression of the data to be processed based on the RANS entropy coding. According to the embodiment of the invention, the length value of the distribution accumulation table is obtained by analyzing the countermeasure relation of increasing and decreasing the coding length after the length change of the distribution accumulation table, and screening the length value of the distribution accumulation table, so that the effect of improving the data compression efficiency is achieved.

Description

Data processing method of photoelectric communication system

Technical Field

The invention relates to the technical field of data compression, in particular to a data processing method of an optoelectronic communication system.

Background

Communication technology is becoming mature, and photoelectric communication systems are being widely used. The higher data transmission rate can be achieved by the digital or analog signal transmitted by the optical signal, however, as the data transmission requirement increases, the amount of data to be transmitted in the optical communication system increases, and thus the data needs to be compressed.

The distribution accumulation table is one of key parts for realizing the RANS entropy coding, records the accumulation distribution condition of each symbol, and provides basic data for calculating the coding length of the symbol. In the prior art, when the RANS entropy coding is used for data compression, the size of the distribution accumulation table is usually determined according to the number of data types, but because the number of data types in the photoelectric communication system is large and the occurrence frequency of various types of data is difficult to predict, the size of the distribution accumulation table determined by the number of data types can cause large deviation between the length proportion of various types of characters in the distribution accumulation table and the frequency of the characters in the photoelectric communication system data, so that the final data compression efficiency is low.

Disclosure of Invention

In order to solve the technical problem that the final data compression efficiency is low because the length proportion of various characters in the distribution accumulation table and the frequency of the characters in the photoelectric communication system data are greatly deviated due to the size of the distribution accumulation table determined by the number of data types, the invention aims to provide a data processing method of the photoelectric communication system, which adopts the following technical scheme:

the invention provides a data processing method of an optoelectronic communication system, which comprises the following steps:

acquiring data to be processed in an optoelectronic communication system; acquiring the character type number, the total number and the character frequency of each type of characters of the data to be processed;

obtaining a length interval of a distribution accumulation table according to the character type number and the character frequency of each type of characters; obtaining an initial length value of a distribution accumulation table according to the character type number;

taking any integer distribution accumulation table length value in the length interval as a length value to be measured; obtaining a basic variable according to the difference between the length value to be measured and the initial length value;

obtaining an initial deviation degree value according to the initial length value, the total number of characters and the character category number; obtaining a deviation degree value to be measured according to the length value to be measured, the total number of characters and the character category number; obtaining a difference variable according to the difference between the initial deviation degree value and the deviation degree value to be detected and the character type number;

obtaining an optimization degree value of the length value to be measured according to the basic variable and the difference variable; obtaining an optimal distribution accumulation table length value according to the optimization degree values of all distribution accumulation table length values in the length interval; and completing the compression of the data to be processed according to the length value of the optimal distribution accumulation table.

Further, the method for acquiring the length interval comprises the following steps:

taking the character type number as the lower limit of the length interval;

obtaining the maximum common factor of the character frequency of various characters, and taking the ratio of the total number of the characters to the maximum common factor as the upper limit of the length interval;

and obtaining the length interval of the distribution accumulation table according to the lower limit and the upper limit of the length interval.

Further, the formula model of the base variable includes:

wherein,representing the basic variable +_>Representing the length value to be measured>Representing the initial length value, +.>Represents a logarithmic function with base 2, +.>Representing a round-up function.

Further, the method for acquiring the initial deviation degree value comprises the following steps:

obtaining the character frequency of each type of characters according to the character frequency of each type of characters and the total number of the characters;

a value obtained by multiplying the initial length value by the character frequency of each type of character is used as a first initial allocation value of each type of character; taking the decimal part of the first initial distribution value as a first decimal, rounding and rounding all the first initial distribution values to obtain a first final distribution value;

performing length adjustment operation on the first final distribution value of each type of character according to the difference between the sum value of the first final distribution values of the various types of characters and the initial length value to obtain a first distribution length value;

taking the ratio of the first allocation length value of each type of character to the initial length value as a first length ratio value of each type of character; taking the first length proportion value of each type of character as a base number, and taking the character frequency as an index as the initial fitness of each type of character; accumulating the initial fitness of each character to obtain an initial fitness value; and obtaining an initial deviation degree value according to the initial fitting degree value, wherein the initial fitting degree value and the initial deviation degree value are in negative correlation.

Further, the length adjustment operation is performed on the first final allocation value of each type of character according to the difference between the sum value of the first final allocation values of each type of character and the initial length value, so as to obtain a first allocation length value, which includes:

when the sum of the first final assigned values of the various characters is greater than the initial length value, subtracting 1 from the first final assigned value corresponding to the minimum value in all the first decimal values greater than or equal to 0.5; when the sum of the first final assigned values of the various characters is smaller than the initial length value, adding 1 to the first final assigned value corresponding to the maximum value in all the first decimal values smaller than 0.5;

and ending the length adjustment operation until the sum of the first final assigned values of the various characters is equal to the initial length value, and obtaining the first assigned length value of the various characters.

Further, the method for obtaining the deviation degree value to be measured comprises the following steps:

taking the value obtained by multiplying the length value to be measured by the character frequency of each type of character as a second initial allocation value of each type of character; taking the decimal part of the second initial distribution value as a second decimal, rounding and rounding all the second initial distribution values to obtain a second final distribution value;

performing length adjustment operation on the second final distribution value of each type of character according to the difference between the sum value of the second final distribution values of the various types of characters and the length value to be measured to obtain a second distribution length value;

taking the ratio of the second allocation length value of each type of character to the length value to be measured as a second length proportion value of each type of character; taking the second length proportion value of each type of character as a base number, and taking the character frequency as an index as the to-be-detected fitness of each type of character; accumulating the to-be-detected fitness values of various characters to obtain to-be-detected fitness values; and obtaining a to-be-measured deviation degree value according to the to-be-measured closeness degree value, wherein the to-be-measured closeness degree value and the to-be-measured deviation degree value are in negative correlation.

Further, the length adjustment operation is performed on the second final allocation value of each type of character according to the difference between the sum value of the second final allocation values of each type of character and the length value to be measured, so as to obtain a second allocation length value, including:

when the sum of the second final assigned values of the various characters is larger than the length value to be measured, subtracting 1 from the second final assigned value corresponding to the minimum value in all second decimal values larger than or equal to 0.5; when the sum of the second final distribution values of the various characters is smaller than the length value to be measured, adding 1 to the second final distribution value corresponding to the maximum value in all second decimal values smaller than 0.5;

and ending the length adjustment operation until the sum of the second final distribution values of the various characters is equal to the length value to be measured, and obtaining the second distribution length value of the various characters.

Further, the formula model of the variance variable includes:

wherein,representing the difference variable->Represents the initial deviation level value +.>Indicating the deviation degree value to be measured,/>Representing the number of character types>A logarithmic function with a base of 2 is shown.

Further, the method for obtaining the optimization degree value comprises the following steps:

and adding the value of the basic variable and the value of the difference variable, and performing negative correlation mapping to obtain the optimization degree value.

Further, the compressing the data to be processed according to the optimal distribution accumulation table length value includes:

acquiring an optimal distribution accumulation table according to the optimal distribution accumulation table length value;

and completing the compression of the data to be processed according to the optimal distribution accumulation table based on the RANS entropy coding.

The invention has the following beneficial effects:

the invention aims to adjust the length value of the distribution accumulation table to obtain the optimal length value of the distribution accumulation table, thereby improving the data compression efficiency; firstly, acquiring data to be processed in a photoelectric communication system, and then acquiring the number of character types, the total number of characters and the character frequency of each type of characters; then the length interval of the distribution accumulation table can be obtained, and the length value of each distribution accumulation table in the length interval, namely the optimization degree value of the length value to be measured, is obtained by traversing the length values of the distribution accumulation table in the length interval, so that the optimal distribution accumulation table length value is selected; the optimization degree value is mainly analyzed from two aspects, namely, the difference between the length value to be measured and the initial length value is analyzed, and a basic variable is obtained, because the change of the length value of the distribution accumulation table can influence the coding length of data; further, since the number of character types included in the data in the photoelectric communication system is not fixed, the occurrence frequency of various characters is difficult to predict, and the variation of the length value of the distribution accumulation table also affects the deviation between the length proportion of various characters in the distribution accumulation table and the frequency of various characters, the deviation degree value corresponding to the length value to be detected and the initial length value is respectively obtained, the deviation degree value reflects the close degree between the length proportion of various characters and the frequency of the various characters under the length value of each distribution accumulation table, and then the difference variable is obtained through the difference between the deviation degree values of the length value to be detected and the initial length value; then, the optimization degree value of the length value to be measured relative to the initial length value is obtained by combining the basic variable and the difference variable; and then the length value of the optimal distribution accumulation table is selected based on the optimization degree value, and then the compression of the data to be processed can be completed. The length value of the distribution accumulation table is increased, so that the data coding length is increased, but the deviation between the proportion of various characters in the distribution accumulation table and the frequency of the characters in the data is reduced; therefore, the invention combines the two by analyzing the relation between the two, thereby selecting the length value of the optimal distribution accumulation table and further improving the compression efficiency of the data.

Drawings

In order to more clearly illustrate the embodiments of the invention or the technical solutions and advantages of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are only some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.

Fig. 1 is a flowchart of a data processing method of an optoelectronic communication system according to an embodiment of the present invention.

Detailed Description

In order to further describe the technical means and effects adopted by the present invention to achieve the preset purpose, the following detailed description refers to the specific implementation, structure, characteristics and effects of a data processing method of an optoelectronic communication system according to the present invention with reference to the accompanying drawings and preferred embodiments. In the following description, different "one embodiment" or "another embodiment" means that the embodiments are not necessarily the same. Furthermore, the particular features, structures, or characteristics of one or more embodiments may be combined in any suitable manner.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

The following specifically describes a specific scheme of a data processing method of an optoelectronic communication system provided by the invention with reference to the accompanying drawings.

Referring to fig. 1, a method flowchart of a data processing method of an optoelectronic communication system according to an embodiment of the present invention is shown, and the method includes the following steps:

step S1: acquiring data to be processed in an optoelectronic communication system; the method comprises the steps of obtaining the character type number, the total number and the character frequency of each type of characters of data to be processed.

The data of the optical-electrical communication system refers to the collection and recording of data that needs to be transmitted in the optical-electrical communication system, and these data include, but are not limited to: data frames, control information, application data, data protocols, etc., which may be used for information transfer and network communication between users, as well as to help monitor and maintain the operating state and performance of the system. However, as the amount of data to be transmitted in an optical-electrical communication system increases, the data needs to be compressed before the data transmitted by the optical-electrical communication system is transmitted. The range asymmetric coefficients (Range Asymmetric Numeral Systems-RANS) are typically compressed using entropy coding, and a distribution accumulation table, which is one of the key parts in implementing the RANS entropy coding, records the cumulative distribution of each symbol and provides the basic data for calculating the coding length of the symbol.

In the embodiment of the invention, the length value of the distribution accumulation table is evaluated by analyzing the data coding length change caused by the length change of the distribution accumulation table and the deviation relation between the length proportion occupied by various characters and the frequency thereof, and then the length value of the optimal distribution accumulation table is obtained, thereby improving the compression efficiency.

Firstly, the data to be processed in the photoelectric communication system is acquired, and the method for acquiring the data to be processed can be to use a sensing technology or read the data to be processed in a memory. Because the data in the photoelectric communication system is mainly various characters, the acquired data to be processed can be processed, and the data to be processed can be ordered according to a dictionary sequence, so that a character sequence of the data to be processed can be acquired.

Then according to the character sequence, obtaining the character type number, the total number of characters and the character frequency of each type of characters, and recording the character type number asThe total number of characters is +.>The character frequency of each type of character is marked +.>。

The data to be processed in the photoelectric communication system, the character type number, the total number of the characters and the character frequency of each type of the characters of the data to be processed are obtained, and the subsequent analysis process can be completed.

Step S2: obtaining a length interval of a distribution accumulation table according to the number of character types and the character frequency of each type of character; and obtaining an initial length value of the distribution accumulation table according to the character category number.

In the conventional case, the RANS entropy encoding directly obtains the length value of the distribution accumulation table through the number of character types, but because the data distribution in the photoelectric communication system is disordered and the occurrence frequencies of various characters are different, the length value of the distribution accumulation table in the conventional case can cause a larger deviation between the length proportion of the characters distributed in the distribution accumulation table and the occurrence frequency of the characters in the data, so that the encoding length is higher than the information entropy, and the final data compression efficiency is affected.

Therefore, the embodiment of the invention traverses each distribution accumulation table length value in the length section by setting the length section of the distribution accumulation table, analyzes the optimization degree value of each distribution accumulation table length value in the length section to the distribution accumulation table length value under the conventional condition, and further selects the optimal distribution accumulation table length value.

First, a length value of a distribution accumulation table in a conventional case may be obtained according to the number of character types obtained in step S1, and recorded as an initial length value. Initial length value is based on 2, number of character typesAs an index, i.e.)>。

And then acquiring a distribution accumulation table length interval based on the character type number and the character frequency of each type of character.

Preferably, the method for acquiring the length interval in one embodiment of the present invention includes:

since the design of the distribution accumulation table is intended to cover all possible character combinations, the frequency of occurrence and the combination of each character are different. Therefore, in order to accurately reflect the frequency of occurrence of each character and the combination of other characters, the length of the distribution accumulation table should be at least equal to the number of character typesTherefore, the number of character types is +.>As the lower limit of the length interval.

Then, a minimum distribution accumulation table length value capable of satisfying the character frequency is calculated, and for a set of given data, if the distribution accumulation table length value is equal to the total number of characters contained in the data, the distribution accumulation table length value must be able to conform to the frequency of the characters in the original data. However, in practical applications, if the length of the distribution accumulation table is set to be the total number of characters contained in the data, the distribution accumulation table may be too lengthy, thereby increasing the complexity of encoding and the requirement of storage space. On the basis of this, therefore,in constructing the distribution accumulation table, an appropriate length needs to be determined to cover all possible character combinations. To achieve this objective, the maximum common factor for the frequency of occurrence of all kinds of characters in the data is calculated asThen use the data to include the total number of characters +.>Dividing by the maximum common factor->The ratio is taken as the upper limit of the length interval and is marked as +.>I.e. +.>. Therefore, the length interval can be obtained according to the upper limit and the lower limit, namely +>。

Thus, the length interval is obtained, and the operation of traversing each distribution accumulation table length value in the length interval in the subsequent step to analyze the optimal distribution accumulation table length value can be completed.

Step S3: taking any integer distribution accumulation table length value in the length interval as a length value to be measured; and obtaining a basic variable according to the difference between the length value to be measured and the initial length value.

For convenience of explanation and explanation, in the embodiment of the present invention, any integer distribution accumulation table length value is taken as an example, and the whole process of the present invention is explained by performing subsequent processing on the length value to be measured.

The length value of the distribution accumulation table affects the coding length of the data, so that a basic variable is obtained based on the difference between the length value to be measured and the initial length value, and the basic variable can primarily represent the influence of the length value of the distribution accumulation table in the length section of the distribution accumulation table on the coding length compared with the conventional case, namely the influence of the initial length value on the coding length.

Preferably, the method for acquiring the basic variable in one embodiment of the present invention includes:

and respectively acquiring corresponding coding basic lengths from the length value to be measured and the initial length value, and differencing to obtain basic variables, so that the influence of the initial length value to be measured on the coding basic lengths compared with the conventional condition can be represented. The formula model of the basic variables is:

wherein,representing a basic variable corresponding to the length value to be measured, +.>Representing the length value to be measured>The value of the initial length is indicated,represents a logarithmic function with base 2, +.>Representing a round-up function.

In the formula model of the basic variable, the larger the length value of the distribution accumulation table is, the larger the corresponding basic coding length is, and the final data coding length is also increased, so that when the value of the basic variable is smaller, the smaller the length value to be measured at the moment is, the influence of the length value of the distribution accumulation table on the basic coding length is only analyzed at the moment, and when the influence of the length value of the final data coding length is further, the smaller the value of the basic variable is, the better the value is, and the better the value is the negative number.

Thus, the first index for evaluating the optimization degree of the length value to be measured compared with the initial length value is obtained by analyzing the difference between the length value to be measured and the initial length value, and the subsequent analysis can be continued by the basic variable.

Step S4: obtaining an initial deviation degree value according to the initial length value, the total number of characters and the number of character types; obtaining a deviation degree value to be measured according to the length value to be measured, the total number of characters and the number of character types; and obtaining a difference variable according to the difference between the initial deviation degree value and the deviation degree value to be detected and the character type number.

The length value of the distribution accumulation table also affects the length proportion of various characters distributed in the distribution accumulation table and the deviation value between the frequencies of the characters in the data, thereby affecting the coding length of the final data and affecting the compression efficiency; therefore, the initial deviation degree value can be obtained according to the initial length value, the total number of characters and the number of character types, and the to-be-measured deviation degree value can be obtained according to the to-be-measured length value, the total number of characters and the number of character types, so that the subsequent evaluation of the optimization degree of the to-be-measured length value is completed.

Preferably, the method for acquiring the initial deviation degree value in one embodiment of the present invention includes:

the relation between the length proportion allocated by various characters and the frequency of the characters in the data is required to be analyzed, so that the character frequency of each type of characters is firstly obtained according to the total number of the characters and the character frequency of each type of characters; then multiplying the initial length value with the character frequency of each type of character to obtain a first initial allocation value of each type of character, wherein the first initial allocation value possibly has decimal, so the decimal part of the first initial allocation value is taken as a first decimal; and then rounding and rounding all the first initial allocation values to obtain first final allocation values.

And judging whether the length adjustment operation is required to be carried out on the first final distribution value of each character according to the difference between the sum value of the first final distribution values of each character and the initial length value, and obtaining a first distribution length value. Then, taking the ratio of the first allocated length value of each type of character to the initial length value as a first length ratio value allocated to each type of character; because the length adjustment operation is performed in the process of obtaining the first length proportion value, the first length proportion value of each type of character and the frequency of each type of character have different values, and in the embodiment of the invention, the initial deviation degree value is obtained according to the character frequency of each type of character and the first length proportion value. The formula model of the initial deviation degree value is as follows:

wherein,represents the initial deviation level value +.>Indicate->First length scale value of class character, +.>Represent the firstCharacter frequency of class character,/>Representing the number of character types>Represents the parameter adjusting factor, avoids the denominator to be 0, and pi represents the cumulative symbol.

In the formula model of the initial deviation degree value, the frequency of each type of character in the data to be processed is taken as an index, the first length proportion value allocated to each type of character in the initial length value is taken as a base, then the corresponding value of each type of character is multiplied, and the sum of the first length proportion values of each type of character and the sum of the character frequencies are 1, so when the frequency of each type of character is closer to the first length proportion value when the first length proportion value is taken as the base, the initial close quantity of each type of character is larger, the multiplied value is,initial fitness valueThe greater the value of the initial length at this point, the higher the degree of fit between the first length scale value of each type of character and the frequency of each type of character, the more appropriate the distribution accumulation table length value; so the initial deviation degree value +.>The smaller the initial closeness value should be, and thus the negative correlation mapping is performed to complete the logical relationship correction. It should be noted that, in the embodiment of the present invention, the value of the parameter adjustment factor is 0.01, and the specific value may be adjusted according to the implementation scenario, which is not limited herein.

Since the first final assigned value may be obtained by rounding the first initial assigned value of each type of character in the above process, the sum of the first final assigned values of each type of character may not be equal to the initial length value of the distribution accumulation table, and thus the length adjustment operation needs to be performed on the first final assigned value of each type of character.

Preferably, in one embodiment of the present invention, performing a length adjustment operation on the first final assigned value of each type of character according to a difference between a sum value of the first final assigned values of each type of character and an initial length value to obtain a first assigned length value, including:

and repeating the adjustment process until the sum of the first final assigned values of the various characters is equal to the initial length value, ending the length adjustment operation, and obtaining the first assigned length value of the various characters.

Similarly, preferably, the method for obtaining the deviation degree value to be measured in one embodiment of the present invention includes:

multiplying the length value to be measured by the character frequency of each type of character to obtain a second initial allocation value of each type of character, wherein the second initial allocation value possibly has decimal, so that the decimal part of the second initial allocation value is taken as a second decimal; and then rounding and rounding all the second initial allocation values to obtain second final allocation values.

And judging whether the second final distribution value of each character needs to be subjected to length adjustment operation according to the difference between the sum value of the second final distribution values of each character and the length value to be measured, and obtaining a second distribution length value. Then, taking the ratio of the second allocation length value of each type of character to the length value to be measured as a second length ratio value allocated to each type of character; because the length adjustment operation is performed in the second length proportion value obtaining process, there is a difference between the second length proportion value of each type of character and the frequency of each type of character. The formula model of the deviation degree value to be measured is as follows:

wherein,indicating the deviation degree value to be measured,/>Indicate->A second length scale value of class character, +.>Represent the firstCharacter frequency of class character,/>Representing the number of character types>Represents the parameter adjusting factor, avoids the denominator to be 0, and pi represents the cumulative symbol.

In the formula model of the deviation degree value to be measured, the frequency of each type of character in the data to be processed is taken as an index, the second length proportion value allocated to each type of character in the length value to be measured is taken as a base, and then the corresponding values of each type of character are multiplied, because the sum of the second length proportion values of each type of character and the sum of the character frequencies are 1, when the frequency of each type of character is closer to the second length proportion value when the second length proportion value is taken as the base, the to-be-measured close amount of each type of character is larger, and the multiplied values are obtained, so that the to-be-measured close degree value is obtainedThe larger the length value to be measured at the moment is, the higher the degree of fit between the second length proportion value of each type of character and the frequency of each type of character is, and the more suitable the length value of the distribution accumulation table is; so the deviation degree value to be measured +.>The smaller the value should be, and thus the negative correlation mapping is performed on the value of the degree of closeness to be measured to complete the logical relationship correction. It should be noted that, in the embodiment of the present invention, the value of the parameter adjustment factor is 0.01, and the specific value may be adjusted according to the implementation scenario, which is not limited herein.

Similarly, the length adjustment operation is performed in the second length scale value obtaining process.

Therefore, preferably, in one embodiment of the present invention, the length adjustment operation is performed on the second final allocation value of each type of character according to the difference between the sum value of the second final allocation values of each type of character and the length value to be measured, so as to obtain the second allocation length value, including:

when the sum of the second final distribution values of the various characters is larger than the length value to be measured, subtracting 1 from the second final distribution value corresponding to the minimum value in all second decimal numbers larger than or equal to 0.5; when the sum of the second final distribution values of the various characters is smaller than the length value to be measured, adding 1 to the second final distribution value corresponding to the maximum value in all the second decimal values smaller than 0.5;

and repeating the adjustment process until the sum of the second final distribution values of the various characters is equal to the length value to be measured, ending the length adjustment operation, and obtaining the second distribution length value of the various characters.

After the initial deviation degree value and the deviation degree value to be measured are respectively obtained, another index capable of reflecting the optimization degree of the length value to be measured compared with the initial length value can be obtained according to the difference of the initial deviation degree value and the deviation degree value to be measured, and the variable is differentiated.

Preferably, the method for obtaining the difference variable in one embodiment of the present invention includes:

and obtaining a difference variable according to the deviation degree value to be detected, the initial deviation degree value and the character type number. The formula model of the difference variable is:

wherein,representing the difference variable corresponding to the length value to be measured, +.>Represents the initial deviation level value +.>Indicating the deviation degree value to be measured,/>Representing the number of character types>A logarithmic function with a base of 2 is shown.

In the formula model of the difference variable, when the difference value is obtained according to the to-be-measured deviation degree value and the initial deviation degree valueAnd when the length value to be measured is smaller, the fitting degree between the length proportion value of each type of character and the frequency of each type of character is better than the fitting degree between the length proportion value of each type of character and the frequency of each type of character under the initial length value, and then the fitting degree is multiplied by the character type number to obtain a difference variable.

By analyzing the length proportion value of each character and the deviation degree value between the frequencies of each character under the length value of the distribution accumulation table, another index of the optimization degree of the length value to be measured compared with the initial length value is obtained, and the difference variable can be combined with the basic variable in the subsequent process, so that the evaluation of the optimization degree of the length value to be measured is completed.

Step S5: obtaining an optimization degree value of the length value to be measured according to the basic variable and the difference variable; obtaining an optimal distribution accumulation table length value according to the optimization degree values of all distribution accumulation table length values in the length interval; and finishing the compression of the data to be processed according to the length value of the optimal distribution accumulation table.

And (3) combining the basic variable obtained in the step (S3) with the difference variable obtained in the step (S4) to obtain the optimization degree value of the length of the distribution accumulation table to be detected.

Preferably, the method for obtaining the optimization degree value in one embodiment of the present invention includes:

and adding the value of the basic variable corresponding to the length value to be detected and the value of the difference variable, and carrying out negative correlation mapping on the added value to serve as an optimization degree value. The formula model of the optimization degree value may specifically be, for example:

wherein,an optimization degree value representing the length value to be measured, +.>Representing the difference variable corresponding to the length value to be measured, +.>And representing the basic variable corresponding to the length value to be measured.

In the formula model of the optimization degree value, the increase of the length value of the distribution accumulation table can cause the increase of the basic coding length and further cause the increase of the final coding length, so that the value of the basic variable obtained according to the difference between the length value to be measured and the initial length value is better if the value is smaller and is the best if the value is negative; meanwhile, the length value of the distribution accumulation table is increased, so that the smaller the length proportion value of each character and the deviation degree value between the frequencies of each character, the smaller and better the value of the difference variable obtained according to the difference between the deviation degree value of the length value to be measured and the initial length value is, and the best the value is the negative number, therefore, in the process of obtaining the optimization degree value, the value of the basic variable and the value of the difference variable are added, and the negative correlation mapping is carried out to complete logic relation correction, so that the optimization degree value is obtained.

Based on the method, the optimization degree values of all the distribution accumulation table length values in the length interval can be obtained, and the larger the optimization degree value is, the better the distribution accumulation table length is, so that the distribution accumulation table length value corresponding to the largest optimization degree value in all the optimization degree values is taken as the optimal distribution accumulation table length value.

After the length value of the optimal distribution accumulation table is obtained, the compression of the data to be processed can be completed according to the length value of the optimal distribution accumulation table.

Preferably, in one embodiment of the present invention, compressing data to be processed according to an optimal distribution accumulation table length includes:

because the length value of the distribution accumulation table in the RANS entropy coding algorithm is improved in the embodiment of the invention, the optimal distribution accumulation table is firstly obtained according to the length value of the optimal distribution accumulation table; and then compressing the data to be processed according to the optimal distribution accumulation table based on the RANS entropy coding. It should be noted that, the RANS entropy encoding is a technical means well known to those skilled in the art, and will not be described herein.

The embodiment of the invention ensures that the length proportion value distributed by various characters in the distribution accumulation table is more closely related to the frequency of the characters in the data by analyzing the influence of the length change of the distribution accumulation table on the final data compression on the premise of ensuring that the length value of the distribution accumulation table is shorter, thereby improving the compression efficiency of the data.

In summary, the embodiment of the invention mainly analyzes the influence of the change of the length of the distribution accumulation table on the final data compression efficiency when compressing the data in the optical communication system based on the RANS entropy coding, and further screens out the optimal value of the length of the distribution accumulation table, thereby improving the data compression efficiency. Firstly, acquiring data to be processed in a photoelectric communication system, and converting the data into a character sequence; since the length of the distribution accumulation table of the RANS entropy coding is based on 2 in the conventional case, the number of character types is used as an index, and the index is used as an initial length value; then obtaining a distribution accumulation table length interval according to the character type number and the character frequency of each type of character, traversing the distribution accumulation table length value of each integer in the interval, and analyzing the optimization degree value of the distribution accumulation table length value to the initial length value; taking the length value of any integer distribution accumulation table in the interval as a length value to be measured, wherein the optimization degree value is mainly obtained in two aspects, one is a basic variable obtained according to the difference between the length value to be measured and the initial length value, and the basic variable characterizes the influence on basic coding length when the length value of the distribution accumulation table changes; on the other hand, analyzing the deviation degree value between the length proportion value distributed by each type of character in the length of the distribution accumulation table and the frequency of the length proportion value in the data, and then obtaining a difference variable according to the difference between the deviation degree value of the length value to be detected and the deviation degree value of the initial length value, wherein the difference variable can reflect the influence on the degree of closeness between the length proportion value distributed by the character and the frequency of the length proportion value when the length value of the distribution accumulation table changes; then combining the two to obtain an optimization degree value of the length value to be measured; then taking the distribution accumulation table length value corresponding to the maximum optimization degree value as an optimal distribution accumulation table length value, and finally completing data compression in the photoelectric system according to the optimal distribution accumulation table length value based on RANS entropy coding; according to the embodiment of the invention, the length value of the distribution accumulation table is obtained by analyzing the countermeasure relation of increasing and decreasing the coding length after the length change of the distribution accumulation table, and screening the length value of the distribution accumulation table, so that the effect of improving the data compression efficiency is achieved.

It should be noted that: the sequence of the embodiments of the present invention is only for description, and does not represent the advantages and disadvantages of the embodiments. The processes depicted in the accompanying drawings do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.

In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments.

Claims

1. A method of data processing in an optoelectronic communications system, the method comprising:

obtaining an optimization degree value of the length value to be measured according to the basic variable and the difference variable; obtaining an optimal distribution accumulation table length value according to the optimization degree values of all distribution accumulation table length values in the length interval; compressing the data to be processed according to the length value of the optimal distribution accumulation table;

the method for acquiring the initial deviation degree value comprises the following steps:

taking the ratio of the first allocation length value of each type of character to the initial length value as a first length ratio value of each type of character; taking the first length proportion value of each type of character as a base number, and taking the character frequency as an index as the initial fitness of each type of character; accumulating the initial fitness of each character to obtain an initial fitness value; obtaining an initial deviation degree value according to the initial fitness value, wherein the initial fitness value and the initial deviation degree value are in negative correlation;

the method for acquiring the deviation degree value to be detected comprises the following steps:

2. The method for processing data in an optical-electrical communication system according to claim 1, wherein the method for acquiring the length interval comprises:

taking the character type number as the lower limit of the length interval;

3. A method of data processing in an optoelectronic communications system as claimed in claim 1 wherein the formula model of the base variable comprises:

4. The method for processing data in an optical-electrical communication system according to claim 1, wherein the performing a length adjustment operation on the first final allocation value of each type of character according to a difference between a sum value of the first final allocation values of each type of character and the initial length value to obtain the first allocation length value comprises:

5. The method for processing data in an optoelectronic communications system according to claim 1, wherein the performing a length adjustment operation on the second final allocation value of each type of character according to a difference between the sum value of the second final allocation values of each type of character and the length value to be measured, to obtain the second allocation length value, includes:

6. A method of processing data in an optoelectronic communications system as claimed in claim 1 wherein the formula model of the variance variable comprises:

7. The method for processing data in an optical-electrical communication system according to claim 1, wherein the method for obtaining the optimization degree value comprises:

8. The method for processing data in an optical-electrical communication system according to claim 1, wherein said compressing the data to be processed according to the optimal distribution accumulation table length value comprises: