CN111384967A - Data encoding method - Google Patents
Data encoding method Download PDFInfo
- Publication number
- CN111384967A CN111384967A CN201811625654.8A CN201811625654A CN111384967A CN 111384967 A CN111384967 A CN 111384967A CN 201811625654 A CN201811625654 A CN 201811625654A CN 111384967 A CN111384967 A CN 111384967A
- Authority
- CN
- China
- Prior art keywords
- value
- symbol
- code length
- code
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/40—Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code
- H03M7/4031—Fixed length to variable length coding
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
The application relates to a data coding method, firstly determining the coding code length of each symbol according to Huffman coding of each symbol, and then obtaining the coding value of each symbol by adding 1 or adding 1 to complement mantissa 0 based on the coding code length. And finally, coding each symbol in the data to be coded by using the coding value of each symbol. When the data coding method is used for coding data to be coded, the coding length of the coding value of each symbol is consistent with Huffman coding, so that the average length of the coding result obtained by the data coded by the data coding method is shorter. The data coding method obtains the coding value of each symbol by adding 1 or adding 1 to complement mantissa 0, the process of determining the coding value corresponding to the symbol is regular and can be repeated, the operation is simple, and the error rate is low.
Description
Technical Field
The present application relates to the field of computer technology, and in particular, to a data encoding method.
Background
Huffman coding is a variable word length coding that constructs the codeword with the shortest average non-prefix length based on the probability of occurrence of a character. The Huffman coding method is generally characterized in that the Huffman coding method comprises the steps of sequencing according to the occurrence frequency of symbols in data, selecting two groups of combined branches with the minimum probability each time, combining the divided characters to calculate a group, constructing a Huffman tree, wherein all characters are leaf nodes, the characters are appointed to start from a root node and start to be 0 at the left and start to be 1 at the right, and bits on paths reaching the leaf nodes jointly form the Huffman coding of the characters.
However, when determining the encoded value of each symbol, Huffman coding must determine the value of each branch on the path from the leaf node where the symbol is located to the root node.
Disclosure of Invention
In view of the above, it is necessary to provide an encoding method and a decoding method that have a simple process of determining a coded value of each symbol and a low error rate, in view of the above technical problems.
A method of encoding data, the method comprising:
taking the Huffman coding length of each symbol in the data to be coded as the coding length of each symbol in the data to be coded;
according to the occurrence frequency of each symbol in the data to be encoded, arranging each symbol in the data to be encoded in a descending order mode to obtain an ordering value of each symbol;
if the code length of the current symbol is the same as that of the symbol of the last sorting value, adding 1 to the code value of the symbol of the last sorting value to obtain the code value of the current symbol; if the code length of the current symbol is different from that of the symbol of the last sorting value, adding 1 to the code value of the symbol of the last sorting value to obtain a numerical value, and supplementing 0 mantissa to obtain the code value of the current symbol;
and coding the data to be coded according to the coding value of the symbol corresponding to each symbol.
As an optional implementation manner, the obtaining a coded value of a symbol corresponding to each symbol in data to be coded, and coding the symbol corresponding to the data to be coded includes:
arranging the symbols in the data to be coded in a descending order according to the occurrence frequency to obtain the ordering value of each symbol, and obtaining a symbol sequence table according to the ordering value of each symbol;
obtaining a code length table according to the code length of each symbol in the data to be coded;
obtaining a code length boundary table and a code length basic value table of the data to be coded according to the code length and the sorting value of each symbol in the data to be coded;
and coding each symbol in the data to be coded by using the symbol sequence table, the code length boundary table and the code length basic value table.
As an optional implementation manner, obtaining a code length boundary table and a code length basic value table of the to-be-encoded data according to the encoding code length and the ordering value of each symbol in the to-be-encoded data, includes:
searching the symbol of the maximum sorting value of each code length in the symbol sequence table, and constructing the code length boundary table by using the maximum sorting value;
and obtaining a basic value of each code length according to each maximum sorting value and the code value corresponding to each maximum sorting value, and obtaining the code length basic value table according to the basic value of each code length.
As an optional implementation manner, constructing the code length boundary table using the maximum sorting value includes:
and sequencing the maximum sequencing values in the code length boundary table in an ascending manner to obtain the code length boundary table.
As an optional implementation, the method further comprises:
arranging various code lengths in the code length table in an ascending order, and sequentially identifying various code lengths in the code length table in a descending order by using serial numbers;
arranging the maximum ordering values in the code length boundary table in an ascending order, and sequentially identifying the maximum ordering values in the code length boundary table in a descending order by using the sequence numbers corresponding to the code length table;
and arranging all the basic values in the code length basic value table in an ascending order, and sequentially identifying all the basic values in the code length basic value table in a descending order by using the sequence numbers corresponding to the code length table.
As an optional embodiment, the initial rank value of the symbol sequence table is 0.
As an optional implementation manner, according to each maximum sorting value and the coded value corresponding to each maximum sorting value, the method includes:
and subtracting the sorting value from the code value corresponding to each maximum sorting value to obtain a basic value of each code length.
As an optional implementation manner, encoding each symbol in the data to be encoded by using the symbol sequence table, the code length boundary table, and the code length basic value table includes:
determining the code length of the current symbol according to the sorting value of the current symbol in the symbol sequence table, the code length boundary table and the code length table;
obtaining a basic value of the current symbol according to the code length of the current symbol and the code length basic value table;
obtaining an initial coding value of the current symbol according to the sorting value of the current symbol in the symbol sequence table and the basic value of the current symbol;
and obtaining the coding value of the current symbol according to the initial coding value of the current symbol and the coding length of the current symbol.
As an optional implementation manner, obtaining an initial coded value of the current symbol according to the sorted value of the current symbol in the symbol sequence table and the base value of the current symbol, includes:
and adding the sorting value of the current symbol in the symbol sequence table and the basic value of the current symbol to obtain the initial coding value of the current symbol.
As an optional implementation, the method further comprises:
and taking the value 0 corresponding to the code length as the code value of the symbol with the minimum sorting value in the symbol sequence list.
A method of data decoding, the method comprising:
acquiring the corresponding relation between each symbol and the coding value;
decoding the data to be decoded according to the corresponding relation between each symbol and the coding value;
wherein, the correspondence between each symbol and the coding value includes:
taking the code length of Huffman coding of each symbol as the coding code length of each symbol;
arranging each symbol in a descending order according to the occurrence frequency of each symbol to obtain an ordering value of each symbol;
if the code length of the current symbol is the same as that of the symbol of the last sorting value, adding 1 to the code value of the symbol of the last sorting value to obtain the code value of the current symbol; and if the coding length of the current symbol is different from the coding length of the symbol of the last sequencing value, adding 1 to the coding value of the symbol of the last order to obtain a numerical value supplementary mantissa 0 to obtain the coding value of the current symbol.
The data coding method firstly determines the coding code length of each symbol according to Huffman coding of each symbol, and then obtains the coding value of each symbol by adding 1 or adding 1 to complement mantissa 0 based on the coding code length. And finally, coding each symbol in the data to be coded by using the coding value of each symbol. When the data coding method is used for coding data to be coded, the coding length of the coding value of each symbol is consistent with Huffman coding, so that the average length of the coding result obtained by the data coded by the data coding method is shorter. The data coding method obtains the coding value of each symbol by adding 1 or adding 1 to complement mantissa 0, the process of determining the coding value corresponding to the symbol is regular and can be repeated, the operation is simple, and the error rate is low.
Drawings
FIG. 1 is a block diagram of an exemplary computing device;
FIG. 2 is a block diagram of an exemplary computing device;
FIG. 3 is a block diagram of an exemplary computing device;
FIG. 4 is a flow diagram illustrating a method for encoding data in one embodiment;
FIG. 5 is a schematic representation of a Huffman tree in one embodiment;
FIG. 6 is a flowchart illustrating encoding of each symbol in data to be encoded according to an embodiment;
FIG. 7 is a schematic illustration of a symbolic sequence table obtained in one embodiment;
FIG. 8 is a flow diagram illustrating additional steps in a data encoding method, according to one embodiment;
FIG. 9 is a flowchart illustrating encoding of each symbol in data to be encoded according to an embodiment;
FIG. 10 is a flow diagram illustrating a method of data processing in one embodiment;
FIG. 11 is a flow diagram illustrating a method for decoding data in one embodiment;
FIG. 12 is a flowchart illustrating decoding data to be decoded according to an embodiment;
FIG. 13 is a flow diagram illustrating additional steps in a data decoding method, according to one embodiment;
FIG. 14 is a flowchart illustrating encoding of each symbol in data to be encoded according to an embodiment;
fig. 15 is a flowchart illustrating a data processing method according to another embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
As shown in fig. 1, a block diagram of an arithmetic device 100 is proposed in one embodiment of the present application, and the arithmetic device includes a master arithmetic unit 110 and a plurality of slave arithmetic units 120 connected to the master arithmetic unit. The master operation unit 110 and each slave operation unit 120 are provided with an encoding circuit 130. Specifically, the master operation unit is used for performing preamble processing on input data and transmitting data with a plurality of slave operation units. Specifically, the plurality of slave arithmetic units 120 are configured to perform intermediate operations in parallel using data transmitted from the master arithmetic unit 110 to obtain a plurality of intermediate results, and to transmit the plurality of intermediate results to the master arithmetic unit 110. The main arithmetic unit 110 is also used for performing subsequent processing on the plurality of intermediate results transmitted from the arithmetic unit 120. Specifically, the encoding circuit 130 may encode (compress) data using the data encoding method proposed in the present application; and decoding (decompressing) the data using the data decoding method proposed in the present application.
In one optional embodiment, as shown in fig. 2, a plurality of slave operation units 120 of the operation device are distributed in an array; each slave arithmetic unit 120 is connected to another adjacent slave arithmetic unit 120, and the master arithmetic unit 110 is connected to k slave arithmetic units 120 among the plurality of slave arithmetic units 120, where the k slave arithmetic units 120 are: n slave arithmetic units 120 in row 1, n slave arithmetic units 120 in row m, and m slave arithmetic units 120 in column 1. As shown in fig. 6, the K slave operation units 120 include only the n slave operation units 120 in the 1 st row, the n slave operation units 120 in the m th row, and the m slave operation units 120 in the 1 st column, that is, the K slave operation units 120 are the slave operation units 120 directly connected to the master operation unit 110 among the plurality of slave operation units 120. Specifically, the K slave arithmetic units 120 are used for forwarding data between the master arithmetic unit 110 and the plurality of slave arithmetic units 120.
Further, the main operation unit 110 further includes an active operation circuit and an addition operation circuit. The activation arithmetic circuit is used for executing activation arithmetic of data in the main arithmetic unit 110; and the addition operation circuit is used for executing addition operation or accumulation operation. Specifically, the slave operation unit 120 further includes a multiplication operation circuit. The multiplication operation circuit is used for executing multiplication operation on the received data block to obtain a product result. Optionally, the slave operation unit 120 may further include an addition operation circuit for performing an addition operation or an accumulation operation.
In another optional embodiment, as shown in fig. 3, the operation device may further include a branch operation unit 140, the main operation unit 110 is connected to one or more branch operation units 140, and the branch operation unit 140 is connected to one or more slave operation units 120.
Specifically, the branch operation unit 140 is used for forwarding data between the master operation unit 110 and the slave operation unit 120. The main operation unit 110 may further include an active operation circuit and an addition operation circuit. The activation arithmetic circuit is used for executing activation arithmetic of data in the main arithmetic unit 110; an addition operation circuit for performing an addition operation or an accumulation operation; the data access circuit is mainly used for transporting the data to be operated to the main operation unit 110 by using the data access method in the above embodiment. Specifically, the slave operation unit 120 further includes a multiplication operation circuit for performing a multiplication operation on the received data block to obtain a product result. Optionally, the slave operation unit 120 may further include an addition operation circuit for performing an addition operation or an accumulation operation.
As shown in fig. 4, a flowchart of steps of a data encoding method proposed in an embodiment of the present application specifically includes:
step S410, using the Huffman code length of each symbol in the data to be encoded as the code length of each symbol in the data to be encoded.
The code length of each symbol in the data coding method is consistent with Huffman coding. Specifically, the encoding circuit 130 takes the code length of Huffman coding of each symbol in the data to be encoded as the code length of encoding of each symbol in the data to be encoded.
Optionally, the encoding circuit 130 may first obtain a corresponding Huffman tree according to the occurrence probability of each symbol in the data to be encoded. And then, according to the path length from the leaf node to the root node corresponding to each symbol in the Huffman tree, obtaining the code length of each symbol in the data to be coded. In one example: symbols included in data to be encoded: μ 1, μ 2, μ 3, μ 4, μ 5, the corresponding probability of occurrence is: 0.4, 0.1, 0.2, 0.1. The Huffman tree obtained by the control circuit 110 according to the probability of occurrence of each symbol in the data to be encoded is shown in fig. 5. As can be seen from fig. 3, the path lengths from the leaf nodes to the root nodes corresponding to μ 1, μ 2, μ 3, μ 4, and μ 5 are 2, 3, 2, and 3, respectively, and the path lengths from the leaf nodes to the root nodes in the Huffman tree correspond to the coding code lengths of the Huffman coding, so the coding code lengths corresponding to μ 1, μ 2, μ 3, μ 4, and μ 5 obtained by the coding circuit 130 in this example are 2, 3, 2, and 3, respectively.
Optionally, the encoding circuit 130 may also directly obtain the encoding code length of each symbol in the data to be encoded, by using the encoding code length of Huffman encoding of each symbol in the data to be encoded. Following the above example, the Huffman codes (the convention Huffman book indicates that the branch pointing to the left sub-tree represents "0" and the branch pointing to the right sub-tree represents "1") obtained from the Huffman tree constructed based on the occurrence probabilities of μ 1, μ 2, μ 3, μ 4, and μ 5 are: 00. 010, 10, 11, 011. Since the encoding code length of each symbol in the data encoding method is consistent with the Huffman encoding, if the encoding circuit 130 can directly wait for the Huffman encoding of each symbol in the data, the encoding can directly obtain the encoding code length of each symbol.
Step S420, according to the occurrence frequency of each symbol in the data to be encoded, arranging each symbol in the data to be encoded in a descending order to obtain the ordering value of each symbol.
Specifically, the encoding circuit 130 arranges each symbol in the data to be encoded in a descending order according to the occurrence frequency of each symbol in the data to be encoded, so as to obtain an ordering value of each symbol. Optionally, the initial value of the ranking value is 0. For example: symbols included in data to be encoded: μ 1, μ 2, μ 3, μ 4, μ 5, the corresponding probability of occurrence is: 0.4, 0.1, 0.2, 0.1. The rank values of the respective symbols correspond to: 0.4, 2, 3 and 5.
Step S430, if the code length of the current symbol is the same as the code length of the symbol of the previous sorting value, adding 1 to the code value of the symbol of the previous sorting value to obtain the code value of the current symbol; and if the code length of the current symbol is different from that of the symbol of the last sorting value, adding 1 to the code value of the symbol of the last sorting value to obtain a numerical value supplementary mantissa 0 to obtain the code value of the current symbol.
Specifically, if the code length of the current symbol is the same as the code length of the symbol of the previous rank value, the coding circuit 130 adds 1 to the code value of the symbol of the previous rank value to obtain the code value of the current symbol. If the code length of the current symbol is different from the code length of the symbol of the previous sorting value, the coding circuit 130 adds 1 to the code value of the symbol of the previous sorting value to obtain a numeric supplementary mantissa 0 to obtain the code value of the current symbol. More specifically, when encoding data, the encoding circuit 130 first compares whether the code length of the current symbol is the same as the code length of the symbol of the previous rank value, and if so, adds 1 to the code length of the symbol of the previous rank value to obtain the code length of the current symbol. If the code length of the former symbol is different from that of the symbol of the last sorting value, 1 is added to the code value of the symbol of the last sorting value, and then a mantissa 0 is supplemented on the basis of the numerical value obtained by adding 1 to the code value of the symbol of the last sorting value to obtain the code value of the current symbol. The encoding circuit 130 obtains an encoded value of each symbol based on the rank value and the encoding code length of each symbol.
Further, the value 0 corresponding to the code length is used as the code value of the symbol with the minimum sorting value in the symbol sequence table. Specifically, the encoding circuit 130 uses a value of 0 corresponding to the code length as the encoded value of the symbol having the smallest ordered value in the symbol sequence table. For example, if the code length of the symbol having the smallest rank value in the symbol sequence table is 2 bits, "00" is used as the code value of the symbol. For example: taking the above example into account, for example, the rank value of μ 3 is 1, the symbol corresponding to the rank value 0 above it is μ 1, the code value of μ 1 is set to "00", and 1 is added to "00", so as to obtain the code value of μ 3 of "01". The sorting value of mu 2 is 3, the symbol corresponding to the previous sorting value 2 is mu 4, the code value of mu 4 is set to be 10, 1 is added on the basis of 10 to obtain the value 11, then the mantissa 0 is complemented on the basis of 11 to obtain the code value of mu 2 to be 110. Setting the code value of μ 1 to "00", the code values of μ 3, μ 4, μ 2 and μ 5 are 01, 10, 110 and 111, respectively.
Step S440, encoding the data to be encoded according to the encoding value of the symbol corresponding to each symbol.
Specifically, the encoding circuit 130 encodes the data to be encoded according to the encoding value of the symbol corresponding to each symbol. For example: taking the above example as an example, assuming that data to be encoded is "μ 3 μ 4 μ 2 μ 5 μ 3 μ 1 μ 2", the encoding circuit 130 encodes each symbol to obtain "0110110110, 1110100110".
In the data encoding method in this embodiment, first, the encoding code length of each symbol is determined according to Huffman encoding of each symbol, and then, based on the encoding code length, the encoding value of each symbol is obtained by adding 1 or adding 1 to complement mantissa 0. And finally, coding each symbol in the data to be coded by using the coding value of each symbol. When the data coding method is used for coding data to be coded, the coding length of the coding value of each symbol is consistent with Huffman coding, so that the average length of the coding result obtained by the data coded by the data coding method is shorter. In the data encoding method of this embodiment, the encoded value of each symbol is obtained by adding 1 or adding 1 to complement mantissa 0, the process of determining the encoded value corresponding to the symbol is regularly repeatable, the operation is simple, and the discrepancy rate is low.
In one embodiment, as shown in fig. 6, step S440 includes:
step S441, arranging the symbols in the data to be encoded in a descending order according to the occurrence frequency to obtain the ordering values of the symbols, and obtaining a symbol sequence table according to the ordering values of the symbols.
Specifically, the encoding circuit 130 arranges the symbols in the data to be encoded in a descending order according to the occurrence frequency to obtain the ordering values of the symbols, and obtains a symbol sequence table according to the ordering values of the symbols. Optionally, the initial rank value of the symbol sequence table is 0. For example: symbols included in data to be encoded: μ 1, μ 2, μ 3, μ 4, μ 5, the corresponding probability of occurrence is: 0.4, 0.1, 0.2, 0.1. In this example, the encoding circuit 130 arranges μ 1, μ 2, μ 3, μ 4, and μ 5 in descending order of the frequency of occurrence, and obtains the ordering value of each symbol as: 0.4, 2, 3, 5; the resulting symbolic sequence table is shown in FIG. 7.
Step S442, obtaining a code length table according to the code length of each symbol in the data to be encoded.
Specifically, the encoding circuit 130 obtains a code length table according to the encoding code length of each symbol in the data to be encoded. The code length table contains at least one code length of the code. For example, taking the above example into account, the code values of μ 1, μ 2, μ 3, μ 4, μ 5 contain two code lengths of 2 bits and 3 bits, and the code length table obtained by the encoding circuit 130 in this example is [2, 3 ]. Optionally, the various code lengths in the code length table are arranged in ascending order. Alternatively, various code lengths in the code length table may also be identified using serial numbers in turn. For example, 0-1 may be used, identifying 2, 3 respectively in the long table.
Step S443, obtaining a code length boundary table and a code length basic value table of the data to be encoded according to the code length and the ordering value of each symbol in the data to be encoded.
The code length boundary table contains the code value of the symbol with the largest rank value in the symbols of each code length. The code length basis value table contains a basis value for each code length. Specifically, the encoding circuit 130 obtains a code length boundary table and a code length basic value table of the data to be encoded according to the code length and the ordering value of each symbol in the data to be encoded. More specifically, the encoding circuit 130 first searches for a symbol of a maximum ordering value of symbols of each code length in the symbol sequence table, and constructs the code length boundary table using the maximum ordering value, where the maximum ordering value is the maximum value of the ordering values of the symbols of the various code lengths. Further, the maximum sorting values in the code length boundary table are sorted in an ascending order to obtain the code length boundary table. The encoding circuit 130 may further obtain a basic value of each encoding code length according to each maximum sorting value and the encoding value corresponding to each maximum sorting value, and obtain the code length basic value table according to the basic value of each encoding code length. Further, subtracting the sorting value from the code value corresponding to each maximum sorting value to obtain a basic value of each code length.
For example: in the above example, the symbol having the largest rank value among symbols having a code length of 2 bits is encoded by μ 4, and the symbol having the largest rank value among symbols having a code length of 3 bits is encoded by μ 5, and the rank value is 4. The encoding circuit 130 arranges 2 and 4 in ascending order, and the obtained code length boundary table is [2, 4 ]. μ 4 in the above example is a symbol of the maximum rank value among symbols having a code length of 2 bits, the rank value thereof is 2, and the code value thereof is 10 (binary, corresponding to a decimal value of 2), and therefore, the code circuit 130 in this example obtains a base value of 0 having a code length of 2 bits; μ 5 is a symbol of the maximum rank value among the symbols having a code length of 3 bits, the rank value thereof is 4, and the code value thereof is 111 (binary, corresponding to a decimal value of 7), so that the code circuit 130 in this example obtains a base value of 3 having a code length of 4 bits. The code length basis value table obtained by the encoding circuit 130 in this example is [0, 3 ].
And step S444, encoding each symbol in the data to be encoded by using the symbol sequence table, the code length boundary table and the code length basic value table.
Specifically, the encoding circuit 130 encodes each symbol in the data to be encoded by using the symbol sequence table, the code length table of the data to be encoded, the code length boundary table, and the code length basic value table. Optionally, the encoding circuit 130 first determines the ordering value of each symbol in the data to be encoded according to the symbol sequence table. Then, the encoding circuit 130 determines a code length of each symbol based on the sorted value of each symbol, and then determines a base value of each symbol based on the code length and the base value. After the encoding circuit 130 determines the basic value of the symbol, an initial encoding value is obtained according to the basic value and the ordering value of the symbol corresponding to the symbol. Finally, the encoding circuit 130 determines the encoding code length of each symbol according to the sorting value of the symbol corresponding to each symbol, and obtains the encoding value of each symbol according to the encoding code length of each symbol and the initial encoding value.
For example, following the above example, the sequence of symbols used is shown in FIG. 7; the code length boundary table used is [2, 4], wherein the sorting value 2 in the code length boundary table is the boundary of the code length of 2 bits, and the sorting value 4 is the boundary of the code length of 3 bits; the code length basic value table is [0, 3], wherein the basic value 0 in the code length basic value table is a basic value of a code length of 2 bits, and the basic value 3 is a basic value of a code length of 3 bits. If the symbol mu 3 in the data to be encoded is encoded, firstly, the ordering value of the mu 3 obtained according to the symbol sequence table, the code length boundary table and the code length basic value table is 2, the encoding code length is 2, the basic value is 0, and then, the initial encoding value obtained according to the basic value 0 and the ordering value 2 is 2. The code length of this symbol μ 3 is 2, and therefore the code circuit obtains a code value of 10 for the symbol μ 3 in this example.
The coding method proposed in this embodiment uses a symbol sequence table, a code length boundary table, and a code length basic value table constructed based on the coding values of the respective symbols, since the code value of each symbol in this embodiment is obtained by adding 1 to the code value of the symbol with the same code length according to the symbol sequence and adding 1 to complement 0 to the code value of the symbol with different code lengths according to the symbol sequence, each symbol in the data to be encoded is encoded, based on the rule, the table lookup (coding value comparison table) operation in the original Huffman coding can be converted into a symbol lookup table combined with simple operation, since the data amount of the symbol sequence table is much smaller than that of the code value comparison table, and the conversion operation only includes comparison and addition operations, therefore, the data encoding method is easily divided into a plurality of parallel operations, and thus the encoding efficiency is high.
In one embodiment, as shown in fig. 8, the encoding method further includes:
and S450, arranging various code lengths in the code length table in an ascending order, and sequentially identifying various code lengths in the code length table in a descending order by using serial numbers.
Specifically, the encoding circuit 130 arranges the various encoding code lengths in the code length table in an ascending order, and sequentially identifies the various encoding code lengths in the code length table arranged in a descending order using the sequence numbers. For example: code length table [2, 3], arranging various code lengths in ascending order; using 0 to identify the code length of 2 bits; the code length is 3 bits using 1 identification code.
And step S460, arranging the sequencing values in the code length boundary table in an ascending order, and sequentially identifying the sequencing values in the code length boundary table in a descending order by using the sequence numbers corresponding to the code length table.
Specifically, the encoding circuit 130 arranges the respective sorting values in the code length boundary table in an ascending order, and sequentially identifies the respective sorting values in the code length boundary table in a descending order using the sequence numbers corresponding to the code length table. For example: the sorting values are sorted in ascending order [2, 4] in the code length boundary table. Rank value 2 is identified with 0; the rank value 4 is identified with 1.
Step S470, arranging each basic value in the code length basic value table in ascending order, and sequentially identifying each basic value in the code length basic value table in descending order by using the sequence number corresponding to the code length table.
Specifically, the encoding circuit 130 arranges the respective basic values in the code length basic value table in ascending order, and sequentially identifies the respective basic values in the code length basic value table in descending order using the sequence numbers corresponding to the code length table. For example: code length basis value table [0, 3], in ascending order.
In the embodiment, the code length table, the code length boundary table and the serial numbers of the numerical values in the code length basic value table are uniformly set, so that other information can be easily determined according to determined information (such as code length) during subsequent data encoding, and the data processing efficiency is improved.
In one embodiment, as shown in fig. 9, step S444 includes,
step S4441, determining the code length of the current symbol according to the sorting value of the current symbol in the symbol sequence table, the code length table and the code length boundary table.
Specifically, the encoding circuit 130 determines the encoding code length of the current symbol according to the sorting value of the current symbol in the symbol sequence table, the code length table, and the code length boundary table. Optionally, the encoding circuit 130 first accesses the code length boundary table, determines that the ordering value in the code length boundary table is not less than the minimum ordering value of the current symbol, and then determines the encoding code length of the current symbol according to the minimum ordering value in the code length boundary table and the code length table. Alternatively, if each code length in the code length table identifies a sequence number, and each ordering value in the code length boundary table also identifies a sequence number, the encoding circuit 130 determines the corresponding code length in the code length table according to the sequence number of the determined minimum ordering value of the current symbol in the code length boundary table, after determining that the ordering value in the code length boundary table is not less than the small ordering value of the current symbol.
For example: the code length boundary table [2, 4], mu 3 has an ordering value of 1 in the symbol sequence table, and the encoding circuit accesses the code length boundary table to determine that the smallest ordering value not less than the ordering value 1 is 2. The serial number of the ordering value 2 in the code length boundary table is 0, and according to the serial number 0, the code length with the serial number 0 is searched in the code length tables [2 and 3], and the code length of mu 3 is determined to be 2.
Step S4442, obtaining a basic value of the current symbol according to the code length of the current symbol and the code length basic value table.
Specifically, the encoding circuit 130 obtains the base value of the current symbol according to the encoding code length of the current symbol and the code length base value table. Optionally, the encoding circuit 130 first accesses the code length boundary table, determines that the ordering value in the code length boundary table is not less than the minimum ordering value of the current symbol, and then determines the encoding code length of the current symbol according to the minimum ordering value in the code length boundary table and the code length table. And finally, determining the basic value of the current symbol according to the corresponding relation between the code length and each basic value in the code length basic values. Alternatively, if the data in the code length table, the code length boundary table, and the code length basic value table identify sequence numbers, the encoding circuit 130 determines the corresponding basic value in the code length basic value table according to the sequence number of the determined minimum sorting value of the current symbol in the code length boundary table after determining that the sorting value in the code length boundary table is not less than the minimum sorting value of the current symbol.
For example: the code length boundary table [2, 4], mu 3 has an ordering value of 1 in the symbol sequence table, and the encoding circuit accesses the code length boundary table to determine that the smallest ordering value not less than the ordering value 1 is 2. The sequence number of the ordering value 2 is 0 in the code length boundary table, and the base value of 0 is searched in the code length base value table [0, 3] according to the sequence number 0, and the base value of mu 3 is determined to be 0.
Step S4443, obtaining an initial code value of the current symbol according to the sorting value of the current symbol in the symbol sequence table and the basic value of the current symbol.
Specifically, the encoding circuit 130 obtains an initial encoded value of the current symbol according to the sorting value of the current symbol in the symbol sequence table and the basic value of the current symbol. Optionally, the encoding circuit 130 adds the sorted value of the current symbol in the symbol sequence table to the base value of the current symbol to obtain an initial encoded value of the current symbol.
For example, taking the above example into account, determining that the base value of μ 3 is 0 and the rank value is 1, then the initial code value of μ 3 is 1.
Step S4444, obtaining the code value of the current symbol according to the initial code value of the current symbol and the code length of the current symbol.
Specifically, the encoding circuit 130 obtains the encoded value of the current symbol according to the initial encoded value of the current symbol and the encoded code length of the current symbol.
For example, taking the above example, the initial code value of μ 3 is 1, the code length is 2, and the code value of μ 3 is 01.
In the encoding method in this embodiment, the symbol sequence table is queried to obtain the ordering value of the symbol, then the code length and the basic value of the current symbol are determined according to the ordering value, the initial code value is obtained according to the basic value and the ordering value, and finally the code value of the symbol is obtained according to the code length and the initial code value.
The following specifically describes the application of the data encoding method in the above embodiment, taking as an example the application of the data encoding method to the operation process executed by the operation device in the above embodiment.
As shown in fig. 10, a data processing method proposed in one embodiment is executed by the computing device 100, and specifically includes:
in step S610, the main operation unit receives input data, and encodes the input data using any one of the data encoding methods in the above embodiments to obtain encoded data.
Specifically, the encoding circuit 130 of the main operation unit 110 encodes the input data using any one of the data encoding methods in the above-described embodiments to obtain encoded data. Further, the encoding circuit 130 firstly accesses the symbol sequence table to obtain the sequence value of the current symbol in the input data; then, determining the code length of the current symbol according to the sorting value of the current symbol in the symbol sequence table, the code length table and the code length boundary table; obtaining a basic value of the current symbol according to the code length of the current symbol and the code length basic value table; obtaining an initial coding value of the current symbol according to the sorting value of the current symbol in the symbol sequence table and the basic value of the current symbol; and obtaining the coding value of the current symbol according to the initial coding value of the current symbol and the coding length of the current symbol. And circularly executing the data coding step to obtain the coding values of other symbols in the input data so as to code the input data.
In step S620, the master operation unit transfers the obtained encoded data to the slave operation unit.
Alternatively, if the arithmetic device has the configuration shown in fig. 2, the master arithmetic unit may transmit the encoded data to the slave arithmetic units via k slave arithmetic units connected to the master arithmetic unit. Alternatively, if the arithmetic device has the configuration shown in fig. 2, the master arithmetic unit may transmit the encoded data to the slave arithmetic unit via the branch arithmetic unit.
In step S630, the encoding circuit of the arithmetic unit receives the encoded data, and decodes the encoded data to obtain decoded data.
In step S640, the slave arithmetic unit performs multiplication using the decoded data to obtain an intermediate result, and transmits the intermediate result to the master arithmetic unit. Optionally, the slave arithmetic unit may also encode the intermediate result by applying the data encoding method according to any of the embodiments, and then transmit the encoded intermediate result to the master arithmetic unit. Alternatively, if the arithmetic device has the configuration shown in fig. 2, the slave arithmetic unit may transmit the intermediate result to the master arithmetic unit through k slave arithmetic units connected to the master arithmetic unit. Alternatively, if the computing device has the configuration shown in fig. 2, the slave computing unit may transmit the intermediate result to the slave computing unit through the branch computing unit.
In step S650, the main operation unit performs an accumulation and activation operation using the intermediate result to obtain an operation result. Optionally, if the encoded intermediate result is transmitted from the slave arithmetic unit, the encoding circuit of the master arithmetic unit needs to decode the encoded intermediate result first, and then perform the accumulation and activation operations to obtain the arithmetic result.
Alternatively, if the operation result is the final operation result, the operation device 100 may terminate the data processing flow. If the operation result is not the final operation result, the operation device 100 can perform the operation of the next stage using the operation result.
The computing device in the above embodiment encodes the input data and transmits the encoded input data to the slave computing unit, so that the bandwidth requirement of data transmission between the computing units can be reduced.
As shown in fig. 11, in another embodiment of the present application, a data decoding method is proposed, which can decode encoded data obtained by using the data encoding method in any of the above embodiments, and the data decoding method specifically includes:
step S510, the code length of Huffman coding of each symbol is used as the code length of each symbol.
It should be noted that the code length of each symbol in the correspondence between each symbol and the code value obtained by the method is consistent with Huffman coding. Specifically, the encoding circuit 130 uses the code length of Huffman encoding of each symbol in the original data as the encoding code length of each symbol; .
Alternatively, the encoding circuit 130 may first obtain a corresponding Huffman tree according to the occurrence probability of each symbol. And then, obtaining the code length of each symbol (mu 1, mu 2, mu 3, mu 4 and mu 5) according to the path length from the leaf node to the root node corresponding to each symbol in the Huffman tree. In one example, the symbol: μ 1, μ 2, μ 3, μ 4, μ 5, the corresponding probability of occurrence is: 0.4, 0.1, 0.2, 0.1. The Huffman tree obtained by the encoding circuit 130 according to the occurrence probability of μ 1, μ 2, μ 3, μ 4, μ 5 is shown in fig. 3. As can be seen from fig. 5, the path lengths from the leaf nodes to the root nodes corresponding to μ 1, μ 2, μ 3, μ 4, and μ 5 are 2, 3, 2, and 3, respectively, and the path lengths from the leaf nodes to the root nodes in the Huffman tree correspond to the code lengths of Huffman coding, so the coding code lengths corresponding to μ 1, μ 2, μ 3, μ 4, and μ 5 obtained by the coding circuit 130 in this example are 2, 3, 2, and 3, respectively.
Alternatively, the encoding circuit 130 may also directly obtain the Huffman-coded code length of each symbol to obtain the code length of each symbol. Following the above example, the Huffman codes (the convention Huffman book indicates that the branch pointing to the left sub-tree represents "0" and the branch pointing to the right sub-tree represents "1") obtained from the Huffman tree constructed based on the occurrence probabilities of μ 1, μ 2, μ 3, μ 4, and μ 5 are: 00. 010, 10, 11, 011. Because the coding code length of each symbol in the method is consistent with the Huffman coding, if the coding circuit 130 can directly wait for the Huffman coding of each symbol in the coded data, the coding code length of each symbol can be directly obtained by the coding.
And S520, arranging the symbols in a descending order according to the occurrence frequency of the symbols to obtain the ordering value of the symbols.
Specifically, the encoding circuit 130 arranges each symbol in descending order according to the occurrence frequency of each symbol to obtain the ordered value of each symbol. Optionally, the initial value of the ranking value is 0. For example: the raw data contains the symbols: μ 1, μ 2, μ 3, μ 4, μ 5, the corresponding probability of occurrence is: 0.4, 0.1, 0.2, 0.1. Arranging the symbols in a descending order according to the occurrence probability, wherein the obtained ordering value of each symbol is as follows: 0. 3, 1, 2 and 4.
Step S530, if the code length of the current symbol is the same as the code length of the symbol of the previous sorting value, adding 1 to the code value of the symbol of the previous sorting value to obtain the code value of the current symbol; and if the coding length of the current symbol is different from the coding length of the symbol of the last sequencing value, adding 1 to the coding value of the symbol of the last order to obtain a numerical value supplementary mantissa 0 to obtain the coding value of the current symbol.
Specifically, if the code length of the current symbol is the same as the code length of the symbol of the previous sorting value, the coding circuit 130 adds 1 to the code value of the symbol of the previous sorting value to obtain the code value of the current symbol. If the code length of the current symbol is different from the code length of the symbol of the previous ordering value, the encoding circuit 130 adds 1 to the encoded value of the symbol of the previous order to obtain a numeric supplementary mantissa 0, so as to obtain the encoded value of the current symbol. More specifically, when encoding data, the encoding circuit 130 first compares whether the code length of the current symbol is the same as the code length of the symbol of the previous rank value, and if so, adds 1 to the code length of the symbol of the previous rank value to obtain the code length of the current symbol. If the code length of the current symbol is different from that of the symbol of the previous sorting value, 1 is added to the code value of the symbol of the previous sorting value, and then a mantissa 0 is supplemented on the basis of the numerical value obtained by adding 1 to the code value of the symbol of the previous bit to obtain the code value of the current symbol. The encoding circuit 130 obtains the encoded value of each symbol by applying the above method, i.e. the corresponding relationship between each symbol and the encoded value is obtained.
For example: taking the above example into account, for example, the rank value of μ 3 is 1, the symbol corresponding to the rank value 0 above it is μ 1, the code value of μ 1 is set to "00", and 1 is added to "00", so as to obtain the code value of μ 3 of "01". The sorting value of mu 2 is 3, the symbol corresponding to the previous sorting value 2 is mu 4, the code value of mu 4 is set to be 10, 1 is added on the basis of 10 to obtain the value 11, then the mantissa 0 is complemented on the basis of 11 to obtain the code value of mu 2 to be 110. Setting the code value of μ 1 to "00", the code values of μ 3, μ 4, μ 2 and μ 5 are 01, 10, 110 and 111, respectively. In this example, the corresponding relationship between each symbol and the encoded value obtained by the encoding circuit 130 is: μ 1 corresponds to 00; μ 2 corresponds to 110; μ 3 corresponds to 01; μ 4 corresponds to 10; μ 5 corresponds to 111.
It should be noted that, since the encoding rule used by the data needs to correspond to the decoding rule, the encoded value corresponding to each symbol obtained by the decoding method in the present embodiment is the same as the encoded value corresponding to each symbol obtained by the data encoding method in the foregoing embodiment.
And step S540, decoding the data to be decoded according to the symbols corresponding to the coding values.
Wherein, the data to be decoded is coded data. Specifically, the encoding circuit 130 decodes the data to be decoded according to the symbol corresponding to each encoded value. Alternatively, if the encoded values corresponding to the symbols obtained in the above steps S510 to S530 are stored in a format of a relationship table, the encoding circuit 130 decodes the data to be decoded by querying the relationship table. Alternatively, if the symbol sequence table, the code length table, the code value range table, and the code length basic value table are obtained based on each symbol and corresponding code value obtained in the above steps S510 to S530, the encoding circuit 130 first obtains the pre-stored symbol sequence table, code length table, code value range table, and code length basic value table, and then determines the corresponding relationship between each symbol and code value in the original data according to the symbol sequence table, code length table, code value range table, and code length basic value table.
In the data decoding method in this embodiment, the coding code length of each symbol is determined according to Huffman coding of each symbol, and then the coding value of each symbol is obtained by adding 1 or adding 1 to complement mantissa 0 based on the coding code length, that is, the corresponding relationship between each symbol and the coding value is obtained. And finally, decoding the data to be decoded by using the corresponding relation between each symbol and the coding value. Since the corresponding relationship between each symbol and the encoded value in this embodiment is obtained by adding 1 or adding 1 to complement 0 according to the rank value of each symbol, the method is regularly and cyclically operable in the process of determining the encoded value corresponding to the symbol, and is simple to operate and low in error rate.
In one embodiment, as shown in fig. 12, step S540 includes:
step S541, arranging the symbols in descending order according to the occurrence frequency to obtain an order value of each symbol, and obtaining a symbol sequence table according to the order value of each symbol.
Specifically, the encoding circuit 130 arranges the symbols in descending order according to the occurrence frequency to obtain the ordered values of the symbols, and obtains a symbol sequence table according to the ordered values of the symbols. Optionally, the initial rank value of the symbol sequence table is 0. For example: symbols included in data to be encoded: μ 1, μ 2, μ 3, μ 4, μ 5, the corresponding probability of occurrence is: 0.4, 0.1, 0.2, 0.1. In this example, the encoding circuit 130 arranges μ 1, μ 2, μ 3, μ 4, and μ 5 in descending order of the frequency of occurrence, and obtains the ordering value of each symbol as: 0.4, 2, 3, 5; the resulting symbolic sequence table is shown in FIG. 7.
And step S542, obtaining a code length table according to the code length of each symbol.
Specifically, the encoding circuit 130 obtains a code length table from the encoding code length of each symbol. The code length table contains at least one code length of the code. For example, taking the above example into account, the code values of μ 1, μ 2, μ 3, μ 4, μ 5 contain two code lengths of 2 bits and 3 bits, and the code length table obtained by the encoding circuit 130 in this example is [2, 3 ]. Optionally, the various code lengths in the code length table are arranged in ascending order. Alternatively, various code lengths in the code length table may also be identified using serial numbers in turn. For example, 0-1 may be used, identifying 2, 3 respectively in the long table.
And step S543, obtaining a code value range table and a code length basic value table according to the code length, the code value and the sorting value of each symbol.
Wherein the code value range table contains the maximum code value of the code values of each code length. The code length basis value table contains a basis value for each code length. Specifically, the encoding circuit 130 obtains an encoding value range table and a code length basic value table according to the encoding code length, the encoding value and the sorting value of each symbol. More specifically, the encoding circuit 130 first searches for the symbol of the maximum ordered value of each encoding code length in the symbol sequence table, and then constructs the encoding value range table using the encoding values of the symbols of the respective maximum ordered values. Further, the encoding values in the encoding value range table are arranged in an ascending order to obtain the encoding value range table. The encoding circuit 130 may further obtain a basic value of each encoding code length according to the encoding value in the encoding value range table and the corresponding sorting value, and obtain the code length basic value table according to the basic value of each encoding code length. Further, a numerical value obtained by subtracting the corresponding sorting value from the code value in the code value range table is used as a basic value of various code lengths.
For example: in the above example, the 10 (binary) code length is the maximum code value among the code values of 2 bits, and the 111 (binary) code length is the maximum code value among the code values of 3 bits. The encoding circuit 130 arranges 10, 111 in ascending order, and obtains an encoding value range table [10, 111 ]. Alternatively, the binary values may be converted into decimal values, and the resulting code value range is represented as [2, 7 ]. μ 4 in the above example is the symbol of the maximum ordered value among the symbols having a code length of 2 bits, the ordered value thereof is 2, and the code value thereof is 10 (binary, corresponding to a decimal value of 2), so that the code circuit 130 in this example obtains a base value of 0 having a code length of 2 bits (operation process: 2-2); μ 5 is a symbol of the largest rank value among the symbols of which code length is 3 bits, the rank value is 4, and the code value is 111 (binary, corresponding to a decimal value of 7), so that the code circuit 130 in this example obtains a base value of 3 of code length of 4 bits (operation process: 7-3). The code length basis value table obtained by the encoding circuit 130 in this example is [0, 3 ].
Step S544 is to decode each encoded value in the data to be decoded by using the symbol sequence table, the code length table, the encoded value range table, and the code length basic value table.
Specifically, the encoding circuit 130 decodes each encoded value in the data to be decoded by using the symbol sequence table, the code length table, the encoded value range table, and the code length basic value table. Optionally, the encoding circuit 130 first determines a code length and a basic value of each encoded value in the data to be decoded according to the encoded value range table, then determines a basic value corresponding to each encoded value according to the code length, obtains an ordering value of each encoded value according to the encoded value and the corresponding basic value, and finally obtains a symbol corresponding to each encoded value according to the ordering value, that is, decoding of each encoded value in the data to be decoded is achieved.
For example, following the above example, the sequence of symbols used is shown in FIG. 7; the coding value range table used is [2, 7], wherein the coding value 2 in the coding value range table is the maximum coding value of the coding code length 2, and the coding value 7 is the maximum coding value of the coding code length 3; the code length basic value table is [0, 3], wherein the basic value 0 in the code length basic value table is a basic value of a code length of 2 bits, and the basic value 3 is a basic value of a code length of 3 bits. And setting to decode the coded value 01 in the data to be decoded. The encoding circuit 130 first determines the minimum encoding value not less than the current encoding value in the encoding value range table according to the encoding value range table, and the obtained result is 0 and the code length is 2. Then, the base value is determined to be 0 according to the code length. The resulting rank value is the encoded value minus the base value: 1. and finally, inquiring the symbol sequence table to obtain the symbol mu 3 with the sequence value of 1.
The coding method proposed in this embodiment uses a symbol sequence table, a coding value range table, and a code length basic value table constructed based on the coding values of the respective symbols, since the code value of each symbol in this embodiment is obtained by adding 1 to the code value of the symbol with the same code length according to the symbol sequence and adding 1 to complement 0 to the code value of the symbol with different code lengths according to the symbol sequence, the data to be decoded is decoded by adding 1 to complement 0 to the code value of the symbol with different code lengths, based on the rule, the table look-up (coding value comparison table) operation in the original Huffman decoding can be converted into a symbol look-up sequence table combined with simple operation, since the data amount of the symbol sequence table is much smaller than that of the code value comparison table, and the conversion operation only includes comparison and addition operations, therefore, the decoding method is easily divided into a plurality of parallel operations, and therefore, the decoding efficiency is high.
In one embodiment, as shown in fig. 13, the decoding method further includes:
and step S550, arranging various code lengths in the code length table in an ascending order, and sequentially identifying the various code lengths arranged in the descending order by using serial numbers.
Specifically, the encoding circuit 130 arranges the various encoding code lengths in the code length table in an ascending order, and sequentially identifies the various encoding code lengths arranged in a descending order using the sequence numbers. For example: code length table [2, 3], arranging various code lengths in ascending order; using 0 to identify the code length of 2 bits; the code length is 3 bits using 1 identification code.
Step S560, arranging each encoded value in the encoded value range table in ascending order, and sequentially identifying each encoded value in the encoded value range table in descending order using the sequence number corresponding to the code length table.
Specifically, the encoding circuit 130 arranges the respective encoded values in the encoded value range table in ascending order, and sequentially identifies the respective encoded values in the encoded value range table arranged in descending order using the sequence numbers corresponding to the code length table. For example: the sorting values are sorted in ascending order [2, 4] in the code length boundary table. Rank value 2 is identified with 0; the rank value 4 is identified with 1.
Step S570, arranging the basic values in the code length basic value table in ascending order, and sequentially identifying the basic values in the code length basic value table in descending order by using the sequence numbers corresponding to the code length table.
Specifically, the encoding circuit 130 arranges the respective basic values in the code length basic value table in ascending order, and sequentially identifies the respective basic values in the code length basic value table in descending order using the sequence numbers corresponding to the code length table. For example: code length basis value table [0, 3], in ascending order.
In the embodiment, the code length table, the basic value range table and the serial numbers of the numerical values in the code length basic value table are uniformly set, so that other information can be easily determined according to determined information (such as code length) during subsequent data encoding, and the data processing efficiency is improved.
In one embodiment, as shown in fig. 15, step S544 includes,
step S5441, find the minimum encoded value not less than the current encoded value in the encoded value range table.
Specifically, the encoding circuit 130 searches for a minimum encoding value in the encoding value range table that is not less than the current encoding value. For example: and a code value range table [2, 7], wherein the code value 01 (corresponding to the decimal value 1) has a minimum code value of 2 which is not less than the code value 01. Alternatively, if each coded value in the coded value range table identifies a sequence number, the sequence number obtained by the coding circuit 130 is 0.
Step S5442, obtaining the code length of the current code value according to the minimum code value not less than the current code value in the code value range table and the code length table.
Specifically, the encoding circuit 130 obtains the encoding code length of the current encoding value according to the minimum encoding value not less than the current encoding value in the encoding value range table and the code length table. Alternatively, if each code length in the code length table identifies a serial number, the encoding circuit 130 may use the serial number of the minimum encoding value not less than the current encoding value in the encoding value range table to search for the corresponding encoding code length in the code length table, and use the encoding code length as the encoding code length of the current encoding value.
Step S5443, obtaining an initial code value of the current code value according to the current code value and the code length.
Specifically, the encoding circuit 130 obtains an initial encoding value of the current encoding value according to the current encoding value and the code length.
For example: if the current code value is 01 and the code length is 2, the obtained initial code value is 1.
Step S5444, determining a base value of the current coded value according to the code length of the current coded value.
Specifically, the encoding circuit 130 determines a base value of the current encoded value according to the code length of the current encoded value. Alternatively, if each basic value in the code length basic value table identifies a sequence number, the encoding circuit 130 may determine the basic value of the current encoded value according to the sequence number of the encoded code length of the current encoded value. Alternatively, the encoding circuit 130 may also determine the base value of the current encoding value according to the sequence number of the smallest encoding value that is not smaller than the current encoding value in the determined encoding value range table.
Step S5445, obtaining the ranking value of the current code value according to the initial code value and the basic value of the current code value.
Specifically, the encoding circuit 130 obtains the ranking value of the current encoded value according to the initial encoded value and the base value of the current encoded value. Optionally, a difference between an initial encoded value of the current encoded value and the corresponding base value is used as the sorting value corresponding to the current encoded value.
For example, taking the above example, the initial code value of 01 is 1, the base value is 0, and the ranking value is 1.
And step S5446, obtaining a symbol corresponding to the current coding value according to the sorting value and the symbol sequence table.
Specifically, the encoding circuit 130 obtains a symbol corresponding to the current encoded value according to the sorted value and the symbol sequence table.
For example, following the above example, the symbol sequence table is shown in fig. 7, and the symbol corresponding to the rank value 1 is μ 3.
In the encoding method in this embodiment, the code length and the basic value of the encoded value are determined by accessing the encoded value range table, then the ordering value corresponding to the encoded value is obtained according to the initial encoded value and the basic value of the encoded value, and finally the symbol corresponding to the encoded value is determined according to the ordering value. The method has small data access amount of table lookup operation and simple operation, thereby improving the data decoding efficiency.
The following specifically describes the application of the data decoding method in the above embodiment, taking as an example the application of the data encoding method to the operation process executed by the operation device in the above embodiment.
As shown in fig. 14, a data processing method proposed in one embodiment is executed by the computing device 100, and specifically includes:
in step S710, the main operation unit receives input data and encodes the input data using any one of the data encoding methods in the above embodiments to obtain encoded data.
Specifically, the encoding circuit 130 of the main operation unit 110 encodes the input data using any one of the data encoding methods in the above-described embodiments to obtain encoded data. Further, the encoding circuit 130 firstly accesses the symbol sequence table to obtain the sequence value of the current symbol; then, determining the code length of the current symbol according to the sorting value of the current symbol in the symbol sequence table, the code length table and the code length boundary table; obtaining a basic value of the current symbol according to the code length of the current symbol and the code length basic value table; obtaining an initial coding value of the current symbol according to the sorting value of the current symbol in the symbol sequence table and the basic value of the current symbol; and obtaining the coding value of the current symbol according to the initial coding value of the current symbol and the coding length of the current symbol. And circularly executing the data coding step to obtain the coding values of other symbols in the input data so as to code the input data.
In step S720, the master operation unit transfers the obtained encoded data to the slave operation unit.
Alternatively, if the arithmetic device has the configuration shown in fig. 2, the master arithmetic unit may transmit the encoded data to the slave arithmetic units via k slave arithmetic units connected to the master arithmetic unit. Alternatively, if the arithmetic device has the configuration shown in fig. 2, the master arithmetic unit may transmit the encoded data to the slave arithmetic unit via the branch arithmetic unit.
In step S730, the encoded data is received from the encoding circuit of the arithmetic unit, and then decoded to obtain decoded data.
Specifically, after receiving the encoded data, the encoding circuit 130 of the slave arithmetic unit first accesses the encoding value range table, and searches for the maximum encoding value not less than the current encoding value in the encoding value range table; then, obtaining the code length of the current code value according to the maximum code value not less than the current code value and the code length table; obtaining an initial coding value of the current coding value according to the current coding value and the code length; determining a basic value of the current coding value according to the code length of the current coding value; then, obtaining the ranking value of the current code value according to the initial code value and the basic value of the current code value; and finally, obtaining the symbol corresponding to the current coding value according to the sorting value and the symbol sequence table. And circularly executing the data decoding step to obtain symbols corresponding to other coded values in the coded data so as to decode the coded data.
In step S740, the slave arithmetic unit performs multiplication using the decoded data to obtain an intermediate result, and transmits the intermediate result to the master arithmetic unit. Optionally, the slave arithmetic unit may also encode the intermediate result by applying the data encoding method according to any of the embodiments, and then transmit the encoded intermediate result to the master arithmetic unit. Alternatively, if the arithmetic device has the configuration shown in fig. 2, the slave arithmetic unit may transmit the intermediate result to the master arithmetic unit through k slave arithmetic units connected to the master arithmetic unit. Alternatively, if the computing device has the configuration shown in fig. 2, the slave computing unit may transmit the intermediate result to the slave computing unit through the branch computing unit.
In step S750, the main operation unit performs an accumulation and activation operation using the intermediate result to obtain an operation result. Optionally, if the encoded intermediate result is transmitted from the slave arithmetic unit, the encoding circuit of the master arithmetic unit needs to decode the encoded intermediate result first, and then perform the accumulation and activation operations to obtain the arithmetic result.
Alternatively, if the operation result is the final operation result, the operation device 100 may terminate the data processing flow. If the operation result is not the final operation result, the operation device 100 can perform the operation of the next stage using the operation result.
The above-mentioned embodiment of the computing device encodes the input data and transmits the encoded input data to the slave computing unit, and the slave computing unit decodes the received data by using the decoding method corresponding to the encoding method and then performs the computation, which can reduce the bandwidth requirement of data transmission between the computing units.
It should be understood that, although the steps in the flowcharts of fig. 4, 6, 8-15 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in fig. 4, 6, 8-15 may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performing the sub-steps or stages is not necessarily sequential, but may be performed alternately or alternatingly with other steps or at least some of the sub-steps or stages of other steps.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.
Claims (10)
1. A method of encoding data, the method comprising:
taking the Huffman coding length of each symbol in the data to be coded as the coding length of each symbol in the data to be coded;
according to the occurrence frequency of each symbol in the data to be encoded, arranging each symbol in the data to be encoded in a descending order mode to obtain an ordering value of each symbol;
if the code length of the current symbol is the same as that of the symbol of the last sorting value, adding 1 to the code value of the symbol of the last sorting value to obtain the code value of the current symbol; if the code length of the current symbol is different from that of the symbol of the last sorting value, adding 1 to the code value of the symbol of the last sorting value to obtain a numerical value, and supplementing 0 mantissa to obtain the code value of the current symbol;
and coding the data to be coded according to the coding value of the symbol corresponding to each symbol.
2. The method according to claim 1, wherein the encoding the data to be encoded according to the encoding value of the symbol corresponding to each symbol comprises:
arranging the symbols in the data to be coded in a descending order according to the occurrence frequency to obtain the ordering value of each symbol, and obtaining a symbol sequence table according to the ordering value of each symbol;
obtaining a code length table according to the code length of each symbol in the data to be coded;
obtaining a code length boundary table and a code length basic value table of the data to be coded according to the code length and the sorting value of each symbol in the data to be coded;
and coding each symbol in the data to be coded by using the symbol sequence table, the code length boundary table and the code length basic value table.
3. The method according to claim 2, wherein obtaining a code length boundary table and a code length basic value table of the data to be encoded according to the code length and the ordering value of each symbol in the data to be encoded comprises:
searching the symbol of the maximum sorting value of each code length in the symbol sequence table, and constructing the code length boundary table by using the maximum sorting value;
and obtaining a basic value of each code length according to each maximum sorting value and the code value corresponding to each maximum sorting value, and obtaining the code length basic value table according to the basic value of each code length.
4. The method of claim 3, wherein constructing the code length boundary table using the maximum rank values comprises:
and sequencing the maximum sequencing values in the code length boundary table in an ascending manner to obtain the code length boundary table.
5. The method of claim 4, further comprising:
arranging various code lengths in the code length table in an ascending order, and sequentially identifying various code lengths in the code length table in a descending order by using serial numbers;
arranging the maximum ordering values in the code length boundary table in an ascending order, and sequentially identifying the maximum ordering values in the code length boundary table in a descending order by using the sequence numbers corresponding to the code length table;
and arranging all the basic values in the code length basic value table in an ascending order, and sequentially identifying all the basic values in the code length basic value table in a descending order by using the sequence numbers corresponding to the code length table.
6. The method of claim 5, wherein the initial rank value of the symbol sequence table is 0.
7. The method of claim 6, wherein obtaining a base value for each code length according to each of the maximum ordering values and a code value corresponding to each of the maximum ordering values comprises:
and subtracting the sorting value from the code value corresponding to each maximum sorting value to obtain a basic value of each code length.
8. The method according to any one of claims 2-7, wherein encoding each symbol in the data to be encoded using the symbol sequence table, the code length boundary table, and the code length base value table comprises:
determining the code length of the current symbol according to the sorting value of the current symbol in the symbol sequence table, the code length boundary table and the code length table;
obtaining a basic value of the current symbol according to the code length of the current symbol and the code length basic value table;
obtaining an initial coding value of the current symbol according to the sorting value of the current symbol in the symbol sequence table and the basic value of the current symbol;
and obtaining the coding value of the current symbol according to the initial coding value of the current symbol and the coding length of the current symbol.
9. The method of claim 8, wherein obtaining the initial code value of the current symbol according to the ordered value of the current symbol in the symbol sequence table and the base value of the current symbol comprises:
and adding the sorting value of the current symbol in the symbol sequence table and the basic value of the current symbol to obtain the initial coding value of the current symbol.
10. The method of claim 1, further comprising:
and taking the value 0 corresponding to the code length as the code value of the symbol with the minimum sorting value in the symbol sequence list.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811625654.8A CN111384967B (en) | 2018-12-28 | 2018-12-28 | Data encoding method |
PCT/CN2019/121056 WO2020114283A1 (en) | 2018-12-07 | 2019-11-26 | Data processing method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811625654.8A CN111384967B (en) | 2018-12-28 | 2018-12-28 | Data encoding method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111384967A true CN111384967A (en) | 2020-07-07 |
CN111384967B CN111384967B (en) | 2022-12-09 |
Family
ID=71222210
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811625654.8A Active CN111384967B (en) | 2018-12-07 | 2018-12-28 | Data encoding method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111384967B (en) |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5841381A (en) * | 1993-12-20 | 1998-11-24 | Canon Kabushiki Kaisha | Huffman coding/decoding using an intermediate code number |
JPH10336661A (en) * | 1997-04-01 | 1998-12-18 | Matsushita Electric Ind Co Ltd | Image coding/decoding device, image coding/decoding method, and recording medium recording image coding/ decoding program |
US6040790A (en) * | 1998-05-29 | 2000-03-21 | Xerox Corporation | Method of building an adaptive huffman codeword tree |
US6313767B1 (en) * | 1999-02-19 | 2001-11-06 | Canon Kabushiki Kaisha | Decoding apparatus and method |
CN103905054A (en) * | 2012-12-25 | 2014-07-02 | 展讯通信(上海)有限公司 | Code table establishing method and device, and encoding and decoding method and device |
CN104283568A (en) * | 2013-07-12 | 2015-01-14 | 中国科学院声学研究所 | A Data Compression Coding Method Based on Partial Huffman Tree |
WO2016171472A1 (en) * | 2015-04-23 | 2016-10-27 | 김정훈 | Method and apparatus for encoding and decoding data |
CN106559085A (en) * | 2016-11-30 | 2017-04-05 | 郑州云海信息技术有限公司 | A kind of normal form Hafman decoding method and its device |
CN106849956A (en) * | 2016-12-30 | 2017-06-13 | 华为机器有限公司 | Compression method, decompression method, device and data processing system |
CN107565973A (en) * | 2017-08-01 | 2018-01-09 | 中国人民解放军国防科学技术大学 | The implementation method and circuit structure of a kind of expansible Huffman encoding of node |
-
2018
- 2018-12-28 CN CN201811625654.8A patent/CN111384967B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5841381A (en) * | 1993-12-20 | 1998-11-24 | Canon Kabushiki Kaisha | Huffman coding/decoding using an intermediate code number |
JPH10336661A (en) * | 1997-04-01 | 1998-12-18 | Matsushita Electric Ind Co Ltd | Image coding/decoding device, image coding/decoding method, and recording medium recording image coding/ decoding program |
US6040790A (en) * | 1998-05-29 | 2000-03-21 | Xerox Corporation | Method of building an adaptive huffman codeword tree |
US6313767B1 (en) * | 1999-02-19 | 2001-11-06 | Canon Kabushiki Kaisha | Decoding apparatus and method |
CN103905054A (en) * | 2012-12-25 | 2014-07-02 | 展讯通信(上海)有限公司 | Code table establishing method and device, and encoding and decoding method and device |
CN104283568A (en) * | 2013-07-12 | 2015-01-14 | 中国科学院声学研究所 | A Data Compression Coding Method Based on Partial Huffman Tree |
WO2016171472A1 (en) * | 2015-04-23 | 2016-10-27 | 김정훈 | Method and apparatus for encoding and decoding data |
CN106559085A (en) * | 2016-11-30 | 2017-04-05 | 郑州云海信息技术有限公司 | A kind of normal form Hafman decoding method and its device |
CN106849956A (en) * | 2016-12-30 | 2017-06-13 | 华为机器有限公司 | Compression method, decompression method, device and data processing system |
CN107565973A (en) * | 2017-08-01 | 2018-01-09 | 中国人民解放军国防科学技术大学 | The implementation method and circuit structure of a kind of expansible Huffman encoding of node |
Non-Patent Citations (1)
Title |
---|
王防修等: "一种哈夫曼编码的改进算法", 《武汉轻工大学学报》 * |
Also Published As
Publication number | Publication date |
---|---|
CN111384967B (en) | 2022-12-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Brisaboa et al. | (S, C)-dense coding: An optimized compression code for natural language text databases | |
CN101800556A (en) | Be fit to the method and the setting of data compression | |
JP2002252563A (en) | Method and device for decoding hofmann code, and table for hofmann code decoding and its generating method | |
CN111384966B (en) | Data decoding method | |
CN111384967B (en) | Data encoding method | |
CN1768480B (en) | Encoding device and method, decoding device and method | |
KR20200003185A (en) | Method and apparatus, device, and storage medium for sequence determination | |
US6049633A (en) | Adaptive arithmetic codec method and apparatus | |
US20060018556A1 (en) | Method, apparatus and system for data block rearrangement for LZ data compression | |
CN111384962B (en) | Data compression/decompression device and data compression method | |
CN111384963B (en) | Data compression/decompression device and data decompression method | |
CN114500670B (en) | Encoding compression method, decoding method and device | |
US6778107B2 (en) | Method and apparatus for huffman decoding technique | |
Roy et al. | Sbvrldnacomp: An effective dna sequence compression algorithm | |
US6573847B1 (en) | Multi-table mapping for huffman code decoding | |
US6101281A (en) | Method for improving data encoding and decoding efficiency | |
US5991340A (en) | Method and system for encoding and decoding data using run prediction | |
CN111384964B (en) | Data compression/decompression device and data compression method | |
CN111384968B (en) | Data compression/decompression device and data decompression method | |
CN107797541B (en) | Image file portable decompression algorithm based on ZigBee firmware upgrading in smart home environment | |
US11640265B2 (en) | Apparatus for processing received data | |
CN116032429B (en) | Method and system for determining interleaving length of triangular interleaver | |
CN110958100A (en) | Equipment control method and device | |
Rakitskiy | The efficient approach of calculating the predictors | |
US12034462B2 (en) | Compressing probability tables for entropy coding |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |