[go: up one dir, main page]

WO2020232683A1 - 一种基于流水式的硬件压缩的系统及方法 - Google Patents

一种基于流水式的硬件压缩的系统及方法 Download PDF

Info

Publication number
WO2020232683A1
WO2020232683A1 PCT/CN2019/088033 CN2019088033W WO2020232683A1 WO 2020232683 A1 WO2020232683 A1 WO 2020232683A1 CN 2019088033 W CN2019088033 W CN 2019088033W WO 2020232683 A1 WO2020232683 A1 WO 2020232683A1
Authority
WO
WIPO (PCT)
Prior art keywords
character
data
unit
matching
compressed
Prior art date
Application number
PCT/CN2019/088033
Other languages
English (en)
French (fr)
Inventor
蔡晔
罗浩鑫
Original Assignee
深圳大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳大学 filed Critical 深圳大学
Priority to PCT/CN2019/088033 priority Critical patent/WO2020232683A1/zh
Publication of WO2020232683A1 publication Critical patent/WO2020232683A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/40Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code

Definitions

  • the present invention relates to the technical field of data compression, in particular to a system and method based on pipeline hardware compression.
  • the embodiment of the present invention provides a pipeline-based hardware compression system and method to increase the data compression rate and reduce the time and memory resources occupied by the central processing unit.
  • the first aspect of the present invention provides a pipeline-based hardware compression system, which includes a management control unit, a pipeline matching unit, a standard format conversion unit, a static Huffman coding unit, and a data stream generation unit;
  • the management control unit is configured to obtain M bits of data to be compressed from an input buffer channel, where M is a positive integer, divide the M bits of data to be compressed into N characters, and divide the N characters into N characters Periodically input the pipeline matching unit, where N is a positive integer less than M, obtain (M/N) bits of compressed data from the data stream generating unit in each period, and output M bits of compressed data after obtaining N periods ;
  • the pipeline matching unit is used to calculate the position where the i-th character matches the previous (i-1) character and the same length of continuous matching, where i is a positive integer not greater than N;
  • the standard format conversion unit is configured to match the i-th character, the i-th character, and the first (i-1) character to the same position, and the i-th character to continuously match the same length Standard data converted into standard data format;
  • the static Huffman coding unit is used to encode the standard data to obtain a Huffman coded stream
  • the data stream generating unit is used to convert the Huffman coded stream into compressed data of every cycle (M/N) bit for output.
  • the management control unit specifically includes:
  • the input state machine unit is configured to obtain the M-bit data to be compressed from the input buffer channel, divide the M-bit data to be compressed into the N characters, and divide the N characters into N periodic input In the pipeline matching unit, each of the N characters is (M/N) bits;
  • a state setting unit configured to set the large and small end conversion of the M-bit data to be compressed and the M-bit compressed data
  • the output state machine unit is configured to obtain the (M/N) bit compressed data from the data stream generating unit every period, and output the M bit compressed data after obtaining N periods.
  • the pipeline matching unit specifically includes:
  • a matching position calculating unit configured to calculate the position where the i-th character matches the first (i-1) character
  • K character matching units are used to match the i-th character with the characters stored in the K character matching units, and if the matches are the same, output matching signals, where K is a positive integer, and the K characters
  • the matching unit has a pipeline structure. After the j-th character matching unit is matched, the character stored in the j-th character matching unit is output to the (j+1)-th character matching unit, and the j-th character is matched to the character stored in the unit Replace with the character input to the j-th character matching unit, where j is a positive integer less than K;
  • the matching length calculation unit is used to calculate that the i-th character continuously matches the same length.
  • the standard format conversion unit specifically includes:
  • the first character buffer unit is configured to store the i-th character, the position where the i-th character matches the first (i-1) character, and the i-th character continuously matches the same length;
  • the second character buffer unit is used to store the (i-1)th character, the position where the (i-1)th character matches the previous (i-2) character, and the (i-1)th character Consecutive characters match the same length;
  • the third character buffer unit is used to store the (i-2)th character, the position where the (i-2)th character matches the same position as the previous (i-3) character, and the (i-2)th character Consecutive characters match the same length;
  • the format conversion state machine unit is used for judging the data stored in the first character buffer unit, the second character buffer unit, and the third character buffer unit to combine the i-th character and the i-th character
  • the character matches the same position with the first (i-1) characters, and the i-th character continuously matches the same length and is converted into the standard data;
  • the standard data is a single character, or the standard data includes a position distance, a matching length, and a single character.
  • the static Huffman coding unit specifically includes:
  • a character encoding unit for converting the single character into a Huffman encoding format to obtain a first Huffman encoding stream
  • a position coding unit for converting the position distance into a Huffman coding format
  • Huffman encoding splicing unit used to splice the position distance, matching length and single character converted into Huffman encoding format into a second Huffman encoding stream;
  • the coded stream selection unit is configured to select whether to output in the first Huffman coded stream or output in the second Huffman coded stream.
  • the second aspect of the present invention provides a pipeline-based hardware compression method, including:
  • the pipeline matching unit is used to perform pipelined character matching, which improves the matching efficiency, thereby increasing the data compression rate.
  • the entire compression process does not require the central processing unit to calculate, reducing the time and memory resources of the central processing unit. .
  • FIG. 1 is a schematic structural diagram of a system based on pipeline hardware compression provided by an embodiment of the present invention
  • FIG. 2 is a schematic structural diagram of a management control unit in a system based on pipeline hardware compression provided by an embodiment of the present invention
  • FIG. 3 is a schematic structural diagram of a pipeline matching unit in a pipeline-based hardware compression system provided by an embodiment of the present invention
  • FIG. 4 is a schematic structural diagram of a standard format conversion unit in a pipeline-based hardware compression system provided by an embodiment of the present invention
  • FIG. 5 is a schematic structural diagram of a static Huffman coding unit in a pipeline-based hardware compression system provided by an embodiment of the present invention
  • FIG. 6 is a flowchart of a method for pipeline-based hardware compression provided by an embodiment of the present invention.
  • FIG. 7 is a flowchart of another pipeline-based hardware compression method provided by an embodiment of the present invention.
  • FIG. 8 is a working flowchart of a data flow generating unit in a pipeline-based hardware compression system provided by an embodiment of the present invention.
  • the embodiment of the present invention provides a pipeline-based hardware compression system and method to increase the data compression rate and reduce the time and memory resources occupied by the central processing unit.
  • the pipeline-based hardware compression system can realize the static Gzip compression function, and the compression format is a standard deflate compression encoding stream format.
  • the data is compressed by the management control unit, the pipeline matching unit, the standard format conversion unit, the static Huffman coding unit, and the data stream generation unit.
  • the pipeline matching unit is used to perform pipeline character matching, which improves the matching efficiency and thus The data compression rate is improved, and the more character matching units used in the pipeline matching unit, the higher the data compression rate.
  • the central processing unit only needs to notify the direct memory access module (DMA, Direct Memory Access) Transmit the data to be compressed to the input buffer channel. Until the compression is completed, the central processing unit informs the direct memory access module again to transmit the compressed data to the memory.
  • DMA Direct Memory Access
  • Figure 1 is a schematic structural diagram of a pipeline-based hardware compression-based system 100 provided by an embodiment of the present invention.
  • the pipeline-based hardware compression-based system 100 provided in an embodiment of the present invention may include: A control unit 101, a pipeline matching unit 102, a standard format conversion unit 103, a static Huffman encoding unit 104, and a data stream generation unit 105;
  • the management control unit 101 is used to obtain M bits of data to be compressed from the input buffer channel, where M is a positive integer, divide the M bits of data to be compressed into N characters, and divide the N characters into N periodic input pipelines.
  • the matching unit 102 where N is a positive integer less than M, and at the same time, each cycle detects whether the data stream generating unit 105 has compressed (M/N) bit data output, and if so, obtains it from the data stream generating unit 105 ( M/N) bit compressed data, output M bit compressed data after obtaining N cycles;
  • the pipeline matching unit 102 is used to obtain a character input by the management control unit 101 every cycle, calculate the position where the i-th character matches the previous (i-1) character and the same length for consecutive matches, and combine the i-th character, The i-th character matches the same position as the previous (i-1) characters, and the i-th character continuously matches the same length and is sent to the standard format conversion unit 103, where i is a positive integer not greater than N;
  • the standard format conversion unit 103 is used to make a logical judgment on the data in the internal buffer unit, and match the i-th character, the i-th character with the previous (i-1) character in the same position, and the i-th character consecutively to match the same position
  • the length of is converted into standard data in a standard data format, and the standard data is sent to the static Huffman coding unit 104;
  • the static Huffman encoding unit 104 is configured to encode standard data through an internal encoding conversion unit to obtain a Huffman encoded stream, and send the Huffman encoded stream to the data stream generating unit 105;
  • the data stream generating unit 105 is used to convert the non-fixed-length Huffman coded stream into a fixed-length coded stream for output, obtain the Huffman coded stream once per cycle, and store the Huffman coded stream in the internal buffer array.
  • the first (M/N) bit data in the internal buffer array is sent to the management control unit 101 every cycle.
  • FIG. 2 is a schematic structural diagram of a management control unit 200 in a pipeline-based hardware compression system provided by an embodiment of the present invention.
  • the management control unit 200 provided by an embodiment of the present invention may include: an input state machine Unit 201, state setting unit 202, and output state machine unit 203;
  • the input state machine unit 201 is used to obtain M bits of data to be compressed from the input buffer channel, where M is a positive integer, divide the M bits of data to be compressed into N characters, and output them in a format of one character per cycle.
  • N is a positive integer less than M, and each of the N characters is (M/N) bits;
  • the state setting unit 202 is used to set the large and small end conversion of the input data and the output data of the state machine;
  • the output state machine unit 203 is configured to obtain compressed (M/N) bit coded stream data every cycle, and output M-bit compressed data after every N cycles.
  • the input state machine unit 201 also outputs an input valid bit, where the input valid bit is used to indicate whether the original data of the current cycle is valid, and when the input valid bit is logically 0, it means the current The original data of the cycle is invalid. When the input valid bit is logically 1, the original data of the current cycle is valid.
  • the big-endian mode means that the high byte of data is stored in the low address of the memory, while the low byte of data is stored in the high address of the memory, the address increases from small to large, and the data is placed from high to low; Little-endian mode means that the high byte of data is stored in the high address of the memory, and the low byte of data is stored in the low address of the memory.
  • the high and low addresses of the address are combined with the data bit weight, and the weight of the high address part is high. , The weight of the lower address part is low.
  • the output state machine unit 203 in addition to acquiring (M/N) bit encoded stream data every cycle, the output state machine unit 203 also acquires an encoding valid bit, where the encoding valid bit is used to indicate whether the current cycle encoded stream data is valid. When the bit is logically 0, it means that the current cycle coded stream data is invalid, and when the coded valid bit is logically 1, it means that the current cycle coded stream data is valid.
  • the input state machine unit 201 obtains 128-bit data to be compressed every 16 cycles, and divides the 128-bit data to be compressed into 16 characters in a format of one character per cycle Output, where each of the 16 characters is 8 bits;
  • the state setting unit 202 sets the state machine input data and the output data of the large and small end conversion
  • the output state machine unit 203 stores the input compressed 8-bit encoded stream data, and after each input of 16 8-bit encoded stream data, they are spliced into 128-bit compressed data for output.
  • FIG. 3 is a schematic structural diagram of a pipeline matching unit 300 in a pipeline-based hardware compression system provided by an embodiment of the present invention.
  • the pipeline matching unit 300 provided in an embodiment of the present invention may include: matching position calculation Unit 301, K character matching unit 302, and matching length calculation unit 303;
  • the matching position calculation unit 301 is used to calculate the position where the current input character matches the previous input character. It is judged by the signal sent by each character matching unit. If the current character matching unit is matched successfully, the output of the character matching unit is output. position;
  • Each character matching unit is used to match the input character with the character stored in the character matching unit. If the matches are the same, Output matching same signal. After each cycle, the character stored in the character matching unit is output to the next character matching unit as the input character of the next character matching unit, and the character input to the character matching unit is stored in Wait for the next cycle to apply in the character matching unit;
  • the matching length calculation unit 303 is used to calculate the current characters that match the same length continuously. If the current characters match the same, the count is increased by one until the character matches are not the same, the count is cleared, and the count is repeated.
  • the matching position calculation unit 301 outputs the same position where the current input character matches the previous input character, the position data is 15-bit data, and the matching length calculation unit 303 outputs the current character continuously matches the same length, and the length data is 8-bit data.
  • FIG. 4 is a schematic structural diagram of a standard format conversion unit 400 in a pipeline-based hardware compression system provided by an embodiment of the present invention.
  • a standard format conversion unit 400 provided by an embodiment of the present invention may include: A character buffer unit 401, a second character buffer unit 402, a third character buffer unit 403, and a format conversion state machine unit 404;
  • the first character buffer unit 401 is used to store the i-th character, the position where the i-th character matches the previous (i-1) character, and the i-th character consecutively match the same length, where i is not A positive integer greater than N;
  • the second character buffer unit 402 is used to store the (i-1)th character, the position where the (i-1)th character matches the previous (i-2) character, and the (i-1)th character consecutive Match the same length;
  • the third character buffer unit 403 is used to store the (i-2)th character, the position where the (i-2)th character matches the previous (i-3) character, and the (i-2)th character consecutive Match the same length;
  • the format conversion state machine unit 404 is used to determine the data stored in the first character buffer unit 401, the second character buffer unit 402, and the third character buffer unit 403 to compare the i-th character and the i-th character with the previous (i- 1) The characters match the same position, and the i-th character matches the same length consecutively and converted into standard data in standard data format;
  • the standard data is a single character, or the standard data includes position distance, matching length and single character.
  • the first character buffer unit 401 stores the i-th character, the position where the i-th character matches the previous (i-1) character, and the i-th character consecutively matches The same length, that is, the character information input in the current cycle of the standard format conversion unit 400;
  • the second character cache unit 402 stores the (i-1)th character, the position where the (i-1)th character matches the same position as the previous (i-2) character, and the (i-1)th character continuously matches the same position. Length, that is, the character information input in the previous cycle by the standard format conversion unit 400;
  • the third character cache unit 403 stores the (i-2)th character, the position where the (i-2)th character matches the same position as the previous (i-3) character, and the (i-2)th character consecutively matches the same position. Length, that is, the character information input in the previous cycle on the standard format conversion unit 400;
  • the format conversion state machine unit 404 by judging the data stored in the three character buffer units, matches the i-th character, the i-th character with the previous (i-1) character in the same position, and the i-th character continuously matches the same
  • the length is converted into standard data in LZ77 data format, that is, if the matching length does not exceed three values, it will be output as a single character. If the matching length exceeds three values, the format of ⁇ location distance, matching length, single character ⁇ will be used instead of repeat String output, for example, if the same character is matched for five consecutive cycles, there will be no valid data output from the first cycle to the fourth cycle, and the fifth cycle will be ⁇ position distance, 5, next character ⁇ Data format output.
  • the format conversion state machine unit 404 outputs position distance, matching length, and single character, where the position distance is 15-bit data, the matching length is 8-bit data, and the single character is 8-bit data.
  • the format conversion state machine unit 404 except In addition to output position distance, matching length and single character, it also outputs a single character flag variable.
  • the single character flag variable is used to indicate whether it is a single character output. If it is not a single character output, the data of position distance and matching length is valid. Single-character output, the data of position distance and matching length is invalid.
  • FIG. 5 is a schematic structural diagram of a static Huffman encoding unit 500 in a pipeline-based hardware compression system provided by an embodiment of the present invention.
  • a static Huffman encoding unit 500 provided by an embodiment of the present invention It may include: a character encoding unit 501, a position encoding unit 502, a length encoding unit 503, a Huffman encoding splicing unit 504, and an encoding stream selection unit 505;
  • the character encoding unit 501 is used to convert a single character into a Huffman encoding format to obtain a first Huffman encoding stream;
  • the position coding unit 502 is used to convert the position distance into a Huffman coding format
  • the length encoding unit 503 is used to convert the matching length into a Huffman encoding format
  • the Huffman coding splicing unit 504 is used to splice the position distance, matching length and single character converted into the Huffman coding format into a second Huffman coding stream, wherein the length of the second Huffman coding stream is not greater than 64 bits;
  • the coded stream selection unit 505 is used to select whether to output as the first Huffman coded stream or output as the second Huffman coded stream, that is, to select whether to output as a single-character Huffman coded stream or as ⁇ position distance, matching length, Single-character Huffman coding stream output. If the data input to the static Huffman coding unit 500 is a single character, the output coding streams of the position coding unit 502 and the length coding unit 503 are invalid, and only the coding of the character coding unit 501 Stream output, if the data input to the static Huffman coding unit 500 is not a single character, output the spliced coded stream.
  • the coded stream selection unit 505 also outputs how many valid data bits are in the Huffman coded stream.
  • FIG. 6 is a flowchart of a pipeline-based hardware compression method provided by an embodiment of the present invention.
  • a pipeline-based hardware compression method provided by an embodiment of the present invention may include:
  • the original data to be compressed is transmitted to the input buffer channel through the direct memory access module.
  • each of the N characters is (M/N) bits.
  • the method of calculating the position where the i-th character matches the previous (i-1) character and the same length of consecutive matches can be:
  • K is a positive integer
  • the K character matching units are in a pipeline structure. After the jth character matching unit is matched, the characters stored in the jth character matching unit are output to the (j+1)th character matching unit, and the The characters stored in the j character matching unit are replaced with the characters input to the j-th character matching unit, and j is a positive integer less than K.
  • the method for converting the i-th character, the i-th character and the previous (i-1) character in the same position, and the i-th character consecutively matching the same length into standard data in standard data format can be:
  • the i-th character, the i-th character and the previous (i-1) character are matched to the same position, and the i-th character is continuously matched to the same length Converted into standard data;
  • the first cache data includes the i-th character, the position where the i-th character matches the previous (i-1) character, and the i-th character continuously matches the same length
  • the second cache data includes storing the (i-th) 1) characters, the (i-1)th character matches the same position as the previous (i-2) character, and the (i-1)th character continuously matches the same length
  • the third buffer data includes the (i-th) 2) Characters, the (i-2)th character matches the same position as the previous (i-3) character, and the (i-2)th character continuously matches the same length;
  • the standard data is a single character, or the standard data includes position distance, matching length and single character.
  • the method of encoding standard data to obtain a Huffman encoded stream may be:
  • the position distance, matching length, and single character converted into the Huffman encoding format are spliced into a second Huffman encoding stream.
  • the compressed data is transferred to the memory through the direct memory access module.
  • FIG. 7 is a flowchart of another pipeline-based hardware compression method provided by an embodiment of the present invention.
  • another pipeline-based hardware compression method provided by an embodiment of the present invention may include:
  • the management control unit obtains 128-bit data to be compressed.
  • the central processing unit transmits the original data to be compressed to the input buffer channel of the hardware compression system through the direct memory access module, and the management control unit obtains 128-bit data to be compressed from the input buffer channel every 16 cycles.
  • the management control unit divides the 128-bit data to be compressed into 16 characters.
  • each of the 16 characters is 8 bits.
  • the management control unit outputs in the format of one character per cycle, and sends 16 characters to the pipeline matching unit in order. Each time a character is output, the count is increased by one.
  • the pipeline matching unit performs pipeline matching on the input characters to obtain matching data, and sends the matching data to the standard format conversion unit.
  • the matching data includes the current input character, the position where the current input character matches the previous input character, and the current input character continuously matches the same length.
  • the standard format conversion unit converts the input matching data into data in the LZ77 format, and sends the data in the LZ77 format to the static Huffman coding unit.
  • the static Huffman encoding unit converts the data in the LZ77 format into a Huffman encoding stream, and sends the Huffman encoding stream to the data stream generating unit.
  • the data stream generating unit obtains a Huffman coded stream of variable length, and sends 8-bit coded stream data to the management control unit every cycle.
  • each 16 8-bit coded stream data is obtained by the management control unit, they are spliced into 128-bit coded stream data for output.
  • step 709. Determine whether the 128-bit data currently input has all been sent, if not, go to step 703, and if yes, go to step 710.
  • step 710 Determine whether there is still data in the input buffer channel, if not, go to step 711, if yes, go to step 701, and obtain the next 128-bit data again in order.
  • FIG. 8 is a working flowchart of a data flow generating unit in a pipeline-based hardware compression system provided by an embodiment of the present invention.
  • the workflow of a data stream generating unit provided by an embodiment of the present invention may include:
  • step 801. Determine whether the encoded stream data currently input to the data stream generating unit is valid, if not, go to step 804, and if yes, go to step 802.
  • the buffer array is used to store the encoded stream data of the input data stream generating unit, and the buffer array is shifted to the left by the effective length bit and is ORed with the current encoded stream data, that is, the current encoded stream data is spliced to the end of the buffer array.
  • step 804. Take out the first (M/N) bit data of the buffer array in the data stream generating unit in the same cycle and output it, and go to step 805.
  • step 805. Determine whether the input data has ended, if not, proceed to step 801, if yes, proceed to step 806.
  • step 804 Determine whether there is still encoded stream data in the buffered data that has not been output; if yes, go to step 804; if not, go to an end state.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

本发明提供了一种基于流水式的硬件压缩的系统及方法。一种基于流水式的硬件压缩的系统包括管理控制单元、流水匹配单元、标准格式转换单元、静态霍夫曼编码单元和数据流生成单元,其中,管理控制单元用于控制协调其它各个单元的数据传输,获取待压缩数据,输出已压缩数据;流水匹配单元用于计算当前输入字符与之前输入字符匹配相同的位置,当前输入字符连续匹配相同的长度;标准格式转换单元用于将输入字符转换成标准数据格式;静态霍夫曼编码单元用于将标准数据格式的数据编码成霍夫曼编码流;数据流生成单元用于将非定长的编码流转换成定长输出。本发明实施例的技术方案,提高了数据压缩的速率,并且减少了占用中央处理器的时间和内存资源。

Description

一种基于流水式的硬件压缩的系统及方法 技术领域
本发明涉及数据压缩技术领域,尤其涉及一种基于流水式的硬件压缩的系统及方法。
背景技术
随着大数据与人工智能的快速发展,服务器端需要存储大量的数据,以支撑大数据与人工智能的平台运算。为了减少数据存储,避免数据占用过多存储资源,通常会对数据进行压缩,再进行存储。
目前对数据进行压缩往往是通过软件实现无损压缩,这样虽然可以有效地减少数据存储,但会占用中央处理器(CPU,Central Processing Unit)大量的时间以及大量的内存资源。
发明内容
本发明实施例提供基于流水式的硬件压缩的系统及方法,以提高数据压缩的速率,并且减少了占用中央处理器的时间和内存资源。
本发明第一方面提供一种基于流水式的硬件压缩的系统,所述系统包括管理控制单元、流水匹配单元、标准格式转换单元、静态霍夫曼编码单元和数据流生成单元;
所述管理控制单元,用于从输入缓存通道获取M比特待压缩数据,其中,M为正整数,将所述M比特待压缩数据分为N个字符,将所述N个字符分为N个周期输入所述流水匹配单元,其中,N为小于M的正整数,每个周期从所述数据流生成单元获取(M/N)比特已压缩数据,获取N个周期后输出M比特已压缩数据;
所述流水匹配单元,用于计算第i个字符与前(i-1)个字符匹配相同的位置以及连续匹配相同的长度,其中,i为不大于N的正整数;
所述标准格式转换单元,用于将所述第i个字符、所述第i个字符与所述前(i-1)个字符匹配相同的位置、所述第i个字符连续匹配相同的长度转换成标准数据格式的标准数据;
所述静态霍夫曼编码单元,用于对所述标准数据进行编码以得到霍夫曼编码流;
所述数据流生成单元,用于将所述霍夫曼编码流转换成每个周期(M/N)比 特的已压缩数据输出。
基于本发明第一方面,在第一种可能的实施方式中,所述管理控制单元具体包括:
输入状态机单元,用于从所述输入缓存通道获取所述M比特待压缩数据,将所述M比特待压缩数据分为所述N个字符,将所述N个字符分为N个周期输入所述流水匹配单元,其中,所述N个字符中每个字符为(M/N)比特;
状态设置单元,用于设置所述M比特待压缩数据和所述M比特已压缩数据的大小端转换;
输出状态机单元,用于每个周期从所述数据流生成单元获取所述(M/N)比特已压缩数据,获取N个周期后输出所述M比特已压缩数据。
基于本发明第一方面或者本发明第一方面的第一种可能的实施方式,在第二种可能的实施方式中,所述流水匹配单元具体包括:
匹配位置计算单元,用于计算所述第i个字符与所述前(i-1)个字符匹配相同的位置;
K个字符匹配单元,用于将所述第i个字符与所述K个字符匹配单元保存的字符进行匹配,若匹配相同则输出匹配相同信号,其中,K为正整数,所述K个字符匹配单元呈流水结构,第j个字符匹配单元匹配后,将第j个字符匹配单元保存的字符输出给第(j+1)个字符匹配单元,将所述第j个字符匹配单元保存的字符替换成输入到所述第j个字符匹配单元的字符,j为小于K的正整数;
匹配长度计算单元,用于计算所述第i个字符连续匹配相同的长度。
基于本发明第一方面的第二种可能的实施方式,在第三种可能的实施方式中,所述标准格式转换单元具体包括:
第一字符缓存单元,用于存储所述第i个字符、所述第i个字符与所述前(i-1)个字符匹配相同的位置以及所述第i个字符连续匹配相同的长度;
第二字符缓存单元,用于存储第(i-1)个字符、所述第(i-1)个字符与前(i-2)个字符匹配相同的位置以及所述第(i-1)个字符连续匹配相同的长度;
第三字符缓存单元,用于存储第(i-2)个字符、所述第(i-2)个字符与前(i-3)个字符匹配相同的位置以及所述第(i-2)个字符连续匹配相同的长度;
格式转换状态机单元,用于通过判断所述第一字符缓存单元、所述第二字符缓存单元和所述第三字符缓存单元保存的数据,将所述第i个字符、所述第i个字符与所述前(i-1)个字符匹配相同的位置、所述第i个字符连续匹配相同的长 度转换成所述标准数据;
其中,所述标准数据为单字符,或者所述标准数据包括位置距离、匹配长度和单字符。
基于本发明第一方面的第三种可能的实施方式,在第四种可能的实施方式中,所述静态霍夫曼编码单元具体包括:
字符编码单元,用于将所述单字符转换成霍夫曼编码格式,以得到第一霍夫曼编码流;
位置编码单元,用于将所述位置距离转换成霍夫曼编码格式;
长度编码单元,用于将所述匹配长度转换成霍夫曼编码格式;
霍夫曼编码拼接单元,用于将转换成霍夫曼编码格式的位置距离、匹配长度和单字符拼接成第二霍夫曼编码流;
编码流选择单元,用于选择以所述第一霍夫曼编码流输出还是以所述第二霍夫曼编码流输出。
本发明第二方面提供了一种基于流水式的硬件压缩的方法,包括:
从输入缓存通道获取M比特待压缩数据,其中,M为正整数;
将所述M比特待压缩数据分为N个字符,其中,N为小于M的正整数;
计算第i个字符与前(i-1)个字符匹配相同的位置以及连续匹配相同的长度,其中,i为不大于N的正整数;
将所述第i个字符、所述第i个字符与所述前(i-1)个字符匹配相同的位置、所述第i个字符连续匹配相同的长度转换成标准数据格式的标准数据;
对所述标准数据进行编码以得到霍夫曼编码流;
将所述霍夫曼编码流转换成每个周期(M/N)比特的已压缩数据输出;
每个周期获取(M/N)比特已压缩数据,获取N个周期后输出M比特已压缩数据。
可以看到,通过本发明提供的基于流水式的硬件压缩的系统及方法,通过管理控制单元、流水匹配单元、标准格式转换单元、静态霍夫曼编码单元和数据流生成单元对数据进行压缩,其中,流水匹配单元用于进行流水式的字符匹配,提高了匹配效率,从而提升了数据压缩的速率,同时,整个压缩过程无需中央处理器进行计算,减少了占用中央处理器的时间和内存资源。
附图说明
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例中所需使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本发明实施例提供的一种基于流水式的硬件压缩的系统的结构示意图;
图2为本发明实施例提供的一种基于流水式的硬件压缩的系统中管理控制单元的结构示意图;
图3为本发明实施例提供的一种基于流水式的硬件压缩的系统中流水匹配单元的结构示意图;
图4为本发明实施例提供的一种基于流水式的硬件压缩的系统中标准格式转换单元的结构示意图;
图5为本发明实施例提供的一种基于流水式的硬件压缩的系统中静态霍夫曼编码单元的结构示意图;
图6为本发明实施例提供的一种基于流水式的硬件压缩的方法的流程图;
图7为本发明实施例提供的另一种基于流水式的硬件压缩的方法的流程图;
图8为本发明实施例提供的一种基于流水式的硬件压缩的系统中数据流生成单元的工作流程图。
具体实施方式
本发明实施例提供基于流水式的硬件压缩的系统及方法,以提高数据压缩的速率,并且减少了占用中央处理器的时间和内存资源。
为了使本技术领域的人员更好地理解本发明方案,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分的实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本发明保护的范围。
以下分别进行详细说明。
本发明的说明书和权利要求书及上述附图中的术语″第一″、″第二″、″第三″、″第四″等是用于区别不同对象,而不是用于描述特定顺序。此外, 术语″包括″和″具有″以及它们任何变形,意图在于覆盖不排他的包含。例如包含了一系列步骤或单元的过程、方法、系统、产品或设备没有限定于已列出的步骤或单元,而是可选地还包括没有列出的步骤或单元,或可选地还包括对于这些过程、方法、产品或设备固有的其它步骤或单元。
本发明实施例提供的基于流水式的硬件压缩的系统,可实现静态Gzip压缩功能,压缩格式为标准的deflate压缩编码流格式。通过管理控制单元、流水匹配单元、标准格式转换单元、静态霍夫曼编码单元和数据流生成单元对数据进行压缩,其中,流水匹配单元用于进行流水式的字符匹配,提高了匹配效率,从而提升了数据压缩的速率,并且,流水匹配单元中应用的字符匹配单元越多,对数据的压缩率越高,同时,在压缩过程中,中央处理器只需通知直接存储器访问模块(DMA,Direct Memory Access)将待压缩数据传输到输入缓存通道,直到压缩完成,中央处理器再次通知直接存储器访问模块将已压缩数据传输到内存,对数据进行压缩时无需中央处理器进行计算,减少了占用中央处理器的时间和内存资源。
首先参见图1,图1为本发明实施例提供的一种基于流水式的硬件压缩的系统100的结构示意图,本发明实施例提供的一种基于流水式的硬件压缩的系统100可以包括:管理控制单元101、流水匹配单元102、标准格式转换单元103、静态霍夫曼编码单元104和数据流生成单元105;
其中,管理控制单元101,用于从输入缓存通道获取M比特待压缩数据,其中,M为正整数,将M比特待压缩数据分为N个字符,将N个字符分为N个周期输入流水匹配单元102,其中,N为小于M的正整数,同时,每个周期检测数据流生成单元105是否有压缩完成的(M/N)比特数据输出,如果有则从数据流生成单元105获取(M/N)比特已压缩数据,获取N个周期后输出M比特已压缩数据;
流水匹配单元102,用于每周期获取一个管理控制单元101输入的字符,计算第i个字符与前(i-1)个字符匹配相同的位置以及连续匹配相同的长度,将第i个字符、第i个字符与前(i-1)个字符匹配相同的位置、第i个字符连续匹配相同的长度发给标准格式转换单元103,其中,i为不大于N的正整数;
标准格式转换单元103,用于对内部缓存单元中的数据进行逻辑判断,将第i个字符、第i个字符与前(i-1)个字符匹配相同的位置、第i个字符连续匹配相同的长度转换成标准数据格式的标准数据,将标准数据发给静态霍夫曼编码 单元104;
静态霍夫曼编码单元104,用于通过内部编码转换单元对标准数据进行编码以得到霍夫曼编码流,将霍夫曼编码流发给数据流生成单元105;
数据流生成单元105,用于将非定长的霍夫曼编码流转换成定长的编码流输出,每周期获取一次霍夫曼编码流,将霍夫曼编码流存储在内部缓存数组,同时每周期向管理控制单元101发送内部缓存数组中的前(M/N)位数据。
参见图2,图2为本发明实施例提供的一种基于流水式的硬件压缩的系统中管理控制单元200的结构示意图,本发明实施例提供的一种管理控制单元200可以包括:输入状态机单元201、状态设置单元202和输出状态机单元203;
其中,输入状态机单元201,用于从输入缓存通道获取M比特待压缩数据,其中,M为正整数,将M比特待压缩数据分为N个字符,以每周期一个字符的格式输出,其中,N为小于M的正整数,N个字符中每个字符为(M/N)比特;
状态设置单元202,用于设置状态机输入数据和输出数据的大小端转换;
输出状态机单元203,用于每个周期获取压缩完成后的(M/N)比特编码流数据,每获取N个周期后输出M比特已压缩数据。
可选的,输入状态机单元201除了每周期输出一个字符外,还输出一个输入有效位,其中,输入有效位用于表示当前周期原始数据是否有效,当输入有效位逻辑上为0时表示当前周期原始数据无效,当输入有效位逻辑上为1时表示当前周期原始数据有效。
其中,大端模式,是指数据的高字节保存在内存的低地址中,而数据的低字节保存在内存的高地址中,地址由小向大增加,而数据从高位往低位放;小端模式,是指数据的高字节保存在内存的高地址中,而数据的低字节保存在内存的低地址中,将地址的高低和数据位权结合起来,高地址部分权值高,低地址部分权值低。
可选的,输出状态机单元203除了每周期获取(M/N)比特编码流数据外,还获取一个编码有效位,其中,编码有效位用于表示当前周期编码流数据是否有效,当编码有效位逻辑上为0时表示当前周期编码流数据无效,当编码有效位逻辑上为1时表示当前周期编码流数据有效。
可选的,当M为128,N为16时,输入状态机单元201每16个周期获取一次128比特待压缩数据,将128比特待压缩数据分为16个字符,以每周期一个字符的格式输出,其中,16个字符中每个字符为8比特;
状态设置单元202设置状态机输入数据和输出数据的大小端转换;
输出状态机单元203将输入的压缩完成后的8比特编码流数据存储起来,每输入16个8比特编码流数据后,拼接成128比特已压缩数据输出。
参见图3,图3为本发明实施例提供的一种基于流水式的硬件压缩的系统中流水匹配单元300的结构示意图,本发明实施例提供的一种流水匹配单元300可以包括:匹配位置计算单元301、K个字符匹配单元302和匹配长度计算单元303;
其中,匹配位置计算单元301,用于计算当前输入字符与之前输入字符匹配相同的位置,通过每个字符匹配单元发送的信号判断,如果当前哪个字符匹配单元匹配成功,则输出该字符匹配单元的位置;
K个字符匹配单元302,其中,K为正整数,K个字符匹配单元呈流水结构,每个字符匹配单元用于将输入字符与保存在该字符匹配单元内的字符进行匹配,若匹配相同则输出匹配相同信号,每个周期结束后,将保存在该字符匹配单元内的字符输出给下一个字符匹配单元,作为下一个字符匹配单元的输入字符,同时将输入该字符匹配单元的字符保存在该字符匹配单元内,等待下一个周期应用;
匹配长度计算单元303,用于计算当前字符连续匹配相同的长度,如果当前字符匹配相同,则计数加一,直到字符匹配不相同,计数清零,重新计数。
可选的,匹配位置计算单元301输出当前输入字符与之前输入字符匹配相同的位置,位置数据为15比特数据,匹配长度计算单元303输出当前字符连续匹配相同的长度,长度数据为8比特数据。
参见图4,图4为本发明实施例提供的一种基于流水式的硬件压缩的系统中标准格式转换单元400的结构示意图,本发明实施例提供的一种标准格式转换单元400可以包括:第一字符缓存单元401、第二字符缓存单元402、第三字符缓存单元403和格式转换状态机单元404;
其中,第一字符缓存单元401,用于存储第i个字符、第i个字符与前(i-1)个字符匹配相同的位置以及第i个字符连续匹配相同的长度,其中,i为不大于N的正整数;
第二字符缓存单元402,用于存储第(i-1)个字符、第(i-1)个字符与前(i-2)个字符匹配相同的位置以及第(i-1)个字符连续匹配相同的长度;
第三字符缓存单元403,用于存储第(i-2)个字符、第(i-2)个字符与前 (i-3)个字符匹配相同的位置以及第(i-2)个字符连续匹配相同的长度;
格式转换状态机单元404,用于通过判断第一字符缓存单元401、第二字符缓存单元402和第三字符缓存单元403保存的数据,将第i个字符、第i个字符与前(i-1)个字符匹配相同的位置、第i个字符连续匹配相同的长度转换成标准数据格式的标准数据;
其中,标准数据为单字符,或者标准数据包括位置距离、匹配长度和单字符。
可选的,当标准数据格式为LZ77数据格式时,第一字符缓存单元401存储第i个字符、第i个字符与前(i-1)个字符匹配相同的位置以及第i个字符连续匹配相同的长度,即标准格式转换单元400当前周期输入的字符信息;
第二字符缓存单元402存储第(i-1)个字符、第(i-1)个字符与前(i-2)个字符匹配相同的位置以及第(i-1)个字符连续匹配相同的长度,即标准格式转换单元400上一个周期输入的字符信息;
第三字符缓存单元403存储第(i-2)个字符、第(i-2)个字符与前(i-3)个字符匹配相同的位置以及第(i-2)个字符连续匹配相同的长度,即标准格式转换单元400上上一个周期输入的字符信息;
格式转换状态机单元404,通过判断三个字符缓存单元保存的数据,将第i个字符、第i个字符与前(i-1)个字符匹配相同的位置、第i个字符连续匹配相同的长度转换成LZ77数据格式的标准数据,即如果匹配长度不超过三个值,则以单字符输出,如果匹配长度超过三个值,则以{位置距离、匹配长度、单字符}的格式代替重复字符串输出,例如,连续五个周期匹配到相同的字符,第一个周期到第四个周期都不会有有效数据输出,到第五个周期会以{位置距离、5、下一字符}的数据格式输出。
可选的,格式转换状态机单元404输出位置距离、匹配长度和单字符,其中,位置距离为15比特数据,匹配长度为8比特数据,单字符为8比特数据,格式转换状态机单元404除了输出位置距离、匹配长度和单字符外,还输出一个单字符标志变量,单字符标志变量用于表示是否为单字符输出,如果不是单字符输出,则位置距离和匹配长度的数据有效,如果是单字符输出,则位置距离和匹配长度的数据无效。
参见图5,图5为本发明实施例提供的一种基于流水式的硬件压缩的系统中静态霍夫曼编码单元500的结构示意图,本发明实施例提供的一种静态霍夫曼 编码单元500可以包括:字符编码单元501、位置编码单元502、长度编码单元503、霍夫曼编码拼接单元504和编码流选择单元505;
其中,字符编码单元501,用于将单字符转换成霍夫曼编码格式,以得到第一霍夫曼编码流;
位置编码单元502,用于将位置距离转换成霍夫曼编码格式;
长度编码单元503,用于将匹配长度转换成霍夫曼编码格式;
霍夫曼编码拼接单元504,用于将转换成霍夫曼编码格式的位置距离、匹配长度和单字符拼接成第二霍夫曼编码流,其中,第二霍夫曼编码流的长度不大于64比特;
编码流选择单元505,用于选择以第一霍夫曼编码流输出还是以第二霍夫曼编码流输出,即选择以单字符的霍夫曼编码流输出还是以{位置距离、匹配长度、单字符}的霍夫曼编码流输出,如果输入静态霍夫曼编码单元500的数据是单字符,则位置编码单元502和长度编码单元503的输出编码流无效,只以字符编码单元501的编码流输出,如果输入静态霍夫曼编码单元500的数据不是单字符,则输出拼接后的编码流。
可选的,编码流选择单元505除了输出霍夫曼编码流外,还输出霍夫曼编码流中有多少个有效的数据位。
参见图6,图6为本发明实施例提供的一种基于流水式的硬件压缩的方法的流程图。其中,如图6所示,本发明实施例提供的一种基于流水式的硬件压缩的方法可以包括:
601、从输入缓存通道获取M比特待压缩数据,其中,M为正整数。
可选的,从输入缓存通道获取M比特待压缩数据之前,通过直接存储器访问模块将原始待压缩数据传输到该输入缓存通道。
602、将M比特待压缩数据分为N个字符,其中,N为小于M的正整数。
其中,该N个字符中每个字符为(M/N)比特。
603、计算第i个字符与前(i-1)个字符匹配相同的位置以及连续匹配相同的长度,其中,i为不大于N的正整数。
可选的,计算第i个字符与前(i-1)个字符匹配相同的位置以及连续匹配相同的长度的方法可以是:
将第i个字符与K个字符匹配单元保存的字符进行匹配,若匹配相同则输出匹配相同信号,根据匹配相同信号计算第i个字符与前(i-1)个字符匹配相同 的位置,计算第i个字符连续匹配相同的长度;
其中,K为正整数,K个字符匹配单元呈流水结构,第j个字符匹配单元匹配后,将第j个字符匹配单元保存的字符输出给第(j+1)个字符匹配单元,将第j个字符匹配单元保存的字符替换成输入到第j个字符匹配单元的字符,j为小于K的正整数。
604、将第i个字符、第i个字符与前(i-1)个字符匹配相同的位置、第i个字符连续匹配相同的长度转换成标准数据格式的标准数据。
可选的,将第i个字符、第i个字符与前(i-1)个字符匹配相同的位置、第i个字符连续匹配相同的长度转换成标准数据格式的标准数据的方法可以是:
通过判断第一缓存数据、第二缓存数据和第三缓存数据,将第i个字符、第i个字符与前(i-1)个字符匹配相同的位置、第i个字符连续匹配相同的长度转换成标准数据;
其中,第一缓存数据包括第i个字符、第i个字符与前(i-1)个字符匹配相同的位置以及第i个字符连续匹配相同的长度,第二缓存数据包括存储第(i-1)个字符、第(i-1)个字符与前(i-2)个字符匹配相同的位置以及第(i-1)个字符连续匹配相同的长度,第三缓存数据包括第(i-2)个字符、第(i-2)个字符与前(i-3)个字符匹配相同的位置以及第(i-2)个字符连续匹配相同的长度;
其中,标准数据为单字符,或者标准数据包括位置距离、匹配长度和单字符。
605、对标准数据进行编码以得到霍夫曼编码流。
可选的,对标准数据进行编码以得到霍夫曼编码流的方法可以是:
当标准数据为单字符时,将单字符转换成霍夫曼编码格式,以得到第一霍夫曼编码流;
当标准数据包括位置距离、匹配长度和单字符时,将单字符转换成霍夫曼编码格式;
将位置距离转换成霍夫曼编码格式;
将匹配长度转换成霍夫曼编码格式;
将转换成霍夫曼编码格式的位置距离、匹配长度和单字符拼接成第二霍夫曼编码流。
606、将霍夫曼编码流转换成每个周期(M/N)比特的已压缩数据输出。
607、每个周期获取(M/N)比特已压缩数据,获取N个周期后输出M比 特已压缩数据。
可选的,设置M比特待压缩数据和M比特已压缩数据的大小端转换。
可选的,原始待压缩数据压缩完成后,通过直接存储器访问模块将已压缩数据传输到内存。
参见图7,图7为本发明实施例提供的另一种基于流水式的硬件压缩的方法的流程图。其中,如图7所示,本发明实施例提供的另一种基于流水式的硬件压缩的方法可以包括:
701、管理控制单元获取128比特待压缩数据。
中央处理器通过直接存储器访问模块,将原始待压缩数据传输到硬件压缩系统的输入缓存通道,管理控制单元每16个周期从输入缓存通道获取一次128比特待压缩数据。
702、管理控制单元将128比特待压缩数据分为16个字符。
其中,16个字符中每个字符为8比特。
703、管理控制单元以每周期一个字符的格式输出,按顺序将16个字符发送给流水匹配单元,每输出一个字符,计数加一。
704、流水匹配单元对输入字符进行流水匹配,以得到匹配数据,将匹配数据发给标准格式转换单元。
其中,匹配数据包括当前输入字符、当前输入字符与之前输入字符匹配相同的位置和当前输入字符连续匹配相同的长度。
705、标准格式转换单元将输入的匹配数据转换成LZ77格式的数据,将LZ77格式的数据发给静态霍夫曼编码单元。
706、静态霍夫曼编码单元将LZ77格式的数据转换成霍夫曼编码流,将霍夫曼编码流发给数据流生成单元。
707、数据流生成单元获取非定长的霍夫曼编码流,每周期将8比特编码流数据发给管理控制单元。
708、管理控制单元每获取16个8比特编码流数据后,拼接成128比特编码流数据输出。
709、判断当前输入的128比特数据是否已经全部发送完毕,否,则进入步骤703,是,则进入步骤710。
710、判断输入缓存通道是否还有数据,否,则进入步骤711,是,则进入步骤701,按顺序再次获取下一个128比特数据。
711、输入数据结束。
712、判断编码流数据是否已经全部输出,否,则继续等待,直到压缩完成的编码流数据全部输出,是,则进入结束状态。
参见图8,图8为本发明实施例提供的一种基于流水式的硬件压缩的系统中数据流生成单元的工作流程图。其中,如图8所示,本发明实施例提供的一种数据流生成单元的工作流程可以包括:
801、判断当前输入数据流生成单元的编码流数据是否有效,否,则进入步骤804,是,则进入步骤802。
802、获取当前编码流数据以及当前编码流数据的有效长度数据。
803、将数据流生成单元中的缓存数组左移当前编码流数据的有效长度位后,与当前编码流数据做或运算。
其中,缓存数组用于保存输入数据流生成单元的编码流数据,将缓存数组左移有效长度位后与当前编码流数据做或运算,即把当前编码流数据拼接到缓存数组的尾部。
804、同周期内取出数据流生成单元内缓存数组的前(M/N)位数据输出,进入步骤805。
805、判断输入数据是否已经结束,否,则进入步骤801,是,则进入步骤806。
806、判断缓存数据里是否还有编码流数据没有输出,是,则进入步骤804,否,则进入结束状态。
需要说明的是,对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本发明并不受所描述的动作顺序的限制,因为依据本发明,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作和模块并不一定是本发明所必须的。在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。
以上所述,以上实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离 本发明各实施例技术方案的范围。

Claims (12)

  1. 一种基于流水式的硬件压缩的系统,其特征在于,所述系统包括管理控制单元、流水匹配单元、标准格式转换单元、静态霍夫曼编码单元和数据流生成单元;
    所述管理控制单元,用于从输入缓存通道获取M比特待压缩数据,其中,M为正整数,将所述M比特待压缩数据分为N个字符,将所述N个字符分为N个周期输入所述流水匹配单元,其中,N为小于M的正整数,每个周期从所述数据流生成单元获取(M/N)比特已压缩数据,获取N个周期后输出M比特已压缩数据;
    所述流水匹配单元,用于计算第i个字符与前(i-1)个字符匹配相同的位置以及连续匹配相同的长度,其中,i为不大于N的正整数;
    所述标准格式转换单元,用于将所述第i个字符、所述第i个字符与所述前(i-1)个字符匹配相同的位置、所述第i个字符连续匹配相同的长度转换成标准数据格式的标准数据;
    所述静态霍夫曼编码单元,用于对所述标准数据进行编码以得到霍夫曼编码流;
    所述数据流生成单元,用于将所述霍夫曼编码流转换成每个周期(M/N)比特的已压缩数据输出。
  2. 根据权利要求1所述的系统,其特征在于,所述管理控制单元具体包括:
    输入状态机单元,用于从所述输入缓存通道获取所述M比特待压缩数据,将所述M比特待压缩数据分为所述N个字符,将所述N个字符分为N个周期输入所述流水匹配单元,其中,所述N个字符中每个字符为(M/N)比特;
    状态设置单元,用于设置所述M比特待压缩数据和所述M比特已压缩数据的大小端转换;
    输出状态机单元,用于每个周期从所述数据流生成单元获取所述(M/N)比特已压缩数据,获取N个周期后输出所述M比特已压缩数据。
  3. 根据权利要求1或2所述的系统,其特征在于,所述流水匹配单元具体包括:
    匹配位置计算单元,用于计算所述第i个字符与所述前(i-1)个字符匹配相 同的位置;
    K个字符匹配单元,用于将所述第i个字符与所述K个字符匹配单元保存的字符进行匹配,若匹配相同则输出匹配相同信号,其中,K为正整数,所述K个字符匹配单元呈流水结构,第j个字符匹配单元匹配后,将第j个字符匹配单元保存的字符输出给第(j+1)个字符匹配单元,将所述第j个字符匹配单元保存的字符替换成输入到所述第j个字符匹配单元的字符,j为小于K的正整数;
    匹配长度计算单元,用于计算所述第i个字符连续匹配相同的长度。
  4. 根据权利要求3所述的系统,其特征在于,所述标准格式转换单元具体包括:
    第一字符缓存单元,用于存储所述第i个字符、所述第i个字符与所述前(i-1)个字符匹配相同的位置以及所述第i个字符连续匹配相同的长度;
    第二字符缓存单元,用于存储第(i-1)个字符、所述第(i-1)个字符与前(i-2)个字符匹配相同的位置以及所述第(i-1)个字符连续匹配相同的长度;
    第三字符缓存单元,用于存储第(i-2)个字符、所述第(i-2)个字符与前(i-3)个字符匹配相同的位置以及所述第(i-2)个字符连续匹配相同的长度;
    格式转换状态机单元,用于通过判断所述第一字符缓存单元、所述第二字符缓存单元和所述第三字符缓存单元保存的数据,将所述第i个字符、所述第i个字符与所述前(i-1)个字符匹配相同的位置、所述第i个字符连续匹配相同的长度转换成所述标准数据;
    其中,所述标准数据为单字符,或者所述标准数据包括位置距离、匹配长度和单字符。
  5. 根据权利要求4所述的系统,其特征在于,所述静态霍夫曼编码单元具体包括:
    字符编码单元,用于将所述单字符转换成霍夫曼编码格式,以得到第一霍夫曼编码流;
    位置编码单元,用于将所述位置距离转换成霍夫曼编码格式;
    长度编码单元,用于将所述匹配长度转换成霍夫曼编码格式;
    霍夫曼编码拼接单元,用于将转换成霍夫曼编码格式的位置距离、匹配长度和单字符拼接成第二霍夫曼编码流;
    编码流选择单元,用于选择以所述第一霍夫曼编码流输出还是以所述第二 霍夫曼编码流输出。
  6. 一种基于流水式的硬件压缩的方法,其特征在于,包括:
    从输入缓存通道获取M比特待压缩数据,其中,M为正整数;
    将所述M比特待压缩数据分为N个字符,其中,N为小于M的正整数;
    计算第i个字符与前(i-1)个字符匹配相同的位置以及连续匹配相同的长度,其中,i为不大于N的正整数;
    将所述第i个字符、所述第i个字符与所述前(i-1)个字符匹配相同的位置、所述第i个字符连续匹配相同的长度转换成标准数据格式的标准数据;
    对所述标准数据进行编码以得到霍夫曼编码流;
    将所述霍夫曼编码流转换成每个周期(M/N)比特的已压缩数据输出;
    每个周期获取(M/N)比特已压缩数据,获取N个周期后输出M比特已压缩数据。
  7. 根据权利要求6所述的方法,其特征在于,所述从输入缓存通道获取M比特待压缩数据之前,包括:
    通过直接存储器访问模块将原始待压缩数据传输到所述输入缓存通道。
  8. 根据权利要求6所述的方法,其特征在于,所述N个字符中每个字符为(M/N)比特,所述方法还包括:
    设置所述M比特待压缩数据和所述M比特已压缩数据的大小端转换。
  9. 根据权利要求6至8任一项所述的方法,其特征在于,所述计算第i个字符与前(i-1)个字符匹配相同的位置以及连续匹配相同的长度包括:
    将所述第i个字符与K个字符匹配单元保存的字符进行匹配,若匹配相同则输出匹配相同信号,根据所述匹配相同信号计算所述第i个字符与前(i-1)个字符匹配相同的位置,计算所述第i个字符连续匹配相同的长度;
    其中,K为正整数,所述K个字符匹配单元呈流水结构,第j个字符匹配单元匹配后,将第j个字符匹配单元保存的字符输出给第(j+1)个字符匹配单元,将所述第j个字符匹配单元保存的字符替换成输入到所述第j个字符匹配单元的字符,j为小于K的正整数。
  10. 根据权利要求9所述的方法,其特征在于,所述将所述第i个字符、所述第i个字符与所述前(i-1)个字符匹配相同的位置、所述第i个字符连续匹配相同的长度转换成标准数据格式的标准数据包括:
    通过判断第一缓存数据、第二缓存数据和第三缓存数据,将所述第i个字符、所述第i个字符与所述前(i-1)个字符匹配相同的位置、所述第i个字符连续匹配相同的长度转换成所述标准数据;
    其中,所述第一缓存数据包括所述第i个字符、所述第i个字符与所述前(i-1)个字符匹配相同的位置以及所述第i个字符连续匹配相同的长度,第二缓存数据包括存储第(i-1)个字符、所述第(i-1)个字符与前(i-2)个字符匹配相同的位置以及所述第(i-1)个字符连续匹配相同的长度,第三缓存数据包括第(i-2)个字符、所述第(i-2)个字符与前(i-3)个字符匹配相同的位置以及所述第(i-2)个字符连续匹配相同的长度,所述标准数据为单字符,或者所述标准数据包括位置距离、匹配长度和单字符。
  11. 根据权利要求10所述的方法,其特征在于,所述对所述标准数据进行编码以得到霍夫曼编码流包括:
    当所述标准数据为单字符时,将所述单字符转换成霍夫曼编码格式,以得到第一霍夫曼编码流;
    当所述标准数据包括位置距离、匹配长度和单字符时,将所述单字符转换成霍夫曼编码格式;
    将所述位置距离转换成霍夫曼编码格式;
    将所述匹配长度转换成霍夫曼编码格式;
    将转换成霍夫曼编码格式的位置距离、匹配长度和单字符拼接成第二霍夫曼编码流。
  12. 根据权利要求7所述的方法,其特征在于,还包括:
    所述原始待压缩数据压缩完成后,通过所述直接存储器访问模块将已压缩数据传输到内存。
PCT/CN2019/088033 2019-05-22 2019-05-22 一种基于流水式的硬件压缩的系统及方法 WO2020232683A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/088033 WO2020232683A1 (zh) 2019-05-22 2019-05-22 一种基于流水式的硬件压缩的系统及方法

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/088033 WO2020232683A1 (zh) 2019-05-22 2019-05-22 一种基于流水式的硬件压缩的系统及方法

Publications (1)

Publication Number Publication Date
WO2020232683A1 true WO2020232683A1 (zh) 2020-11-26

Family

ID=73459259

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/088033 WO2020232683A1 (zh) 2019-05-22 2019-05-22 一种基于流水式的硬件压缩的系统及方法

Country Status (1)

Country Link
WO (1) WO2020232683A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118713680A (zh) * 2024-08-27 2024-09-27 湖南师范大学 一种基于区块链技术的可信计量数据存储方法

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104202054A (zh) * 2014-09-16 2014-12-10 东南大学 一种硬件lzma压缩实现系统及方法
CN107135003A (zh) * 2017-04-19 2017-09-05 西安电子科技大学 基于Gzip硬件实现文本压缩方法
US10171103B1 (en) * 2018-01-12 2019-01-01 Mellanox Technologies, Ltd. Hardware data compression architecture including shift register and method thereof

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104202054A (zh) * 2014-09-16 2014-12-10 东南大学 一种硬件lzma压缩实现系统及方法
CN107135003A (zh) * 2017-04-19 2017-09-05 西安电子科技大学 基于Gzip硬件实现文本压缩方法
US10171103B1 (en) * 2018-01-12 2019-01-01 Mellanox Technologies, Ltd. Hardware data compression architecture including shift register and method thereof

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
MING-BO LIN ET AL.: "A Lossless Data Compression and Decompression Algorithm and Its Hardware Architecture", IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, vol. 14, no. 9, 30 September 2006 (2006-09-30), XP011149588, DOI: 20200218153419 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118713680A (zh) * 2024-08-27 2024-09-27 湖南师范大学 一种基于区块链技术的可信计量数据存储方法

Similar Documents

Publication Publication Date Title
US11431351B2 (en) Selection of data compression technique based on input characteristics
CN105207678B (zh) 一种改进型lz4压缩算法的硬件实现系统
CN104202054A (zh) 一种硬件lzma压缩实现系统及方法
US20110078222A1 (en) Enhanced multi-processor waveform data exchange using compression and decompression
US9176977B2 (en) Compression/decompression accelerator protocol for software/hardware integration
CN114567331B (zh) 一种基于lz77的压缩方法、装置及其介质
US20100324914A1 (en) Adaptive Encoding of a Digital Signal with One or More Missing Values
CN103427844A (zh) 一种基于gpu和cpu混合平台的高速无损数据压缩方法
WO2020211000A1 (zh) 数据解压缩的装置与方法
WO2020232683A1 (zh) 一种基于流水式的硬件压缩的系统及方法
US8406538B2 (en) Image processing apparatus and image processing method
CN110233627B (zh) 一种基于流水式的硬件压缩的系统及方法
CN105005464A (zh) 一种Burrows Wheeler变换硬件处理装置
US7071854B1 (en) Hardware-implemented LZW data decompression
Erdeljan et al. IP core for efficient zero-run length compression of CNN feature maps
CN114025024B (zh) 一种数据传输方法及装置
CN110247666B (zh) 一种硬件并行压缩的系统及方法
CN110473264B (zh) 基于霍夫曼编码的深度图压缩方法、解压缩方法及编码器
TW201440045A (zh) 解壓縮電路與相關的解壓縮方法
CN118353966B (zh) 一种数据传输方法、数据传输设备和相关设备
WO2020232682A1 (zh) 一种硬件并行压缩的系统及方法
US10491241B1 (en) Data compression scheme utilizing a repetitive value within the data stream
JP2008203959A (ja) データ転送におけるシリアライズ方法、データフォーマット及びデータ転送装置
TWI883487B (zh) 資料處理方法及裝置、電子設備、儲存介質
CN116170115B (zh) 基于码本的数字喷泉编解码方法、装置以及系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19930104

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19930104

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 21.03.2022)

122 Ep: pct application non-entry in european phase

Ref document number: 19930104

Country of ref document: EP

Kind code of ref document: A1