CN108702160B

CN108702160B - Method, apparatus and system for compressing and decompressing data

Info

Publication number: CN108702160B
Application number: CN201780015262.7A
Authority: CN
Inventors: 安耶洛斯·阿雷拉基斯; 佩尔·斯滕斯特伦
Original assignee: Zeropoint Technologies AB
Current assignee: Zeropoint Technologies AB
Priority date: 2016-01-29
Filing date: 2017-01-27
Publication date: 2022-05-17
Anticipated expiration: 2037-01-27
Also published as: CN108702160A; WO2017131579A1; WO2017131578A1; CN108886367A; CN108886367B

Abstract

Methods, devices and systems expand compressors and decompressors for encoding and decoding data in a cache/memory/data transfer subsystem of a computer system or in a communication network. The example variable length compressor is extended to be able to compress blocks of data values and the compressed blocks comprise a mixture of compressed and uncompressed values, wherein metadata in the form of unique special meaning codewords (UUICs) indicate the uncompressed values. An example variable length decompressor is extended to be able to decompress the compressed data block comprising a mixture of compressed and uncompressed data values. The compressor and decompressor are extended to support compression and decompression of common compression scenarios used in conjunction with variable length compression to improve compressibility in a cache/memory/data transfer subsystem of a computer system or in a communication network.

Description

Method, apparatus and system for compressing and decompressing data

Technical Field

The present disclosure relates generally to the field of data compression and decompression, such as data compression and decompression in a cache/memory subsystem and/or a data transmission subsystem of a computer system or in a data communication system.

Background

Data compression is a relatively mature technique for reducing the size of data. Which is applied to data stored in a memory subsystem of a computer system to increase storage capacity. Data compression is also used when data is transferred between different subsystems within a computer system, or generally when the transfer is between two points in a data communication system including a communication network.

Data compression requires two basic operations: 1) compression (also called encoding), which takes uncompressed data as input and converts the uncompressed data into compressed data by replacing data values with corresponding code words (also called encoding, word codes or codes in the literature); and 2) decompression (also called decoding), which takes compressed data as input and converts it to uncompressed by replacing the code words with the corresponding data values. Data compression may be lossless or lossy depending on whether the actual data value after decompression is identical to the original data value before compression (lossless), or whether the data value after decompression is different from the original data value and the original value is not available (lossy). Compression and decompression may be implemented in software, or hardware, or a combination of software and hardware, to implement corresponding methods, apparatus and systems.

FIG. 1 depicts an example of a computer system 100. Computer system 100 includes one or more processing units P1 … Pn connected to a memory hierarchy 110 using a communication means such as an interconnection network. Each processing unit includes a processor (or core), and each processing unit may be a CPU (central processing unit), a GPU (graphics processing unit), or generally a block that performs computations. Memory hierarchy 110, on the other hand, constitutes a storage subsystem of computer system 100 and includes cache memory 120, which may be organized in one or several levels (levels, tiers), L1-L3, and memory 130 (a.k.a. primary memory). The memory 130 may also be connected to a secondary storage device (e.g., a hard disk drive, solid state drive, or flash memory). Memory 130 may be organized into levels, such as fast main memory (e.g., DDR) and flash memory. Cache memory 120 in the present example includes three levels, where L1 and L2 are private caches in that each processing unit P1-Pn is connected to a designated L1/L2 cache, and L3 is shared among all processing units P1-Pn. With different numbers of processing units and, in general, different combinations between processing units and memory subsystems, alternative examples may implement different cache hierarchies with more, fewer, or even no cache levels, with or without dedicated or shared designated caches, implementing various memory levels, all as readily implemented by the skilled person.

Data compression may be applied to a computer system in different ways. FIG. 2 depicts an example 200 of a computer system, such as system 100 of FIG. 1, in which data is compressed in a memory, such as main memory, of such a computer system. This means that the data is compressed before being saved in memory and the data is decompressed when leaving memory by the corresponding compression operation as described above.

In the alternative example 300 of the computer system shown in FIG. 3, data compression may be applied to the L3 cache of the cache system. Similar to the previous example, the data needs to be compressed before it is stored in the cache, and the data needs to be decompressed before it leaves the cache (e.g., to other cache levels (L2) or memory 330 where the data is uncompressed). In an alternative example, the data may be stored in compressed form in any level of the cache hierarchy.

Data may also be compressed only when it is being transferred between different subsystems in a computer system. In the alternative example 400 of the computer system shown in fig. 4, data is compressed when transferred between the L3 cache and the memory 430 using the corresponding communication device. Similar to the previous example, there needs to be compression and decompression at the end of the communication device in order to compress the data before transmitting it, and to decompress the data when it is received at the other end.

In an alternate example of a computer system 500, data compression may be applied in a combination of subsystems as depicted in FIG. 5. In this example, the data is compressed while it is held in memory 530 and while it is being transferred between memory 530 and cache hierarchy 520. Thus, when data is moved from the cache hierarchy 520 to the memory 530, it may only be necessary to compress the data before it is transferred from the L3 cache. Alternatively, compressed data leaving memory 530 destined for cache hierarchy 520 may only need to be decompressed when it is received to the other end of the communication device connecting memory 530 with cache hierarchy 520. Any example is possible with respect to applying compression to a combination of different subsystems in a computer system, and may be implemented by one skilled in the art.

Data transmission may also be between two arbitrary points within a communication network. Fig. 6 depicts an example of a data communication system 600 including a communication network 605 between two points, where data is transmitted by a transmitter 610 and received by a receiver 620. In such instances, the points may be two intermediate nodes in the network or a source node and a destination node of a communication link or a combination of these. Data compression may be applied to a data communication system such as the example system 700 depicted in fig. 7. Compression needs to be applied before the data is transmitted over the communication network 705 by the transmitter 710 and decompression needs to be applied after the data is received by the receiver 720.

There are various different algorithms for implementing data compression. One family of data compression algorithms is the statistical compression algorithm, which is data dependent and can provide compression efficiencies close to entropy, because it assigns variable length (also called variable width) codes based on the statistical properties of the data values: short codewords are used to encode data values that occur frequently, while longer codewords encode data values that occur less frequently. Huffman coding is a known statistical compression algorithm.

A known variant of huffman coding for accelerated decompression is canonical huffman coding. Based on this, the code words have a numerical sequence property, which means that code words of the same length are consecutive integers.

Examples of canonical huffman-based compression and decompression mechanisms are proposed in the prior art. Such compression and decompression mechanisms may be used in the foregoing examples to achieve huffman-based compression and decompression.

Fig. 9 shows an example of a compressor 900 from the prior art implementing huffman coding, e.g. canonical huffman coding. The compressor takes as input an uncompressed block, which is a string of data values and includes one or more data values generally denoted v1, v2, …, vn throughout this disclosure. Unit 910, which may be a storage unit or an extractor for extracting data values from uncompressed blocks, provides the data values to variable length coding unit 920. The variable length coding unit 920 includes a Code Table (CT)922 and a Codeword (CW) selector 928. CT 922 is a table that may be implemented as a look-up table (LUT) or computer cache memory (with any arbitrary dependencies) and contains one or more entries; each entry includes a value 923 that may be compressed using a codeword, CW 925, and a codeword length (cL) 927. Since the various sets of codewords used by the statistical compression algorithm are of variable length, when stored in CT 922 with each entry having a fixed-size width (codeword 925), these sets of codewords must be padded with zeros. The codeword length 927 holds the actual length (e.g., in bits) of the variable length code. CW selector 928 uses cL to identify the actual CW and discard the padded zeros. The encoded values are then concatenated to the rest of the compressed values, together forming a compressed block. An exemplary flow chart of a compression method following the previously described compression steps is depicted in fig. 25.

An example of a prior art decompressor 1000 is shown in fig. 10. The canonical huffman decompression can be divided into two steps: codeword detection and value retrieval (retrieve ). Each of these steps is carried out by one of the following units: (1) a Codeword Detection Unit (CDU)1020 and a (2) Value Retrieval Unit (VRU) 1030. The purpose of CDU1020 is to find valid codewords within the compressed sequence (i.e., the sequence of codewords of compressed data values). The CDU1020 includes a set of comparators 1022 and a priority encoder 1024. Each

comparator

1022a, 1022b, 1022c compares each potential bit sequence (bit sequence) to a known codeword, which in this example is a first assigned (at code generation time) canonical huffman codeword (FCW) of a particular length. In an alternative implementation, the last assigned canonical huffman codeword could also be used, but in this case the exact comparison made would be different. The bit sequence to be compared may be stored in a memory unit 1010 (e.g. implemented as a FIFO or flip-flop) and determines the number of comparators and the maximum width of the widest of the bit sequences, the maximum size of the bit sequence to be compared depending on the maximum length of the valid huffman code word determined at code generation (mCL). However, depending on the chosen implementation of such a decompressor (e.g. in software or in hardware), the maximum length may be limited to a specific value at design, compilation, configuration or runtime. The output of comparator 1022 is inserted into a priority encoder such as structure 1024, which outputs the length of the matched codeword (referred to as "matched length" in fig. 10). Based on this, the detected valid codeword (matched codeword) is extracted from the bit sequence stored in the storage unit 1010; the bit sequence is shifted as many positions as defined by the "matched length" and the empty portion is loaded into subsequent bits of the compressed sequence so that the CDU1020 can determine the next valid codeword.

On the other hand, the Value Retrieval Unit (VRU)1030 includes an offset table 1034, a subtractor unit 1036, and a decompression look-up table (DeLUT) 1038. The "matched length" from the previous step is used to determine the offset value (stored in offset table 1034) that must be subtracted (1036) from the arithmetic value of the matched codeword also determined in the previous step to obtain the address of the DeLUT 1038 from which the original data value corresponding to the detected codeword can be retrieved and attached to the remaining decompressed values stored in decompression block 1040. The operation of the decompressor is repeated until all values held in compressed form in the input compressed sequence (referred to as compressed blocks in fig. 10) are restored to the uncompressed data values v1, v2, …, vn.

Fig. 26 depicts an exemplary flow chart of a decompression method following the decompression steps previously described.

The above-described compressor and decompressor may quickly and efficiently compress data blocks using variable length canonical huffman coding and decompress data blocks compressed using variable length canonical huffman coding. However, they are not able to compress and decompress data blocks comprising a mixture of compressed and uncompressed values, which is a common situation when applying statistical compression to the computer systems or communication networks of the above examples. The present inventors have recognized that there is room for improvement in the art of data compression and decompression.

Disclosure of Invention

It is an object of the present invention to provide improvements in the field of data compression and decompression techniques.

The present disclosure generally discloses methods, devices and systems for compressing and decompressing blocks of data values when compression is applied, for example, in a cache subsystem and/or a memory subsystem and/or a data transmission subsystem of a computer system and/or in a data communication system. There are various methods to efficiently compress data in the subsystem using entropy-based variable length coding, and one such way is by using huffman coding. A current compressor may be used to compress a block of data values using huffman coding, while a current decompressor may be used to decompress the block of data compressed using huffman coding. However, when entropy-based compression is applied in the system, some data is not compressible or is selected to be uncompressed; for example, they occur only once, so compressing them requires more metadata than keeping them uncompressed; or no encoding for certain data values, as they do not occur during statistics collection, but occur during compression. Thus, the compressors lack important features that enable them to create a mix of compressed and uncompressed data values within the same block; and when mixed together in blocks, the decompressor cannot distinguish between compressed data and uncompressed data. The method, apparatus and system disclosed in this disclosure enhances existing compressors and decompressors that utilize variable length coding with the following new features: in the common case when compression is applied to a computer system or a communication network, a block of data is compressed, including a mixture of compressed and uncompressed data; and when a mix of compressed and uncompressed data is included in such a system, decompressing the compressed data block. Furthermore, the proposed method, device and system even further enhance the compressor and decompressor by combining them with other aggressive compressors and decompressors, respectively, for common compression scenarios in the computer system and communication network.

A first aspect of the present invention is a data compression apparatus for compressing an uncompressed data block comprising one or more data values into a compressed data block, the data compression apparatus comprising:

a compressor configured to compress data values of uncompressed data blocks into corresponding variable length codewords;

a detector configured to detect data values in uncompressed data blocks that cannot be compressed by the compressor; and

a compressed data block generator configured to generate a compressed data block by combining:

compressed data values in the form of variable length code words corresponding to data values in uncompressed data blocks compressed by the compressor;

uncompressed data values in the form of detected data values in uncompressed data blocks that cannot be compressed by the compressor; and

metadata indicating uncompressed data values, wherein the metadata is a unique special meaning codeword.

Advantageously, the only special meaning codeword is the codeword generated at the time of code generation together with the variable length codeword of the compressor.

More specifically, the unique special meaning codeword may be generated at code generation by: this means that the frequency of occurrence of all data values not present in the value-frequency table is calculated or estimated at the time of code generation, wherein their frequency of occurrence compared to the total number of data values present will influence the width of the unique special meaning code word. Thus, by using the frequency of occurrence of data values, compression efficiency can be improved over prior art methods by distinguishing uncompressed data values from compressed data values within a compressed data block with a unique special meaning codeword. The more data values that are not captured by the value-frequency tracker at the time of code generation, the correspondingly more data values that are kept uncompressed compared to the total number of data values that occur, the narrower the unique special meaning codeword used to indicate the uncompressed data values in the generated compressed data block. On the other hand, the lower the frequency of occurrence of the uncompressed data values, the wider the unique special meaning codeword used to indicate the uncompressed data values in the generated compressed data block.

A second aspect of the present invention is a data decompression apparatus for decompressing a compressed data block into a decompressed data block including one or more data values, the data decompression apparatus comprising:

a decompressor configured to decompress a variable length codeword of a compressed data block into a corresponding decompressed data value; and

a decompressed data block generator configured to:

detecting metadata in the compressed data block, the metadata being a unique special meaning codeword indicating uncompressed data values included in the compressed data block; and

based on the detected metadata, generating a decompressed block of data by: the way is to combine the decompressed data values from the decompressor and the uncompressed data values from the compressed data block such that the order of the data values of the generated decompressed data block is the same as the order in which the data values appear in the uncompressed data block prior to the data compression that produced the compressed data block.

A third aspect of the present invention is a data compression method for compressing an uncompressed data block comprising one or more data values into a compressed data block, the data compression method comprising:

compressing data values of the uncompressed data blocks into corresponding variable length codewords;

detecting data values in uncompressed data blocks that cannot be compressed by the compressor; and

generating a compressed data block by combining:

compressed data values in the form of variable length code words corresponding to the data values in the uncompressed data block that are compressed by the compressor;

A fourth aspect of the present invention is a data decompression method for decompressing a compressed data block into a decompressed data block including one or more data values, the data decompression method comprising:

decompressing the variable length codewords of the compressed data block into corresponding decompressed data values;

A fifth aspect of the invention is a system comprising one or more memories, a data compression device according to the above first aspect and a data decompression device according to the above second aspect.

A sixth aspect of the present invention is a computer program product comprising code instructions which, when loaded and executed by a processing device, cause performance of the method according to the third aspect above.

A seventh aspect of the present invention is a computer program product comprising code instructions which, when loaded and executed by a processing device, cause performance of the method according to the fourth aspect above.

Other aspects, objects, features and advantages of the embodiments of the present disclosure will appear from the following detailed disclosure, from the attached dependent claims as well as from the drawings. In general, all terms used in the claims are to be interpreted according to their ordinary meaning in the technical field, unless explicitly stated otherwise.

All references to "a)/an/the [ element, device, component, means, step, etc ]" are to be interpreted openly as referring to at least one instance of the element, device, component, means, step, etc., unless explicitly stated otherwise. The steps of any method disclosed herein do not have to be performed in the exact order disclosed, unless explicitly stated.

Drawings

Examples of the background art and embodiments of aspects of the present invention are described with reference to the following drawings:

FIG. 1 illustrates a block diagram of a computer system including n processing cores, each coupled to three levels of a cache hierarchy and a main memory.

FIG. 2 illustrates the block diagram of FIG. 1, wherein the main memory holds data in compressed form.

FIG. 3 illustrates the block diagram of FIG. 1, wherein an L3 cache holds data in compressed form. Other cache levels may also store data in compressed form.

Fig. 4 illustrates the block diagram of fig. 1, wherein data is compressed in the communication device, for example when transferred between the memory and cache hierarchy.

FIG. 5 illustrates the block diagram of FIG. 1, where compression may be applied to the main memory and the links connecting the memory to the cache hierarchy. In general, compression may be applied to any combination of portions like a cache hierarchy, a transport (e.g., a link connecting a memory to a cache subsystem), and main memory.

Fig. 6 illustrates a block diagram of a data transfer link connecting two points in a communication network. These points may be two intermediate nodes in the network or the source node and the destination node of a communication link or a combination of these.

Fig. 7 illustrates a block diagram of the data transfer link of fig. 6, wherein the data being transmitted is in compressed form, so they may need to be compressed in the transmitter and decompressed in the receiver.

Fig. 8 illustrates on the left side a block of uncompressed data values and on the right side the same block in compressed form using a variable length code that has been generated using huffman coding. All data values of the uncompressed block are replaced by corresponding huffman code words.

Fig. 9 illustrates a compressor for compressing (or encoding) a block as illustrated in fig. 8 using huffman coding.

Fig. 10 illustrates a decompressor for decoding (or decompressing) a block compressed using canonical huffman coding.

FIG. 11 illustrates uncompressed blocks on the left side and the same blocks in compressed form on the right side in an alternative manner, including: a bit mask indicating which values are compressed and which values are uncompressed; and variable length coding (i.e., a sequence of bits encoded by variable length coding) comprising a mixture of compressed and uncompressed values, according to related but not presently claimed designs.

Fig. 12 illustrates an uncompressed block on the left side and the same block in a second alternative on the right side in a compressed form, comprising a mixture of compressed and uncompressed values, with each uncompressed value preceded by a unique variable length (e.g., huffman) codeword corresponding only to all uncompressed values.

Fig. 13 illustrates an uncompressed block on the left side and the same block in compressed form in a third alternative on the right side, comprising a mixture of compressed and uncompressed values, where each uncompressed value is replaced in the block by a unique variable length (e.g., huffman) codeword in the original sequence of data values, which corresponds only to all uncompressed values, while the actual uncompressed values are saved at the end of the block in reverse order of occurrence (i.e., the last saved uncompressed value is the uncompressed value that first appears in the block in the original order of data values).

FIG. 14 illustrates a data compression device based on the compressor of FIG. 9 but modified and expanded to be able to compress the blocks of FIG. 11 in the following manner, according to a related but currently non-claimed design: the approach detects both compressible and incompressible block data values and generates a mask that is placed before a variable length encoded bit sequence comprising compressed and uncompressed values to indicate uncompressed values included in the compressed block.

Fig. 15 illustrates a data decompression apparatus based on the decompressor of fig. 10 but modified and expanded to be able to decompress the compressed chunks of fig. 11, including a mask placed before the variable length encoded bit sequence to indicate uncompressed values, according to a related but not presently claimed design.

Fig. 16 illustrates a data compression device that is based on the compressor of fig. 9 but is modified and expanded to be able to compress the blocks of fig. 12 by: the approach is to detect both compressible and incompressible block data values and encode the uncompressed block data values within a variable length encoded bit sequence comprising compressed values and uncompressed values with a unique codeword corresponding only to the uncompressed values and appended before each uncompressed value.

Fig. 17a illustrates a data decompression apparatus which is based on the decompressor of fig. 10 but is modified and expanded to be able to decompress the compressed blocks of fig. 12 which comprise compressed and uncompressed values, wherein each uncompressed value is preceded by a unique codeword corresponding only to the uncompressed value.

Fig. 17b shows an alternative implementation of the data decompression apparatus of fig. 17 a.

Fig. 18 illustrates a data compression device that is based on the compressor of fig. 9 but is modified and expanded to be able to compress the blocks of fig. 13 by: the approach is to detect the compressible and incompressible block data values, replace the uncompressed block data values within the variable length encoded bit sequence, which includes both compressed and uncompressed values, with only unique code words corresponding to all uncompressed values, and place the actual uncompressed values at the end of the compressed block in reverse order of occurrence (last to first occurring values).

Fig. 19 shows a data decompression apparatus that is based on the decompressor of fig. 10, but is modified and expanded to be able to decompress the compressed blocks of fig. 13, which comprise a mixture of compressed and uncompressed values, wherein each uncompressed value is replaced with a unique codeword corresponding to only all uncompressed values, while the uncompressed values are saved at the end of the block in reverse order of occurrence.

Fig. 20 shows on the left an empty uncompressed block, which is a block comprising only zero data values, and on the right the same block compressed with variable length coding using the compressor of fig. 9, assuming that each zero value is replaced by a codeword of the smallest possible width (1 bit).

Fig. 21 shows on the left an empty uncompressed block, which is a block comprising only zero data values, and on the right the same block compressed using 1-bit encoding.

Fig. 22 shows on the left side the uncompressed blocks, which are identical to the blocks in fig. 11, 12 and 13, and on the right side the identical blocks in compressed form, including the following, in a fourth alternative: a one bit indicator indicating whether the compressed block is empty; and a variable length coded bit sequence comprising a mixture of compressed and uncompressed values, wherein each uncompressed value is preceded by a unique variable length (e.g., huffman) codeword corresponding only to all uncompressed values.

Fig. 23 illustrates a data compression apparatus capable of compressing the uncompressed block (compressed empty block) of fig. 21 and the uncompressed block (compressed block including a mixture of compressed values and uncompressed values) of fig. 22, the data compression apparatus including the data compression apparatus of fig. 16 and an empty block detection unit that checks whether all block data values are zero values. In this case, it compresses it using 1-bit encoding as in fig. 21; otherwise, it compresses it as in fig. 22.

Fig. 24a illustrates a data decompression apparatus capable of decompressing the compressed blocks of fig. 21 (compressed empty block) and fig. 22 (compressed block comprising a mixture of compressed and uncompressed values), comprising the data decompression apparatus of fig. 17a and additional logic (at the bottom of fig. 24) capable of detecting whether a block is compressed as an empty block by checking whether the first bit of the block is zero. In this case, all data values of the decompressed block are assigned a value of 0.

Fig. 24b illustrates an alternative example of the data decompression apparatus of fig. 24a capable of decompressing the compressed blocks of fig. 21 (compressed empty blocks) and fig. 22 (compressed blocks comprising a mixture of compressed and uncompressed values), which includes the data decompression apparatus of fig. 17a and additional logic (at the bottom of fig. 24) capable of detecting whether a block is compressed as an empty block by checking whether the first bit of the block is zero. In this case, all data values of the decompressed block are reset to a value of 0.

Fig. 25 illustrates an exemplary flow diagram of a compression method for compressing a block using variable length coding (e.g., huffman).

Fig. 26 illustrates an exemplary flow diagram of a decompression method for decompressing a compressed block compressed using variable length coding (e.g., canonical huffman).

FIG. 27 illustrates an exemplary flow chart of a new method of building on top of the compression method of FIG. 25 and capable of compressing the blocks of FIG. 11 by detecting compressible and incompressible block data values and generating a mask placed before a variable length encoded bit sequence including compressed and uncompressed values to indicate the uncompressed values included in the compressed blocks.

FIG. 28 illustrates an exemplary flow chart of a new method that builds on the method of FIG. 26 and that is capable of decompressing the compressed chunk of FIG. 11 that includes a mask placed before the variable length encoded bit sequence to indicate uncompressed values.

FIG. 29 illustrates an exemplary flow chart of a new method of building on top of the compression method of FIG. 25 to enable compression of the blocks of FIG. 12 by detecting both compressible and incompressible block data values and encoding the uncompressed block data values within a variable length encoded bit sequence comprising compressed and uncompressed values using a unique codeword corresponding only to the uncompressed values and appended before each uncompressed value.

Fig. 30 illustrates an exemplary flow chart of a new method of constructing a block capable of decompressing the compressed block of fig. 12, which includes compressed values and uncompressed values, above the method of fig. 26, where each uncompressed value is preceded by a unique codeword corresponding only to the uncompressed value.

Fig. 31 illustrates an exemplary flow chart of a new method of building on top of the compression method of fig. 25 to enable compression of the blocks of fig. 13 by detecting compressible and incompressible block data values and encoding the uncompressed block data values within a variable length encoded bit sequence comprising compressed values and uncompressed values in reverse order of occurrence with only unique code words corresponding to all uncompressed values and placing the actual uncompressed values at the end of the compressed blocks (the last to the first occurring values).

Fig. 32 illustrates an exemplary flow chart of a new method that builds on the method of fig. 26 to enable decompression of the compressed block of fig. 13 that includes a mixture of compressed and uncompressed values, where each uncompressed value is replaced by a unique code word that corresponds only to all uncompressed values, while the uncompressed values are saved at the end of the block in reverse order of occurrence.

FIG. 33 illustrates an exemplary flow chart of a new method that is built on top of the compression method of FIG. 29 to also check if all block data values are zero values and set the null block indicator to "1" accordingly, in which case the block is compressed using 1-bit encoding as in FIG. 21; otherwise, it compresses the block as in FIG. 22, setting the zero indicator to 0.

Fig. 34 illustrates an exemplary flowchart of a new method built on top of the method of fig. 30 to be able to detect whether a block is compressed to an empty block by checking whether the first bit of the block is equal to 1. In this case, all data values of the decompressed block are assigned a value of 0; otherwise, the block is decompressed by the method illustrated in fig. 30.

Detailed Description

The present disclosure discloses methods, devices and systems for compressing one or more blocks of data values and decompressing one or more compressed blocks of data values when the compression is applied to a cache subsystem and/or a memory subsystem and/or a data transmission subsystem and/or a communication network in a computer system. The disclosed methods, devices and systems extend and optimize baseline compression methods, devices and systems and decompression methods, devices and systems to accommodate data compression scenarios common in the systems of the above-mentioned applications and also to have better compressibility.

A data block includes one or more data values and may be of any size. In the embodiment of the computer system as depicted in fig. 1, the block of data values may alternatively be referred to as: 1) a cache line, cache set, or cache sector when a data block is stored in a cache hierarchy within such a computer system; 2) a cache line, a memory page, or a memory sector, when a block of data is stored in a memory within such a computer system or transmitted in a communication device within such a computer system. On the other hand, in an embodiment of a transport link within a communication network as depicted in fig. 6, a data block may also be referred to as a data packet, flit, payload, header, etc.

Entropy-based variable length compression, such as huffman compression, may be applied to the blocks of data values shown on the left side of fig. 8 in the context of the cache/memory/data transmission subsystem of the example computer system as depicted in fig. 2, 3, 4, 5 or the example communication link as depicted in fig. 7. The block comprises 8 data values, however it may be of any size as described previously (section 0069). Using an example set of canonical huffman codewords and a prior art huffman compressor such as the example embodiment of fig. 9, all data values in the block are compressed (or encoded) forming a compressed block as depicted on the right side of fig. 8. Further, the example variable length compressed block of data values depicted on the right side of fig. 8 may be decompressed by a prior art canonical huffman decompressor such as the example embodiment of fig. 10.

Compressing all possible data values of one or more blocks using huffman coding requires the presence of huffman codewords for all data values that may be present in a computer system or transmitted in a network. Reducing the number of possible data values by reducing the value granularity is one way to deal with this problem. For example, using 1 byte data values for huffman coding compression requires up to 256 huffman codewords. However, when finer grain values are used, compression efficiency is reduced because the generated codewords are not significantly denser than the replacement data values. Increasing the compression efficiency requires huffman compression of coarse-grained data values. This has the disadvantage that codewords must be generated in advance for all possible values that may be accessed, in such a way that the resources and metadata required for storing the huffman code are increased to prohibitive sizes. When compression is applied to the buffer/memory subsystem, the transmission subsystem and the communication network, an alternative solution to generating codewords based only on the current set of data values used and regenerating codes based on new values when they occur (e.g. when they are introduced, created, accessed or transmitted) is not a viable solution because of the overhead imposed by sampling and code regeneration. Especially when compression is applied to cache and memory subsystems, it also means that all previously compressed data values stored must be decompressed and recompressed using a new encoding, potentially introducing significant overhead to the system.

A viable solution to this problem that allows the use of variable length huffman coding to compress coarse grain values without preserving a large amount of metadata and without requiring regeneration of the huffman coding is generally to allow some data values to be left uncompressed. Another motivation for keeping some data values uncompressed is that when data values occur several times (the frequency of occurrence is small), more metadata is needed to compress them than to keep them uncompressed, eventually resulting in more area overhead and more time overhead (due to compression and decompression). A value-frequency table may be used to track the most frequent values and their frequency of occurrence.

When a block of data values includes uncompressed values, it may remain uncompressed. The inventors have conceived two possible solutions to allow for a mix of compressed and uncompressed data values within a data block, provided that the sequence of data values remains unchanged, as does the sequence in the original uncompressed block. Furthermore, the third solution contemplated by the present inventors may alternatively be used to allow for the rearrangement of the original sequence of data values without requiring any additional metadata beyond that used by the first two solutions in forming the compressed block.

An example of a compressed data block according to the first solution is depicted on the right side of fig. 11, while the same data block is shown in uncompressed form on the left side thereof. The compressed data block comprises a variable length huffman coded bit sequence and metadata in the form of a bit mask (C-state mask) occurring before the variable length huffman coding. The variable length encoded bit sequence includes compressed and uncompressed values. The mask includes as many bits as the number of data values contained in a block. Each bit in the mask defines whether the corresponding data value is compressed (mask bit is e.g. 1) or not (mask bit is e.g. 0). The positions of the mask bits in the mask are used to locate (or count) the corresponding values in the variable length sequence of compressed and uncompressed values.

A block diagram of an example data compression apparatus 1400 capable of forming the compressed block of fig. 11 is depicted in fig. 14. The example data compression device includes: a compressor in the form of a variable length coding unit 1420; a detector in the form of a compression indication unit 1430; a compression status register 1440 for storing the C-state mask; and

other logic

1440, 1450, and 1460 that form the compressed data chunk generator. As in the compressor embodiment of fig. 9, the data compression device takes as input an uncompressed data chunk 1410, which is a stream of data values and comprises one or more data values v1, v2, …, vn, and which may be retrieved from the storage unit 1405 or from an extractor having data values out of the uncompressed chunk. However, the data value of the uncompressed data block 1410 is provided not only to the variable length coding unit 1420 but also to the compression indication unit 1430 and the selector 1450. The variable length coding unit 1420 is similar to the variable length coding unit 920 of the compressor 900 of fig. 9, except that the Code Table (CT)1422 also includes a valid bit (v)1424 in each table entry. The value 1423, Codeword (CW)1425, and code length (cL)1427 of each CT entry are similar to the values 923, CW 925, and cL 927 of the compressor 900 of fig. 9. v bit 1424 indicates whether the entry includes a valid CW for the stored value.

The compression indication unit 1430 checks whether each incoming data value (candidate for compression) is present in the CT 1422 by comparing (comparator 1434a) it to the value 1423 of the matching entry and whether it is valid by checking (comparator 1434b) the valid bit 1424 of the entry. If both comparisons are true (indicated by element 1438), then there is a valid codeword for the candidate value for compression; otherwise, the value will be saved as uncompressed. This information, which is generated by the compression indication unit or detector 1430 as a detection result 1439 (e.g., 1 for values compressed using codewords in CT, 0 for uncompressed values), is marked in the appropriate location in the C-state mask in the compression status register 1440, while the selector 1450 also makes the appropriate selection according to this information.

Thus, compressed data chunk generators 1440-1460 are configured to: the variable length encoded bit sequence 1455 (i.e. the bit sequence generated by the variable length encoding) is iteratively constructed by adding compressed or uncompressed data values v1-vn in the form of variable length Code Words (CW)1425 in the order in which they appear in the uncompressed data block 1410, the addition of which depends on the detection result 1439 of the detector 1430, while the compression status mask C-status mask at the corresponding position in the compression status register 1440) is updated accordingly.

When the chunk compression is complete (meaning that all chunk values have been attempted to be compressed), the C-state mask is taken from the compression status register 1440 and prepended to the variable length encoded bit sequence 1455, which includes compressed values and uncompressed values, using the concatenate unit 1460. The result of this concatenation is a compressed block 1490. Thus, compressed data chunk generators 1440-1460 are configured to: when all data values of the uncompressed data chunk 1410 have been processed, the compression state mask C state mask is concatenated by concatenator 1460) and the variable length encoded bit sequence 1455 to generate the compressed data chunk 1490.

A block diagram of an example data decompression apparatus 1500 capable of decompressing the compressed blocks of fig. 11 is depicted in fig. 15. The data decompression apparatus 1500 is constructed based on the decompressor 1000 of fig. 10, and includes: a storage unit 1505 that holds a partially compressed data block 1510 (the size of the storage unit 1510 is at least the largest of the uncompressed value length and the maximum codeword length); a codeword detection unit 1520 (similar to the codeword detection unit 1020 of the decompressor 1000 of fig. 10); a value retrieving unit 1530 (similar to the value retrieving unit 1030 of the decompressor 1000 of fig. 10); and additional logic 1540-1570 that forms a decompressed data block generator. Thus, the codeword detection unit 1520 and the value retrieval unit 1530 form a decompressor.

The decompressed data block generator includes: a register 1550 for storing a compression status mask, C-status mask, such as a compression status mask derived from an additional portion of the compressed data block 1510;

selectors

1540 and 1570; and a storage unit 1560. The decompressed data block generator reads the C-state mask in each value decompression step to decide whether the current value of the compressed data block 1510 is compressed or uncompressed in order to decide the correct data path. If the C-state mask bit is 1 (i.e., the current value is compressed), the amount to be shifted is selected by the selector 1540 as the matching length of the codeword, which is the output of the codeword detection unit 1520, and the value attached to the remaining decompressed value is selected (by the selector 1570) as the decoded value output by the value retrieval unit 1530. On the other hand, if the C-state mask bit is 0 (i.e., the value is uncompressed), the selector 1540 will select the length of the uncompressed value, which has a fixed length and is typically determined based on the value granularity used (e.g., 32 bits if the symbol granularity is 4 bytes). The uncompressed value is read from the storage unit 1505 and selected by the selector 1570 using the C-state mask bits as control signals. Decompression continues until all values of compressed data chunk 1510 have been processed (i.e., decompressed or read depending on the C-state mask bits) and a decompressed data chunk 1590 of data value v1 … vn is formed.

Thus, the decompressed data chunk generators 1540-1570 are configured to: metadata (i.e., C-state masks) in compressed data block 1510 is detected, wherein the metadata indicates uncompressed data values included in the compressed data block, and based on the detected metadata, decompressed data block 1590 is generated by combining decompressed data values from

decompressors

1520, 1530 with uncompressed data values from compressed data block 1510 such that the order of data values v1 … Vn of generated decompressed data block 1590 is the same as the order in which the data values appear in an uncompressed data block (such as block 1410 of fig. 14) prior to data compression resulting in a compressed data block (such as block 1490 of fig. 14).

More specifically, the decompressed data chunk generators 1540-1570 are thus configured to retrieve from the compressed data chunk 1510 a compression state mask C-state mask that indicates the location of the compressed data value and the uncompressed data value in the compressed data chunk 1510. The decompressed data chunk generators 1540-1570 are further configured to generate decompressed data chunk 1590 by: that is, for each data value in the compressed data block 1510, based on the bit value at the corresponding location in the compression state mask C-state mask, the control selector 1570 selects either the decompressed data value from the

decompressors

1520, 1530 or the uncompressed data value from the compressed data block.

An exemplary flowchart of a compression method constructed based on the compression method of fig. 25, which can compress the blocks of fig. 11, is illustrated in fig. 27. As in the data compression apparatus of fig. 14, the compression method detects compressible and incompressible block data values by looking up the block data values in the CT and whether they are associated with valid codewords, and encodes this information in a C-state mask that precedes the storage of the encoded values (i.e., the variable length encoded bit sequence).

On the other hand, fig. 28 illustrates an exemplary flowchart of a decompression method constructed based on the decompression method of fig. 26, which is capable of decompressing the compressed block of fig. 11, which includes a mixture of uncompressed and compressed values and a mask at the beginning of the compressed block indicating which values are uncompressed and which values are not uncompressed. The decompression method checks the corresponding C-state entry at each decompression step. If it is 0, then the current value being decompressed is effectively uncompressed and can be read directly from the storage unit holding the bit sequence (or a portion thereof) of compressed and uncompressed values. The minimum size of the storage unit is defined by the maximum between the length of the largest variable length codeword (mCL) and the length of the uncompressed value (unc _ val _ length); however, it is still the length of the largest variable length codeword that still determines the maximum size of the comparison. As shown in the left path of fig. 28, the shift amount ("length") to shift the current bit sequence to discard the read values is the length of the uncompressed values, and it is assigned to the variable "length". Otherwise, the decompression of the compressed value follows another path in fig. 28, and the length of the matching codeword is instead assigned to "length".

While the first example solution of using a fixed size mask before a variable length compressed block of data values may of course have its benefits, it may have the disadvantage that it may reduce the compression efficiency. This is so because it always adds a fixed number of bits. If no uncompressed values are present in the compressed block, then no masking is actually needed. Therefore, if the number of uncompressed values is small, a fixed-size mask will inevitably increase area overhead.

An alternative example of the compressed block of FIG. 11 is depicted in FIG. 12. In this example, if instead a unique special meaning codeword is placed before each uncompressed value (i.e., the second solution, which implements a mix of compressed and uncompressed data values within a data block, assuming that the sequence of data values remains unchanged, as in the original uncompressed block), then masking can be avoided altogether. Therefore, compression efficiency can be improved. By calculating or estimating the frequency of occurrence of all these values which do not occur in the value-frequency table, the unique code word may be generated at the time of code generation together with the remaining variable length codes, e.g. the code words in the form of huffman codes, e.g. code words in the form of Code Words (CW)1625 in the Code Table (CT)1622 of the variable length coding unit 1620 in fig. 16. This may be calculated by incrementing an extra counter, for example, when one or more values are not captured by or evicted from such a value-frequency table to create space for other more frequent values. Other possible solutions to this can be implemented by those skilled in the art. Distinguishing uncompressed values from compressed values within a compressed block in this alternative way of also using the frequency of occurrence of the values may result in a more efficient solution in terms of compression efficiency compared to previous implementations using fixed size masks: the more values that are not captured by the value-frequency tracker, the more uncompressed values compared to the total number of values that occur, the narrower the unique code words attached to the uncompressed values. On the other hand, the smaller the frequency of occurrence of uncompressed values, the wider the unique code word attached to each uncompressed value. Thus, at code generation, the frequency of occurrence of all data values not present in the value-frequency table compared to the total number of data values present will affect the width of the unique code word. This unique codeword, which is attached to all uncompressed values, is from now on referred to as the "unique uncompressed identifier codeword" (UUIC).

A block diagram of an example data compression apparatus 1600 capable of forming the compressed data block of fig. 12 is depicted in fig. 16. It includes: element 1605, which may be a store or extractor of data values from uncompressed chunks of data 1610 as used by the data compression device embodiments described above; a compressor in the form of a variable length coding unit 1620; a detector in the form of a compression indication unit 1630; and a compressed data block generator consisting of UUIC attach unit 1640 and other logic 1650. The

units

1620 and 1630 are similar to the variable length coding unit 1420 and the compression indication unit 1430 of the data compression apparatus 1400 of fig. 14. The data compression apparatus 1600 operates as follows. Data values are extracted from the uncompressed data blocks 1610 and checked by the compression indication unit 1630 whether they can be compressed using the variable length coding unit 1620. If not (i.e., the data value is to be stored in uncompressed form), then the UUIC is prepended to the data value by element 1640, otherwise the data value is encoded using element 1620. Element 1640 also contains a storage element (not shown in FIG. 16 for clarity) wherein the storage element holds a UUIC. The storage location is updated when a new UUIC is generated (e.g., when a new huffman code is generated). The selector 1650, which is controlled by the detection result 1639 by the compression indication unit 1630, makes the correct selection and further attaches the made correct selection to the remaining compressed data values that will form at the end of the compressed data block 1690. As already explained above, the data compression apparatus 1600 can improve compression efficiency as compared to the data compression apparatus 1400 in fig. 14. A second advantage of the data compression apparatus 1600 over the data compression apparatus 1400 is that the linker 1460 can be omitted. Accordingly, separate hardware for concatenating metadata (see C-state mask in fig. 14) to the compressed data blocks will not be required, and the data compression apparatus 1600 may thus provide further improvements in reduced hardware costs and increased compression speed.

Thus, the compressed data chunk generators 1640-1650 are configured to generate the above-described metadata as unique special meaning codeword UUIC, indicating detected data values in uncompressed data chunks 1610 that cannot be compressed by the compressor 1620. In addition, the compressed data block generators 1640-1650 are configured to attach a unique special meaning codeword UUIC to each uncompressed data value when generating the compressed data block 1690.

Additionally, as is clear from the above, the compressed data block generators 1640-1650 are configured to generate the compressed data blocks 1690 by: that is, a variable length encoded bit sequence is iteratively constructed by adding compressed data values in the form of variable length codewords or uncompressed data values with unique special meaning codewords attached thereto in the order in which they appear in uncompressed data block 1610, depending on the detection result 1639 of detector 1630.

A block diagram of an example data decompression apparatus 1700 capable of decompressing the compressed data blocks of fig. 12 is depicted in fig. 17 a. The data decompression apparatus 1700 includes: a storage unit 1705 that holds a partially compressed data block 1710 (the partial size being at least the largest of the total length of the UUIC attached to the uncompressed value and the maximum codeword length); a codeword detection unit 1720 (similar to the codeword detection unit 1020 of the decompressor 1000 of fig. 10); a value retrieval unit 1730 (similar to the value retrieval unit 1030 of the decompressor 1000 of fig. 10); UUIC detection unit 1740; shift amount calculation unit 1770; a shift unit 1750; a comparator unit 1760 and a selector 1780. The codeword detection unit 1720 and the value retrieval unit 1730 thus form a decompressor, while the units 1740-1780 form a decompressed data block generator.

In each data value decompression, UUIC detection unit 1740 takes as input one or more bit subsequences of all possible widths starting from the first bit of the bit sequence stored in unit 1705. Each of the widths may be 1 bit, 2 bits, 3 bits, etc., up to a maximum width equal to the maximum UUIC width, which, like mCL, may be limited to a particular value depending on the implementation (e.g., in software or in hardware) of such data decompression apparatus selected at design, compilation, configuration, or runtime, as described in paragraph [0015 ]. Codeword detection unit 1720 uses the same bit subsequences or a superset of them. UUIC detection unit 1740 attempts to detect uncompressed data values by matching one of these bit sequences with a candidate

UUIC using comparators

1744a, 1744b, 1744c, etc., and generating "length of matching UUIC" using priority encoder 1748. Although there is only one UUIC of a particular length per huffman-coded version generated and used, its length is not known in advance; therefore, each bit sub-sequence needs to be compared in

equal comparators

1744a, 1744b, 1744 c. However, only one comparison will be valid. The rest are deactivated by using a corresponding deactivation signal (not shown in fig. 17a for clarity) that is also updated when the new code is generated.

The "length of matching UUIC" output by unit 1740 indicates whether UUIC is detected or not, using comparator 1760. The output of the comparator is a "UUIC detect flag". The "length of the matched UUIC" is used to: a) shift amount calculation unit 1770, where it is added (using adder 1774) to the length of the uncompressed value stored in storage unit (e.g., flip-flop) 1776 (i.e., fixed length and typically determined based on the value granularity used during code generation, e.g., 32 bits if the symbol granularity is 4 bytes); and b) a shift unit 1750, where the UUIC is removed from the bit sequence "matched UUIC with uncompressed value" (extracted from unit 1710) so that only uncompressed data values remain. The "UUIC detect flag" serves as a control signal 1762 to the

selectors

1772 and 1780. Selector 1772 of shift amount calculation unit 1770 determines the shift amount of the bit sequence stored in 1710, such that the matching portion (either the uncompressed data value followed by the UUIC or the codeword detected by unit 1720) is removed and the empty portion is padded by the next bits of the compressed sequence. On the other hand, the selector 1780 selects between the uncompressed data value and the decoded value output by the value retrieval unit 1730. The selected value is concatenated with the remaining decompressed data values. Decompression continues until all data values of compressed data block 1710 have been decompressed and form decompressed data block 1790 of data values v1 … vn.

Thus, decompressed data chunk generators 1740-1780 are configured to detect the unique special meaning codeword UUIC included in compressed data chunk 1710 and generate resultant control signal 1762. The decompressed data block generators 1740-1780 are also configured to remove the detected unique special meaning code word from the associated uncompressed data value to which it is attached.

In addition, the decompressed data block generators 1740-1780 are configured to generate the decompressed data block 1790 by: that is, for each data value in compressed data block 1710, the selector 1780 is controlled based on the control signal 1762 to select either the decompressed data value from the

decompressors

1720, 1730 or the uncompressed data value for which the detected unique special meaning codeword has been removed.

A block diagram of an alternative embodiment of the example data decompression apparatus of fig. 17a is depicted in fig. 17 b. In this data decompression apparatus, comparator 1760 is omitted because UUIC detection unit 1740 is implemented using comparators 1744a-c and or gate 1748 instead of priority encoder 1748 of fig. 17 a. As a result, UUIC detection unit 1740 directly generates the UUIC detection flag, instead of "length of matching UUIC". Instead, the "length of the matched UUIC" is output by storage unit 1776a, e.g., a flip-flop, which is updated each time a new huffman code is generated, so that it corresponds to the correct length of the UUIC attached to the uncompressed value. The remaining elements and logic are similar to those in the data decompression apparatus of fig. 17 a. Other alternative implementations of the data decompression apparatus embodiment may be implemented by those skilled in the art.

An exemplary flowchart of a compression method constructed based on the compression method of fig. 25 and capable of compressing the blocks of fig. 12 is illustrated in fig. 29. The compression method uses the same detection method as the compression method of fig. 27 to detect compressible and incompressible data values of a block. However, it also encodes the uncompressed data values of the blocks within the variable length encoded bit sequence comprising the compressed values and the uncompressed values by attaching a UUIC before the uncompressed values, as described in the data compression apparatus of fig. 16.

An exemplary flow diagram of a decompression method constructed based on the method of fig. 26 and capable of decompressing the compressed block of fig. 12, including compressed and uncompressed values, with a UUIC preceded each uncompressed value, is illustrated in fig. 30. The bit sub-sequence aligned with the first bit position of the bit sequence stored in the storage unit is numerically compared with the First Codeword (FCW) and also compared with a unique codeword (UUIC) attached to the uncompressed value. If the latter comparison yields a match, the uncompressed value can be read directly from the storage unit, while the shift amount ("length") used to discard the current bit sequence of read values is the UUIC length plus the length of the uncompressed value. Otherwise, the decompression of the compressed value follows another path in fig. 30, and the length of the matching codeword is assigned to "length". The minimum size of the storage unit is defined by the largest of the two, one being the length of the maximum variable length codeword (mCL) and the other being the length of the uncompressed value (unc _ val _ length) and the aggregate length of the UUIC length (UUIC _ length).

The amount of shift of the compressed bit sequence held in the storage unit 1010 (fig. 10) or 1710 (fig. 17) has an arbitrary size after matching out a codeword (a normal codeword obtained by the codeword detection unit 1020 (or 1720) or a UUIC obtained by the UUIC detection unit 1740). Arbitrary shift amounts generally increase the complexity of shifter implementation. In addition, the maximum shift amount may be large due to the sum of the UUIC length and the fixed size length of the uncompressed value (after UUIC detection). One way to reduce the cost of shifting and possibly speed up decompression is to migrate the uncompressed values to the end of the compressed block during compression and save them in the reverse order of what occurred in the original uncompressed block. Fig. 13 illustrates another embodiment of a compressed block compressed in the new method (i.e., a third solution to achieve a mix of compressed and uncompressed data values within a data block by rearranging the original sequence of data values when forming the compressed block). On the right side of fig. 13, the uncompressed value 500 has been moved to the end of the compressed block; to be able to reconstruct the original order of values when decompressing a block, it now encodes the uncompressed values for UUICs that were previously attached before them. Thus, UUIC replaces the value 500 in the original value order, so that when it is detected by the UUIC detection unit, uncompressed values can be taken at the end of the block.

A block diagram of an example data compression apparatus 1800 capable of forming the compressed data blocks of fig. 13 is depicted in fig. 18. It includes: element 1805, which may be storage as used by the data compression device embodiments described above or an extractor of data values from uncompressed chunks 1810; a compressor in the form of a variable length coding unit 1820; a detector in the form of a compression indication unit 1830; and a compressed data chunk generator consisting of UUIC store unit 1840, selector 1850, store unit 1860 for uncompressed data values, and concatenate unit 1870. The

units

1820 and 1830 are similar to the variable length coding unit and the compression instructing unit of the data compression apparatuses 1400 and 1600, respectively, described above. However, when a data value is identified as incompressible by the compression indication unit 1830, the UUIC read from the UUIC store 1840 is selected by the selector 1850 and appended to the variable length encoded bit sequence. The data values that remain uncompressed are instead saved in the storage unit 1860 for the uncompressed values. During compression, the storage unit holds the uncompressed data values in the reverse order in which they appear in uncompressed data block 1810. For example, the first occurrence of an uncompressed data value (32 bits, assuming a value granularity of 4 bytes) is stored in the last location (right-most location in FIG. 18) of the storage unit 1860. The Write Enable (WE) of the storage unit 1860 is connected to the output of the compaction indication unit 1830. When all data values have been processed, a sequence of uncompressed data values (if it contains valid uncompressed data values) is appended to the end of the variable length encoded bit sequence and together form a compressed data block 1890.

Thus, like the compressed data block generators 1640-1650 described above with reference to FIG. 16, the compressed data block generators 1840-1870 in FIG. 18 are configured to generate the metadata as a unique special meaning codeword UUIC indicating the data values in the detected uncompressed data block 1810 that cannot be compressed by the compressor 1820. In addition, compressed data block generators 1840-1870 are configured to: for all detected data values in the uncompressed data blocks 1810 that cannot be compressed by the compressor 1820, the unique special meaning codeword is included in the generated compressed data blocks 1890 in the order in which the data values appear in the uncompressed data blocks 1810 instead of the corresponding detected data values. The compressed data block generators 1840-1870 are also configured to: detected data values in the uncompressed data blocks 1810 that cannot be compressed by the compressor 1820 are appended at the end (i.e., tail end or front end) of the generated compressed data blocks 1890 in an order that is opposite to the order in which the detected data values occur in the uncompressed data blocks 1810.

Additionally, as is clear from the above, the compressed data block generators 1840-1870 are configured to generate the compressed data block 1890 by: that is, a variable length encoded bit sequence is iteratively constructed by adding each compressed data value from compressor 1820 in the order in which it appears in uncompressed data block 1810. Whenever a detector 1830 detects a data value in the uncompressed data block 1810 that cannot be compressed by the compressor 1820, a unique special meaning codeword UUIC is added to the variable length encoded bit sequence, while the detected data value is stored in the storage unit 1860. The detected data values are stored in storage unit 1860 in a reverse order compared to the order in which the data values appear in uncompressed data block 1810. When all of the data values of the uncompressed data block 1810 have been processed, a compressed data block 1890 is generated by having the concatenator 1870 concatenate the variable length encoded bit sequence and the data values saved in the storage unit 1860.

A block diagram of an example implementation of a data decompression apparatus 1900 capable of decompressing the compressed blocks of fig. 13 is depicted in fig. 19. The data decompression apparatus 1900 includes: a storage unit 1905 that holds a portion of the compressed data block (the size of the storage unit 1905 is at least the largest of the total length of the UUIC and the maximum codeword length); a codeword detection unit 1920 (similar to the codeword detection unit 1020 of the decompressor 1000 of fig. 10); a value retrieving unit 1930 (similar to the value retrieving unit 1030 of the decompressor 1000 of fig. 10); UUIC detection unit 1940 (similar to UUIC detection unit 1740 of data decompression apparatus 1700 of fig. 17 a); an uncompressed value extraction unit 1960 and additional logic. Additional logic includes comparator unit 1970, arithmetic unit 1980, and

selectors

1950 and 1985. The codeword detection unit 1920 and the value retrieval unit 1930 thus form a decompressor, while the units 1940-1985 form a decompressed data block generator.

The uncompressed value extraction unit 1960 includes a selector unit 1968 and a storage unit 1964 that holds the compressed blocks resulting from the data compression apparatus of fig. 18 in an order (block end to block beginning) opposite to the order in which the uncompressed data values are placed, in the end of the data blocks in the order opposite to the order in which they occur. Based on the currently detected uncompressed value sequence number measured by the arithmetic unit 1980 (e.g., incrementer) for each non-zero (verified by the comparator unit 1970) "matching UUIC length" output from the UUIC detection unit 1940, the uncompressed value extraction unit 1960 selects the corresponding uncompressed data value from the storage unit 1964 and sends it to the remaining decompressed data values through the selector 1985. Thus, the uncompressed data value is no longer read from storage unit 1910, as with data decompression apparatus 1700 of fig. 17 or 1500 of fig. 15, while the maximum shift amount of the compressed bit sequence is reduced to the largest of the UUIC length and the maximum codeword length. If there are no uncompressed data values at the end of the compressed data block, the output of uncompressed value extraction unit 1960 will not be selected because no UUIC will be detected.

Thus, decompressed data block generator 1940-1985 is configured to detect the unique special meaning codeword UUIC included in compressed data block 1910 and generate resultant control signal 1962. The decompressed data block generators 1940-1985 are also configured to generate decompressed data blocks 1990 by: that is, for each data value of compressed data block 1910 and based on control signal 1962, control selector 1985 selects the decompressed data values from

decompressors

1920, 1930 in the order in which the compressed data values appear in compressed data block 1910, or selects the uncompressed data values from compressed data block 1910 in the reverse order compared to the order in which the uncompressed data values appear in compressed data block 1910. Additionally, decompressed data block generators 1940-1985 are configured to store copies of at least a portion of compressed data block 1910 in reverse order in storage unit 1964, with the tail end of the compressed data block being stored at the beginning of storage unit 1964. The decompressed data block generators 1940-1985 are further configured to: each time the unique special meaning codeword UUIC is detected to be included in the compressed data block 1910, the uncompressed data value counter 1980 is incremented and used as a pointer to a storage location in storage unit 1964 to provide the uncompressed data value to selector 1985.

An exemplary flowchart of a compression method constructed based on the compression method of fig. 25 and capable of compressing the blocks of fig. 13 is illustrated in fig. 31. The compression method detects compressible data values and incompressible data values of a block using the same detection method as the compression method of fig. 27. However, it uses UUIC to encode the uncompressed data values of a block within a variable length encoded bit sequence containing compressed and uncompressed values in the original data value order and places the actual uncompressed values at the end of the compressed block in the reverse order of occurrence (last to first occurring values).

Fig. 32 illustrates an exemplary flowchart of a decompression method constructed based on the decompression method of fig. 26, which is capable of decompressing the compressed block of fig. 13 including a mixture of compressed and uncompressed values, in which each uncompressed value is replaced by a UUIC in the original data value order and the actual uncompressed value is saved at the end of the block in the reverse appearance order. This method is similar to the decompression method of fig. 30, detecting uncompressed values by attempting to match UUICs. The difference is that when a UUIC is detected, the uncompressed value is read directly from the end of the compressed block held in the second storage location (the "compressed" array). The read uncompressed value is then discarded from this said second storage unit (or the array index is incremented). Further, the UUIC length is assigned to "length", which is the shift amount of the bit sequence stored in the storage unit. Otherwise, decompression of the compressed values follows another path in fig. 32, and the length of the matching codeword is assigned to "length". Unlike the previous two embodiments, the minimum size of the storage unit is defined by the maximum between the length of the maximum variable length codeword (mCL) and the UUIC length.

Another situation that often occurs when data compression is applied in a cache/memory subsystem or a transmission network subsystem of a computer system or in a communication network is a block of data values that is populated with the same common data values. The most common situation is that the same common data value is the value 0, and therefore such a block is called an empty block. Variable length coding like huffman coding is limited to the maximum compression ratio, since one codeword can only replace a data value in the best case. In another embodiment of an uncompressed block of 8 data values, as depicted on the left side of fig. 20, each value is a value of 0, the compressed block comprises 8 bits (1 bit per value) as depicted on the right side of fig. 20, in the case where huffman coding is used and it is assumed that the value of 0 is so frequent that it can be coded with 1 bit. Another alternative method of compressing such a block is to replace the entire block of values with one bit, e.g., bit 1 as depicted in fig. 21. However, in the compression case, if a block includes a value other than 0, it will be compressed by one of the above-described data compression apparatuses. However, the extra information needs to be encoded so that the block in compressed form can be distinguished from the empty compressed block by the data decompression device. An example approach is to place a bit, e.g., bit 0, before the other variable length coded bit sequence to indicate that it is a non-empty block. Other methods may be found by those skilled in the art.

A block diagram of an example data compression device capable of forming the compressed blocks of fig. 21 and 22 is depicted in fig. 23. It is based on the data compression device 1600 of fig. 16, however all of the above-described data compression devices disclosed in this patent, such as

devices

1400, 1600 and 1800, may be used as a basis by those skilled in the art. Compared to the data compression apparatus 1600 of fig. 16, a new unit in the data compression apparatus 2300 of fig. 23 is an empty block detection unit 2360 (on the left side of fig. 23) which compares whether all block data values are equal to the same common data value 0 in a plurality of

comparators

2364a, 2364b, and the like. If true, the empty block indication bit (i.e., output by and gate 2368) is set to "1" and selector 2370 is controlled accordingly to cause the empty block indication bit to constitute a compressed data block 2390. Otherwise, the empty block indication bit is set to "0" and the selector 2370 is controlled accordingly to have the empty block indication bit precede the rest of the variable length encoded bit sequence, together forming a compressed data block 2390 as described above for the different embodiments. The empty block detection unit 2360 may be modified by those skilled in the art to compress other kinds of uncompressed data blocks that include the same common data value that is not a value of 0.

More generally, a data compression apparatus (such as 2300) may comprise any of the

data compression apparatuses

1400, 1600, or 1800 as described above, and additionally comprise a common value detection unit (such as unit 2360) configured to detect when all data values of an uncompressed data block (e.g. 2310) have the same common block data value (e.g. value 0). Such a data compression device is configured to: when it is detected that all data values of the uncompressed data blocks have the same common block data value, generating a compressed data block (e.g. 2390) comprising only the special meaning data value indicative of the detected common block data value and not comprising a combination of the compressed data value, the uncompressed data value and the metadata as described above, whereas the data compression apparatus is configured to: when it is not detected that all data values of the uncompressed data blocks have the same common block data value, a compressed data block is generated that includes a value different from the special meaning data value followed by a combination of the compressed data value, the uncompressed data value, and the metadata.

A block diagram of an example embodiment of a data decompression device 2400 capable of decompressing said compressed data blocks as depicted in fig. 21 and 22 is illustrated in fig. 24a and is built on the basis of the data decompression device 1700 of fig. 17 a. However, all of the above-described data decompression devices disclosed in this disclosure, such as

devices

1500, 1700, and 1900, may be used as a basis by those skilled in the art. In contrast to the data decompression apparatus 1700 of fig. 17a in which the storage unit 1710 holds part of the compressed data blocks 1710, in the new data decompression apparatus 2400 of fig. 24a, the storage unit includes: a store subunit 2418 that holds only the first bit of the compressed data block 2410; and a store subunit 2414 that holds the partially compressed data block 2410 as before. In this data decompression apparatus and only at the beginning of decompressing the compressed block, it is checked whether the leading bit (stored in the storage sub-unit 2418) is non-zero (or "1"). If true, all values in decompressed block 2490 are assigned a value of 0 by a plurality of multiplexers 2495 (on the right side of FIG. 24); otherwise (if the leading bit of compressed data block 2410 is "0"), the remaining steps as described for data decompression apparatus 1700 of fig. 17a are followed to decompress the data block.

More generally, a data decompression apparatus (such as 2400) may comprise any of the

data decompression apparatuses

1500, 1700 or 1900 as described above, and additionally a special meaning data value detector (such as 2450) configured to detect a special meaning data value at the beginning of a compressed data block (such as 2410), the special meaning data value indicating a common block data value (such as value 0), such data decompression apparatus being configured to generate the decompressed data block (such as 2490) by populating the decompressed data block with the common block data value when the special meaning data value is detected, whereas the data decompression apparatus is configured to generate the decompressed data block as described for any of the above embodiments when the special meaning data value is not detected.

An alternative block diagram of an embodiment of the data decompression apparatus is depicted in fig. 24b, wherein a decompressed block of data values is reset to 0 if the leading bit of the compressed block is found to be non-zero, assuming that the decompressed block is reconstructed in the storage unit as a flip-flop array. Other alternative implementations of the data decompression apparatus may be implemented by those skilled in the art. Alternative embodiments of the data decompression apparatus may be implemented when any other value is very commonly present within the same block of data values.

Fig. 33 illustrates an exemplary flowchart of a compression method constructed based on the compression method of fig. 29, and the compression method also checks whether all block data values are zero values, so that it compresses the blocks (referred to as empty blocks) as depicted in fig. 21 in a more efficient manner; otherwise, it compresses as depicted in fig. 22. This is accomplished by: the way that at the start of compression it is checked whether all values are equal to the value 0 and accordingly the empty block indicator is set to "1" (the empty block indicator constitutes the compressed block itself); otherwise, it sets the empty block indicator to "0" and compresses the block using an alternative path that includes the compression method of fig. 29, but places the empty block indicator at the beginning of the compressed block. In an alternative embodiment of the compression method, each value may be compared to a value of 0 while compressing each value using an alternative path that includes the compression method of fig. 29. If they are both equal to the value 0, the block is compressed into a Null block. Other embodiments of the compression method may attempt to compress other values that occur more commonly than the value 0, either statically or dynamically.

Fig. 34 illustrates an exemplary flowchart of a decompression method constructed based on the decompression method of fig. 30, and which can detect whether a block is compressed as an empty block by checking whether the first bit of the block is equal to "1". In this case, all data values of the decompressed block are assigned a value of 0; otherwise, the block is decompressed using an alternative path that includes the method illustrated in fig. 30. Other embodiments of the decompression method may include other values that occur more commonly than 0, either statically or dynamically.

In all the above embodiments of the data compression device and/or the data decompression device, a person skilled in the art may insert delay elements such as flip-flops, so that the compression of a block of data values or/and the decompression of a compressed block of values may be pipelined in a plurality of stages to reduce the clock cycle time and increase the processing (compression or/and decompression) throughput.

Furthermore, alternative embodiments of the data compression apparatus and/or the data decompression apparatus disclosed in the present disclosure may be parallelized by simultaneously compressing data values of a plurality of blocks and/or decompressing data values of a plurality of compressed blocks by a person skilled in the art and according to techniques known per se. In this case, the decompressor design needs to be modified accordingly by those skilled in the art.

The respective

data compression devices

1400, 1600, 1800, 2300 in fig. 14, 16, 18 and 23 may be implemented, for example, in hardware, for example, as digital circuits in an integrated circuit, as a special purpose device (e.g., a memory controller), as a programmable processing device (e.g., a Central Processing Unit (CPU) or a Digital Signal Processor (DSP), a Field Programmable Gate Array (FPGA), etc. the functions of the respective data compression methods described in this disclosure may be performed, for example, by a suitably configured respective

data compression device

1400, 1600, 1800, 2300, or as a respective computer program product comprising code instructions that, when loaded and executed by a general purpose processing device such as a CPU or DSP, cause the respective methods to be performed.

The respective

data decompression devices

1500, 1700, 1900, 2400 in fig. 15, 17, 19, 24a and 24b may be implemented, for example, in hardware, e.g., as digital circuits in an integrated circuit, a dedicated device (e.g., a memory controller), a programmable processing device (e.g., a Central Processing Unit (CPU) or a Digital Signal Processor (DSP), a Field Programmable Gate Array (FPGA), etc. the functions of the respective data decompression methods described in the present disclosure may be performed, for example, by a suitably configured respective

data decompression device

1500, 1700, 1900, 2400, or as a respective computer program product comprising code instructions that, when loaded and executed by a general purpose processing device such as a CPU or a DSP (e.g., any of the processing units P1 … Pn of fig. 1-5), cause the respective methods to be performed.

Example embodiments disclosed herein propose methods, devices and systems for data block compression and decompression, i.e., for storing or transmitting information more compactly: data block compression and decompression in or for a cache/memory subsystem of a computer system, data block compression and decompression in or for a data transmission subsystem of a computer system, or data block compression and decompression in or for a communication network.

FIG. 35 illustrates a general system 3500 in accordance with the present invention. The system includes one or more memories 3510, a data compression device 3520 (such as, for example, any of

data compression devices

1400, 1600, 1800, 2300), and a data decompression device 3530 (such as, for example, any of

data decompression devices

1500, 1700, 1900, 2400). Advantageously, system 3500 is a computer system (such as any of computer systems 100-500 of FIGS. 1-5) and the one or more memories 3510 are one or more cache memories (such as any of cache memories L1-L3 of FIGS. 1-5), one or more random access memories (such as any of memories 130-530 of FIGS. 1-5), or one or more secondary storages. Alternatively, system 3500 is a data communication system (such as

communication networks

600, 700 of fig. 6-7), wherein the one or more memories 3510 can be data buffers (such as

transmitters

610, 710 and

receivers

620, 720 of fig. 6-7) associated with transmitting and receiving nodes in the data communication system.

While aspects of the invention have been described using example embodiments, they are not limited to the disclosed embodiments, but rather, they cover alternative embodiments that may be implemented by those skilled in the art.

It should be noted that alternative inventive aspects are defined in the following numbered clauses, particularly directed to the related but presently non-claimed designs shown in and described with respect to fig. 11, 14 and 15.

I. A data compression apparatus for compressing an uncompressed data block comprising one or more data values into a compressed data block, the data compression apparatus comprising:

metadata, which indicates uncompressed data values.

The data compression device of clause I, wherein the compressed data block generator comprises at least one selector responsive to the detection result of the detector and configured to generate the compressed data block by combining the variable length codeword, the detected data value and the metadata in one or more steps.

The data compression device of any one of the preceding clauses wherein the detector is configured to detect data values in uncompressed data blocks that cannot be compressed by the compressor as one or more of:

data values not present in the compressor's code table,

the data values of the code words that are present in the code table but absent in the code table of the compressor,

a data value that exists in the code table, has a codeword in the code table, but is indicated as invalid in the code table of the compressor.

The data compression device of any preceding clause,

wherein the compressed data chunk generator comprises a compression status register to store the metadata in the form of a compression status mask to indicate the location of compressed and uncompressed data values in the compressed data chunk; and

wherein the compressed data chunk generator is configured to include the compression status mask from the compression status register in the generated data chunk.

V. the data compression apparatus according to clause IV, the compressed data block generator comprising a linker, wherein the compressed data block generator is configured to:

iteratively constructing a variable length encoded bit sequence by adding compressed data values or uncompressed data values in the form of variable length code words in the order in which they appear in the uncompressed data block, while updating the compression status mask at the corresponding position in the compression status register accordingly, said adding of compressed data values or uncompressed data values in the form of variable length code words depending on the detection result of the detector; and

when all data values of the uncompressed data block have been processed, a compressed data block is generated by concatenating the compression state mask and the variable length encoded bit sequence by a concatenator.

The data compression device according to any preceding clause, further comprising a common value detection unit configured to detect when all data values of an uncompressed data block have the same common block data value,

wherein the data compression device is configured to: when it is detected that all data values of the uncompressed data blocks have the same common block data value, generating a compressed data block including only the special meaning data value indicating the detected common block data value and not including a combination of the compressed data value, the uncompressed data value, and the metadata described above; and

wherein the data compression device is configured to: when it is not detected that all data values of an uncompressed data block have the same common block data value, a compressed data block is generated that includes a value different from the special meaning data value followed by a combination of the compressed data value, uncompressed data value, and metadata described above.

A data decompression apparatus for decompressing a compressed data block into a decompressed data block comprising one or more data values, the data decompression apparatus comprising:

a decompressed data block generator configured to:

detecting metadata in the compressed data block, the metadata indicating uncompressed data values included in the compressed data block; and

generating the decompressed data block based on the detected metadata by: the way is to combine the decompressed data values from the decompressor and the uncompressed data values from the compressed data block such that the order of the data values of the generated decompressed data block is the same as the order in which the data values appear in the uncompressed data block prior to the data compression that produced the compressed data block.

The data decompression device according to clause VII, wherein the decompressed data block generator is configured to:

retrieving the metadata from the compressed data chunk in the form of a compression status mask indicating the location of compressed and uncompressed data values in the compressed data chunk; and

generating a decompressed data block by: that is, for each data value in the compressed data block, the selector is controlled to select either the decompressed data value from the decompressor or the uncompressed data value from the compressed data block based on the bit value at the corresponding position in the compression state mask.

IX., the data decompression apparatus according to any one of clauses VII-VIII, further comprising a special meaning data value detector configured to detect a special meaning data value indicative of a common block data value at the beginning of a compressed data block,

wherein the data decompression device is configured to: generating a decompressed data block by populating the decompressed data block with common block data values when a special meaning data value is detected; and

wherein the data decompression device is configured to: when no special meaning data value is detected, a decompressed data block is generated as described in any of clauses VII-VIII.

A data compression method for compressing an uncompressed data block comprising one or more data values into a compressed data block, the data compression method comprising:

generating a compressed data block by combining:

metadata, which indicates uncompressed data values.

The data compression method may comprise any of the functional features of the data compression device according to clauses I-VI.

A data decompression method for decompressing a compressed data block into a decompressed data block comprising one or more data values, the data decompression method comprising:

The data decompression method may comprise any of the functional features of the data compression device according to clauses VII-IX.

A system comprising one or more memories, a data compression device according to any of clauses I-VI and a data decompression device according to any of clauses VII-IX.

The system of clause XII, wherein the system is a computer system, and wherein the one or more memories are from the group consisting of:

the memory is cached and the data is stored in the cache,

a random access memory, and

and a secondary storage device.

The system of clause XII, wherein the system is a data communication system, and wherein the one or more memories are data buffers.

XV. a computer program product comprising code instructions which, when loaded and executed by a processing device, cause performance of the method according to clause X.

A computer program product comprising code instructions which, when loaded and executed by a processing apparatus, cause performance of the method according to clause XI.

Claims

1. A data compression apparatus (1600; 1800; 2300) for compressing uncompressed data blocks (1610; 1810; 2310) comprising one or more data values (v1-vn) into compressed data blocks (1690; 1890; 2390), the data compression apparatus comprising:

a compressor (1620; 1820; 2320) configured to compress data values of the uncompressed data blocks into corresponding variable-length codewords (1625; 1825; 2325);

a detector (1630; 1830; 2330) configured to detect data values in the uncompressed data block that cannot be compressed by the compressor; and

a compressed data block generator (1640-1650; 1840-1870; 2340-2370) configured to generate the compressed data block by combining:

compressed data values in the form of variable length code words corresponding to the data values in the uncompressed data block that were compressed by the compressor;

uncompressed data values in the form of detected data values in the uncompressed data blocks that cannot be compressed by the compressor; and

metadata (UUIC) indicating the uncompressed data value, wherein the metadata is a unique special meaning codeword (UUIC).

2. The data compression device (1600; 1800; 2300) of claim 1, wherein the unique special meaning codeword (UUIC) is a codeword generated at code generation together with the variable length codeword (1625; 1825; 2325) of the compressor (1620; 1820; 2320).

3. The data compression device (1600; 1800; 2300) of claim 2, wherein the unique special meaning codeword (UUIC) is generated by: the frequency of occurrence of all data values not present in the value-frequency table is calculated or estimated at the time of code generation, and their frequency of occurrence compared to the total number of data values present influences the width of the unique special meaning code word (UUIC).

4. The data compression device (1600; 1800; 2300) of any one of claims 1-3, wherein the compressed data block generator (1640-1650; 1840-1870; 2340-2370) is configured to:

when generating the compressed data block (1690; 1890; 2390), the unique special meaning codeword (UUIC) is attached to each uncompressed data value.

5. The data compression device (1600) of claim 4, wherein the compressed data chunk generator (1640-1650) is configured to generate the compressed data chunk (1690) by: the way that a variable length encoded bit sequence is iteratively constructed by adding compressed data values in the form of variable length code words or uncompressed data values with a unique special meaning code word attached thereto in the order in which they appear in the uncompressed data block (1610), the addition of compressed data values in the form of variable length code words or uncompressed data values with a unique special meaning code word attached thereto depending on the detection result (1639) of the detector (1630).

6. The data compression device (1800) according to any one of claims 1-3, wherein the compressed data block generator (1840-1870) is configured to:

for all detected data values in the uncompressed data blocks (1810) that cannot be compressed by the compressor, including the unique special meaning codeword in the generated compressed data blocks (1890) in the order in which the data values occur in the uncompressed data blocks (1610) instead of the corresponding detected data values; and

appending the detected data values in the uncompressed data blocks (1810) that cannot be compressed by the compressor to the end of the generated compressed data blocks (1890) in an order that is opposite to the order in which the detected data values appear in the uncompressed data blocks (1810).

7. The data compression device (1800) according to claim 6, the compressed data block generator (1840-1870) comprising a storage unit (1860) and a linker (1870), wherein the compressed data block generator is configured to generate the compressed data block (1890) by:

iteratively constructing a variable length encoded bit sequence by adding each compressed data value from the compressor (1820) in the order in which it appears in the uncompressed data block (1810);

adding the unique special meaning codeword (UUIC) to the variable length encoded bit sequence and saving the detected data values in the storage unit (1860) whenever the detector (1830) detects data values in the uncompressed data block (1810) that cannot be compressed by the compressor (1820), wherein the detected data values are saved in the storage unit (1860) in a reverse order compared to an order in which the data values appear in the uncompressed data block (1810); and

generating the compressed data block (1890) by concatenating the variable length encoded bit sequence and the data values stored in the storage unit (1860) by the concatenator (1870) when all data values of the uncompressed data block (1810) have been processed.

8. The data compression device (2300) of any of claims 1-3, further comprising a common value detection unit (2360) configured to detect when all data values of the uncompressed data block (2310) have the same common block data value,

wherein the data compression device is configured to: when all data values of the uncompressed data block (2310) are detected to have the same common block data value, generating the compressed data block (2390) that includes only special meaning data values indicative of the detected common block data values and does not include combinations of compressed data values, uncompressed data values, and metadata as described above; and

wherein the data compression device is configured to: when all data values of the uncompressed data block (2310) are not detected to have the same common block data value, generating the compressed data block (2390) comprising a value different from the special meaning data value followed by a combination of the compressed data value, uncompressed data value, and metadata described above.

9. The data compression device (1600; 1800; 2300) of any one of claims 1-3, wherein the compressed data block generator (1640-1650; 1840-1870; 2340-2370) comprises at least one selector (1650; 1850-1870; 2350) that is responsive to a detection result (1639; 1839; 2339) of the detector (1630; 1830; 2330) and configured to generate a compressed data block (1690; 1890; 2390) by combining the variable length codeword, the detected data value and the metadata in one or more steps.

10. The data compression device (1600; 1800; 2300) of any one of claims 1-3, wherein the detector (1630; 1830; 2330) is configured to detect data values in the uncompressed data block that cannot be compressed by the compressor as one or more of:

data values not present in the code table (1622; 1822; 2322) of the compressor (1620; 1820; 2320),

data values of a codeword (1625; 1825; 2325) that are present in the code table but are absent in the code table of the compressor,

data values present in the code table having a codeword (1625; 1825; 2325) in the code table but indicated as invalid (1624; 1824; 2324) in the code table of the compressor.

11. A data decompression apparatus (1700; 1900; 2400) for decompressing a compressed data block (1710; 1910; 2410) into a decompressed data block (1790; 1990; 2490) comprising one or more data values (v1-vn), the data decompression apparatus comprising:

a decompressor (1720-1730; 1920-1930; 2420-2430) configured to decompress a variable length codeword (1625; 1825; 2325) of the compressed data block into a corresponding decompressed data value; and

a decompressed data block generator (1740-1780; 1940-1985; 2440-2487) configured to:

detecting metadata (UUIC) in the compressed data block (1710), the metadata being a unique special meaning codeword (UUIC) indicating uncompressed data values included in the compressed data block; and

generating the decompressed data block based on the detected metadata by: the manner in which decompressed data values from the decompressor and uncompressed data values from the compressed data blocks are combined is such that the order of the data values of the generated decompressed data blocks is the same as the order in which the data values occurred in uncompressed data blocks (1610; 1810; 2310) prior to the data compression that produced the compressed data blocks.

12. The data decompression apparatus (1700) of claim 11 wherein the decompressed data block generator (1740-1780) is configured to:

detecting a unique special meaning codeword (UUIC) included in the compressed data block (1710) and generating a resultant control signal (1762);

removing the detected unique special meaning codeword from the associated uncompressed data value to which it is attached; and

generating the decompressed data block (1790) by: the way is that for each data value in the compressed data block (1710), a selector (1780) is controlled based on the control signal (1762) to select a decompressed data value from the decompressor (1720, 1730) or an uncompressed data value for which the detected unique special meaning codeword has been removed.

13. The data decompression device (1900) of claim 11, wherein the decompressed data block generator (1940-1985) is configured to:

detecting the unique special meaning codeword (UUIC) included in the compressed data block (1910) and generating a resultant control signal (1962); and

generating the decompressed data block (1990) by: the manner is that for each data value in the compressed data block (1910) and based on the control signal (1962), a control selector (1985) selects decompressed data values from the decompressors (1920, 1930) in the order in which they appear in the compressed data block (1910) or selects uncompressed data values from the compressed data block (1910) in the reverse order compared to the order in which they appear in the compressed data block (1910).

14. The data decompression device (1900) of claim 13, wherein the decompressed data block generator (1940-1985) is configured to:

storing copies of at least a portion of the compressed data block (1910) in a reverse order in a storage unit (1964), the tail end of the compressed data block being stored at the beginning of the storage unit;

incrementing an uncompressed data value counter (1980) whenever a unique special meaning codeword (UUIC) is detected to be included in the compressed data block (1910); and

providing the uncompressed data value to the selector (1985) using the uncompressed data value counter (1980) as a pointer to a storage location in the storage unit (1964).

15. The data decompression device (2400) as defined in any one of claims 11-14, further comprising a special meaning data value detector (2450) configured to detect a special meaning data value indicative of a common block data value at a beginning of the compressed data block (2410),

wherein the data decompression device is configured to: generating a decompressed data block by populating the decompressed data block with the common block data values (2490) when the special meaning data values are detected; and

wherein the data decompression device is configured to: generating the decompressed data block according to any of claims 11-14 when the special meaning data value is not detected.

16. A data compression method for compressing uncompressed data blocks (1610; 1810; 2310) comprising one or more data values (v1-vn) into compressed data blocks (1690; 1890; 2390), the data compression method comprising:

compressing data values of the uncompressed data blocks into corresponding variable length codewords (1625; 1825; 2325);

detecting data values in the uncompressed data block that cannot be compressed; and

generating the compressed data block by combining:

17. A method of data compression as claimed in claim 16 in which the unique special meaning codeword (UUIC) is caused to be generated at code generation time together with the variable length codeword (1625; 1825; 2325).

18. A method of data compression as claimed in claim 17 in which the unique special meaning codeword (UUIC) is generated by: the frequency of occurrence of all data values not present in the value-frequency table is calculated or estimated at the time of code generation, and their frequency of occurrence compared to the total number of data values present influences the width of the unique special meaning code word (UUIC).

19. A method of data compression as claimed in any one of claims 16 to 18 further comprising:

20. The data compression method of claim 19, further comprising:

generating the compressed data block (1690) by: the manner in which the variable length encoded bit sequence is iteratively constructed by adding the compressed data values in the form of variable length codewords or uncompressed data values with unique special meaning codewords attached thereto in the order in which they appear in the uncompressed data block (1610), the addition of the compressed data values in the form of variable length codewords or uncompressed data values with unique special meaning codewords attached thereto depending on the detection result of the detecting step (1639).

21. A method of data compression as claimed in any one of claims 16 to 18 further comprising:

22. The data compression method as claimed in claim 21, further comprising generating the compressed data block (1890) by:

iteratively constructing a variable length encoded bit sequence by adding each compressed data value in the form of its variable length codeword in the order in which it appears in said uncompressed data block (1810);

adding the unique special meaning codeword (UUIC) to the variable length encoded bit sequence and saving detected data values whenever an incompressible data value is detected in the uncompressed data block (1810), wherein the detected data values are saved in a reverse order compared to an order in which the data values appear in the uncompressed data block (1810); and

generating the compressed data block (1890) by concatenating the variable length encoded bit sequence and the saved data values when all data values of the uncompressed data block (1810) have been processed.

23. The data compression method as claimed in any one of claims 16-18, further comprising detecting when all data values of the uncompressed data block (2310) have the same common block data value,

wherein, when all data values of the uncompressed data block (2310) are detected to have the same common block data value, generating the compressed data block (2390) comprising only special meaning data values indicative of the detected common block data values and not a combination of compressed data values, uncompressed data values and metadata as described above; and

wherein, when all data values of the uncompressed data block (2310) are not detected to have the same common block data value, a compressed data block (2390) is generated that includes a value different from the special meaning data value followed by a combination of the compressed data value, uncompressed data value, and metadata.

24. A method of data compression as claimed in any one of claims 16 to 18 in which data values in the uncompressed data block which cannot be compressed are detected as one or more of:

data values not present in the code table (1622; 1822; 2322),

a data value of a codeword (1625; 1825; 2325) present in the code table but absent from the code table,

data values present in the code table having a codeword (1625; 1825; 2325) in the code table but indicated as invalid (1424; 1624; 1824; 2324) in the code table.

25. A data decompression method for decompressing a compressed data block (1710; 1910; 2410) into a decompressed data block (1790; 1990; 2490) comprising one or more data values (v1-vn), the data decompression method comprising:

decompressing variable length codewords (1625; 1825; 2325) of the compressed data block into corresponding decompressed data values;

detecting metadata (UUIC) in the compressed data block, the metadata being a unique special meaning codeword (UUIC) indicating uncompressed data values included in the compressed data block; and

generating the decompressed data block based on the detected metadata by: the manner is to combine decompressed data values from a decompressor and uncompressed data values from the compressed data blocks such that the order of the data values of the generated decompressed data blocks is the same as the order in which the data values occurred in an uncompressed data block (1610; 1810; 2310) of uncompressed data blocks before the data compression of the compressed data blocks was generated.

26. A method of data decompression as claimed in claim 25, further comprising:

detecting the unique special meaning codeword (UUIC) included in the compressed data block (1710) and generating a resultant control signal (1762);

generating the decompressed data block (1790) by: the manner is to select, for each data value in the compressed data block (1710) and based on the control signal (1762), either a decompressed data value or an uncompressed data value for which the detected unique special meaning codeword has been removed.

27. A method of data decompression as claimed in claim 25, further comprising:

generating the decompressed data block (1990) by: the manner is that for each data value in the compressed data block (1910) and based on the control signal (1962), decompressed data values are selected in the order in which they appear in the compressed data block (1910) or uncompressed data values from the compressed data block (1910) are selected in the reverse order compared to the order in which they appear in the compressed data block (1910).

28. The data decompression method according to claim 27, further comprising:

incrementing an uncompressed data value counter (1980) each time the unique special meaning codeword (UUIC) is detected as being included in the compressed data block (1910); and

the uncompressed data value is provided to the selecting step using the uncompressed data value counter (1980) as a pointer to a storage location in the storage unit (1964).

29. A method of data decompression according to any one of claims 25-28, further comprising:

detecting a special meaning data value indicative of a common block data value at the beginning of the compressed data block (2410),

wherein, when the special meaning data value is detected, the decompressed data block is generated by populating the decompressed data block with the common block data values (2490); and

wherein, when the special meaning data value is not detected, the decompressed data block is generated according to any of claims 25-28.

30. A data compression and decompression system (3500), comprising: one or more memories (3510); the data compression device (3520; 1600; 1800; 2300) of any one of claims 1-10; and a data decompression device (3530; 1700; 1900; 2400) according to any one of claims 11-15.

31. The compression and decompression system (3500) according to claim 30, wherein said system is a computer system (100; 200; 300; 400; 500), and wherein said one or more memories (3510) are from the group consisting of:

a cache memory (L1-L3),

a random access memory (130; 230; 330; 430; 530), and

an auxiliary storage device.

32. The compression and decompression system (1900) according to claim 30, wherein the system is a data communication system (600; 700) and wherein the one or more memories (3510) are data buffers.

33. A computer-readable medium comprising code instructions that when loaded and executed by a processing device cause performance of the method according to any one of claims 16-24.

34. A computer-readable medium comprising code instructions that when loaded and executed by a processing device cause performance of the method according to any one of claims 25-29.