CN112585589B

CN112585589B - Device and method for compacting compressed data blocks and uncompressed data blocks

Info

Publication number: CN112585589B
Application number: CN201880096600.9A
Authority: CN
Inventors: 阿列克谢·瓦伦蒂诺维奇·罗曼诺夫斯基; 埃莱雅·亚历山德罗维奇·帕皮耶夫; 牛进保; 薛强; 全绍晖
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2018-08-09
Filing date: 2018-08-09
Publication date: 2025-01-24
Anticipated expiration: 2038-08-09
Also published as: US20200366314A1; CN112585589A; WO2020032816A8; WO2020032816A1; EP3821346B1; US11177825B2; EP3821346A1

Abstract

The present invention relates to the field of data compression and data compaction. Specifically, the present invention provides a device for compacting compressed data blocks and uncompressed data blocks into an output buffer in an improved manner. Thus, wasted space is reduced or eliminated. The device is used to: obtain an input data block set, wherein the set includes at least one of a compressed data block and an uncompressed data block; compact the compressed data blocks into the output buffer starting from a first predefined area in the output buffer, so that the compressed data blocks are compacted in sequence; and compact the uncompressed data blocks into the output buffer starting from a second predefined area in the output buffer, so that the uncompressed data blocks are compacted in sequence.

Description

Apparatus and method for compacting compressed data blocks and uncompressed data blocks

Technical Field

The present invention relates generally to the field of data compression and data compaction. More particularly, the present invention relates to an apparatus and method for compacting compressed and uncompressed data blocks into an output buffer. The invention also relates to a device and a method capable of reducing wasted space of a storage medium in the case of storing compressed data blocks or the like.

Background

Lossless data compression is one of the algorithms commonly used in All-Flash-Array (All-Flash-Array) storage devices. Data compression may reduce the size of stored data as it is written to the storage device. Furthermore, when compressed data blocks are read from the storage device, these compressed data blocks are decompressed again to the original content and size. In the primary storage device, data compression and decompression is done "in-line/on-the-fly" transparently to the application writing and reading data.

Furthermore, lossless data compression may increase the effective storage capacity of the storage device, and thus may store more data in the storage device. In addition, since less data is written into the storage devices such as the Solid state disk (Solid STATE DRIVE, SSD), the wear of the SSD can be reduced by data compression, and the durability of the SSD can be improved.

The legacy device (e.g., memory provider) provides a compression ratio (compression ratio, CR) of 4:1, in other words, the algorithm used by the legacy memory provider compresses the data to 4/1 of its original size. In addition, some computer programs and/or applications write data (e.g., input data blocks) to a storage device in fixed-size blocks, and compressing the blocks may result in compressed data blocks of different sizes.

In addition, if the input data block is incompressible, compressing the input data block may result in an output block larger than the input block due to metadata compression overhead, etc. In this case, the compressed data block may be discarded by the memory pipeline, and the original input data block may be written to the memory device.

Some computer programs and/or applications may read blocks of data from a storage device in any manner, such as discontinuously. In addition, to speed up the reading of compressed data blocks from a storage device at any address, other data structures that are indirectly addressed (e.g., address translation) may be used because the size of the compressed data blocks may vary.

Conventionally, in order to perform online data compression, compressed data blocks are written using granularity units of a fixed size of 1 Kilobyte (KB). For example, if an 8KB input data block is compressed to a 3.5KB compressed data block, 41 KB granularity units are required to write the compressed data block. This may be the case when compressed data blocks are written to the storage medium using granularity units. In addition, the size and offset of the written compressed data blocks may be represented using granularity units.

Then, granularity units with compressed data may be packed (hereafter also referred to as "pack") into blocks (chunk) (hereafter also referred to as "output buffers") using some other algorithm, such as fixing blocks (chunk) of 1 Megabyte (MB) or the like.

Fig. 14 is a diagram of a conventional compression of an 8KB input data block and further packing of the compressed data block into blocks (chunk).

First, an 8KB input data block is compressed, and then the compressed data block is packed into blocks (chunk). Fig. 14 schematically shows two blocks (chunk) of fixed size 1MB (i.e., block (chunk) 1 and block (chunk) 2). Both blocks (chunk) include compressed data blocks and uncompressed data blocks, packed by 1KB granularity unit alignment.

Further, the block address (granularity address) includes a block (chunk) identifier (chunk) ID, a data offset, and a data size (hereinafter also referred to as a length). The block address may be used for address indirection (i.e., address translation) in a random read scenario.

Fig. 15 is a schematic diagram of the manner in which data blocks packed by 1KB granularity alignment are unpacked and decompressed from blocks (chunk) in the prior art.

Fig. 15 schematically shows two blocks (chunk) comprising a block (chunk) 1 and a block (chunk) 2, based on the two blocks (chunk) in fig. 14. These two blocks (chunk) have a fixed size of 1MB and include compressed data blocks and uncompressed data blocks that are aligned with 1KB granularity units at the time of packing.

As described above, the block address includes a block (chunk) ID, an offset, and a length. First, a block address is used, and data (i.e., packed compressed data) is unpacked. The compressed data block is then decompressed to an 8KB output data block.

As shown, the block address includes a block (chunk) ID, an offset, and a length for reading, unpacking, and decompressing data from the block (chunk).

In addition, the methods described in fig. 14 and 15 use 1 bit per granularity unit to flag data that is in use (e.g., active) and/or deleted. The garbage collector algorithm performing defragmentation requires the use of 1 bit per granularity unit as used above.

However, the disadvantage of the above conventional approach is that the last granularity unit of a compressed data block typically comprises a compressed data block of less than 1KB, thus wasting unused space for granularity units.

For example, if an 8KB size input data block is compressed to 2KB+1B, then 3 granularity units of 1KB in size are required for packaging. Since only 1 byte of the third granularity unit is used, 1023 bytes of the third granularity unit are wasted.

Accordingly, the above conventional method has a disadvantage in that a wasteful space, such as a wasteful space formed when compressed data blocks are packed into blocks (chunk), is generated.

FIG. 16 is a diagram of wasted space formed in granularity units when compressed data blocks are packed with 1KB granularity unit alignment.

In fig. 16, 4 compressed data blocks are packed with 1KB granularity unit alignment. The first compressed data block occupies 2 granularity units, each granularity unit having a fixed size of 1 KB. Since the size of the first block is less than 2KB, a portion of the second granularity unit is wasted. Similarly, the second compressed data block occupies the 3 rd, 4 th, 5 th granularity unit, wasting a portion of the fifth granularity unit. Furthermore, the third compressed data block occupies the 6 th and 7 th granularity units, wastes a portion of the 7 th granularity unit, and so on.

FIG. 17 is a schematic diagram of wasted space formed in granularity units when compressed and uncompressed data blocks are packed by 1KB granularity unit alignment.

In fig. 17, several blocks including compressed data blocks B0, B1, B3, B4, B5, and Bn and uncompressed data blocks B2 and B6 are packed into blocks (chunk) by 1KB granularity unit alignment.

First, two uncompressed data blocks B0 and B1 are packed into a block (chunk). It can be seen that two wasted spaces are formed, the first wasted space being located between B0 and B1 and the second wasted space being located in the last granularity unit (fifth granularity unit of block) occupied by B1. The uncompressed data block B2 is then packed into blocks (chunk), followed by packing of compressed data blocks B3, B4, B5, etc. In this way, several wasted spaces are formed between the compressed data blocks, namely wasted space between the packed data blocks B3 and B4, wasted space between the packed data blocks B4 and B5, and so on.

The disadvantage of the above approach is that it creates wasted space in the case where the last granularity unit of the compressed data block is typically less than 1KB, etc., and wasted used space in the granularity unit. For example, in the case of an average compression ratio of 2:1, a block (chunk) of size 1MB may include 256 compressed data blocks. Furthermore, a 1MB block (chunk) may also include an average wasted space size of 128KB (512×256=128 KB), which results in a block (chunk) having 12.5% wasted space, where the average wasted space size in the last 1KB granularity unit of a compressed data block is 512 bytes.

For example, when an 8KB size input data block is compressed to 3KB+1B on average, a larger amount of wasted space, i.e., 25% of 256KB or blocks (chunk), is also accumulated.

In addition to the space waste in the storage medium, there is also waste in the central processing unit, called CPU waste. For example, CPU waste occurs when compressed data blocks are discarded as not fit in 6 granularity units or the like. The size of the compressed data block in the above method is limited due to the low bit utilization in the block address. The above conventional method uses 3 bits as the compressed data block size, expressed in 1KB granularity units. For example, when an uncompressed data block uses bit pattern "111", bit pattern "000" is invalid and a compressed data block having a size in the range of 1 to 6 granularity units uses the other 6 bit patterns.

In conventional devices and methods, any compressed data blocks that require more than 6KB are discarded, wasting the corresponding CPU cycles. Therefore, the above method has another disadvantage in that it causes CPU waste in addition to space waste in the storage medium.

Fig. 18 illustrates a prior art method of compacting compressed and uncompressed data blocks into Solid State Disk (SSD) memory.

For example, input data corresponding to an input file is first compressed. The compressed data blocks are 0.5KB, 1.0KB, 2.0KB, 1.3KB, 1.0KB, 0.5KB, 1.0KB, and 2.0KB in size, respectively, which are ready to be written to the SSD. If there is no data compaction, then one SSD block of 4KB size is allocated to each file (i.e., each compressed data block), using 36KB of storage in total.

If the compressed data blocks are data packed, multiple files can be written to each 4KB block, thus using only 12KB of the storage device.

However, in the above method, the sum of the sizes of the blocks packed into one 4KB block may be smaller than 4KB (e.g., 3KB, 3.5KB, etc.), and thus, a wasteful space is formed in each 4KB block.

Thus, the above method has a disadvantage in that, for example, compressed data blocks form a wasteful space when compressed into an SSD. Furthermore, another disadvantage of the above approach is that read amplification may occur when reading "wasted bytes".

Fig. 19 illustrates another method of compressing and compacting an input data file into a memory device described in US 2012/0290798 A1.

The HOST (HOST) column in fig. 19 shows a number of data regions grouped according to logically addressable units (Logical Addressable Unit, LAUs), each LAU having 4 data regions. There are a total of 5 LAUs, labeled LAA0 through LAA4.

The data area shown in the host column is an uncompressed data area. After compression, the data area occupies less space. For example, the data area of LAA0 stored in DDR is compressed by 50%, so LAA0 occupies 50% of the memory blocks in DDR. The LAU written into the DDR does not employ this compaction scheme, resulting in a portion of the space in the DDR memory block being empty. The data area in LAA1 is also compressed by 50% and the data area in LAA2 is compressed by 25%. The data area in LAA3 cannot be compressed uniformly, but the overall compression ratio of LAA3 is 50%. The compression ratio of LAA4 was 100%.

After the LAUs are compacted, they are grouped together into hardware addressable units (hardware addressable unit, HAU). When LAUs stored in Double Data Rate (DDR) memory are packed into a single HAU, the firmware causes these LAUs to be written to a storage device (SSD).

However, the above method creates wasted space, for example, when rounding up the compressed data area, wasted space may be formed between the compressed data blocks. In addition, wasted space is also created during compaction into the SSD.

While there are prior art techniques for compressing and compacting input data files into storage devices, such as by compacting each input data file into a single block of 4KB size, it is generally desirable to improve the apparatus and method for compacting compressed and uncompressed data blocks into output buffers and/or storage media.

Disclosure of Invention

In view of the above problems and disadvantages, the present invention is directed to improving conventional apparatuses and methods for data compression and compaction. It is therefore an object of the present invention to provide an apparatus and method for compacting compressed and uncompressed data blocks into an output buffer. In particular, there is a need to reduce wasted space when data is compressed and packed into a cache.

The object of the invention is achieved by the solution provided in the attached independent claims. Advantageous implementations of the invention are further defined in the dependent claims.

A first aspect of the invention provides an apparatus for compacting compressed and uncompressed data blocks into an output buffer. The device is used for acquiring an input data block set, wherein the set comprises at least one of compressed data blocks and uncompressed data blocks, compressing the compressed data blocks into the output buffer from a first predefined area in the output buffer so that the compressed data blocks are compressed in sequence, and compressing the uncompressed data blocks into the output buffer from a second predefined area in the output buffer so that the uncompressed data blocks are compressed in sequence.

According to the first aspect, the formation of wasted space between compressed data blocks is substantially reduced or even eliminated. For example, using lossless data compression of the LZ4 algorithm, with the apparatus provided by the first aspect, wasteful cancellation may be close to "ideal". In other words, the compressed data blocks may be packed (e.g., sequentially) such that no wasted space is formed between two adjacent compressed data blocks. Furthermore, a better data reduction ratio may be obtained, for example, the apparatus provided by the first aspect may obtain a data reduction ratio of 13% to 27%. Thus, the speed (e.g., writing and reading speeds) can also be increased by 6% to 42% as compared to typical online compression packing schemes of conventional devices.

Another advantage of the apparatus provided by the first aspect is that the actual compressed data size is taken into account in the packing scheme and that the waste of SSD memory is minimized compared to typical online compression packing schemes. For example, wasted space may be reduced to 0.1% of the memory capacity.

In an implementation manner of the first aspect, the apparatus is further configured to determine an upper byte limit of the compressed data block according to a size of the output buffer, a size of the compressed data block to be compressed into the output buffer, and a header size of the compressed data block.

By determining an upper byte limit for the compressed data blocks, the device is able to allocate (e.g., based on the determined upper limit) a predetermined space in the output buffer for compressed data blocks to be packed into the output buffer.

In an implementation form of the first aspect, the apparatus is further configured to obtain an input data block, compress the obtained input data block according to a predefined compression ratio, and determine a size of the compressed data block.

The advantage of this implementation is that by compressing the acquired input data blocks, the device reduces the data size when the data is written to the storage device.

In one implementation of the first aspect, when determining that the size of the compressed data block is smaller than a granularity unit of the output buffer, the device copies and adds the compressed data to a separate output buffer, wherein the separate output buffer is associated with the compressed data block smaller than the granularity unit, the granularity unit representing a granularity of a memory.

By allocating separate output buffers for the compressed data blocks smaller than the granularity unit, the formation of wasted space in the granularity unit including small data blocks can be avoided. Thus, not only is wasted space in granularity units avoided, but also several small data blocks are copied into the separate output caches, thus efficiently utilizing the memory.

In an implementation manner of the first aspect, the first predefined area is a start of the output buffer, and the compressed data blocks are sequentially packed from the start of the output buffer towards an end of the output buffer.

An advantage of this implementation is that the compressed data blocks are packed next to each other and that the formation of wasted space between the compressed data blocks may be reduced and/or eliminated. In addition, since the compressed data block is compressed according to its actual size, waste in the storage medium can be minimized.

In one implementation of the first aspect, the compressed data blocks are packed into the output buffer without gaps and/or independently of granularity units of the output buffer.

By compacting the compressed data blocks using a gapless method and/or independently of the granularity unit, the apparatus is able to reduce and/or eliminate wasted space that would otherwise be created after compressed data blocks stored in the granularity unit. For example, the device can use actual compressed data blocks having a variable size. Thus, the apparatus solves the problem in the conventional apparatus that wasted space remains in each compressed data block and accordingly remains in the storage medium, that is, in the conventional apparatus, the rounded size of the compressed data block is used instead of the actual size.

In an implementation manner of the first aspect, the second predefined area is an end of the output buffer, and the uncompressed data blocks are sequentially packed from the end of the output buffer towards a start of the output buffer.

An advantage of this implementation is that the uncompressed data blocks are packed next to each other and that wasted space formed between the uncompressed data blocks can be reduced and/or eliminated.

In one implementation manner of the first aspect, the uncompressed data block is compacted into the output buffer according to granularity units of the output buffer.

An advantage of this implementation is that the size of the uncompressed data block is based on the granularity unit of the output buffer. Thus, no wasted space is formed between two and/or more uncompressed data blocks, as these uncompressed data blocks are arranged in accordance with the granularity of the output buffer.

In an implementation form of the first aspect, the apparatus is further configured to calculate a block address of the compressed data block by determining an offset of the compressed data block in the output buffer and estimating a length of the compressed data block in the output buffer.

The advantage of this implementation is that the block address, offset and length are calculated, so the device can allocate temporary buffers, etc. based on the calculated block address, determined offset and estimated length. Therefore, the compressed data block can be more efficiently read, written, stored, and the like.

In one implementation of the first aspect, a block address of the compressed data block smaller than the granularity unit corresponds to an index of the granularity unit.

The block address of a compressed data block smaller than the granularity unit can be calculated using the index of the granularity unit. Thus, the block address of a small compressed data block can be calculated.

In an implementation form of the first aspect, the apparatus is further configured to calculate a block address of the uncompressed data block by determining an offset of the uncompressed data block in the output buffer and estimating a length of the uncompressed data block in the output buffer.

An advantage of this implementation is that calculating the block address of the uncompressed data block enables allocating space for the uncompressed data block, e.g., at the end of a block (chunk). Therefore, the uncompressed data blocks can be read, written, stored, and the like more efficiently.

In an implementation form of the first aspect, the apparatus is further configured to generate a block preamble for each compressed data block, wherein the block preamble represents an offset of the compressed data block relative to a start of the granularity unit.

Generating the block preamble may determine a position of the compressed data block relative to a start of the granularity unit. Further, a byte offset of the compressed data block may be determined, which may be stored at the beginning of a first granularity unit of the compressed data block in the block (chunk).

In an implementation form of the first aspect, the apparatus is further configured to generate a block tail header for each compressed data block, wherein the block tail header represents an offset of a last byte of the compressed data block relative to a start of a last granularity unit of the compressed data block.

Generating the block tail header can determine an offset of a last byte of the compressed data block relative to a start of a last granularity unit of the compressed data block. Furthermore, the tail header may be placed at the beginning of the last granularity unit of the compressed data block.

For example, in some embodiments, the compressed data block may occupy 3KB+1 bytes. Further, assuming that 3KB includes the preamble, 1 byte remains in the last granularity unit of the compressed data block. Furthermore, the device may insert the tail header at the beginning of the last granularity unit of the compressed data block (e.g., assuming the tail header is 2 bytes), so the last granularity unit of the compressed data block may be 3 bytes, 2 bytes of which are used for the tail header and 1 byte for the compressed data block. In this case, the tail header may include a 3 byte offset.

In an implementation manner of the first aspect, the apparatus further includes a writing module configured to write the compressed data block and the uncompressed data block to the memory.

An advantage of this implementation is that the device comprises a writing module and can write data, i.e. the compressed data blocks and the uncompressed data blocks, to the memory.

In one implementation of the first aspect, the compressed data blocks and the uncompressed data blocks packed into the output buffer (104) are written to the memory.

An advantage of this implementation is that all advantages provided to the output buffer for the compressed data blocks and the uncompressed data blocks can be transferred to the memory accordingly. In other words, the device is able to reduce the size of the stored data when the data is written to the storage device. Furthermore, when a compressed data block is read from the storage device, the compressed data block is decompressed again to the original content and size. In addition, the device increases the effective storage capacity of the memory. The wear of the memory can be reduced, and also the durability can be improved, etc.

In an implementation manner of the first aspect, the apparatus further includes a reading module configured to read the compressed data block and the uncompressed data block from the memory.

An advantage of this implementation is that the compressed data blocks and the uncompressed data blocks can be read from the memory.

In an implementation manner of the first aspect, the reading module is further configured to read the compressed data block and the uncompressed data block according to an identification number of the output buffer, a size of a corresponding block, and an offset of the corresponding block relative to a start of the output buffer.

Based on the block (chunk) ID (hereinafter also referred to as chunk (chunk) identifier, the identification number of the output buffer), the size of the compressed data chunk and the offset of the compressed data chunk from the start of the chunk (chunk), data can be read faster and more efficiently.

In one implementation of the first aspect, the memory includes volatile memory or nonvolatile memory.

The invention may be applied to memories based on volatile and/or non-volatile systems, the invention not being limited to a particular type of memory.

A second aspect of the invention provides a method of compacting compressed and uncompressed data blocks into an output buffer. The method comprises the steps of obtaining an input data block set, wherein the set comprises at least one of compressed data blocks and uncompressed data blocks, compressing the compressed data blocks into the output buffer from a first predefined area in the output buffer so that the compressed data blocks are compressed in sequence, and compressing the uncompressed data blocks into the output buffer from a second predefined area in the output buffer so that the uncompressed data blocks are compressed in sequence.

In one implementation manner of the second aspect, the method further includes determining an upper byte limit of the compressed data block according to a size of the output buffer, a size of the compressed data block to be compressed into the output buffer, and a header size of the compressed data block.

In one implementation of the second aspect, the method further includes obtaining an input data block, compressing the obtained input data block according to a predefined compression ratio, and determining a size of the compressed data block.

In one implementation manner of the second aspect, when the size of the compressed data block is determined to be smaller than a granularity unit of the output buffer, the method further includes copying and adding the compressed data to a separate output buffer, wherein the separate output buffer is associated with the compressed data block smaller than the granularity unit, and the granularity unit represents granularity of a memory.

In one implementation of the second aspect, the first predefined area is a start of the output buffer, and the compressed data blocks are sequentially packed from the start of the output buffer towards an end of the output buffer.

In one implementation of the second aspect, the compressed data blocks are packed into the output buffer without gaps and/or independently of granularity units of the output buffer.

In an implementation manner of the second aspect, the second predefined area is an end of the output buffer, and the uncompressed data blocks are sequentially packed from the end of the output buffer towards a start of the output buffer.

In one implementation manner of the second aspect, the uncompressed data block is compacted into the output buffer according to granularity units of the output buffer.

In one implementation form of the second aspect, the method further comprises calculating a block address of the compressed data block by determining an offset of the compressed data block in the output buffer and estimating a length of the compressed data block in the output buffer.

In one implementation of the second aspect, the block address of the compressed data block smaller than the granularity unit corresponds to an index of the granularity unit.

In one implementation of the second aspect, the method further comprises calculating a block address of the uncompressed data block by determining an offset of the uncompressed data block in the output buffer and estimating a length of the uncompressed data block in the output buffer.

In one implementation of the second aspect, the method further comprises generating a block preamble for each compressed data block, wherein the block preamble represents an offset of the compressed data block relative to a start of the granularity unit.

In one implementation of the second aspect, the method further comprises generating a block tail header for each compressed data block, wherein the block tail header represents an offset of a last byte of the compressed data block relative to a start of a last granularity unit of the compressed data block.

In one implementation of the second aspect, the method further includes writing the compressed data block and the uncompressed data block to the memory.

In one implementation of the second aspect, the compressed data blocks and the uncompressed data blocks packed into the output buffer are written to the memory.

In one implementation of the second aspect, the method further includes reading the compressed data block and the uncompressed data block from the memory.

In one implementation of the second aspect, the method further comprises reading the compressed data block and the uncompressed data block according to an identification number of the output buffer, a size of a corresponding block, and an offset of the corresponding block relative to a start of the output buffer.

In one implementation of the second aspect, the memory includes volatile memory or nonvolatile memory.

It should be noted that all devices, elements, units and components described in the present application may be implemented in software or hardware elements or any type of combination thereof. All steps performed by the various entities described in this application and the functions described to be performed by the various entities are intended to indicate that the various entities are adapted to or for performing the respective steps and functions. Although in the following description of specific embodiments, a particular function or step performed by an external entity is not reflected in the description of a specific detailed element of the entity performing the particular step or function, it should be clear to a skilled person that the methods and functions may be implemented in corresponding hardware or software elements or any combination thereof.

Drawings

The following description of specific embodiments will set forth aspects of the invention and its implementation in conjunction with the accompanying drawings.

Fig. 1 is a schematic diagram of an apparatus for compressing a compressed data block and an uncompressed data block according to one embodiment of the present invention.

Fig. 2 is a detailed schematic diagram of an apparatus for compressing compressed data and uncompressed data according to one embodiment of the present invention.

Fig. 3 is a schematic diagram of compression of compressed and uncompressed data blocks into 1MB blocks (chunk) provided by various embodiments.

FIG. 4 is a schematic diagram of a calculated block address and block header provided by various embodiments.

Fig. 5 is a schematic diagram of one example of a block header provided by various embodiments.

FIG. 6 is a schematic diagram of a calculated block length provided by various embodiments.

FIG. 7 is a schematic diagram of wasted space formed between 2 compressed data blocks provided by various embodiments.

FIG. 8 is a schematic diagram of two wasted spaces formed between 3 compressed data blocks provided by various embodiments.

FIG. 9 is a schematic diagram of wasted space formed between 2 compressed data blocks provided by various embodiments.

Fig. 10 is a schematic diagram of a header for placing a number of small compressed data blocks provided by various embodiments.

FIG. 11 is a schematic diagram of a method for compressing compressed data blocks and uncompressed data blocks into an output buffer according to one embodiment of the present invention.

FIG. 12 is a schematic diagram of a method of writing an input data block into memory provided by various embodiments.

FIG. 13 is a schematic diagram of a method of reading compressed data blocks and uncompressed data blocks from memory provided by various embodiments.

FIG. 14 is a diagram of a conventional compression and packing of data blocks into blocks (chunk) by 1KB granularity unit alignment.

FIG. 15 is a schematic diagram of a conventional manner of unpacking and decompressing data blocks packed by 1KB granularity unit alignment from a block (chunk).

FIG. 16 is a diagram of wasted space in granularity units when compressed data blocks are packed with 1KB granularity unit alignment.

FIG. 17 is a diagram of wasted space in granularity units when compressed and uncompressed data blocks are packed with 1KB granularity unit alignment.

Fig. 18 is a schematic diagram of a conventional method of compacting compressed and uncompressed data blocks into a solid state disk memory.

FIG. 19 is a schematic diagram of a prior art method of compressing and compacting an input data file into a memory device.

Detailed Description

Fig. 1 is a schematic diagram of an apparatus 100 for compacting compressed data blocks and uncompressed data blocks according to one embodiment of the present invention. Device 100 is specifically configured to obtain a set of input data blocks 101. The set of input data blocks 101 may include at least one of compressed data blocks 102 and uncompressed data blocks 103. In the embodiment of fig. 1, device 100 obtains a set of input data blocks 101, which illustratively includes 4 compressed data blocks 102 and 4 uncompressed data blocks 103. In some embodiments, the apparatus 100 may also be used to compress uncompressed data blocks, the acquired set of input data blocks, and so on.

The apparatus 100 is configured to compress the compressed data blocks 102 into the output buffer 104 starting from a first predefined area in the output buffer 104 such that the compressed data blocks 102 are compressed in sequence.

The device 100 is further arranged for compacting the uncompressed data blocks 103 into the output buffer 104 starting from a second predefined area in the output buffer 104 such that the uncompressed data blocks 103 are compacted in sequence.

For example, device 100 may obtain a set of input data blocks 101. Furthermore, the compression of compressed data block 102 and uncompressed data block 103 may be different. The starting region for compacting compressed data blocks 102 into output buffer 104 (also referred to as blocks (chunk)) may be different from the starting region for compacting uncompressed data blocks 103 into output buffer 104.

Compressed data block 102 may be packed using granularity units, such as granularity units of 1KB in fixed size, and blocks (chunk) of fixed size 1 MB. The invention is not limited to a particular size of granularity units and/or blocks (chunk). In addition, a block address including a block (chunk) ID, granularity offset, and length of the compressed data block may be used.

For example, the compressed data block 102 may be packed starting from the first predefined area (which may be the beginning of the output buffer 104) and may continue to be packed toward the end of the output buffer 104 and may be packed sequentially. Furthermore, the compressed data blocks 102 may not be packed through granularity unit alignment of the output buffer. Compressing the compressed data blocks 102 in this manner may eliminate wasted space. However, in some embodiments, for example, when 8KB is compressed to less than 1KB, some wasted space may be created.

Furthermore, the uncompressed data block 103 may be packed into the output buffer 104 starting from a second predefined area in the output buffer 104. The second predefined area may be the end of the output buffer 104. Uncompressed data block 103 may also be packed from the end of the output buffer towards the beginning of output buffer 104. The invention is not limited to a particular region and/or orientation. Moreover, uncompressed data blocks 103 may be packed sequentially.

In addition, compacting the compressed data blocks 102 into the output buffer 104 may effectively eliminate wasted space between adjacent compressed data blocks 102. Similarly, compacting the uncompressed data blocks 103 may also eliminate wasted space between adjacent uncompressed data blocks 103. However, for example, when compressed data block 102 and uncompressed data block 103 are compressed next to each other (e.g., as adjacent data blocks), some wasted space may be created.

Fig. 2 is a detailed schematic diagram of an apparatus 100 for compacting compressed data blocks and uncompressed data blocks according to one embodiment of the present invention. The device 100 in fig. 2 is based on the device 100 in fig. 1 and thus includes all the functions and features of the device 100. For this reason, like features are labeled with like reference numerals. Other features described with reference to fig. 2 are optional features of the device 100.

Device 100 also includes an interface 201 in the form of an adapter for communicatively coupling with other devices, such as a host computer, for obtaining a set of input data blocks 101, wherein the set includes at least one of compressed data blocks 102 and uncompressed data blocks 103. The interface 201 may include a computer bus interface, such as a serial AT accessory (SERIAL AT ATTACHMENT, SATA), a parallel advanced technology attachment (PARALLEL ADVANCED Technology Attachment, PATA), or the like.

The device 100 further comprises a data compressor 202 for compressing the input data block 101 acquired by the interface 201 according to a predefined compression ratio and for determining the size of the compressed data block 102. For example, the data compressor 202 may employ various compression schemes to remove redundant data, metadata, etc. and/or reduce the size of the acquired input data block 101.

Furthermore, the apparatus 100, such as the data compressor 202 therein, may also be configured to determine an upper byte bound for the compressed data block 102 based on the size of the output buffer, the size of the compressed data block 102 to be packed into the output buffer 104, and the header size of the compressed data block 102.

In addition, the apparatus 100 may optionally further comprise a data compactor 203. The data compressor 203 may acquire information about the size of the compressed data block, the header size, the output buffer size, etc., and may also compress the compressed data block 102 and the uncompressed data block 103 into the output buffer 104.

The device 100 may also optionally include a block address calculator 204. The block address calculator 204 may calculate the block address of the compressed data block 102 by determining an offset of the compressed data block 102 in the output buffer 104 and estimating a length of the compressed data block 102 in the output buffer 104.

As described above, the data compressor 202 may determine the size of the compressed data block 102. Further, when determining that the size of the compressed data block 102 is smaller than the granularity unit of the output buffer 104, the block address calculator 204 may allocate an index of the granularity unit as a block address of the compressed data block smaller than the granularity unit.

The block address calculator 204 may also calculate the block address of the uncompressed data block 103 by determining an offset of the uncompressed data block 103 in the output buffer 104 and estimating a length of the uncompressed data block 103 in the output buffer 104.

As described above, the length bits in the block address can be estimated. For example, device 100 (e.g., block address calculator 204 therein) may assign bit pattern "111" to uncompressed data occupying 8 granularity units. In addition, the length of the compressed data block may use other bit patterns "000" to "110" (coding numbers of 0 to 6). The length of the compressed data block may be calculated by adding 2 to the value represented by the bit pattern. For example, bit pattern "000" indicates a value of 0. Then, by adding a value of 0 to 2 (i.e., 0+2=2), the present invention encodes the length of 2 granularity units for a compressed data block. Thus, the length of the compressed data block may be represented using granularity units, ranging from 2 to 8 granularity units.

In addition, the compressed data length of 8 granularity units encoded by the bit pattern "110" represents the actual length of a compressed data block of less than 8KB (i.e., <8 KB), rounded to 8 granularity units. Thus, there may be up to 8KB-256 bytes of compressed data blocks. This value is taken as an upper limit for the compressed data size because, for example, if all blocks are compressed to 8KB-256 bytes and are not further packed into blocks (chunk) by alignment, 129 compressed data blocks may be included in one block (chunk), which is 1 more block than 128 uncompressed data blocks. In some embodiments, the present invention may require 1 bit per block for garbage collection, rather than 1 bit per granularity unit in conventional devices.

The device 100 may also optionally include a block header generator 205. The block header generator 205 may also be configured to generate a block header for each compressed data block 102, wherein the block header represents an offset of the compressed data block relative to a start of a granularity unit. The block header generator 205 may also generate a block tail header for each compressed data block 102, where the tail header represents an offset of a last byte of a compressed data block relative to a start of a last granularity unit of the compressed data block.

As described above, the data compressor 203 may compress the compressed data block 102 and the uncompressed data block 103. The data compressor may compress the compressed data blocks 102 into the output buffer 104 starting from a first predefined area in the output buffer 104. The first predefined area may be the beginning of the output buffer 104 and the compressed data blocks 102 may be packed sequentially from the beginning of the output buffer 104 towards the end of the output buffer 104. Furthermore, compressed data blocks 102 may be packed into output buffer 104 without gaps and/or independently of granularity units of output buffer 104. In other words, compaction may not be performed by granularity unit alignment of the output buffer 104. Such packaging may eliminate wasted space, etc.

The data compressor may also compress the uncompressed data blocks 103 into the output buffer 104 starting from the second predefined area. The second predefined area may be the end of the output buffer 104 and the uncompressed data blocks 103 may be packed sequentially from the end of the output buffer 104 towards the beginning of the output buffer 104. Furthermore, uncompressed data blocks 103 may be packed into output buffer 104 according to granularity units of output buffer 104. For example, uncompressed data blocks 103 may be packed into an output buffer such that these uncompressed data blocks 103 are naturally aligned to 1KB (i.e., total size of 8 KB).

The device 100 may also optionally include a separate output buffer 206. The separate output buffer 206 may be used for compaction of small compressed data blocks, e.g., compressed data blocks smaller than the granularity unit of the output buffer 104. As described above, the data compressor 202 may determine the size of the compressed data block. In addition, when it is determined that the size of the compressed data block is smaller than the granularity unit of the output buffer 104, the device (e.g., the data compactor 203) may copy and add the compressed data to the separate output buffer 206. The individual output buffers 206 are associated with compressed data blocks that are smaller than granularity units that represent the granularity of the memory 208 and/or the output buffer 104.

The device 100 may also optionally include a memory 208. The memory 208 may include volatile memory, nonvolatile memory, and the like.

The device 100 may also optionally include a write module 207. The write module may be used to write the compressed data blocks 102 and the uncompressed data blocks 103 to the memory 208. In addition, the compressed data blocks 102 and uncompressed data blocks 103 packed into the output buffer 104 (and/or the separate output buffer 206) are written to the memory 208.

The device 100 may also optionally include a reading module 209. The read module 209 may be used to read the compressed data blocks 102 and the uncompressed data blocks 103 from the memory 208. In addition, the reading module 209 may be configured to read the compressed data block 102 and the uncompressed data block 103 according to an identification number of the output buffer, a size of a corresponding block, and an offset of the corresponding block relative to a start of the output buffer.

Fig. 3 is a schematic diagram of compression of compressed data blocks 102 and uncompressed data blocks 103 into 1MB block (chunk) 104 (output buffer) provided by various embodiments. Without limiting the invention, the compressed data block 102 and the uncompressed data block 103 are compressed into blocks (chunk) 104 using 1MB block (chunk) 104 as follows.

The device 100 obtains a set of input data blocks, wherein the set comprises compressed data blocks 102, such as B0, B1, B3, B4, B5 and Bn, and uncompressed data blocks 103, such as B2 and B6, to be packed into blocks (chunk) 104.

Compressed data blocks 102 (i.e., B0, B1, B3, B4, B5, and Bn) are grouped together at the beginning of block (chunk) 104. The compressed data block 102 is packed from the left (i.e., the beginning of the block (chunk)) toward the end of the block (chunk) 104. The compressed data blocks 102 are packed without any alignment, so there is no wasted space between any two adjacent compressed data blocks 102.

Further, uncompressed data blocks 103 of 8KB, 16KB (and/or 32 KB), etc. are grouped together at the end of block (chunk) 104. Uncompressed data block 103 is packed from the right (i.e., at the end of block (chunk)) toward the beginning of block (chunk) 104. Uncompressed data blocks 103 are packed with natural alignment of block sizes of 8KB or the like.

In the case where the block (chunk) 104 is full, the compressed data block 102 is adjacent to the uncompressed data block 103, etc., there may be (a small amount of) wasted space 301 between the compressed data block 102 and the uncompressed data block 103.

In the embodiment of fig. 3, a wasted space 301 is formed between Bn and B6. When another compressed or uncompressed data block cannot be added, wasted space occurs. In this case, the wasted space size may be in the range of 1 byte to 8KB-1 bytes.

FIG. 4 is a schematic diagram of a compute block address and block header 401 provided by various embodiments.

As described above, one or more of several types of metadata may be used, including the block address of the compressed data block, the leading header 401, and the trailing header 402 (shown as white rectangles).

For example, the block address may include a block (chunk) ID, a block offset (e.g., an offset relative to the start of the block (chunk)), and a block length. Furthermore, the block offset and the block length may be represented using granularity units.

The block address may occupy 64 bits. Furthermore, the block address may be stored in a read access memory (READ ACCESS memory, RAM) to more quickly access blocks written to a storage medium, such as a Solid State Device (SSD).

The preamble 401 of the compressed data block may include a block byte offset of the compressed data block relative to the start of the granularity unit, etc. The preamble 401 of a compressed data block may be stored at the beginning of a first granularity unit of the compressed data block 102 in the block (chunk) 104. The first granularity unit of the compressed data block may be an offset in the corresponding block address. The relationship of block address and block header is shown in fig. 4, where the position of the block header is marked with a white rectangle. Further, 403 represents the actual start position of the head indication.

When the compressed data block 102 is the last compressed data block, the tail header 402 of the compressed data block may include a (e.g., suitable) tail or the like. However, if there is another compressed data block immediately after the compressed data block 102 is compressed, e.g., there is a next compressed data block, then the tail header 402 of the previous compressed data block may be and/or may represent the leading header of the next compressed data block.

Each compressed data block 102 may include a leading header 401 and a trailing header 402. The preamble header 401 may determine the offset of the first byte of the compressed data block 102 and the trailer header 402 may determine the byte offset following the last byte of the compressed data block.

In fig. 4, the leading head 401 of block B0 is located in the first granularity unit and the trailing head 402 is located in the second granularity unit.

The uncompressed data block may include only the block address. Furthermore, the uncompressed data blocks may not include a header, e.g., a leading header and/or a trailing header.

Fig. 5 is a schematic diagram of one example of a block header provided by various embodiments. Only compressed data uses the block header.

In fig. 5, H0 and H1 are the leading head 401 and the trailing head 402 of the block B0, respectively. H1 and H2 are the leading header 401 and trailing header 402, respectively, of block B1.

The H0 offset 501 field consists of 4 bytes, indicating the actual starting position of the B0 block in the granularity unit. The H0 size 502 field includes 1532 bytes, i.e., a B0 block compression size reported by a data compressor or the like.

The H1 offset field includes 516 byte offsets indicating the starting position of the B1 block in the granularity unit. The H1 size field includes 2296 bytes, i.e., the compressed size of the B1 block.

The size (also referred to as length) of compressed block data may be measured using byte and/or granularity units.

In the present invention, the compressed data blocks 102 do not need to be aligned, so the compressed data blocks 102 may start at any position in the granularity unit 601, cross one or more granularity units 601, and end at any position in the granularity unit 601.

Fig. 6 shows 3 compressed blocks 102, including blocks B0, B1, and B2, with corresponding headers 401 and 402 (also referred to as H0, H1, and H2). Block B0 spans granularity units 0 and 1. Block B1 spans granularity units 1,2, 3, and 4 (the last granularity unit includes the tail header of B1). Block B2 spans granularity units 4 through 9 (the last granularity unit (not shown) includes the tail header of B2).

The present invention limits the size of the compressed data block to a range of 2 to 8 granularity units. Furthermore, the present invention may achieve a compressed data block 102 size of up to 8KB-256 bytes, which may also be represented using 8 granularity units.

Fig. 7 is a schematic diagram of the formation of wasted space 301 between 2 compressed data blocks 102 provided by various embodiments.

For example, when the size of (e.g., at least one) compressed data block 102 is less than one granularity unit (e.g., less than 1 KB), a wasted space 301 may form between 2 compressed data blocks 102. In the embodiment of fig. 7, a waste space 301 is formed between block B1 and block B2.

Further, for example, the wasted space 301 may be formed in the following 3 cases:

1. the last part of the compressed data block B0 is smaller than 1KB;

Cr is very high, resulting in the next data block B1 being compressed to less than 1KB;

The last part of B0 is put into the same granularity unit together with the whole block B1, so that there may be wasted space reserved in this granularity unit (block B2 with corresponding header will start from the next granularity unit).

Fig. 8 is a schematic diagram of 2 wasted spaces 301 formed between 3 compressed data blocks 102 provided by various embodiments.

Fig. 8 shows that wasted space 301 is formed when the compression ratio (compression ratio, CR) is very high (e.g., cr=9:1). In some embodiments, 2 consecutive blocks may be compressed to less than 1KB, thus creating wasted space 301. In the embodiment of fig. 8, block B1 with the corresponding header starts from the next granularity unit, and thus, a wasted space 301 is formed between 2 consecutive granularity units.

Fig. 9 is a schematic diagram of the formation of wasted space 301 between 2 compressed data blocks 102 provided by various embodiments.

Fig. 9 shows that wasted space 301 is formed in the case that there are not enough bytes in the last granularity unit of a compressed data block for the tail head, etc.

For example, assume that the block header occupies 4 bytes. The last portion of the compressed data block B0 is greater than 1020 bytes and less than 1024 bytes in size. Thus, here a wasted space 301 in the range of 1 to 3 bytes is formed. In addition, the last granularity unit header is placed into the next granularity unit, so the size of the compressed data block 102 is increased by one granularity unit. Furthermore, the next compressed data block of B1 may not exist, so the last granularity unit includes only the tail header. In this case, the wasted space formed is 1020 bytes plus 1 to 3 bytes of the previous granularity unit. This may be the case for each block (chunk) 104.

Fig. 10 is a schematic diagram of a header 401, 402 for placing a number of small compressed data blocks 102, as provided by various embodiments.

In some embodiments, several (e.g., two or more) headers may be grouped together and may also be placed before the corresponding blocks. Furthermore, this makes it possible to further eliminate and/or reduce wasted space.

In fig. 10, the heads 401, 402 of several small compressed data blocks 102 are placed together at the beginning of the output buffer 104. In addition, 1001 represents a start unit of a block (chunk) ID and an offset indication, a block header is obtained by indexing, and 1002 represents an actual start position of the block indicated by the header in the unit, as described above.

FIG. 11 is a schematic diagram of a method 1100 for compacting compressed data blocks 102 and uncompressed data blocks 103 into an output buffer 104 according to one embodiment of the present invention.

Method 1100 includes step 1 of obtaining 1101 a set of input data blocks 101, wherein the set includes at least one of compressed data blocks 102 and uncompressed data blocks 103.

The method 1100 further comprises a step2 of compacting 1102 the compressed data blocks 102 into the output buffer 104, starting from a first predefined area in the output buffer 104, such that the compressed data blocks 102 are compacted in sequence.

The method 1100 further comprises a step3 of compacting 1103 the uncompressed data blocks 103 into the output buffer 104 starting from the second predefined area in the output buffer 104 such that the uncompressed data blocks 103 are compacted in sequence.

FIG. 12 is a schematic diagram of a method 1200 of writing an input data block to memory provided by various embodiments.

At 1201, the device 100 retrieves an input data block and begins writing the retrieved input data block to memory.

At 1202, the apparatus 100 compresses the acquired input data block according to a predefined compression ratio.

At 1203, device 100 determines whether the input data block is compressible. When it is determined that the input data block is compressible, the apparatus 100 further determines the size of the compressed data block, and then performs step 1204. However, upon determining that the input data block is not compressible (i.e., compression fails), device 100 performs step 1208.

At 1204, device 100 calculates a block length from the compression size and offset in the current unit of the block (chunk).

Further, the apparatus determines an upper limit of the compressed block size according to the result of [ output buffer size/(1+output buffer size/input block size) -2×head size ]. For example, when the output buffer is 1MB, the input block is 8KB, and the head size is 4, the apparatus determines the upper limit to be 8120 bytes. In the above formula, since each compression block includes a leading header and a trailing header, the header size is multiplied by 2.

In addition, the device 100 adjusts the upper limit by the number of bytes in the last granularity unit occupied by the previous compressed data block such that the length of the subsequent compressed data block in the granularity unit should be less than or equal to the length bits (i.e., 1. Ltoreq. Length bits). For a length bit of 3, the compressed data block in granularity units has a length of 8 (i.e., less than or equal to 1<3).

For example, assume that the last granularity unit of the previous compressed data block (along with its tail header) is occupied 15 bytes. The apparatus 100 adjusts the upper limit by subtracting 15 so that the total limit of the size of the compressed data blocks expressed in granularity units is equal to or smaller than granularity units (1 length bits).

In addition, the device 100 compares the adjusted upper limit with the available space remaining in the output buffer. When it is determined that there is sufficient space available, the device 100 continues to adjust the upper limit. Otherwise, the apparatus 100 replaces the upper limit value with the size of the remaining available space in the output buffer, and then performs step 1205.

At 1205, the device 100 places the first portion of the block into a block (chunk) from the offset of the current unit to the last unit of the block.

The device 100 stores the compressed data blocks with leading and trailing headers in the output buffer from the start position. For example, the device 100 increases the current pointer (pointer) of the compressed data block by the size of the compressed data block and increases the leading and trailing heads to write the next uncompressed data block.

Furthermore, when the size of the compressed data block is smaller than one granularity unit, the apparatus copies and adds the compressed data block to a separate cache corresponding to a small compressed data block, as described above.

In addition, when the separate buffer is full, for example, the separate buffer includes two or more small compressed data blocks, the apparatus 100 flushes its contents (i.e., the contents of the separate buffers) to the output buffer, and performs step 1209 to calculate the block address of the compressed data block.

At 1206, the device 100 generates a header in the first and last units of the block.

Further, for example, when the first granularity unit for storing the compressed data block is empty, the apparatus 100 writes the preamble of the compressed data block. The preamble includes the actual starting position of the compressed data block as described above. For example, for a header size of 4 bytes, the actual starting position of the compressed data block is 4.

As described above, the device 100 writes compressed data blocks in all granularity units except the last granularity unit to the output cache. In the last granularity unit, the device 100 first writes to the tail header. The tail header comprises the actual starting position of the available space after the compressed data block. The tail header may be the leading header of the next compressed data block.

Additionally, the last granularity unit of the compressed data block may include zero or more bytes of the tail header and the compressed data block. Furthermore, the size of the compressed data block to be written to the last granularity unit may not allow writing to the tail header. For example, the size of the granularity unit is 1024 bytes, the header size is 4 bytes, and the size of the compressed data block to be written to the last granularity unit is 1021 or 1022 or 1023 bytes. In this case, the compressed data block may be added with 0 bytes up to the size of the granularity unit. Furthermore, the tail header may be written at the beginning of the next granularity unit.

At 1207, the device 100 places the second portion of the block into the block (chunk), the device 100 placing the remaining bytes starting from the last unit offset of the block.

In addition, device 100 may also write zero or more bytes of the compressed data block to the last granularity unit.

At 1208, the device 100 places the uncompressed data block at the end of a block (chunk) preceding the previous uncompressed data block that has been packed.

For example, the device 100 determines that compression failed, e.g., the block size is greater than or equal to the adjusted upper limit. In addition, the device 100 stores the original block (i.e., the uncompressed data block) in the output buffer, from the end toward the beginning. In addition, apparatus 100 subtracts the original block size from the current pointer of the uncompressed data block in preparation for writing the uncompressed data block.

In addition, when there is insufficient space available in the output buffer to store the uncompressed data block, device 100 flushes the output buffer, initializes all data, and further stores the uncompressed data block in the newly initialized output buffer.

Or if the output buffer is flushed, for example, because the upper limit of available space is reached, as described above, the device 100 flushes all data to the output buffer, initializes all data, and further recompresses the input data block.

At 1209, the device 100 generates a block address.

At 1210, the apparatus 100 writes the block and ends writing the block to the memory.

As described above, the apparatus 100 calculates the block address of the compressed data block and/or the uncompressed data block.

The block address includes bit fields of offset, length, etc., calculated as follows.

The device 100 calculates an offset of the compressed data block or uncompressed data block in the output buffer. For example, if the output buffer size is 1MB and the uncompressed data block size is 8KB, device 100 stores 128 uncompressed data blocks in the output buffer. In addition, the apparatus 100 allocates at least two granularity units for each compressed data block because each compressed data block requires 2 headers. Thus, device 100 may store 512 compressed data blocks in the output buffer. The apparatus 100 also encodes the offset of any one of the 512 compressed data blocks using 9 bits. For a 1KB granularity unit, the device 100 uses 10 bits as the smallest addressable object for offset addressing, 9 bits being 1 bit less than 10 bits.

Further, the apparatus 100 calculates the length of the compressed data blocks and/or uncompressed data blocks in the output buffer. For example, when the output buffer is 1MB, the data block is 8KB, and the granularity unit is 1KB, the apparatus 100 needs a length of 3 bits. As described above, bit value 111 represents an uncompressed data block equal to 8 granularity units or 8KB in length. Further, bit values in the range of 000 to 011 represent the size of a compressed data block having a length in the range of 2 to 8 granularity units or 2KB to 8 KB.

In the case of a separate output buffer, the device 100 uses the index of small compressed data blocks in one granularity unit. For example, the output buffer size is 1MB, the data block size is 8KB, the granularity unit size is 1KB, and each granularity unit has 3 small blocks. The device 100 requires an index of 2 bits. An index value of 0 (bit field value of 00) indicates that there is no small compressed data block.

Furthermore, the block address of the uncompressed data block comprises the actual offset of the original block in the block (chunk), the length being equal to [ (1 < < length bit) -1]. For example, for a length bit of 3, the uncompressed data block has a block address length of 7 (i.e., (1<3) -1=7). However, the original length of the granularity unit is 8 (i.e., (1<3) =8). Furthermore, the block address may include an index of small compressed data blocks in one granularity unit, as described above. Index values in the range 1 to 3 are used for 1 to 3 small compressed data blocks, respectively.

In addition, the compressed data block offset of the block address of the compressed data block is an offset of the first granularity unit occupied by the compressed data block. In addition, when the compressed block data occupies one or less granularity units, the apparatus 100 requires a length of at least 2 due to the need for a leading header and a trailing header.

Moreover, the length of the compressed data block in the block address is the total number of granularity units occupied by the compressed block data minus 2, and the adjusted upper limit of the compressed data block size is considered. When the apparatus 100 determines that the difference between the current pointer of the uncompressed data block and the compressed data block is at least two granularity units, the apparatus 100 considers the use of the next input data block. Otherwise, the device 10 swipes the output buffer and initializes all data.

At 1301, the device 100 starts to perform step 1, reading a data block from memory. The device 100 initiates a block read using the block address.

At 1302, device 100 resolves the block address and derives a block (chunk) ID, offset, and length, as described above.

The device 100 uses the block (chunk) identifier (chunk) ID to locate the chunk (chunk) start location on the storage medium. The device 100 then uses the block address offset to find the location of the data block in the block (chunk).

At 1303, device 100 reads data from the block (chunk) according to the chunk (chunk) ID, offset, and length.

The device 100 adjusts the length of a data block (which may be a compressed data block or an uncompressed data block) and may read the data block as a plurality of consecutive granularity units.

For example, when the length of a data block encoded with the corresponding bit field of the block address is equal to [ (1 < < length bit) -1], the data block is an uncompressed data block. The device 100 then reads [ (1 < < length bit) ] granularity units. But when the length of the data block encoded with the corresponding bit field of the block address is in the range of 0 to [ (1 < < length bit) -2], the data block is a compressed data block, and then the apparatus 100 reads [ (length+2) ] granularity units.

At 1304, the apparatus 100 determines whether the block of data is an uncompressed block. Further, when it is determined that the data block is an uncompressed data block, the apparatus 100 performs step 1311. However, upon determining that the data block is not an uncompressed data block, the device 100 performs step 1305.

At 1305, device 100 parses the block start header, device 100 derives the size and byte offset of the compressed data block in the first granularity unit.

The device 100 parses the leading and trailing heads of the compressed data block to determine a starting position of the compressed data block in a first read granularity unit and an ending position of the compressed data block in a last granularity unit.

At 1306, device 100 allocates a temporary cache. The temporary cache corresponds to the derived block size.

At 1307, the apparatus 100 copies the first portion of the block from the offset to the last unit to the output buffer.

At 1308, device 100 skips the tail header in the last unit of the block.

At 1309, the apparatus 100 copies the second portion of the block to the output buffer starting from the last granularity unit.

Furthermore, when the last granularity unit includes a compressed data block, the apparatus 100 moves the compressed data block to a "proper position" (i.e., the beginning of the granularity unit) to cover the tail header and further to make the compressed data block contiguous.

At 1310, the device 100 decompresses the unpacked blocks.

The apparatus 100 decompresses the continuous compressed data and restores the compressed data to the original block content.

At 1311, device 100 reads the decompressed data block and ends the read process.

Table 1 shows the benchmark results of compressing and packing two data sets based on the method disclosed in the prior art, and table 2 shows the benchmark results of compressing and packing two data sets based on the apparatus (or the method operated by the apparatus) disclosed in the present invention.

The reference verification results of the prior art solution (one prototype implementation of the solution proposed in the prior art) and the independent application prototype of the present invention are obtained according to the following conditions.

1. Implemented block (chunk) packing and block address calculation

2. Data reading in memory-irrespective of IO (disk) overhead

3. Repeating (packing, compressing), (unpacking, decompressing) 100 times, average speed

4. Block compression/decompression is done in a single thread

Lz class algorithm as default compressor and scheme packing scheme

In addition, the environment for obtaining the reference verification result is Intel (R) Xeon (R) CPU E5-2670 0@2.60GHz, wherein the reference verification is compiled by using GCC 6.3-O3 and Linux kernel 3.19.

TABLE 1 benchmark verification results for compression and packing of two data sets based on methods disclosed in the prior art

TABLE 2 benchmark verification results for compression and packaging of two data sets based on the disclosed apparatus (or method operated by the apparatus)

Compared with the typical online compression packing scheme in the method disclosed by the prior art, the data reduction ratio of the reference check result is improved by 13 to 27 percent, and the speed is improved by 6 to 42 percent. In addition, the invention enables the selected LZ4 class algorithm to realize waste elimination close to ideal conditions.

The invention considers the actual compressed data size in the packing scheme, and minimizes the waste of SSD memory compared with typical online compression packing schemes in the prior art. For example, for the Oracle DB dataset, memory waste (% input) falls from 9.4% to 0.1%, while for the Standard (Standard) dataset, memory waste falls from 6.1% to 0.1%.

The invention has been described in connection with different embodiments and implementations as examples. However, other variations can be understood and effected by those skilled in the art in practicing the claimed invention, from a study of the drawings, the invention, and the appended claims. In the claims and in the description, the word "comprising" does not exclude other elements or steps, and the indefinite article "a" or "an" does not exclude a plurality. A single element or other unit may fulfill the functions of several entities or items recited in the claims. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to advantage.

Claims

1. An apparatus (100) for compacting compressed data blocks (102) and uncompressed data blocks (103) into an output buffer (104), characterized in that the apparatus (100) is adapted to:

Acquiring a set of input data blocks (101), wherein the set comprises at least one of compressed data blocks (102) and uncompressed data blocks (103);

-compacting the compressed data blocks (102) into the output buffer (104) without gaps starting from a first predefined area in the output buffer (104), such that the compressed data blocks (102) are compacted in sequence;

-compacting the uncompressed data blocks (103) into the output buffer (104) starting from a second predefined area in the output buffer (104), such that the uncompressed data blocks (103) are compacted in sequence;

The compressing the compressed data blocks (102) into the output buffer (104) without gaps comprises adjusting the length upper limit of the compressed data blocks (102) by the number of bytes in a granularity unit (601) of the last output buffer (104) occupied by the previous compressed data block (102) so that the length of the subsequent compressed data blocks (102) in the granularity unit (601) is smaller than or equal to the length upper limit.

2. The apparatus (100) of claim 1, further configured to determine an upper byte limit for the compressed data block based on a size of the output buffer (104), a size of the compressed data block (102) to be packed into the output buffer (104), and a header size of the compressed data block (102).

3. The device (100) according to claim 1 or 2, characterized in that the device is further adapted to acquire an input data block (101), to compress the acquired input data block (101) according to a predefined compression ratio, and to determine the size of the compressed data block (102).

4. A device (100) according to claim 3, characterized in that when the size of the compressed data block (102) is determined to be smaller than a granularity unit (601) of the output buffer (104), the compressed data is copied and added to a separate output buffer (206), wherein the separate output buffer (206) is associated with compressed data blocks smaller than the granularity unit (601), the granularity unit (601) representing the granularity of the memory (208).

5. The device (100) according to any of claims 1-4, wherein the first predefined area is a start of the output buffer (104), the compressed data blocks (102) being packed sequentially from the start of the output buffer (104) towards an end of the output buffer (104).

6. The device (100) according to any of claims 1-5, wherein the second predefined area is an end of the output buffer (104), the uncompressed data blocks (103) being packed sequentially from the end of the output buffer (104) towards a beginning of the output buffer (104).

7. The apparatus (100) of claim 6, wherein the uncompressed data blocks (103) are packed into the output buffer (104) according to granularity units (601) of the output buffer (104).

8. The apparatus (100) according to any of claims 1-7, wherein the apparatus (100) is further configured to calculate a block address of the compressed data block (102) by determining an offset (501) of the compressed data block (102) in the output buffer (104) and estimating a length (502) of the compressed data block (102) in the output buffer (104).

9. The apparatus (100) of claim 8, wherein a block address of a compressed data block (102) smaller than the granularity unit (601) corresponds to an index of the granularity unit (601).

10. The apparatus (100) according to any of claims 1-9, wherein the apparatus (100) is further configured to calculate a block address of the uncompressed data block (103) by determining an offset of the uncompressed data block (103) in the output buffer (104) and estimating a length of the uncompressed data block (103) in the output buffer (104).

11. The apparatus (100) according to any one of claims 1-10, wherein the apparatus (100) is further configured to generate a block preamble (401) for each compressed data block, wherein the block preamble represents an offset of the compressed data block with respect to a start of the granularity unit (601).

12. The apparatus (100) according to any one of claims 1-11, wherein the apparatus (100) is further configured to generate a block tail header (402) for each compressed data block, wherein the block tail header represents an offset of a last byte of the compressed data block relative to a start of a last granularity unit of the compressed data block.

13. The device (100) according to any one of claims 1-11, wherein the device (100) further comprises a writing module (207) for writing the compressed data block (102) and the uncompressed data block (103) to a memory (208).

14. The apparatus (100) of claim 13, wherein the compressed data blocks (102) and the uncompressed data blocks (103) packed into the output buffer (104) are written to the memory (208).

15. The device (100) according to any one of claims 1-14, wherein the device (100) further comprises a reading module (209) for reading the compressed data blocks (102) and the uncompressed data blocks (103) from a memory (208).

16. The device (100) of claim 15, wherein the reading module (209) is further configured to read the compressed data block (102) and the uncompressed data block (103) based on an identification number of the output buffer (104), a size of a corresponding block, and an offset of the corresponding block relative to a start of the output buffer (104).

17. The device (100) of any of claims 1-16, wherein the memory (208) comprises volatile memory or non-volatile memory.

18. A method (1100) of compacting compressed data blocks (102) and uncompressed data blocks (103) into an output buffer (104), the method (1100) comprising the steps of:

-obtaining (1101) a set of input data blocks (101), wherein the set comprises at least one of compressed data blocks (102) and uncompressed data blocks (103);

-compacting (1102) the compressed data blocks (102) into the output buffer (104) without gaps starting from a first predefined area in the output buffer (104), such that the compressed data blocks (102) are compacted in sequence;

-compacting (1103) the uncompressed data blocks (103) into the output buffer (104) starting from a second predefined area in the output buffer (104), such that the uncompressed data blocks (103) are compacted in sequence;