Background
In recent years, with the rapid development of the computer field, the processing capacity of a Central Processing Unit (CPU) is greatly improved, the performance of a storage system is relatively slightly improved, and a huge performance gap exists between the traditional main memory and auxiliary memory structure, which becomes the performance bottleneck of the computer system.
On one hand, in a traditional main Memory-auxiliary Memory system, a main Memory (generally, a Dynamic Random Access Memory (DRAM)) generally has the characteristics of small capacity, high speed, volatility (power-off data loss) and the like; while the auxiliary memory (magnetic disk, magnetic tape, optical disk, flash memory) usually has the characteristics of large capacity, slow speed, nonvolatility (no loss of power-off data) and the like. Such architectures are largely determined by the hardware characteristics of the memory devices, but they have become increasingly unable to meet today's memory requirements. For example, a memory database system resides all mass data in a main memory, and an auxiliary memory only performs backup, so that performance can be effectively improved, but due to the volatility of a DRAM, data loss may be caused once the power is off or the system crashes, so various fault-tolerant means need to be adopted to timely store data and maintain the consistency of the data, a common method is to store the data to an auxiliary memory medium periodically, and frequent data backup (from a main memory to the auxiliary memory) causes little performance loss.
On the other hand, as the technology of DRAM is approaching its limit, it is increasingly difficult to increase its speed and capacity. Therefore, the industry has been actively exploring and researching new storage materials, devices and even new storage systems. NVM (Non-Volatile Memory) is a new type of Memory device that has attracted attention in recent years.
NVM typically has the characteristics of byte-granular addressing, large capacity density, fast access speed, non-volatility (no loss of data when power is off), etc. NVM has both primary and secondary Memory characteristics, also known as Persistent Memory (PM) or Storage Class Memory (SCM). Among many NVM technologies, Phase Change Memory (PCM) is the most developed and representative, and its hardware characteristics are shown in table 1.
Table 1, PCM, DRAM hardware characteristic comparison table.
The emergence and rapid development of NVM has brought new opportunities and changes to memory systems. Based on the existing architecture shown in fig. 1(a), NVM can replace disk, flash memory, etc. in the conventional storage structure to become fast auxiliary memory as shown in fig. 1 (b); can also be used as a main memory instead of DRAM as shown in FIG. 1 (d); it is even possible to coexist with DRAM to form a hybrid main memory as shown in fig. 1 (c). Among them, the hybrid main memory structure as shown in fig. 1(c) is a hot spot of current industry research.
At the software level, NVM-based data persistence also promises a number of new technical approaches. To ensure data consistency, these technical approaches use more or less transaction-based logging mechanisms (transaction logging). The transaction type persistent Log is mainly divided into a Redo Log (Redo-Log) and an Undo Log (Undo-Log), and currently, persistent storage libraries Mnemosyn and NVML in an x86 system are respectively application representatives of the transaction type persistent Log. Both log mechanisms can implement data write operation transacting, that is, it is ensured that write data is either successful or rolled back to the state before transaction commit if there is a failure. Fig. 2 shows these two logging mechanisms. As shown in fig. 2(a), when a transaction is committed, the Redo-Log writes addresses and new values of all write operations in the transaction into a persistent Log, then updates the values of the addresses, and discards the Log after the update is successful; as shown in FIG. 2(b), when a transaction is committed, Undo-Log first copies the address and old value of each write operation into a persistent Log, then updates the value of the address, and discards the Log after the update is successful. As can be seen from FIG. 2, whatever the logging mechanism, there is some overhead in persisting: two write operations (write to log, write to persistent heap) must be performed, each write operation requires two time-consuming operations, namely data update (flush) and operation synchronization (fence), to ensure completion of write persistence, and the write operation of NVM is delayed. On the other hand, such log persistence simply writes address and value information to NVM, and does not address redundancy that exists locally in information during program execution.
Disclosure of Invention
The technical problems to be solved by the invention are as follows: aiming at the problems in the prior art, the invention provides a method and a device for compressing redundant information in a fault-tolerant NVM (non-volatile memory) persistence process.
In order to solve the technical problems, the invention adopts the technical scheme that:
a compression method for redundant information of a fault-tolerant NVM (non-volatile memory) persistence process comprises the following implementation steps:
1) distributing a write set WriteSet, an address queue and a plurality of value queues vqueue _ X in a thread supporting transaction persistence, wherein the value queues vqueue _ X are used for recording values of which old and new only have different low X bits, the low X bits of all recorded values are written into a persistence log when the values are compressed, and the X values of different value queues vqueue _ X are different;
2) allocating a persistent log save area in the NVM;
3) writing all persistent write operation information to the write set WriteSet during the transaction;
4) scanning a write set WriteSet, filtering write set elements with completely the same new and old values, respectively adding values of the write set elements with different values of only low X bits into corresponding value queues vqueue _ X for the write set elements with completely the same new and old values, and adding addresses of the write set elements with incompletely the same new and old values into address queues queue according to the sequence of the values of X in the value queues vqueue _ X from large to small;
5) compressing the address information in the address queue so that the same part of a group of adjacent addresses is eliminated, and writing the compressed address information into an address record;
6) and writing the low X bits of the N bit values of all elements in each value queue vqueue _ X into the value record, thereby eliminating redundant information of the values.
Preferably, the plurality of value queues vqueue _ X in step 1) comprises value queues vqueue _ N, vqueue _ N/2, vqueue _ N/22…, vqueue _16 and vqueue _8, wherein N is the bit width of the persistent log, and vqueue _ N is used for recording old and new complete dataDifferent values, the lower N bits of each record value are written into the persistent log when the values are compressed; vqueue _ N/2 is used for recording different values of new and old with low N/2 bits, and the low N/2 bits of each recorded value are written into a persistent log when the values are compressed; … …, respectively; vqueue _16 is used for recording different values of old and new values with only lower 16 bits, and the lower 16 bits of each recorded value are written into a persistent log when the values are compressed; vqueue _8 is used to record values whose old and new have only 8 lower bits different, and the lower 8 bits of each recorded value are written into the persistent log when the values are compressed.
Preferably, each persistent log in step 2) includes 3 portions corresponding to the number of values of the respective value queue vqueue _ X, the records of the plurality of compressed addresses, and the records of the plurality of compressed values when the persistent log save area is allocated in the NVM.
Preferably, the persistent log has a bit width of 64 bits, which can be similarly extended to 128 bits or higher.
Preferably, the detailed steps of step 3) include: for the address and value information pair < addr, value > to be written by the persistent write operation tx _ write, judging whether the address addr already exists in the write set WriteSet, and if so, replacing the original value of the existing item in the write set WriteSet with the value of the persistent write operation tx _ write; if address addr is not present in the WriteSet WriteSet, then add the pair of information < addr, value > to the WriteSet WriteSet.
Preferably, the detailed steps of step 5) include:
5.1) constructing a processing sub-queue, initially putting the first two addresses of a queue in the processing sub-queue, and initializing a current compression mode into an incompressible mode, wherein the compression mode comprises three types of the incompressible mode, continuous address compression and high 32-bit same address compression, the continuous address compression mode refers to the compression of a group of address queues with the same difference value of adjacent addresses, and the high 32-bit same address compression mode refers to the compression of a group of address queues with the same high 32-bit;
5.2) sequentially traversing subsequent addresses Addr in the address queue, adding the currently traversed addresses Addr into the processing sub-queue when one address Addr is traversed, and skipping to execute the step 5.3); jumping to execute the step 5.5) until the traversal is completed;
5.3) judging whether the processing sub-queue is incompressible before and after the address Addr is added, if the processing sub-queue is incompressible before and after the address Addr is added, taking out the first address of the processing sub-queue, writing the address record, and skipping to execute the step 5.2); otherwise, skipping to execute the step 5.4);
5.4) judging whether the compression mode of the processing sub-queue changes after the address Addr is added, if the compression mode corresponding to the processing sub-queue changes after the address Addr is added, taking out all addresses in the processing sub-queue before the address Addr for compression, if the original compression mode is continuous address compression, performing continuous address compression, and if the original compression mode is same address compression with 32 bits higher, performing same address compression with 32 bits higher, and writing the same address compression into an address record; skipping to execute step 5.2);
5.5) judging whether the processing sub queue is empty, and if so, skipping to execute the step 5.7); otherwise, skipping to execute the step 5.6);
5.6) judging whether the processing sub-queue is compressible, if the processing sub-queue is compressible, determining a new compression mode of the processing sub-queue according to the content of the processing sub-queue, taking out all addresses of the processing sub-queue for compression, if the new compression mode is continuous address compression, performing continuous address compression, if the new compression mode is high 32-bit same address compression, performing high 32-bit same address compression, writing address records, and skipping to execute the step 5.7); otherwise, all the addresses of the sub-queue are taken out and processed without compression, address records are written, and the step 5.7 is executed by skipping;
5.7) judging that the recording of the write address of the compressed address information is finished.
The invention also provides a device for compressing redundancy information of the fault-tolerant NVM (non-volatile memory) persistent process, which comprises computer equipment, wherein the computer equipment is programmed to execute the steps of the method for compressing the redundancy information of the fault-tolerant NVM persistent process.
Compared with the prior art, the invention has the following advantages:
1. the invention can fully utilize the space-time locality of addresses and values of persistent write operations in transactions, and can obtain compression benefits from a small number of write operations, and particularly can obtain huge compression ratios when a large number of continuous addresses (such as arrays) are written with values with small new and old change ranges, thereby greatly improving the persistence efficiency.
2. The compression method adopted by the invention is simple, and the overhead introduced by the compression method is smaller compared with the write delay of the NVM.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other. The logging mechanism in this embodiment is redo logging (i.e., logging new values).
As shown in fig. 3, the implementation steps of the method for compressing redundancy information in the fault-tolerant NVM persistence process of this embodiment include:
1) distributing a write set WriteSet, an address queue and a plurality of value queues vqueue _ X in a thread supporting transaction persistence, wherein the value queues vqueue _ X are used for recording values of which old and new only have different low X bits, the low X bits of all recorded values are written into a persistence log when the values are compressed, and the X values of different value queues vqueue _ X are different;
2) allocating a persistent log save area in the NVM;
3) writing all persistent write operation information to the write set WriteSet during the transaction;
4) scanning a write set WriteSet, filtering write set elements with completely the same new and old values, respectively adding values of the write set elements with different values of only low X bits into corresponding value queues vqueue _ X for the write set elements with completely the same new and old values, and adding addresses of the write set elements with incompletely the same new and old values into address queues queue according to the sequence of the values of X in the value queues vqueue _ X from large to small;
5) compressing the address information in the address queue so that the same part of a group of adjacent addresses is eliminated, and writing the compressed address information into an address record;
6) and writing the low X bits of the N bit values of all elements in each value queue vqueue _ X into the value record, thereby eliminating redundant information of the values.
The write set WriteSet is used to record the address and new value of each persistent write operation in a transaction, and if the same address is written multiple times, the last new value is recorded. And respectively recording the value of the corresponding compression mode by each value queue, recording a new value if a Redo Log Redo-Log mechanism is adopted, and recording an old value if the Undo Log Undo-Log mechanism is adopted. The present embodiment records new values using the redo log.
In this embodiment, the address and the value are both 64 bits (bit).
In this embodiment, the multiple value queues vqueue _ X in step 1) include value queues vqueue _ N, vqueue _ N/2 and vqueue _ N/22、…、vqueue_16The vqueue _8, wherein N is the bit width of the persistent log, vqueue _ N is used for recording new and old completely different values, and the lower N bits of each recorded value are written into the persistent log when the values are compressed; vqueue _ N/2 is used for recording different values of new and old with low N/2 bits, and the low N/2 bits of each recorded value are written into a persistent log when the values are compressed; … …, respectively; vqueue _16 is used for recording different values of old and new values with only lower 16 bits, and the lower 16 bits of each recorded value are written into a persistent log when the values are compressed; vqueue _8 is used to record values whose old and new have only 8 lower bits different, and the lower 8 bits of each recorded value are written into the persistent log when the values are compressed.
The address queue is used for recording addresses, the recording sequence is according to the address corresponding to each value of vqueue _ N, the address corresponding to each value of vqueue _ N/2, … …, the address corresponding to each value of vqueue _16 and the address corresponding to each value of vqueue _8, and when the addresses are compressed, the compressed records are written into the persistent log according to the continuity or high order bits of each address.
In this embodiment, each persistent log in the persistent log save area allocated in the NVM in step 2) includes 3 portions, which correspond to the number of values in each value queue vqueue _ X, records of multiple compressed addresses, and records of multiple compressed values.
In this embodiment, the bit width of the persistent log is 64 bits.
Taking the 64-bit value as an example, the data structure of the persistent log is shown in fig. 4. In the table shown in fig. 4, each row represents a 64-bit space. The order after decompression of the address records in the table shown in fig. 4 is the same as the order after decompression of the m1, m2, m3, m4 value records, and is equal in number to m1+ m2+ m3+ m4, so that the one-to-one correspondence < address, value > pairs can be recovered after decompression. In addition, direct continuous filling is allowed during bit number switching in the value recording process, and no space is required, such as: the last 32bit value record only occupies the low 32 bits of the 64bit space, then the high 32 bits can directly follow the padding of the 16bit value record.
In this embodiment, the address records are divided into 3 types according to the address compression mode, and are determined by the low 2 bits of the address (8 aligned), 00 indicates no compression, 01 indicates continuous address compression, and 10 indicates high N/2 bit (for example, 64 bits, high 32 bits) same address compression.
The uncompressed address record is as follows:
64 bit: address (Low 2bit is 00) |
The continuous address compression record is as follows:
the high 32bit same address compression record is as follows:
each address pair is shaped as follows:
32 bit: address 2 low 32bit
|
32 bit: address 1 low 32bit |
The last address pair is allowed to be unfilled, i.e. the number of compressed addresses is now odd.
In this embodiment, taking 64bit as an example, the value records are divided into 4 types according to the value compression mode: record a 64bit value, record value 32 bits lower, record value 16 bits lower, record value 8 bits lower, wherein the 64bit mode represents an uncompressed value. The number of values in each compression mode is recorded in the header of the persistent log. According to the value compression mode, each 64-bit space can record 1/2/4/8 values of recorded bits in turn, allowing the last 64-bit space to be unfilled.
During a transaction, the < address, value > pairs of all persistent write operations are written to the write set WriteSet. If the write set WriteSet already has the address, then the new value written later replaces the original value. In this embodiment, the detailed steps of step 3) include: for the address to be written by the persistent write operation tx _ write and the information pair < addr, value >, judging whether the value addr already exists in the write set WriteSet, and if so, replacing the value of the persistent write operation tx _ write with the original value' of the existing item in the write set WriteSet; if the value addr is not present in the WriteSet WriteSet, then the pair of information < addr, value > is added to the WriteSet WriteSet. As shown in fig. 5, assume that there are currently 4 < address, value > pairs < addr1, value1> … … < addr4, value4> in the WriteSet. The write operation tx _ write writes < addr4, value4 '>, since addr4 already exists in the write set, replacing the value of value4 with the value of value 4'. The write operation tx _ write writes < addr5, value5>, because addr5 is not in the write set, < addr5, value5> is added to the write set. The italicized content in figure 5 represents the variation of the write set WriteSet.
The compression of the WriteSet WriteSet in this embodiment can be divided into three phases:
first, the preparation phase, step 4).
Scanning write set WriteSet in the step 4), filtering write set elements with completely the same new and old values, respectively adding values of the write set elements with the different values of the low X bits of the new and old values into corresponding value queues vqueue _ X, and adding addresses of the write set elements with the incompletely the same new and old values into address queues according to the sequence of the values of X in the value queues vqueue _ X from large to small. Taking 64-bit addresses and values as an example, specifically, the method includes preparing arrays vqueue _64, vqueue _32, vqueue _16, vqueue _8 and queue with proper sizes, wherein the queue serves as an address compression queue, and the other queues serve as value compression queues. 1) Comparing the new value with the old value: and scanning the write set WriteSet, comparing the new and old values to generate a compression mode (keeping 64bit low, keeping 32bit low, keeping 16bit low and keeping 8bit low), and filtering write set elements with completely same new and old values. 2) Value enqueue: and after filtering, adding the values into corresponding value queues vqueue _ N respectively according to a compression mode obtained by comparing the new values with the old values. 3) Address queuing: and finally, adding the corresponding addresses into the queue according to the sequence of vqueue _64- > vqueue _32- > vqueue _16- > vqueue _ 8. The process and the redundancy of information therein is illustrated in fig. 6, where italicized characters represent the redundant part of the information.
And step two, an address compression stage, namely step 5).
And 5) compressing the address information in the address queue so as to eliminate the same part of a group of adjacent addresses, and writing the compressed address information into the address record.
As shown in fig. 7, the detailed steps of step 5) include:
5.1) constructing a processing sub-queue, initially putting the first two addresses of a queue in the processing sub-queue, and initializing a current compression mode into an incompressible mode, wherein the compression mode comprises three types of the incompressible mode, continuous address compression and high 32-bit same address compression, the continuous address compression mode refers to the compression of a group of address queues with the same difference value of adjacent addresses, and the high 32-bit same address compression mode refers to the compression of a group of address queues with the same high 32-bit;
5.2) sequentially traversing subsequent addresses Addr in the address queue, adding the currently traversed addresses Addr into the processing sub-queue when one address Addr is traversed, and skipping to execute the step 5.3); jumping to execute the step 5.5) until the traversal is completed;
5.3) judging whether the processing sub-queue is incompressible before and after the address Addr is added, if the processing sub-queue is incompressible before and after the address Addr is added, taking out the first address of the processing sub-queue, writing the address record, and skipping to execute the step 5.2); otherwise, skipping to execute the step 5.4);
5.4) judging whether the compression mode of the processing sub-queue changes after the address Addr is added, if the compression mode corresponding to the processing sub-queue changes after the address Addr is added, taking out all addresses in the processing sub-queue before the address Addr for compression, if the original compression mode is continuous address compression, performing continuous address compression, and if the original compression mode is same address compression with 32 bits higher, performing same address compression with 32 bits higher, and writing the same address compression into an address record; skipping to execute step 5.2);
5.5) judging whether the processing sub queue is empty, and if so, skipping to execute the step 5.7); otherwise, skipping to execute the step 5.6);
5.6) judging whether the processing sub-queue is compressible, if the processing sub-queue is compressible, determining a new compression mode of the processing sub-queue according to the content of the processing sub-queue, taking out all addresses of the processing sub-queue for compression, if the new compression mode is continuous address compression, performing continuous address compression, if the new compression mode is high 32-bit same address compression, performing high 32-bit same address compression, writing address records, and skipping to execute the step 5.7); otherwise, all the addresses of the sub-queue are taken out and processed without compression, address records are written, and the step 5.7 is executed by skipping;
5.7) judging that the recording of the write address of the compressed address information is finished.
The result of this process is shown in fig. 8, where italicized characters indicate redundant portions of information, and redundant information (the same portion of a set of adjacent addresses) for the compressed address has been eliminated. Continuous address compression case: 5 addresses are continuous, the last two bit values of the addresses are modified to be 1 by taking the first address as a reference, and the description field desc records the number of the addresses 5 and the offset 8; high 32bit same address compression case: the 5 addresses are the same with 32 higher bits, the last two bits are modified to be 2 by taking the first address as a reference, the description field desc records the number of addresses 5, and the subsequent address pairs sequentially record the 5 addresses with 32 lower bits.
And thirdly, a value compression stage, namely step 6).
Step 6) writing the low X bits of the N bit values of all elements in each value queue vqueue _ X into a value record, thereby eliminating redundant information of the values. In this embodiment, the method specifically includes:
estimating and allocating value recording space values according to the number of elements in the value queues vqueue _64, vqueue _32, vqueue _16 and vqueue _8, and sequentially recording the numbers in a head 64bit of the persistent log;
sequentially writing the values of the elements in vqueue _64 into value record values;
successively writing the lower 32 bits of the value of the element in vqueue _32 into value records values;
successively writing the lower 16 bits of the value of the element in vqueue _16 into value records values;
the lower 8 bits of the value of the element in vqueue _8 are successively written to the value record values in turn.
The result of this process is shown in fig. 9, where italics are redundant information and after compression, the redundant information (the same part as the old value) of the value has been eliminated.
In summary, NVM has the disadvantages of higher write latency and limited write times compared to DRAM, and in order to ensure the correctness and consistency of persistent data, conventional transaction log methods are usually adopted, which require two write operations and fail to utilize the redundancy existing in the locality of information during program execution. The core idea of the foregoing method for compressing redundant information in a fault-tolerant NVM persistence process in this embodiment is to compress and eliminate redundant information in a transaction log of a persistent storage as much as possible, including address information redundancy caused by locality of program execution and information redundancy of new and old contents caused by writing a new value, and in the compression process, a high-performance DRAM is fully used as a buffer, only a result of a final compression process is written into the NVM, and by compressing the redundant information in the persistent storage, the size of persistent data can be reduced by compressing and eliminating the redundancy as much as possible, thereby reducing the persistence overhead of a transaction, and achieving the purposes of improving the performance of a persistent transaction and prolonging the service life of the NVM. In addition, the present embodiment further provides a device for compressing fault-tolerant NVM persistence process redundancy information, which includes a computer device programmed to execute the steps of the method for compressing fault-tolerant NVM persistence process redundancy information according to the present embodiment.
The above description is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above embodiments, and all technical solutions belonging to the idea of the present invention belong to the protection scope of the present invention. It should be noted that modifications and embellishments within the scope of the invention may occur to those skilled in the art without departing from the principle of the invention, and are considered to be within the scope of the invention.