[go: up one dir, main page]

CN109542356B - Fault-tolerant NVM (non-volatile memory) persistence process redundancy information compression method and device - Google Patents

Fault-tolerant NVM (non-volatile memory) persistence process redundancy information compression method and device Download PDF

Info

Publication number
CN109542356B
CN109542356B CN201811457572.7A CN201811457572A CN109542356B CN 109542356 B CN109542356 B CN 109542356B CN 201811457572 A CN201811457572 A CN 201811457572A CN 109542356 B CN109542356 B CN 109542356B
Authority
CN
China
Prior art keywords
address
value
queue
compression
vqueue
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811457572.7A
Other languages
Chinese (zh)
Other versions
CN109542356A (en
Inventor
彭林
张文喆
陈俊峰
黄春
唐滔
方建滨
左克
姜浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN201811457572.7A priority Critical patent/CN109542356B/en
Publication of CN109542356A publication Critical patent/CN109542356A/en
Application granted granted Critical
Publication of CN109542356B publication Critical patent/CN109542356B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0608Saving storage space on storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • G06F3/0616Improving the reliability of storage systems in relation to life time, e.g. increasing Mean Time Between Failures [MTBF]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/0679Non-volatile semiconductor memory device, e.g. flash memory, one time programmable memory [OTP]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明公开了一种面向容错的NVM持久化过程冗余信息的压缩方法及装置,本发明实施步骤包括分配写集合、地址队列以及多个值队列,在NVM中分配持久化日志保存区;在事务期间将所有持久化写操作信息写入写集合;扫描写集合,按照值的新旧对比情况分别加入相应的值队列,并按值队列依序加入地址队列;针对地址队列中的地址信息进行压缩使得一组相邻地址的相同部分被消除,将压缩后的地址信息写入地址记录;将各个值队列中所有元素的N位值的低X位写入值记录,从而将值的冗余信息消除。本发明能够减小持久化数据的大小,从而降低事务的持久化开销,达到提升持久化事务性能、延长NVM使用寿命的目的。

Figure 201811457572

The invention discloses a fault-tolerant NVM persistence process redundancy information compression method and device. The implementation steps of the invention include allocating a write set, an address queue and a plurality of value queues, and allocating a persistent log storage area in the NVM; Write all persistent write operation information into the write set during the transaction; scan the write set, add the corresponding value queues according to the old and new comparisons of the values, and add them to the address queues in sequence according to the value queues; compress the address information in the address queues The same part of a group of adjacent addresses is eliminated, and the compressed address information is written into the address record; the lower X bits of the N-bit value of all elements in each value queue are written into the value record, thereby the redundant information of the value is written eliminate. The present invention can reduce the size of persistent data, thereby reducing the persistent overhead of transactions, and achieve the purpose of improving persistent transaction performance and prolonging the service life of NVM.

Figure 201811457572

Description

Fault-tolerant NVM (non-volatile memory) persistence process redundancy information compression method and device
Technical Field
The invention relates to the field of computer persistent memories, in particular to a method and a device for compressing redundant information in a fault-tolerant NVM (non volatile memory) persistent process, which are used for compressing the redundant information in persistent storage to improve the writing performance and the service life of the persistent memory.
Background
In recent years, with the rapid development of the computer field, the processing capacity of a Central Processing Unit (CPU) is greatly improved, the performance of a storage system is relatively slightly improved, and a huge performance gap exists between the traditional main memory and auxiliary memory structure, which becomes the performance bottleneck of the computer system.
On one hand, in a traditional main Memory-auxiliary Memory system, a main Memory (generally, a Dynamic Random Access Memory (DRAM)) generally has the characteristics of small capacity, high speed, volatility (power-off data loss) and the like; while the auxiliary memory (magnetic disk, magnetic tape, optical disk, flash memory) usually has the characteristics of large capacity, slow speed, nonvolatility (no loss of power-off data) and the like. Such architectures are largely determined by the hardware characteristics of the memory devices, but they have become increasingly unable to meet today's memory requirements. For example, a memory database system resides all mass data in a main memory, and an auxiliary memory only performs backup, so that performance can be effectively improved, but due to the volatility of a DRAM, data loss may be caused once the power is off or the system crashes, so various fault-tolerant means need to be adopted to timely store data and maintain the consistency of the data, a common method is to store the data to an auxiliary memory medium periodically, and frequent data backup (from a main memory to the auxiliary memory) causes little performance loss.
On the other hand, as the technology of DRAM is approaching its limit, it is increasingly difficult to increase its speed and capacity. Therefore, the industry has been actively exploring and researching new storage materials, devices and even new storage systems. NVM (Non-Volatile Memory) is a new type of Memory device that has attracted attention in recent years.
NVM typically has the characteristics of byte-granular addressing, large capacity density, fast access speed, non-volatility (no loss of data when power is off), etc. NVM has both primary and secondary Memory characteristics, also known as Persistent Memory (PM) or Storage Class Memory (SCM). Among many NVM technologies, Phase Change Memory (PCM) is the most developed and representative, and its hardware characteristics are shown in table 1.
Table 1, PCM, DRAM hardware characteristic comparison table.
Figure BDA0001888040220000011
Figure BDA0001888040220000021
The emergence and rapid development of NVM has brought new opportunities and changes to memory systems. Based on the existing architecture shown in fig. 1(a), NVM can replace disk, flash memory, etc. in the conventional storage structure to become fast auxiliary memory as shown in fig. 1 (b); can also be used as a main memory instead of DRAM as shown in FIG. 1 (d); it is even possible to coexist with DRAM to form a hybrid main memory as shown in fig. 1 (c). Among them, the hybrid main memory structure as shown in fig. 1(c) is a hot spot of current industry research.
At the software level, NVM-based data persistence also promises a number of new technical approaches. To ensure data consistency, these technical approaches use more or less transaction-based logging mechanisms (transaction logging). The transaction type persistent Log is mainly divided into a Redo Log (Redo-Log) and an Undo Log (Undo-Log), and currently, persistent storage libraries Mnemosyn and NVML in an x86 system are respectively application representatives of the transaction type persistent Log. Both log mechanisms can implement data write operation transacting, that is, it is ensured that write data is either successful or rolled back to the state before transaction commit if there is a failure. Fig. 2 shows these two logging mechanisms. As shown in fig. 2(a), when a transaction is committed, the Redo-Log writes addresses and new values of all write operations in the transaction into a persistent Log, then updates the values of the addresses, and discards the Log after the update is successful; as shown in FIG. 2(b), when a transaction is committed, Undo-Log first copies the address and old value of each write operation into a persistent Log, then updates the value of the address, and discards the Log after the update is successful. As can be seen from FIG. 2, whatever the logging mechanism, there is some overhead in persisting: two write operations (write to log, write to persistent heap) must be performed, each write operation requires two time-consuming operations, namely data update (flush) and operation synchronization (fence), to ensure completion of write persistence, and the write operation of NVM is delayed. On the other hand, such log persistence simply writes address and value information to NVM, and does not address redundancy that exists locally in information during program execution.
Disclosure of Invention
The technical problems to be solved by the invention are as follows: aiming at the problems in the prior art, the invention provides a method and a device for compressing redundant information in a fault-tolerant NVM (non-volatile memory) persistence process.
In order to solve the technical problems, the invention adopts the technical scheme that:
a compression method for redundant information of a fault-tolerant NVM (non-volatile memory) persistence process comprises the following implementation steps:
1) distributing a write set WriteSet, an address queue and a plurality of value queues vqueue _ X in a thread supporting transaction persistence, wherein the value queues vqueue _ X are used for recording values of which old and new only have different low X bits, the low X bits of all recorded values are written into a persistence log when the values are compressed, and the X values of different value queues vqueue _ X are different;
2) allocating a persistent log save area in the NVM;
3) writing all persistent write operation information to the write set WriteSet during the transaction;
4) scanning a write set WriteSet, filtering write set elements with completely the same new and old values, respectively adding values of the write set elements with different values of only low X bits into corresponding value queues vqueue _ X for the write set elements with completely the same new and old values, and adding addresses of the write set elements with incompletely the same new and old values into address queues queue according to the sequence of the values of X in the value queues vqueue _ X from large to small;
5) compressing the address information in the address queue so that the same part of a group of adjacent addresses is eliminated, and writing the compressed address information into an address record;
6) and writing the low X bits of the N bit values of all elements in each value queue vqueue _ X into the value record, thereby eliminating redundant information of the values.
Preferably, the plurality of value queues vqueue _ X in step 1) comprises value queues vqueue _ N, vqueue _ N/2, vqueue _ N/22…, vqueue _16 and vqueue _8, wherein N is the bit width of the persistent log, and vqueue _ N is used for recording old and new complete dataDifferent values, the lower N bits of each record value are written into the persistent log when the values are compressed; vqueue _ N/2 is used for recording different values of new and old with low N/2 bits, and the low N/2 bits of each recorded value are written into a persistent log when the values are compressed; … …, respectively; vqueue _16 is used for recording different values of old and new values with only lower 16 bits, and the lower 16 bits of each recorded value are written into a persistent log when the values are compressed; vqueue _8 is used to record values whose old and new have only 8 lower bits different, and the lower 8 bits of each recorded value are written into the persistent log when the values are compressed.
Preferably, each persistent log in step 2) includes 3 portions corresponding to the number of values of the respective value queue vqueue _ X, the records of the plurality of compressed addresses, and the records of the plurality of compressed values when the persistent log save area is allocated in the NVM.
Preferably, the persistent log has a bit width of 64 bits, which can be similarly extended to 128 bits or higher.
Preferably, the detailed steps of step 3) include: for the address and value information pair < addr, value > to be written by the persistent write operation tx _ write, judging whether the address addr already exists in the write set WriteSet, and if so, replacing the original value of the existing item in the write set WriteSet with the value of the persistent write operation tx _ write; if address addr is not present in the WriteSet WriteSet, then add the pair of information < addr, value > to the WriteSet WriteSet.
Preferably, the detailed steps of step 5) include:
5.1) constructing a processing sub-queue, initially putting the first two addresses of a queue in the processing sub-queue, and initializing a current compression mode into an incompressible mode, wherein the compression mode comprises three types of the incompressible mode, continuous address compression and high 32-bit same address compression, the continuous address compression mode refers to the compression of a group of address queues with the same difference value of adjacent addresses, and the high 32-bit same address compression mode refers to the compression of a group of address queues with the same high 32-bit;
5.2) sequentially traversing subsequent addresses Addr in the address queue, adding the currently traversed addresses Addr into the processing sub-queue when one address Addr is traversed, and skipping to execute the step 5.3); jumping to execute the step 5.5) until the traversal is completed;
5.3) judging whether the processing sub-queue is incompressible before and after the address Addr is added, if the processing sub-queue is incompressible before and after the address Addr is added, taking out the first address of the processing sub-queue, writing the address record, and skipping to execute the step 5.2); otherwise, skipping to execute the step 5.4);
5.4) judging whether the compression mode of the processing sub-queue changes after the address Addr is added, if the compression mode corresponding to the processing sub-queue changes after the address Addr is added, taking out all addresses in the processing sub-queue before the address Addr for compression, if the original compression mode is continuous address compression, performing continuous address compression, and if the original compression mode is same address compression with 32 bits higher, performing same address compression with 32 bits higher, and writing the same address compression into an address record; skipping to execute step 5.2);
5.5) judging whether the processing sub queue is empty, and if so, skipping to execute the step 5.7); otherwise, skipping to execute the step 5.6);
5.6) judging whether the processing sub-queue is compressible, if the processing sub-queue is compressible, determining a new compression mode of the processing sub-queue according to the content of the processing sub-queue, taking out all addresses of the processing sub-queue for compression, if the new compression mode is continuous address compression, performing continuous address compression, if the new compression mode is high 32-bit same address compression, performing high 32-bit same address compression, writing address records, and skipping to execute the step 5.7); otherwise, all the addresses of the sub-queue are taken out and processed without compression, address records are written, and the step 5.7 is executed by skipping;
5.7) judging that the recording of the write address of the compressed address information is finished.
The invention also provides a device for compressing redundancy information of the fault-tolerant NVM (non-volatile memory) persistent process, which comprises computer equipment, wherein the computer equipment is programmed to execute the steps of the method for compressing the redundancy information of the fault-tolerant NVM persistent process.
Compared with the prior art, the invention has the following advantages:
1. the invention can fully utilize the space-time locality of addresses and values of persistent write operations in transactions, and can obtain compression benefits from a small number of write operations, and particularly can obtain huge compression ratios when a large number of continuous addresses (such as arrays) are written with values with small new and old change ranges, thereby greatly improving the persistence efficiency.
2. The compression method adopted by the invention is simple, and the overhead introduced by the compression method is smaller compared with the write delay of the NVM.
Drawings
FIG. 1 is a schematic diagram of a conventional memory architecture and a new NVM-based memory architecture.
FIG. 2 is a schematic diagram of two Log mechanisms of a Redo Log (Redo-Log) and an Undo Log (Undo-Log) in the prior art.
FIG. 3 is a schematic diagram of a basic flow of a method according to an embodiment of the present invention.
FIG. 4 is a diagram illustrating a data structure of a persistent log according to an embodiment of the present invention.
FIG. 5 is a diagram of an example of a write set WriteSet in an embodiment of the invention.
FIG. 6 is a diagram illustrating an example of a preparation phase in an embodiment of the present invention.
FIG. 7 is a detailed flowchart of step 5) in the embodiment of the present invention.
FIG. 8 shows the result of the address compression stage in an embodiment of the present invention.
FIG. 9 shows the results obtained during the value compression phase in an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other. The logging mechanism in this embodiment is redo logging (i.e., logging new values).
As shown in fig. 3, the implementation steps of the method for compressing redundancy information in the fault-tolerant NVM persistence process of this embodiment include:
1) distributing a write set WriteSet, an address queue and a plurality of value queues vqueue _ X in a thread supporting transaction persistence, wherein the value queues vqueue _ X are used for recording values of which old and new only have different low X bits, the low X bits of all recorded values are written into a persistence log when the values are compressed, and the X values of different value queues vqueue _ X are different;
2) allocating a persistent log save area in the NVM;
3) writing all persistent write operation information to the write set WriteSet during the transaction;
4) scanning a write set WriteSet, filtering write set elements with completely the same new and old values, respectively adding values of the write set elements with different values of only low X bits into corresponding value queues vqueue _ X for the write set elements with completely the same new and old values, and adding addresses of the write set elements with incompletely the same new and old values into address queues queue according to the sequence of the values of X in the value queues vqueue _ X from large to small;
5) compressing the address information in the address queue so that the same part of a group of adjacent addresses is eliminated, and writing the compressed address information into an address record;
6) and writing the low X bits of the N bit values of all elements in each value queue vqueue _ X into the value record, thereby eliminating redundant information of the values.
The write set WriteSet is used to record the address and new value of each persistent write operation in a transaction, and if the same address is written multiple times, the last new value is recorded. And respectively recording the value of the corresponding compression mode by each value queue, recording a new value if a Redo Log Redo-Log mechanism is adopted, and recording an old value if the Undo Log Undo-Log mechanism is adopted. The present embodiment records new values using the redo log.
In this embodiment, the address and the value are both 64 bits (bit).
In this embodiment, the multiple value queues vqueue _ X in step 1) include value queues vqueue _ N, vqueue _ N/2 and vqueue _ N/22、…、vqueue_16The vqueue _8, wherein N is the bit width of the persistent log, vqueue _ N is used for recording new and old completely different values, and the lower N bits of each recorded value are written into the persistent log when the values are compressed; vqueue _ N/2 is used for recording different values of new and old with low N/2 bits, and the low N/2 bits of each recorded value are written into a persistent log when the values are compressed; … …, respectively; vqueue _16 is used for recording different values of old and new values with only lower 16 bits, and the lower 16 bits of each recorded value are written into a persistent log when the values are compressed; vqueue _8 is used to record values whose old and new have only 8 lower bits different, and the lower 8 bits of each recorded value are written into the persistent log when the values are compressed.
The address queue is used for recording addresses, the recording sequence is according to the address corresponding to each value of vqueue _ N, the address corresponding to each value of vqueue _ N/2, … …, the address corresponding to each value of vqueue _16 and the address corresponding to each value of vqueue _8, and when the addresses are compressed, the compressed records are written into the persistent log according to the continuity or high order bits of each address.
In this embodiment, each persistent log in the persistent log save area allocated in the NVM in step 2) includes 3 portions, which correspond to the number of values in each value queue vqueue _ X, records of multiple compressed addresses, and records of multiple compressed values.
In this embodiment, the bit width of the persistent log is 64 bits.
Taking the 64-bit value as an example, the data structure of the persistent log is shown in fig. 4. In the table shown in fig. 4, each row represents a 64-bit space. The order after decompression of the address records in the table shown in fig. 4 is the same as the order after decompression of the m1, m2, m3, m4 value records, and is equal in number to m1+ m2+ m3+ m4, so that the one-to-one correspondence < address, value > pairs can be recovered after decompression. In addition, direct continuous filling is allowed during bit number switching in the value recording process, and no space is required, such as: the last 32bit value record only occupies the low 32 bits of the 64bit space, then the high 32 bits can directly follow the padding of the 16bit value record.
In this embodiment, the address records are divided into 3 types according to the address compression mode, and are determined by the low 2 bits of the address (8 aligned), 00 indicates no compression, 01 indicates continuous address compression, and 10 indicates high N/2 bit (for example, 64 bits, high 32 bits) same address compression.
The uncompressed address record is as follows:
64 bit: address (Low 2bit is 00)
The continuous address compression record is as follows:
Figure BDA0001888040220000061
the high 32bit same address compression record is as follows:
Figure BDA0001888040220000062
each address pair is shaped as follows:
32 bit: address 2 low 32bit 32 bit: address 1 low 32bit
The last address pair is allowed to be unfilled, i.e. the number of compressed addresses is now odd.
In this embodiment, taking 64bit as an example, the value records are divided into 4 types according to the value compression mode: record a 64bit value, record value 32 bits lower, record value 16 bits lower, record value 8 bits lower, wherein the 64bit mode represents an uncompressed value. The number of values in each compression mode is recorded in the header of the persistent log. According to the value compression mode, each 64-bit space can record 1/2/4/8 values of recorded bits in turn, allowing the last 64-bit space to be unfilled.
During a transaction, the < address, value > pairs of all persistent write operations are written to the write set WriteSet. If the write set WriteSet already has the address, then the new value written later replaces the original value. In this embodiment, the detailed steps of step 3) include: for the address to be written by the persistent write operation tx _ write and the information pair < addr, value >, judging whether the value addr already exists in the write set WriteSet, and if so, replacing the value of the persistent write operation tx _ write with the original value' of the existing item in the write set WriteSet; if the value addr is not present in the WriteSet WriteSet, then the pair of information < addr, value > is added to the WriteSet WriteSet. As shown in fig. 5, assume that there are currently 4 < address, value > pairs < addr1, value1> … … < addr4, value4> in the WriteSet. The write operation tx _ write writes < addr4, value4 '>, since addr4 already exists in the write set, replacing the value of value4 with the value of value 4'. The write operation tx _ write writes < addr5, value5>, because addr5 is not in the write set, < addr5, value5> is added to the write set. The italicized content in figure 5 represents the variation of the write set WriteSet.
The compression of the WriteSet WriteSet in this embodiment can be divided into three phases:
first, the preparation phase, step 4).
Scanning write set WriteSet in the step 4), filtering write set elements with completely the same new and old values, respectively adding values of the write set elements with the different values of the low X bits of the new and old values into corresponding value queues vqueue _ X, and adding addresses of the write set elements with the incompletely the same new and old values into address queues according to the sequence of the values of X in the value queues vqueue _ X from large to small. Taking 64-bit addresses and values as an example, specifically, the method includes preparing arrays vqueue _64, vqueue _32, vqueue _16, vqueue _8 and queue with proper sizes, wherein the queue serves as an address compression queue, and the other queues serve as value compression queues. 1) Comparing the new value with the old value: and scanning the write set WriteSet, comparing the new and old values to generate a compression mode (keeping 64bit low, keeping 32bit low, keeping 16bit low and keeping 8bit low), and filtering write set elements with completely same new and old values. 2) Value enqueue: and after filtering, adding the values into corresponding value queues vqueue _ N respectively according to a compression mode obtained by comparing the new values with the old values. 3) Address queuing: and finally, adding the corresponding addresses into the queue according to the sequence of vqueue _64- > vqueue _32- > vqueue _16- > vqueue _ 8. The process and the redundancy of information therein is illustrated in fig. 6, where italicized characters represent the redundant part of the information.
And step two, an address compression stage, namely step 5).
And 5) compressing the address information in the address queue so as to eliminate the same part of a group of adjacent addresses, and writing the compressed address information into the address record.
As shown in fig. 7, the detailed steps of step 5) include:
5.1) constructing a processing sub-queue, initially putting the first two addresses of a queue in the processing sub-queue, and initializing a current compression mode into an incompressible mode, wherein the compression mode comprises three types of the incompressible mode, continuous address compression and high 32-bit same address compression, the continuous address compression mode refers to the compression of a group of address queues with the same difference value of adjacent addresses, and the high 32-bit same address compression mode refers to the compression of a group of address queues with the same high 32-bit;
5.2) sequentially traversing subsequent addresses Addr in the address queue, adding the currently traversed addresses Addr into the processing sub-queue when one address Addr is traversed, and skipping to execute the step 5.3); jumping to execute the step 5.5) until the traversal is completed;
5.3) judging whether the processing sub-queue is incompressible before and after the address Addr is added, if the processing sub-queue is incompressible before and after the address Addr is added, taking out the first address of the processing sub-queue, writing the address record, and skipping to execute the step 5.2); otherwise, skipping to execute the step 5.4);
5.4) judging whether the compression mode of the processing sub-queue changes after the address Addr is added, if the compression mode corresponding to the processing sub-queue changes after the address Addr is added, taking out all addresses in the processing sub-queue before the address Addr for compression, if the original compression mode is continuous address compression, performing continuous address compression, and if the original compression mode is same address compression with 32 bits higher, performing same address compression with 32 bits higher, and writing the same address compression into an address record; skipping to execute step 5.2);
5.5) judging whether the processing sub queue is empty, and if so, skipping to execute the step 5.7); otherwise, skipping to execute the step 5.6);
5.6) judging whether the processing sub-queue is compressible, if the processing sub-queue is compressible, determining a new compression mode of the processing sub-queue according to the content of the processing sub-queue, taking out all addresses of the processing sub-queue for compression, if the new compression mode is continuous address compression, performing continuous address compression, if the new compression mode is high 32-bit same address compression, performing high 32-bit same address compression, writing address records, and skipping to execute the step 5.7); otherwise, all the addresses of the sub-queue are taken out and processed without compression, address records are written, and the step 5.7 is executed by skipping;
5.7) judging that the recording of the write address of the compressed address information is finished.
The result of this process is shown in fig. 8, where italicized characters indicate redundant portions of information, and redundant information (the same portion of a set of adjacent addresses) for the compressed address has been eliminated. Continuous address compression case: 5 addresses are continuous, the last two bit values of the addresses are modified to be 1 by taking the first address as a reference, and the description field desc records the number of the addresses 5 and the offset 8; high 32bit same address compression case: the 5 addresses are the same with 32 higher bits, the last two bits are modified to be 2 by taking the first address as a reference, the description field desc records the number of addresses 5, and the subsequent address pairs sequentially record the 5 addresses with 32 lower bits.
And thirdly, a value compression stage, namely step 6).
Step 6) writing the low X bits of the N bit values of all elements in each value queue vqueue _ X into a value record, thereby eliminating redundant information of the values. In this embodiment, the method specifically includes:
estimating and allocating value recording space values according to the number of elements in the value queues vqueue _64, vqueue _32, vqueue _16 and vqueue _8, and sequentially recording the numbers in a head 64bit of the persistent log;
sequentially writing the values of the elements in vqueue _64 into value record values;
successively writing the lower 32 bits of the value of the element in vqueue _32 into value records values;
successively writing the lower 16 bits of the value of the element in vqueue _16 into value records values;
the lower 8 bits of the value of the element in vqueue _8 are successively written to the value record values in turn.
The result of this process is shown in fig. 9, where italics are redundant information and after compression, the redundant information (the same part as the old value) of the value has been eliminated.
In summary, NVM has the disadvantages of higher write latency and limited write times compared to DRAM, and in order to ensure the correctness and consistency of persistent data, conventional transaction log methods are usually adopted, which require two write operations and fail to utilize the redundancy existing in the locality of information during program execution. The core idea of the foregoing method for compressing redundant information in a fault-tolerant NVM persistence process in this embodiment is to compress and eliminate redundant information in a transaction log of a persistent storage as much as possible, including address information redundancy caused by locality of program execution and information redundancy of new and old contents caused by writing a new value, and in the compression process, a high-performance DRAM is fully used as a buffer, only a result of a final compression process is written into the NVM, and by compressing the redundant information in the persistent storage, the size of persistent data can be reduced by compressing and eliminating the redundancy as much as possible, thereby reducing the persistence overhead of a transaction, and achieving the purposes of improving the performance of a persistent transaction and prolonging the service life of the NVM. In addition, the present embodiment further provides a device for compressing fault-tolerant NVM persistence process redundancy information, which includes a computer device programmed to execute the steps of the method for compressing fault-tolerant NVM persistence process redundancy information according to the present embodiment.
The above description is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above embodiments, and all technical solutions belonging to the idea of the present invention belong to the protection scope of the present invention. It should be noted that modifications and embellishments within the scope of the invention may occur to those skilled in the art without departing from the principle of the invention, and are considered to be within the scope of the invention.

Claims (6)

1.一种面向容错的NVM持久化过程冗余信息的压缩方法,其特征在于实施步骤包括:1. a method for compressing fault-tolerant NVM persistence process redundancy information, characterized in that the implementation step comprises: 1)在支持事务持久化的线程中分配写集合WriteSet、地址队列queue以及多个值队列vqueue_X,其中值队列vqueue_X用于记录新旧只有低X位不同的值,在压缩值时各记录值的低X位写入持久化日志中,且不同值队列vqueue_X的X值不同;1) Allocate the write set WriteSet, the address queue queue, and multiple value queues vqueue_X in the threads that support transaction persistence. The value queue vqueue_X is used to record the old and new values with only the low X bits. When compressing the value, the low value of each record value is X bits are written into the persistent log, and the X values of different value queues vqueue_X are different; 2)在NVM中分配持久化日志保存区;2) Allocate persistent log storage area in NVM; 3)在事务期间将所有持久化写操作信息写入写集合WriteSet;3) Write all persistent write operation information to the write set WriteSet during the transaction; 4)扫描写集合WriteSet,对新旧值完全相同的写集合元素进行过滤,对新旧值不完全相同的写集合元素,将新旧只有低X位不同的值的写集合元素的值分别加入相应的值队列vqueue_X,并按照值队列vqueue_X中X的值从大到小的顺序将新旧值不完全相同的写集合元素的地址加入地址队列queue;4) Scan the write set WriteSet, filter the write set elements with the same new and old values, and add the values of the write set elements whose old and new values are only different from the old and new values to the corresponding values Queue vqueue_X, and add the addresses of write set elements with different old and new values to the address queue queue according to the value of X in the value queue vqueue_X in descending order; 5)针对地址队列queue中的地址信息进行压缩使得一组相邻地址的相同部分被消除,将压缩后的地址信息写入地址记录;5) Compress the address information in the address queue so that the same part of a group of adjacent addresses is eliminated, and write the compressed address information into the address record; 6)将各个值队列vqueue_X中所有元素的N位值的低X位写入值记录,从而将值的冗余信息消除;6) Write the lower X bits of the N-bit values of all elements in each value queue vqueue_X into the value record, thereby eliminating the redundant information of the value; 步骤5)的详细步骤包括:The detailed steps of step 5) include: 5.1)构造处理子队列,将所述处理子队列中初始放入地址队列queue的前两个地址,初始化当前的压缩模式为不可压缩,压缩模式包括不可压缩、连续地址压缩、高32bit相同地址压缩三种,连续地址压缩模式是指对一组相邻地址的差值相同的地址队列进行压缩,高32bit相同地址压缩模式是指对一组具有相同的高32bit的地址队列进行压缩;5.1) Construct a processing sub-queue, put the first two addresses of the processing sub-queue into the address queue queue initially, initialize the current compression mode as incompressible, and the compression modes include incompressible, continuous address compression, and high 32bit same address compression Three, continuous address compression mode refers to compressing a group of address queues with the same difference between adjacent addresses, and high 32bit same address compression mode refers to compressing a group of address queues with the same high 32bit; 5.2)依次遍历地址队列queue中的后续地址Addr,每遍历一个地址Addr,则将当前遍历的地址Addr加入处理子队列,跳转执行步骤5.3);直至遍历完成,跳转执行步骤5.5);5.2) Traverse the subsequent addresses Addr in the address queue queue in turn, each time an address Addr is traversed, add the currently traversed address Addr to the processing sub-queue, and jump to step 5.3); until the traversal is completed, jump to step 5.5); 5.3)判断处理子队列在地址Addr加入前后是否均为不可压缩,如果处理子队列在Addr加入前后均不可压缩,则取出处理子队列第一个地址,写入地址记录,跳转执行步骤5.2);否则,跳转执行步骤5.4);5.3) Determine whether the processing sub-queue is incompressible before and after the address Addr is added. If the processing sub-queue is incompressible before and after Addr is added, take out the first address of the processing sub-queue, write the address record, and jump to step 5.2) ; otherwise, jump to step 5.4); 5.4)判断处理子队列的压缩模式是否在地址Addr加入后发生变化,如果处理子队列对应的压缩模式在地址Addr加入后发生变化,则取出处理子队列中位于地址Addr之前的所有地址进行压缩,如果原压缩模式为连续地址压缩,则进行连续地址压缩,如果原压缩模式为高32bit相同地址压缩,则进行高32bit相同地址压缩,写入地址记录;跳转执行步骤5.2);5.4) Determine whether the compression mode of the processing sub-queue changes after the address Addr is added. If the compression mode corresponding to the processing sub-queue changes after the address Addr is added, take out all addresses in the processing sub-queue before the address Addr for compression. If the original compression mode is continuous address compression, perform continuous address compression; if the original compression mode is high 32bit same address compression, perform high 32bit same address compression, and write the address record; jump to step 5.2); 5.5)判断处理子队列是否为空,如果判断处理子队列为空,则跳转执行步骤5.7);否则,跳转执行步骤5.6);5.5) Determine whether the processing sub-queue is empty, if it is determined that the processing sub-queue is empty, jump to step 5.7); otherwise, jump to step 5.6); 5.6)判断处理子队列是否可压缩,如果处理子队列可压缩,则根据处理子队列的内容确定处理子队列新的压缩模式,并取出处理子队列所有地址进行压缩,如果新的压缩模式为连续地址压缩,则进行连续地址压缩,如果新的压缩模式为高32bit相同地址压缩,则进行高32bit相同地址压缩,写入地址记录,跳转执行步骤5.7);否则,取出处理子队列所有地址不压缩,写入地址记录,跳转执行步骤5.7);5.6) Determine whether the processing sub-queue can be compressed. If the processing sub-queue can be compressed, determine the new compression mode of the processing sub-queue according to the content of the processing sub-queue, and take out all addresses of the processing sub-queue for compression. If the new compression mode is continuous Address compression, then perform continuous address compression. If the new compression mode is the same address compression of the high 32 bits, perform the same address compression of the high 32 bits, write the address record, and jump to step 5.7); otherwise, take out all addresses of the processing subqueue without Compress, write address record, jump to step 5.7); 5.7)判定将压缩后地址信息的写入地址记录结束。5.7) It is determined that the writing address recording of the compressed address information is completed. 2.根据权利要求1所述面向容错的NVM持久化过程冗余信息的压缩方法,其特征在于,步骤1)中的多个值队列vqueue_X包括值队列vqueue_N、vqueue_N/2、vqueue_N/22、…、vqueue_16、vqueue_8,其中,N为持久化日志的位宽,vqueue_N用于记录新旧完全不同的值,在压缩值时各记录值的低N位写入持久化日志中;vqueue_N/2用于记录新旧只有低N/2位不同的值,在压缩值时各记录值的低N/2位写入持久化日志中;……;vqueue_16用于记录新旧只有低16位不同的值,在压缩值时各记录值的低16位写入持久化日志中;vqueue_8用于记录新旧只有低8位不同的值,在压缩值时各记录值的低8位写入持久化日志中。2. The method for compressing fault-tolerant NVM persistence process redundancy information according to claim 1, wherein the multiple value queues vqueue_X in step 1) include value queues vqueue_N, vqueue_N/2, vqueue_N/2 2 , ..., vqueue_16, vqueue_8, where N is the bit width of the persistent log, vqueue_N is used to record completely different values between the old and new, and the low N bits of each recorded value are written into the persistent log when the value is compressed; vqueue_N/2 is used for Record the old and new values with only the low N/2 bits. When compressing the values, the low N/2 bits of the recorded values are written into the persistent log; ...; vqueue_16 is used to record the old and new values with only the low 16 bits. When compressing the value, the lower 16 bits of each record value are written into the persistent log; vqueue_8 is used to record the old and new values with only the lower 8 bits different. When compressing the value, the lower 8 bits of each record value are written into the persistent log. 3.根据权利要求2所述面向容错的NVM持久化过程冗余信息的压缩方法,其特征在于,步骤2)中在NVM中分配持久化日志保存区时每一个持久化日志包含3个部分,对应各个值队列vqueue_X的值的数目、多个压缩地址的记录、多个压缩值的记录。3. The fault-tolerant NVM persistence process redundancy information compression method according to claim 2, characterized in that, in step 2), each persistent log contains 3 parts when allocating a persistent log storage area in the NVM, The number of values corresponding to each value queue vqueue_X, the records of multiple compression addresses, and the records of multiple compression values. 4.根据权利要求2所述面向容错的NVM持久化过程冗余信息的压缩方法,其特征在于,所述持久化日志的位宽为64位,可类同扩展至128位或更高。4 . The fault-tolerant NVM persistent process redundancy information compression method according to claim 2 , wherein the persistent log has a bit width of 64 bits, which can be extended to 128 bits or higher similarly. 5 . 5.根据权利要求1所述面向容错的NVM持久化过程冗余信息的压缩方法,其特征在于,步骤3)的详细步骤包括:针对持久化写操作tx_write要写入的地址、值构成的信息对<addr,value>,判断地址addr在写集合WriteSet中是否已经存在,如果在写集合WriteSet中已经存在,则将持久化写操作tx_write的值value替换写集合WriteSet中已存在项目的原值value’;如果地址addr在写集合WriteSet中不存在,则将信息对<addr,value>加入写集合WriteSet。5 . The fault-tolerant NVM persistence process redundancy information compression method according to claim 1 , wherein the detailed steps of step 3) include: information composed of addresses and values to be written for the persistent write operation tx_write For <addr,value>, determine whether the address addr already exists in the write set WriteSet, if it already exists in the write set WriteSet, replace the value value of the persistent write operation tx_write with the original value value of the existing item in the write set WriteSet '; If the address addr does not exist in the write set WriteSet, add the information pair <addr, value> to the write set WriteSet. 6.一种面向容错的NVM持久化过程冗余信息的压缩装置,包括计算机设备,其特征在于,所述计算机设备被编程以执行权利要求1~5中任意一项所述面向容错的NVM持久化过程冗余信息的压缩方法的步骤。6. An apparatus for compressing redundant information in a fault-tolerant NVM persistence process, comprising computer equipment, wherein the computer equipment is programmed to execute the fault-tolerant NVM persistence described in any one of claims 1 to 5. The steps of the compression method of the redundant information of the process.
CN201811457572.7A 2018-11-30 2018-11-30 Fault-tolerant NVM (non-volatile memory) persistence process redundancy information compression method and device Active CN109542356B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811457572.7A CN109542356B (en) 2018-11-30 2018-11-30 Fault-tolerant NVM (non-volatile memory) persistence process redundancy information compression method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811457572.7A CN109542356B (en) 2018-11-30 2018-11-30 Fault-tolerant NVM (non-volatile memory) persistence process redundancy information compression method and device

Publications (2)

Publication Number Publication Date
CN109542356A CN109542356A (en) 2019-03-29
CN109542356B true CN109542356B (en) 2021-12-31

Family

ID=65851950

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811457572.7A Active CN109542356B (en) 2018-11-30 2018-11-30 Fault-tolerant NVM (non-volatile memory) persistence process redundancy information compression method and device

Country Status (1)

Country Link
CN (1) CN109542356B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110737547B (en) * 2019-10-22 2022-08-19 第四范式(北京)技术有限公司 Method and apparatus for restoring an in-memory database using a non-volatile memory NVM

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101165667A (en) * 2006-10-17 2008-04-23 国际商业机器公司 Apparatus and method for managing address conversion in data processing system
CN103985415A (en) * 2013-02-10 2014-08-13 Lsi公司 Retention-drift-history-based non-volatile memory read threshold optimization

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8370263B2 (en) * 2011-03-31 2013-02-05 Bank Of America Corporation Providing trusted services management using a hybrid service model
US9075816B2 (en) * 2011-08-24 2015-07-07 Google Inc. System and method of compressing data in font files
US9423978B2 (en) * 2013-05-08 2016-08-23 Nexgen Storage, Inc. Journal management
US9507733B2 (en) * 2013-06-21 2016-11-29 Microsoft Technology Licensing, Llc Cache destaging for virtual storage devices
CN104283906B (en) * 2013-07-02 2018-06-19 华为技术有限公司 Distributed memory system, clustered node and its section management method
CN106294190B (en) * 2015-05-25 2020-10-16 中兴通讯股份有限公司 Storage space management method and device
CN108897642B (en) * 2018-06-27 2020-11-27 清华大学 Method and device for optimizing log mechanism in persistent transactional memory system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101165667A (en) * 2006-10-17 2008-04-23 国际商业机器公司 Apparatus and method for managing address conversion in data processing system
CN103985415A (en) * 2013-02-10 2014-08-13 Lsi公司 Retention-drift-history-based non-volatile memory read threshold optimization

Also Published As

Publication number Publication date
CN109542356A (en) 2019-03-29

Similar Documents

Publication Publication Date Title
CN104881371B (en) Persistence memory transaction handles buffer memory management method and device
KR102229648B1 (en) Apparatus and method of wear leveling for storage class memory
US5652857A (en) Disk control apparatus for recording and reproducing compression data to physical device of direct access type
US7228381B2 (en) Storage system using fast storage device for storing redundant data
US6859888B2 (en) Data storage array apparatus storing error information without delay in data access, and method, program recording medium, and program for the same
CN110377531B (en) Persistent memory storage engine device based on log structure and control method
WO2016041401A1 (en) Method and device for writing data to cache
WO2019228440A1 (en) Data page access method, storage engine, and computer readable storage medium
CN109165321B (en) Consistent hash table construction method and system based on nonvolatile memory
US8429339B2 (en) Storage device utilizing free pages in compressed blocks
WO2022033269A1 (en) Data processing method, device and system
US20190102111A1 (en) Apparatus and Method of Wear Leveling for Storage Class Memory Using Address Cache
CN104899117A (en) Memory database parallel logging method for nonvolatile memory
CN113360093A (en) Memory system and device
CN107577614B (en) Data writing method and memory system
US20190286569A1 (en) Logical to physical data storage mapping
Leventhal Flash storage today: Can Flash memory become the foundation for a new tier in the storage hierarchy?
CN109542356B (en) Fault-tolerant NVM (non-volatile memory) persistence process redundancy information compression method and device
KR102471966B1 (en) Data input and output method using storage node based key-value srotre
CN111414320A (en) Method and system for constructing disk cache based on non-volatile memory of log file system
US11989090B2 (en) Method and system for ensuring failure atomicity in non-volatile memory
US20230033998A1 (en) Memory system for maintaining data consistency and operation method thereof
CN104811647B (en) The double subregion wiring methods of distributed memory system disk towards video stream data
CN115640238A (en) Reliable memory-mapped I/O implementation method and system for persistent memory
CN104615505B (en) A kind of method of remote copy record data change

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant