CN106844650A

CN106844650A - A kind of daily record merges the merging method and system of tree

Info

Publication number: CN106844650A
Application number: CN201710047936.3A
Authority: CN
Inventors: 潘锋烽; 熊劲
Original assignee: Institute of Computing Technology of CAS
Current assignee: Institute of Computing Technology of CAS
Priority date: 2017-01-20
Filing date: 2017-01-20
Publication date: 2017-06-13

Abstract

The present invention proposes a log merge tree merging method and system, the method includes a real merging step, data merging and metadata merging to generate Real SSTable, data merging is merging SSTable; virtual merging step, generating Virtual SSTable, only for metadata The data is merged, and the data source of the Virtual SSTable is recorded; the reading step of the Real SSTable is to read the Real SSTable, and when the key falls in the key range of the Real SSTable, directly search for the value corresponding to the key on the Real SSTable; The step of reading the Virtual SSTable; during the reading process, the Virtual SSTable is merged, and the Virtual SSTable is changed into a Real SSTable.

Description

Method and system for merging log merging trees

技术领域technical field

本发明涉及日志合并数技术领域，特别涉及一种日志合并树的合并方法及系统。The invention relates to the technical field of log merging numbers, in particular to a method and system for merging log merging trees.

背景技术Background technique

日志合并树(Log-Structured Merge Tree，简称LSM-Tree)由多组件组成，包括一个内存组件和多个磁盘组件，组件大小呈指数增长，其架构如图2所示，内存组件有内存表(Memtable)组成，每个磁盘组件都是由一个或者多个排序字符串表(SSTable)组成，LSM-Tree的主要思想是将对数据的写或者更新保存在内存中，达到指定的阈值后将这些操作顺序地写入到存储设备中，插入、更新、删除等更新操作由内存组件服务，查找和扫描等操作由所有组件服务，它采用异地更新方式，因此，在LSM-tree中，同一个key可能存在多个版本的value，删除操作只是在内存组件中添加一个删除标记，而没有真正删除数据，在后续的compact过程中会根据同一个key的新旧版本和删除标志做数据的删除及合并，不同key之间做排序操作，生成新的SSTable文件，并删除旧的SSTable文件。Log-Structured Merge Tree (LSM-Tree for short) consists of multiple components, including a memory component and multiple disk components. The size of the components grows exponentially. Its architecture is shown in Figure 2. The memory component has a memory table ( Memtable), each disk component is composed of one or more sorted string tables (SSTable), the main idea of LSM-Tree is to save the write or update of data in the memory, after reaching the specified threshold, these Operations are sequentially written to the storage device. Update operations such as insert, update, and delete are served by memory components, and operations such as search and scan are served by all components. It adopts an off-site update method. Therefore, in the LSM-tree, the same key There may be multiple versions of value. The delete operation just adds a delete mark in the memory component, but does not actually delete the data. In the subsequent compact process, the data will be deleted and merged according to the old and new versions of the same key and the delete mark. Perform sorting operations between different keys, generate new SSTable files, and delete old SSTable files.

为了保持LSM-tree对于将来的读、写操作高效，需要不断地将数据从高组件向低组件移动，当某个组件的大小超过阈值时，触发其合并(Compaction)操作，每次数据移动在相邻组件之间进行，期间将两组件部分数据进行排序，删除无效数据和旧数据，这个过程称为compaction，以图3为例，当组件C2超过阈值，则在组件C2中选择一个SSTable T22进行合并，从组件C3中选择与组件C2中T22的key-range存在overlap的SSTable T32和T33。将T22和T32、T33合并为三个新的SSTable，为T35、T36、T37；并将新构造的SSTable放置在C3。以这种方式，将数据T22从组件C2向下移动到组件C3。In order to keep the LSM-tree efficient for future read and write operations, it is necessary to continuously move data from high components to low components. When the size of a certain component exceeds the threshold, its compaction operation is triggered. Each data movement is It is carried out between adjacent components, during which part of the data of the two components is sorted, and invalid data and old data are deleted. This process is called compaction. Take Figure 3 as an example. When component C2 exceeds the threshold, select an SSTable T22 in component C2 To merge, select SSTable T32 and T33 that overlap with the key-range of T22 in component C2 from component C3. Merge T22, T32, and T33 into three new SSTables, T35, T36, and T37; and place the newly constructed SSTable in C3. In this way, data T22 is moved down from component C2 to component C3.

Compaction会控制基于LSM-Tree的key/value存储的IO操作，在Compaction过程中，键值对会从smaller level(低层)流向larger level(高层)，由于每层预设定的容量的限制，键值对像larger level的缓慢流动会导致系统的写暂停，图4显示了一个键值对从smaller level流向larger level过程中的读写操作过程，在compaction过程中，一个键值对会被读入写出很多次，即使是在同一个层中，原因是compaction过程是一个轮询调度的，而在smaller level层中轮询的速度会快过larger level层，结果在一个键值对移动到相邻层之前就已经参与了多次compaction过程，这种严重的写放大现象导致了经常出现写暂停，从而降低系统的写性能。Compaction will control the IO operation of LSM-Tree-based key/value storage. During the Compaction process, key-value pairs will flow from the smaller level (lower level) to the larger level (higher level). Due to the limitation of the preset capacity of each layer, the key The slow flow of value pairs like the larger level will cause the system to suspend writing. Figure 4 shows the process of reading and writing a key-value pair from the smaller level to the larger level. During the compaction process, a key-value pair will be read into Write many times, even in the same layer, the reason is that the compaction process is a round-robin scheduling, and the polling speed in the smaller level layer will be faster than the larger level layer, and the result is that a key-value pair is moved to the corresponding The adjacent layer has participated in multiple compaction processes before, and this severe write amplification phenomenon causes frequent write pauses, thereby reducing the write performance of the system.

现有针对LSM-Tree写放大问题的解决方案，主要有以下几种方式：The existing solutions to the LSM-Tree write amplification problem mainly include the following methods:

技术方案1：将传统LSM-Tree触发compaction的条件放大，即增大每个组件的阈值大小，主要的缺点在于：只能缓解组件0到组件1的写放大问题，而后续层级依旧存在写放大问题。Technical Solution 1: Enlarge the conditions for triggering compaction in traditional LSM-Tree, that is, increase the threshold value of each component. The main disadvantage is that it can only alleviate the problem of write amplification from component 0 to component 1, and write amplification still exists in subsequent levels. question.

技术方案2：bLSM将key分成多个key range，使得compaction落在少量的keyrange中，避免不相关key range中的数据进行compaction，如图5所示，假设N～Q范围内不断有数据插入，则只会引发N～Q范围内的compaction，而不会引发其他范围的compaction，如图5所示。其主要缺点在于：无法避免某一key range内的compaction所带来的写放大问题。Technical solution 2: bLSM divides the key into multiple key ranges, so that the compaction falls in a small number of key ranges, and avoids compaction of data in irrelevant key ranges. As shown in Figure 5, assuming that there are continuous data insertions in the range of N~Q, Then only the compaction in the range of N~Q will be triggered, and the compaction in other ranges will not be triggered, as shown in Fig. 5 . Its main disadvantage is that the write amplification problem caused by compaction in a certain key range cannot be avoided.

bLSM发表在Proceedings of the 2012ACM SIGMOD International Conferenceon Management of Data。bLSM was published in Proceedings of the 2012ACM SIGMOD International Conference on Management of Data.

技术方案3：VT-Tree会判断当一个键值对在连续多层中都没有相同key对应的其它版本的键值对时，可以直接跨越这些层到达含有相同key值的层或最后一层，从而节省了键值对在多层之前移动所带来的I/O开销，如图6所示。其主要缺点在于：无法缓解热点数据进行compaction所带来的写放大问题。Technical solution 3: VT-Tree will judge that when a key-value pair does not have other versions of the key-value pair corresponding to the same key in consecutive layers, it can directly cross these layers to reach the layer containing the same key value or the last layer. This saves the I/O overhead caused by the movement of key-value pairs across multiple layers, as shown in Figure 6. Its main disadvantage is that it cannot alleviate the write amplification problem caused by compaction of hot data.

VT-Tree发表在11th USENIX Conference on File and Storage Technologies(FAST’13)VT-Tree was presented at the 11th USENIX Conference on File and Storage Technologies (FAST’13)

技术方案4：Wisckey中将key与value在LSM-Tree中分开存储，即将key与value的指针存储在LSM-Tree中，而value的真实数据则存储在其他地方，因此在进行compaction过程中，只有key与value的指针会进行重复读写，如图7所示。其主要缺点在于：Key由于compaction被反复读写，在一些key较大的场景下，LSM-Tree的写放大问题仍然存在。Technical solution 4: In Wisckey, the key and value are stored separately in the LSM-Tree, that is, the key and value pointers are stored in the LSM-Tree, while the real data of the value is stored elsewhere. Therefore, during the compaction process, only The key and value pointers will be read and written repeatedly, as shown in Figure 7. Its main disadvantage is that the key is repeatedly read and written due to compaction. In some scenarios with large keys, the write amplification problem of LSM-Tree still exists.

Wisckey发表在14th USENIX Conference on File and Storage Technologies(FAST’16)Wisckey presented at the 14th USENIX Conference on File and Storage Technologies (FAST’16)

发明内容Contents of the invention

本发明的目的是解决上述现有技术无法解决由于LSM-Tree中数据由于合并(同层&跨层)导致反复读写而引发的LSM-Tree写放大问题，提出一种日志合并树的合并方法及系统。The purpose of the present invention is to solve the LSM-Tree write amplification problem caused by the repeated reading and writing of data in the LSM-Tree due to the merging (same layer & cross-layer) of data in the above-mentioned prior art, and propose a method for merging log merge trees and system.

本发明提出一种日志合并树的合并方法，包括：The present invention proposes a method for merging log merging trees, including:

实合并步骤，包括数据合并与元数据合并，并生成Real SSTable，其中数据合并为将SSTable进行合并，元数据合并包括合并key range，file number，file size信息；The real merging step includes data merging and metadata merging, and generates a Real SSTable, wherein the data merging is merging the SSTable, and the metadata merging includes merging key range, file number, and file size information;

虚合并步骤，生成Virtual SSTable，并只对元数据进行合并，元数据进行合并包括合并key range，file number，file size信息，并记录所述Virtual SSTable的数据来源；The virtual merging step generates a Virtual SSTable, and only merges the metadata, and the merging of the metadata includes merging key range, file number, and file size information, and records the data source of the Virtual SSTable;

Real SSTable的读取步骤，对所述Real SSTable进行读取，其中当key落在所述Real SSTable的key range中，则直接在所述Real SSTable上查找key所对应的value值；The step of reading the Real SSTable is to read the Real SSTable, wherein when the key falls in the key range of the Real SSTable, directly search the corresponding value of the key on the Real SSTable;

Virtual SSTable的读取步骤，当key落在Virtual SSTable的key range中，则通过元数据信息查找所述Virtual SSTable与key相对应的数据来源，并在所述Real SSTable中查找key所对应的value值；In the reading step of the Virtual SSTable, when the key falls in the key range of the Virtual SSTable, the data source corresponding to the key in the Virtual SSTable is searched through the metadata information, and the value corresponding to the key is searched in the Real SSTable ;

Virtual SSTable的合并步骤，在读取过程中对所述Virtual SSTable进行合并，将所述Virtual SSTable变成Real SSTable，从而提升读性能。The merging step of the Virtual SSTable is to merge the Virtual SSTable during the reading process, so as to change the Virtual SSTable into a Real SSTable, thereby improving the reading performance.

所述实合并步骤包括：Described real merging step comprises:

11.依次读取各个Real SSTable的key/value；11. Read the key/value of each Real SSTable in turn;

12.通过归并排序，将符合条件的key/value顺序写入到固定大小的Real SSTable中；12. By merging and sorting, write the qualified key/value into the fixed-size Real SSTable in sequence;

13.写满之后，重新创建一个新的Real SSTable，继续将排好序的key/value写入其中，直至写完。13. After it is full, re-create a new Real SSTable, and continue to write the sorted key/value into it until it is finished.

所述虚合并步骤包括：The virtual merging step includes:

21.收集Real SSTable的元数据；21. Collect metadata of Real SSTable;

22.根据Real SSTable中的key range，获取新的key range，覆盖所有RealSSTable的范围；22. Obtain a new key range according to the key range in Real SSTable, covering all Real SSTable ranges;

23.根据Real SSTable的个数N，将新的key range分成N个，表示N个VirtualSSTable，其中Virtual SSTable的文件大小设置成与Real SSTable的文件大小相同，Virtual SSTable的文件编号通过虚合并进程统一分配，数据源则为虚合并中所涉及到的所有Real SSTable。23. Divide the new key range into N according to the number N of Real SSTables, representing N Virtual SSTables, where the file size of the Virtual SSTable is set to be the same as that of the Real SSTable, and the file numbers of the Virtual SSTable are unified through the virtual merge process The data source is all Real SSTables involved in the virtual merge.

所述Virtual SSTable的读取步骤包括The reading steps of the Virtual SSTable include

41.判断需要查找的key是否落在Virtual SSTable的key范围中；41. Determine whether the key to be searched falls in the key range of the Virtual SSTable;

42.如果不在，则返回42. Return if not in

43.如果在，则通过Virtual SSTable的元数据找到数据源，即Real SSTables；43. If yes, find the data source through the metadata of the Virtual SSTable, that is, Real SSTables;

44.根据Real SSTable的索引进行查找key；44. Search key according to the index of Real SSTable;

45.如果找到，则返回key的value值；45. If found, return the value of the key;

46.如果没有找到，则进行下一个Real SSTable的查找，直至将Virtual SSTable的数据源中所包含的Real SSTable都查找一遍；46. If not found, then search for the next Real SSTable until all the Real SSTables contained in the data source of the Virtual SSTable are searched;

47.如果还是没有找到，则返回。47. If still not found, return.

所述Virtual SSTable的合并步骤包括The merging steps of the Virtual SSTable include

51.获取Virtual SSTable的数据源Real SSTables；51. Get the data source Real SSTables of Virtual SSTable;

52.依次读取各个Real SSTable的key/value；52. Read the key/value of each Real SSTable in turn;

53.通过归并排序，将符合条件的key/value顺序写入到固定大小的Real SSTable中；53. By merging and sorting, write the key/value that meets the conditions into the fixed-size Real SSTable in sequence;

54.写满之后，重新创建一个新的Real SSTable，继续将排好序的key/value写入其中，直至写完。54. After it is full, re-create a new Real SSTable, and continue to write the sorted key/value into it until it is finished.

本发明还提出一种日志合并树的合并系统，包括：The present invention also proposes a log merge tree merging system, including:

实合并模块，包括数据合并与元数据合并，并生成Real SSTable，其中数据合并为将SSTable进行合并，元数据合并包括合并key range，file number，file size信息；The real merging module includes data merging and metadata merging, and generates a Real SSTable, where data merging is merging SSTables, and metadata merging includes merging key range, file number, and file size information;

虚合并模块，用于生成Virtual SSTable，并只对元数据进行合并，元数据进行合并包括合并key range，file number，file size信息，并记录所述Virtual SSTable的数据来源；The virtual merging module is used to generate a Virtual SSTable, and only merges metadata, and the merging of metadata includes merging key range, file number, and file size information, and records the data source of the Virtual SSTable;

Real SSTable的读取模块，用于对所述Real SSTable进行读取，其中当key落在所述Real SSTable的key range中，则直接在所述Real SSTable上查找key所对应的value值；The reading module of Real SSTable is used for reading described Real SSTable, wherein when key falls in the key range of described Real SSTable, then directly search the value value corresponding to key on described Real SSTable;

Virtual SSTable的读取模块，用于当key落在Virtual SSTable的key range中，则通过元数据信息查找所述Virtual SSTable与key相对应的数据来源，并在所述RealSSTable中查找key所对应的value值；The reading module of the Virtual SSTable is used to find the data source corresponding to the key in the Virtual SSTable through the metadata information when the key falls in the key range of the Virtual SSTable, and to find the value corresponding to the key in the RealSSTable value;

Virtual SSTable的合并模块，用于在读取过程中对所述Virtual SSTable进行合并，将所述Virtual SSTable变成Real SSTable，从而提升读性能。The merging module of the Virtual SSTable is used for merging the Virtual SSTable during the reading process, changing the Virtual SSTable into a Real SSTable, thereby improving the reading performance.

所述实合并模块包括：Described real merge module comprises:

所述虚合并模块包括：The virtual merging module includes:

21.收集Real SSTable的元数据；21. Collect metadata of Real SSTable;

所述Virtual SSTable的读取模块包括The reading module of the Virtual SSTable includes

42.如果不在，则返回42. Return if not in

47.如果还是没有找到，则返回。47. If still not found, return.

所述Virtual SSTable的合并模块包括The merge module of the Virtual SSTable includes

由以上方案可知，本发明的优点在于：As can be seen from the above scheme, the present invention has the advantages of:

1.本发明能够降低合并所带来的写放大问题，从而提升系统的写性能，并且与其他降低LSM-Tree写放大问题的方法是正交的，可叠加使用；1. The present invention can reduce the write amplification problem caused by merging, thereby improving the writing performance of the system, and it is orthogonal to other methods for reducing the write amplification problem of LSM-Tree, and can be superimposed and used;

2.本发明能够减少合并中的I/O量，对于SSD这类存储设备，延长其使用寿命；2. The present invention can reduce the amount of I/O in merging, and prolong its service life for storage devices such as SSDs;

图1展示了本发明与RocksDB的性能评测，从图1中可以看出，本发明的优势在于：Fig. 1 shows the performance evaluation of the present invention and RocksDB, as can be seen from Fig. 1, the advantage of the present invention is:

Write-intensive(写密集)的负载下，整体性能提升30％～1倍；Under Write-intensive (write-intensive) load, the overall performance is increased by 30% to 1 times;

Read-intensive(读密集)的负载下，整体性能基本上与RocksDB持平；Under Read-intensive (read-intensive) load, the overall performance is basically the same as RocksDB;

本身LSM-Tree主要用于Write-intensive负载。The LSM-Tree itself is mainly used for Write-intensive loads.

附图说明Description of drawings

图1是本发明性能评测图；Fig. 1 is a performance evaluation figure of the present invention;

图2是LSM-Tree架构图；Figure 2 is an LSM-Tree architecture diagram;

图3是LSM-Tree Compaction示例图；Figure 3 is an example diagram of LSM-Tree Compaction;

图4是LSM-Tree进行Compaction过程中键值对的流动过程图；Figure 4 is a flow diagram of the key-value pairs during the Compaction process of the LSM-Tree;

图5是bLSM图；Figure 5 is a bLSM diagram;

图6是VT-Tree图；Figure 6 is a VT-Tree diagram;

图7是Wisckey图；Figure 7 is a Wisckey diagram;

图8是实合并具体流程图；Fig. 8 is the specific flow chart of real merger;

图9是虚合并具体流程图；Fig. 9 is the specific flowchart of virtual merger;

图10是Virtual SSTable的读取流程图；Figure 10 is a flow chart of reading Virtual SSTable;

图11是Virtual SSTable的合并图；Figure 11 is a merged diagram of Virtual SSTable;

图12是不同VCT对于写性能的影响图；Figure 12 is a diagram of the impact of different VCTs on write performance;

图13是不同VCT对于I/O量和合并时间的影响图；Figure 13 is a diagram of the influence of different VCTs on I/O volume and merge time;

图14是不同MCT对于读性能的影响图。Figure 14 is a graph showing the influence of different MCTs on read performance.

具体实施方式detailed description

以下为本发明的整体流程，如下所示：The following is the overall process of the present invention, as follows:

1.实合并1. Real merger

实合并包括数据合并和元数据合并，其产生的SSTable为Real SSTable。其中数据合并是SSTable合并，而元数据合并包括合并key range，file number，file size等信息，具体流程如图8所示。Real merging includes data merging and metadata merging, and the resulting SSTable is Real SSTable. Among them, data merging is SSTable merging, and metadata merging includes merging key range, file number, file size and other information. The specific process is shown in Figure 8.

12.通过归并排序，将符合条件的key/value顺序写入到固定大小的Real SSTable中(默认为2MB)12. By merging and sorting, write the qualified key/value order into a fixed-size Real SSTable (the default is 2MB)

2.虚合并2. Virtual merger

虚合并产生的SSTable为Virtual SSTable。虚合并只对元数据合并。元数据合并包括合并key range，file number，file size等，以及记录该Virtual SSTable是由哪些Real SSTable构成(称为parentSST)，即Virtual SSTable的数据来源。具体流程如图9所示。The SSTable generated by the virtual merge is a Virtual SSTable. Virtual merge only merges metadata. Metadata merging includes merging key range, file number, file size, etc., and recording which Real SSTables the Virtual SSTable is composed of (called parentSST), which is the data source of the Virtual SSTable. The specific process is shown in Figure 9.

21.收集虚合并中所涉及到的Real SSTable的元数据；21. Collect the metadata of the Real SSTable involved in the virtual merge;

22.根据Real SSTable中的key range，得到一个新的key range，覆盖上述所有Real SSTable的范围；22. According to the key range in the Real SSTable, a new key range is obtained, covering the range of all the above Real SSTables;

23.根据虚合并中Real SSTable的个数N，将新的key range分成N个，这N个key的范围表示N个Virtual SSTable，其中文件大小统一设置成与Real SSTable大小一样(默认为2MB)，文件编号通过虚合并进程统一分配，数据源则是本次虚合并中所涉及到的所有Real SSTable。23. Divide the new key range into N according to the number N of Real SSTables in the virtual merge. The range of these N keys represents N Virtual SSTables, and the file size is uniformly set to be the same as the size of the Real SSTable (default is 2MB) , the file number is uniformly assigned through the virtual merge process, and the data source is all the Real SSTables involved in this virtual merge.

3.Real SSTable的读取3. Reading of Real SSTable

Real SSTable的读取流程(与传统基于LSM-Tree的键值系统，例如LevelDB、RocksDB，相同)：当key落在Real SSTable的key range中，则直接在其上查找key所对应的value值。The reading process of Real SSTable (same as traditional LSM-Tree-based key-value systems, such as LevelDB and RocksDB): when the key falls in the key range of Real SSTable, directly search for the value corresponding to the key.

4.Virtual SSTable的读取4. Reading of Virtual SSTable

Virtual SSTable的读取流程：当key落在Virtual SSTable的key range中，则通过元数据信息找到Virtual SSTable相应的数据源，即Real SSTable，并在Real SSTable中查找key所对应的value值，具体流程如图10所示。The reading process of Virtual SSTable: When the key falls in the key range of Virtual SSTable, find the corresponding data source of Virtual SSTable through the metadata information, that is, Real SSTable, and find the value corresponding to the key in Real SSTable. The specific process As shown in Figure 10.

42.如果不在，则返回42. Return if not in

43.如果在，则通过Virtual SSTable的元数据找到其数据源，即Real SSTables；43. If it is, find its data source through the metadata of Virtual SSTable, that is, Real SSTables;

46.如果没有找到，则进行下一个Real SSTable的查找，直至将该VirtualSSTable的数据源中所包含的Real SSTable都找一遍；46. If not found, then search for the next Real SSTable until all the Real SSTables contained in the data source of the VirtualSSTable are searched;

47.如果还是没有找到，则返回；47. If still not found, return;

5.Virtual SSTable的合并5. Merge of Virtual SSTable

为了减少Virtual SSTable对读性能的影响，在读取过程中会对Virtual SSTable进行合并，将Virtual SSTable变成Real SSTable，从而提升读性能。具体流程如图11所示。In order to reduce the impact of the Virtual SSTable on the read performance, the Virtual SSTable will be merged during the read process, turning the Virtual SSTable into a Real SSTable, thereby improving the read performance. The specific process is shown in Figure 11.

51.获取Virtual SSTable的数据源—Real SSTables；51. Obtain the data source of Virtual SSTable—Real SSTables;

53.通过归并排序，将符合条件的key/value顺序写入到固定大小的Real SSTable中(默认为2MB)53. By merging and sorting, write the qualified key/value order into a fixed-size Real SSTable (the default is 2MB)

实例1：合并操作(虚合并与实合并的结合)Example 1: Merge operation (combination of virtual merge and real merge)

在本发明中当LSM-Tree的某一层达到阈值时，需要进行合并，则具体流程如下所示：In the present invention, when a certain layer of the LSM-Tree reaches the threshold, it needs to be merged, and the specific process is as follows:

a.选择需要进行合并的SSTable；a. Select the SSTable that needs to be merged;

b.统计这些SSTable中包含Real SSTable的个数，记为N；b. Count the number of Real SSTables contained in these SSTables, denoted as N;

c.如果N超过阈值VCT，则进行实合并；c. If N exceeds the threshold VCT, perform real merging;

d.N小于VCT，则进行虚合并；d. If N is less than VCT, virtual merge is performed;

其中VCT是区分实合并和虚合并的参数，不同VCT对于合并的影响是不同：Among them, VCT is a parameter to distinguish between real merger and virtual merger. Different VCTs have different effects on mergers:

VCT越小，实合并次数越多，从而导致I/O开销较大；The smaller the VCT, the more times of actual merging, resulting in larger I/O overhead;

VCT越大，虚合并次数越多，虽然能够节省I/O开销，但是后续的实合并开销较大，从而导致合并时间较长；The larger the VCT, the more virtual merge times, although it can save I/O overhead, but the subsequent real merge overhead is larger, resulting in longer merge times;

图12展示了在Write-100％负载下，不同的VCT对写性能的影响，其中当VCT＝12时，写性能是最优的，其主要原因在于两个方面：节约的I/O量以及总的合并时间，具体如图13所示，VCT＝12能够节约一定的I/O量，并且合并的时间最少，因此在本例中VCT＝12是最优的。Figure 12 shows the impact of different VCTs on write performance under the Write-100% load. When VCT=12, the write performance is optimal, mainly due to two aspects: the amount of saved I/O and The total merging time, as specifically shown in FIG. 13 , VCT=12 can save a certain amount of I/O, and the merging time is the least, so in this example, VCT=12 is optimal.

实例2：Get操作(Real SSTable和Virtual SSTable的读取、Virtual SSTable的合并)Example 2: Get operation (reading of Real SSTable and Virtual SSTable, merging of Virtual SSTable)

其基本流程如下所示：Its basic flow is as follows:

a.从LSM-Tree中由上往下找；a. Search from top to bottom in LSM-Tree;

b.对于每一层的SSTable，通过其上的元数据信息—key range，来定位SSTable；b. For each layer of SSTable, locate the SSTable through its metadata information—key range;

c.如果是Real SSTable，则直接进行查找；c. If it is a Real SSTable, search it directly;

d.如果是Virtual SSTable，则通过Virtual SSTable的parentSST，找到其数据源之后，再进行查找；d. If it is a Virtual SSTable, find its data source through the parentSST of the Virtual SSTable, and then search;

e.没有找到，则继续在本层或者下一层定位相应的SSTable进行查找，直到找到或者找完为止；e. If not found, continue to locate the corresponding SSTable in this layer or the next layer to search until it is found or finished;

本发明的Get操作还会涉及Virtual SSTable的合并，具体流程如下所示：Get operation of the present invention also can relate to the merging of Virtual SSTable, and specific process is as follows:

a.当Get操作涉及到Virtual SSTable时，会判断当前Virtual SSTable的两个值：读取该Virtual SSTable的累积次数(R)和该Virtual SSTable所包含的Real SSTable的个数(M)a. When the Get operation involves a Virtual SSTable, it will judge two values of the current Virtual SSTable: the cumulative number of reads of the Virtual SSTable (R) and the number of Real SSTables contained in the Virtual SSTable (M)

b.当R>RCT&&M>MCT时，则进行Virtual SSTable的合并，否则不合并；b. When R>RCT&&M>MCT, merge the Virtual SSTable, otherwise do not merge;

使用读取该Virtual SSTable的累积次数和和该Virtual SSTable所包含的RealSSTable的个数来作为触发Virtual SSTable合并的条件，主要原因在于：Use the cumulative number of reads of the Virtual SSTable and the number of RealSStables contained in the Virtual SSTable as the conditions to trigger the merger of the Virtual SSTable. The main reasons are:

读取累积次数从热度角度考虑，热度越高，virtual SSTable被频繁读，热度越低，virtual SSTable很少被读；The cumulative number of reads is considered from the perspective of popularity. The higher the popularity, the virtual SSTable is frequently read, and the lower the popularity, the virtual SSTable is rarely read;

Virtual SSTable所包含的Real SSTable个数越多，读性能越差；反之，读性能越好；The more Real SSTables included in the Virtual SSTable, the worse the read performance; otherwise, the better the read performance;

因此当R和M同时满足条件时，表明当前Virtual SSTable的热度较高，且所包含的real SSTable个数超过了阈值范围，会严重影响读性能，需要进行合并。RCT的默认值采用5，而对于MCT的选取，图14展示了不同MCT对于读性能的影响。Therefore, when R and M meet the conditions at the same time, it indicates that the popularity of the current virtual SSTable is high, and the number of real SSTables contained exceeds the threshold range, which will seriously affect the read performance and need to be merged. The default value of RCT is 5, and for the selection of MCT, Figure 14 shows the impact of different MCTs on the read performance.

当MCT＝3～5时，读性能与RocksDB基本相同；When MCT=3~5, the read performance is basically the same as RocksDB;

当MCT＝7～11时，读性能较差，其主要原因在于没有被合并的Virtual SSTable影响了读性能；When MCT=7~11, the read performance is poor, the main reason is that the unmerged Virtual SSTable affects the read performance;

因此，MCT的选取需要考虑两个因素：读性能和合并带来的开销，综合上述两个因素，在本例中MCT＝5较为合适，原因在于1)读性能与RocksDB基本上相同；2)相比于MCT＝3,能够触发较少的Virtual SSTable合并，节省资源。Therefore, the selection of MCT needs to consider two factors: read performance and overhead caused by merging. Combining the above two factors, MCT=5 is more appropriate in this example, because 1) the read performance is basically the same as RocksDB; 2) Compared with MCT=3, fewer virtual SSTable merges can be triggered to save resources.

所述实合并模块包括：Described real merge module comprises:

12.通过归并排序，将符合条件的key/value顺序写入到固定大小的Real SSTable中；12. By merging and sorting, write the qualified key/value into the fixed-size Real SSTable in order;

所述虚合并模块包括：The virtual merging module includes:

21.收集Real SSTable的元数据；21. Collect metadata of Real SSTable;

42.如果不在，则返回42. Return if not in

47.如果还是没有找到，则返回。47. If still not found, return.

Claims

1. a kind of daily record merges the merging method of tree, it is characterised in that including：

Real combining step, including data merge and merge with metadata, and generate Real SSTable, wherein data merge into by SSTable is merged, and metadata merges to be included merging key range, file number, file size information；

Empty combining step, generates Virtual SSTable, and only metadata merged, and metadata is merged including closing And key range, file number, file size information, and record the data source of the Virtual SSTable；

The read step of Real SSTable, is read out to the Real SSTable, wherein when key falls in the Real In the key range of SSTable, then the value values corresponding to key are directly searched on the Real SSTable；

The read step of Virtual SSTable, when key falls in the key range of Virtual SSTable, then by unit Data source Virtual SSTable corresponding with key described in data information search, and looked into the Real SSTable Look for the value values corresponding to key；

The combining step of Virtual SSTable, merges in reading process to the Virtual SSTable, by institute State Virtual SSTable and become Real SSTable, so as to lift reading performance.

2. daily record as claimed in claim 1 merges the merging method of tree, it is characterised in that the real combining step includes：

11. key/value for being successively read each Real SSTable；

12. are sorted by merger, and qualified key/value is sequentially written in the Real SSTable of fixed size；

After 13. write completely, a new Real SSTable is re-created, continued sorted key/value write-ins wherein, Until writing.

3. daily record as claimed in claim 1 merges the merging method of tree, it is characterised in that the empty combining step includes：

21. metadata for collecting Real SSTable；

The 22. key range in Real SSTable, obtain new key range, cover all Real SSTable's Scope；

New key range, according to the number N of Real SSTable, are divided into N number of by 23., represent N number of Virtual SSTable, The file size of wherein Virtual SSTable is arranged to identical with the file size of Real SSTable, Virtual The reference number of a document of SSTable passes through empty merging process and distributes unitedly, and data source is then all Real involved in empty merging SSTable。

4. daily record as claimed in claim 1 merges the merging method of tree, it is characterised in that the Virtual SSTable's Read step includes

41. judge whether the key for requiring to look up falls in the key scopes of Virtual SSTable；

42. if it was not then return

If 43. are finding data source, i.e. Real SSTables by the metadata of Virtual SSTable；

44. carry out lookup key according to the index of Real SSTable；

45. if it is found, then return to the value values of key；

If 46. do not find, the lookup of next Real SSTable is carried out, until by the number of Virtual SSTable All searched one time according to the Real SSTable included in source；

If 47., again without finding, return.

5. daily record as claimed in claim 1 merges the merging method of tree, it is characterised in that the Virtual SSTable's Combining step includes

The 51. data source Real SSTables for obtaining Virtual SSTable；

52. key/value for being successively read each Real SSTable；

53. are sorted by merger, and qualified key/value is sequentially written in the Real SSTable of fixed size；

After 54. write completely, a new Real SSTable is re-created, continued sorted key/value write-ins wherein, Until writing.

6. a kind of daily record merges the combination system of tree, it is characterised in that including：

Real merging module, including data merge and merge with metadata, and generate Real SSTable, wherein data merge into by SSTable is merged, and metadata merges to be included merging key range, file number, file size information；

Empty merging module, for generating Virtual SSTable, and only merges to metadata, and metadata merges bag Include merging key range, file number, file size information, and record the data of the Virtual SSTable Source；

The read module of Real SSTable, for being read out to the Real SSTable, wherein when key falls described In the key range of Real SSTable, then the value values corresponding to key are directly searched on the Real SSTable；

The read module of Virtual SSTable, for falling in the key range of Virtual SSTable as key, then leads to Cross metadata information and search the Virtual SSTable data sources corresponding with key, and in the Real SSTable Value values corresponding to middle lookup key；

The merging module of Virtual SSTable, for being merged to the Virtual SSTable in reading process, The Virtual SSTable are become into Real SSTable, so as to lift reading performance.

7. daily record as claimed in claim 6 merges the combination system of tree, it is characterised in that the real merging module includes：

11. key/value for being successively read each Real SSTable；

8. daily record as claimed in claim 6 merges the combination system of tree, it is characterised in that the empty merging module includes：

21. metadata for collecting Real SSTable；

9. daily record as claimed in claim 6 merges the combination system of tree, it is characterised in that the Virtual SSTable's Read module includes

42. if it was not then return

44. carry out lookup key according to the index of Real SSTable；

45. if it is found, then return to the value values of key；

If 47., again without finding, return.

10. daily record as claimed in claim 6 merges the combination system of tree, it is characterised in that the Virtual SSTable's Merging module includes

The 51. data source Real SSTables for obtaining Virtual SSTable；

52. key/value for being successively read each Real SSTable；

53. are sorted by merger, and qualified key/value is sequentially written in the RealSSTable of fixed size；