CN106502587B

CN106502587B - Hard disk data management method and hard disk control device

Info

Publication number: CN106502587B
Application number: CN201610912077.5A
Authority: CN
Inventors: 丁敬文
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2016-10-19
Filing date: 2016-10-19
Publication date: 2019-10-25
Anticipated expiration: 2036-10-19
Also published as: CN106502587A

Abstract

The embodiment of the invention discloses a hard disk data management method and a hard disk control device, which are used for efficiently managing fragments on the hard disk. The embodiment of the present invention is applied to a hard disk control device including a hard disk. The hard disk includes a data area and a log area. The method includes: writing data to the cache device; After the modification and release of the set number of times, the hard disk can generate data with a preset number of fragments; if the data is not hot data, allocate data area space for the data in the data area, and write the data into the data area space; if the data is hot data, then Allocate the log area space for the data in the log area, and write the data into the log area space. By storing different types of data in different areas on the hard disk and managing them in different ways, the efficiency of fragment management on the hard disk can be improved, and the efficient management of hard disk fragments in the log area can reduce the generation of hard disk fragments.

Description

Hard disk data management method and hard disk control device

技术领域technical field

本发明涉及数据处理领域，尤其涉及一种硬盘数据管理方法和硬盘控制装置。The invention relates to the field of data processing, in particular to a hard disk data management method and a hard disk control device.

背景技术Background technique

对于普通的机械硬盘，因为其依赖机械转动硬盘和移动磁头定位读写位置，所以硬盘顺序读写是最理想的读写模型。如果硬盘空间碎片化，那么写数据的时候，无法分配到连续的空间，导致磁头抖动严重，数据传输的主要时间消耗在定位磁道和扇区上，从而留给传输数据的时间很少。因为文件的数据比较离散，那么读取的这些文件的时候，效率也较低。For ordinary mechanical hard disks, because they rely on mechanically rotating the hard disk and moving the magnetic head to position the reading and writing position, sequential reading and writing of the hard disk is the most ideal reading and writing model. If the hard disk space is fragmented, continuous space cannot be allocated when writing data, resulting in severe head vibration. The main time of data transmission is spent on positioning tracks and sectors, leaving little time for data transmission. Because the data of the file is relatively discrete, the efficiency of reading these files is also low.

因此，大多数硬盘文件系统都在尽力避免产生大量的碎片空间，但是碎片化仍然无法避免。Therefore, most hard disk file systems are trying their best to avoid creating a large amount of fragmented space, but fragmentation is still unavoidable.

如，采用COW机制可以利用硬盘顺序写的优势。当要修改写一块数据的时候，不是直接覆盖老版本的数据，而是读取老版本的数据，修改好之后，写到一个新的位置，将写数据的数据都聚合起来，顺序写到硬盘上，释放老版本的数据。因为数据的位置变化了，需要将指向数据的上一层索引块中的指针进行修改，如此递归到最顶层。这样就会释放大量的数据，导致硬盘上产生大量的碎片。For example, using the COW mechanism can take advantage of the hard disk sequential write. When you want to modify and write a piece of data, instead of directly overwriting the old version of the data, you read the old version of the data. After the modification, write to a new location, aggregate the written data, and write them to the hard disk sequentially. , release the old version of the data. Because the location of the data has changed, it is necessary to modify the pointer to the index block of the previous layer of the data, so as to recurse to the topmost layer. This will release a large amount of data, resulting in a large amount of fragmentation on the hard disk.

发明内容Contents of the invention

本发明实施例提供了一种硬盘数据管理方法和硬盘控制装置，用于高效管理硬盘上的碎片。Embodiments of the present invention provide a hard disk data management method and a hard disk control device for efficient management of fragments on a hard disk.

本发明第一方面提供一种硬盘数据管理方法，该方法应用于包括硬盘的硬盘控制装置，硬盘包括数据区和日志区，该方法包括：The first aspect of the present invention provides a hard disk data management method. The method is applied to a hard disk control device including a hard disk. The hard disk includes a data area and a log area. The method includes:

硬盘控制装置向缓存器件写入数据，该缓存器件例如可以是内存、flash卡、固态硬盘等与硬盘不同的存储器件，然后，硬盘控制装置判断该数据是否是热点数据，其中热点数据为存储在硬盘上后在预设次数的修改和释放后能使硬盘产生预设数量碎片的数据。通过在缓存器件上对写入的数据进行判断，确定该数据的类型，以对不同的数据执行不同的处理方式。The hard disk control device writes data to the cache device, and the cache device can be, for example, a storage device different from the hard disk such as a memory, a flash card, a solid state disk, and then, the hard disk control device judges whether the data is hot data, wherein the hot data is stored in After the preset number of modifications and releases on the hard disk, the hard disk can generate a preset number of fragmented data. By judging the written data on the cache device, the type of the data is determined, so as to execute different processing methods for different data.

在向硬盘写入数据时，若该数据不是热点数据，则为数据在数据区分配数据区空间，将数据写入数据区空间；若该数据是热点数据，则为数据在日志区分配日志区空间，将数据写入日志区空间。When writing data to the hard disk, if the data is not hot data, allocate data area space for the data in the data area, and write the data into the data area space; if the data is hot data, allocate log area for the data in the log area space, write data into the log area space.

本发明第一方面的硬盘数据管理方法，将待写入硬盘的数据分为热点数据和非热点数据，热点数据易于导致硬盘产生碎片，将热点数保存在日志区上，以日志方式进行管理，即使日志区上的数据频繁修改产生硬盘碎片，也方便对这些碎片进行回收等管理，而将非热点数据保存在数据区，非热点数据的释放不易导致硬盘产生碎片，数据区可以无需为硬盘碎片管理分配过多资源，从而，通过在硬盘上将不同类型的数据保存在不同的区域以不同的方式进行管理，可提高硬盘上的碎片管理效率，有效对硬盘上的碎片进行管理，规避或减少硬盘碎片产生。The hard disk data management method of the first aspect of the present invention divides the data to be written into the hard disk into hot data and non-hot data, hot data is likely to cause fragmentation of the hard disk, and the hot data is stored in the log area and managed in a log manner, Even if the data in the log area is frequently modified to cause hard disk fragments, it is convenient to recover and manage these fragments, and save non-hot data in the data area, the release of non-hot data will not easily cause hard disk fragments, and the data area does not need to be hard disk fragments Manage the allocation of excessive resources. Therefore, by saving different types of data in different areas on the hard disk and managing them in different ways, the efficiency of fragmentation management on the hard disk can be improved, and the fragmentation on the hard disk can be effectively managed to avoid or reduce Hard disk fragmentation occurs.

结合第一方面，在第一种可能的实现方式中，缓存器件为内存，为该数据在日志区分配日志区空间之后，第一种可能的实现方式还包括：建立该数据和日志区空间的映射关系。即硬盘控制装置为该数据在日志区分配日志区空间后，在缓存器件上建立一映射关系，为该数据和其分配到的日志区空间的对应关系，通过该映射关系记录数据在日志区的存储情况，从而可使用该映射关系对日志区的数据和缓存器件上的数据淘汰操作进行管理。其中，在第一种可能的实现方式中，缓存器件为内存，但缓存器件还可以是其它的情形。In combination with the first aspect, in the first possible implementation manner, the cache device is a memory, and after the log area space is allocated for the data in the log area, the first possible implementation manner further includes: establishing a relationship between the data and the log area space Mapping relations. That is, after the hard disk control device allocates log area space for the data in the log area, it establishes a mapping relationship on the cache device, and records the corresponding relationship between the data and the log area space allocated to it, and records the data in the log area through the mapping relationship. Storage conditions, so that the mapping relationship can be used to manage the data in the log area and the data elimination operation on the cache device. Wherein, in the first possible implementation manner, the cache device is a memory, but the cache device may also be other situations.

结合第一方面的第一种可能的实现方式，在第二种可能的实现方式中建立数据和日志区空间的映射关系，包括：建立多个目标数据和多个目标数据分配到的日志区空间的映射关系，其中目标数据属于热点数据；In combination with the first possible implementation of the first aspect, the mapping relationship between data and log area space is established in the second possible implementation, including: establishing multiple target data and the log area space to which multiple target data are allocated The mapping relationship of , where the target data belongs to hot data;

将数据写入日志区空间，包括：将多个目标数据的多个写操作组合为一个事务；将事务的所有目标数据写入日志区空间。而当事务的其中一个目标数据的写操作执行失败时，事务的其他目标数据执行的写操作失败。多个目标数据指至少两个目标数据，相应的，多个写操作指至少两个写操作。这样，引入数据库领域事务的概念，以多个热点数据为单位对进行操作，如以多个属于同一事务的热点数据建立映射关系，和以事务的所有热点数据的写操作一起执行向日志区的写操作。这样能提高数据处理的效率。Writing data into the log area space includes: combining multiple write operations of multiple target data into one transaction; writing all target data of the transaction into the log area space. And when the write operation of one of the target data of the transaction fails, the write operations of other target data of the transaction fail. A plurality of target data refers to at least two target data, and correspondingly, a plurality of write operations refers to at least two write operations. In this way, the concept of transactions in the database domain is introduced, and multiple hot data are used as a unit to operate, such as establishing a mapping relationship with multiple hot data belonging to the same transaction, and performing write operations to the log area together with the write operation of all hot data of the transaction write operation. This can improve the efficiency of data processing.

结合第一方面的第二种可能的实现方式，第三种可能的实现方式还包括：在内存上缓存属于热点数据的数据。将热点数据缓存在内存上，例如，在将热点数据写入日志区时，还将这些数据保留在内存上或者，在向内存写入数据前从日志区读取其上的热点数据，并缓存在内存上，这样，后续向内容写入数据时，可直接在内存上对数据进行修改，数据在内存中迁移，减少了硬盘上碎片的产生，并可根据迁移情况对日志区的数据进行整理。In combination with the second possible implementation manner of the first aspect, a third possible implementation manner further includes: caching data belonging to hot data in memory. Cache hotspot data in memory, for example, when writing hotspot data to log area, keep these data in memory or read hotspot data from log area before writing data to memory, and cache In the memory, in this way, when data is subsequently written to the content, the data can be directly modified in the memory, and the data is migrated in the memory, which reduces the generation of fragments on the hard disk, and can organize the data in the log area according to the migration situation .

结合第一方面的第三种可能的实现方式，在第四种可能的实现方式中将事务的所有目标数据写入日志区空间之前，第四种可能的实现方式还包括：In combination with the third possible implementation of the first aspect, before writing all target data of the transaction into the log area space in the fourth possible implementation, the fourth possible implementation also includes:

根据多个目标数据建立数据链表，其中，数据链表用于管理目标数据，数据链表管理的目标数据与事务的目标数据相同；然后，根据数据链表对目标数据进行管理，而，根据数据链表对目标数据进行管理，包括：建立第二数据链表后，当第二数据链表管理的第二目标数据是由预先建立的第一数据链表管理的第一目标数据修改得到时，在第一数据链表上解除对第一目标数据的管理；在与第一数据链表对应的第一映射关系上删除第一目标数据的信息。这样，在内存上，通过数据链表即可管理不同事务间数据的迁移。Establish a data linked list based on multiple target data, wherein the data linked list is used to manage the target data, and the target data managed by the data linked list is the same as the target data of the transaction; then, the target data is managed according to the data linked list, and the target data is managed according to the data linked list Data management includes: after the second data link list is established, when the second target data managed by the second data link list is modified from the first target data managed by the pre-established first data link list, the first data link list is released. Management of the first target data; deleting information of the first target data on the first mapping relationship corresponding to the first data linked list. In this way, in the memory, data migration between different transactions can be managed through the data linked list.

根据数据链表对内存上的数据进行管理的方式可以为：在预设释放条件下，根据数据链表从先到后的建立顺序，查找数据链表未解除管理的数据；在内存上释放目标数据链表未解除管理的数据，且在内存上保留目标数据链表对应的目标映射关系。通过释放目标数据链表上的数据可扩大内存管理数据的容量。系统通过目标映射关系的查询，即可从日志区上读取到对应的数据。The way to manage the data on the memory according to the data link list can be: under the preset release condition, according to the establishment sequence of the data link list from first to last, find the data that has not been released from the data link list; release the target data link list on the memory. The managed data is released, and the target mapping relationship corresponding to the target data linked list is reserved in memory. The capacity of the memory management data can be expanded by releasing the data on the target data linked list. The system can read the corresponding data from the log area by querying the target mapping relationship.

结合第一方面的第四种可能的实现方式，在第六种可能的实现方式中在预设释放条件下，根据数据链表从先到后的建立顺序，查找数据链表未解除管理的数据，包括：当内存达到第一预设水位时，根据数据链表从先到后的建立顺序，查找数据链表未解除管理的数据；另外，该方法还包括第二阶段的内存数据淘汰，即在内存上释放目标数据链表未解除管理的数据之后，该方法还包括：在内存达到第二预设水位时，从日志区读取目标映射关系指向的数据；将目标映射关系指向的数据写入数据区；在内存上删除目标映射关系。通过两阶段的内存数据淘汰机制可扩大内存对数据的管理容量，而在第二阶段的内存数据淘汰中，此时目标映射关系指向的数据为不活跃的数据，被修改的可能性较低，可将这些数据从日志区迁移到数据区保存，这不会过多增加数据区的碎片。In combination with the fourth possible implementation of the first aspect, in the sixth possible implementation, under the preset release condition, according to the establishment sequence of the data linked list from first to last, search for data that has not been released from the data linked list, including : When the memory reaches the first preset water level, according to the establishment order of the data link list from first to last, search for the data that has not been released from the data link list; in addition, this method also includes the second stage of memory data elimination, that is, release on the memory After the target data link list has not released the managed data, the method also includes: when the memory reaches the second preset water level, reading the data pointed to by the target mapping relationship from the log area; writing the data pointed to by the target mapping relationship into the data area; Delete the target mapping relationship in memory. Through the two-stage memory data elimination mechanism, the management capacity of memory for data can be expanded. In the second stage of memory data elimination, the data pointed to by the target mapping relationship at this time is inactive data, and the possibility of being modified is low. These data can be migrated from the log area to the data area for storage, which will not increase the fragmentation of the data area too much.

结合第一方面的第四种可能的实现方式，在第六种可能的实现方式该方法还包括：根据事务的写入顺序为事务对应的数据链表按照递增规则分配事务号。通过为数据链表分配事务号，即可根据事务号来对数据链表进行管理，提高了管理效率。如，从当前事务号最小的数据链表开始，根据事务号由小到大的顺序查找数据链表未解除管理的数据，这样即可实现根据数据链表从先到后的建立顺序，查找数据链表未解除管理的数据。In combination with the fourth possible implementation manner of the first aspect, in the sixth possible implementation manner, the method further includes: assigning a transaction number to the data linked list corresponding to the transaction according to an increment rule according to the writing sequence of the transaction. By assigning a transaction number to the data link list, the data link list can be managed according to the transaction number, which improves the management efficiency. For example, starting from the data linked list with the smallest current transaction number, search for the data that has not been released from the data linked list according to the order of the transaction number from small to large, so that you can find the unreleased data linked list according to the order in which the data linked list is established from first to last managed data.

结合第一方面的第四种可能的实现方式，在第七种可能的实现方式中该方法还包括：在预设回收条件下，执行日志区数据搬迁的步骤。例如执行日志区数据搬迁的步骤，包括：In combination with the fourth possible implementation manner of the first aspect, in a seventh possible implementation manner, the method further includes: under preset recycling conditions, performing a step of relocating log area data. For example, the steps to perform data relocation in the log area include:

查找映射关系；Find the mapping relationship;

根据映射关系记录的信息判断与映射关系对应的第一日志区上的数据是否迁移完；According to the information recorded in the mapping relationship, it is judged whether the data on the first log area corresponding to the mapping relationship has been migrated;

若第一日志区上的数据未迁移完，则根据映射关系记录的信息，确定第一日志区上的空间利用率；If the data on the first log area has not been migrated, then determine the space utilization rate on the first log area according to the information recorded in the mapping relationship;

当第一日志区的空间利用率小于预设利用率阀值时，将第一日志区的数据迁移至第二日志区，并更新与被搬迁的数据对应的映射关系，其中第二日志区为空闲的日志区或在回收日志区时使用过的日志区；When the space utilization rate of the first log area is less than the preset utilization threshold, the data in the first log area is migrated to the second log area, and the mapping relationship corresponding to the relocated data is updated, where the second log area is A free log area or a log area that has been used when recycling a log area;

当当前日志区总的空间水位达到预设空间阀值时，则停止执行日志区数据搬迁的步骤，否则继续执行日志区数据搬迁的步骤。When the total space water level of the current log area reaches the preset space threshold, the step of relocating the data in the log area is stopped; otherwise, the step of relocating the data in the log area is continued.

结合第一方面的第七种可能的实现方式，在第八种可能的实现方式中该方法还包括：根据事务的写入顺序为事务对应的数据链表按照递增规则分配事务号，根据事务号来对数据链表进行管理，可提高管理效率。例如，从当前事务号最小的数据链表开始，根据事务号由小到大的顺序查找与数据链表对应的映射关系，即可实现对映射关系的查找。In combination with the seventh possible implementation of the first aspect, in the eighth possible implementation, the method further includes: assigning a transaction number to the data linked list corresponding to the transaction according to the increment rule according to the write order of the transaction, and assigning a transaction number according to the transaction number Manage the data linked list to improve management efficiency. For example, starting from the data linked list with the smallest current transaction number, the mapping relationship corresponding to the data linked list can be searched according to the sequence of transaction numbers from small to large, so as to realize the search for the mapping relationship.

结合第一方面的第七种可能的实现方式，在第九种可能的实现方式中预设回收条件包括定时器超时、对内存数据的回收操作完成、日志区总水位达到预设水位阀值中的至少一个。In combination with the seventh possible implementation of the first aspect, the preset recovery conditions in the ninth possible implementation include timer timeout, completion of the recovery operation on memory data, and the total water level in the log area reaching the preset water level threshold at least one of the .

结合第一方面的第一种可能的实现方式，在第十种可能的实现方式中，在内存上缓存属于热点数据的数据，有多种方式，例如判断数据是否是热点数据之后，若数据是热点数据，则在内存上保留该数据；或者，向缓存器件写入数据之前，从日志区读取数据到缓存器件缓存。Combined with the first possible implementation of the first aspect, in the tenth possible implementation, there are many ways to cache data belonging to hot data in memory, for example, after judging whether the data is hot data, if the data is For hot data, keep the data in the memory; or, before writing data to the cache device, read data from the log area to the cache of the cache device.

结合第一方面或第一方面的第二种至第十种任一可能的实现方式，在第十一种可能的实现方式中热点数据包括数据大小小于预设数据阀值的数据和/或热点数据包括元数据。该预设数据阀值例如可以是128KB或其它的空间大小，具体的数值可根据业务类型来调整，若数据的数据大小小于该预设数据阀值，则该数据的频繁释放可能使硬盘产生大量碎片。而元数据包括对数据的管理数据，例如保存数据地址的间接块，和保存对象管理结构的元数据块。元数据也可导致硬盘产生大量碎片。对这些热点数据要筛选出，以保存在日志区。In combination with the first aspect or any of the second to tenth possible implementations of the first aspect, in the eleventh possible implementation, the hotspot data includes data and/or hotspots whose data size is smaller than a preset data threshold Data includes metadata. The preset data threshold can be, for example, 128KB or other space size, and the specific value can be adjusted according to the business type. If the data size of the data is smaller than the preset data threshold, the frequent release of the data may cause a large amount of data to be generated on the hard disk. debris. The metadata includes data management data, such as an indirect block for storing data addresses, and a metadata block for storing object management structures. Metadata can also cause massive fragmentation of the hard drive. These hot data should be screened out to save in the log area.

结合第一方面或第一方面的第二种至第十种任一可能的实现方式，在第十二种可能的实现方式中为数据在日志区分配日志区空间，将数据写入日志区空间，包括：为数据在日志区按照顺序分配日志区空间，将数据顺序追加写入日志区空间。这样即可实现数据在日志区的顺序读写，从而在搬迁日志区的数据时，没有元数据的开销。整理开销比较小，有效保证了系统性能的稳定性。In combination with the first aspect or any of the second to tenth possible implementations of the first aspect, in the twelfth possible implementation, allocate log area space for data in the log area, and write data into the log area space , including: allocating log area space for data in the log area in sequence, and sequentially appending data to the log area space. In this way, the sequential reading and writing of data in the log area can be realized, so that there is no metadata overhead when relocating the data in the log area. The sorting overhead is relatively small, effectively ensuring the stability of system performance.

结合第一方面或第一方面的第二种至第十种任一可能的实现方式，在第十三种可能的实现方式中该方法还包括：当数据区的空间利用率大于预设数据区利用阀值时，将当前空闲的日志区转化为数据区；当日志区的空间利用率大于预设日志区利用阀值时，将由空闲的日志区转化成的数据区转化为日志区。这样，日志区和数据区相互转化以适配系统容量的变化。可灵活适应具体的使用场景，提高了硬盘的使用效率。In combination with the first aspect or any of the second to tenth possible implementation manners of the first aspect, in the thirteenth possible implementation manner, the method further includes: when the space utilization rate of the data area is greater than the preset data area When the threshold is used, the currently free log area is converted into a data area; when the space utilization rate of the log area is greater than the preset log area utilization threshold, the data area converted from the idle log area is converted into a log area. In this way, the log area and the data area are transformed into each other to adapt to changes in system capacity. It can flexibly adapt to specific usage scenarios and improve the efficiency of hard disk usage.

结合第一方面或第一方面的第二种至第十种任一可能的实现方式，在第十四种可能的实现方式中硬盘还包括超级块，每个日志区分配有标识信息，超级块用于在日志区被修改后记录被修改的日志区的标识信息。通过超级块来进一步对日志区进行管理，例如若系统断电或崩溃恢复后，硬盘控制装置即可根据该超级块记录的信息及时查找被修改的日志区。In combination with the first aspect or any of the second to tenth possible implementation manners of the first aspect, in the fourteenth possible implementation manner, the hard disk further includes a super block, each log section is provided with identification information, and the super block It is used to record the identification information of the modified log area after the log area is modified. The log area is further managed through the super block. For example, if the system is powered off or recovers from a crash, the hard disk control device can search for the modified log area in time according to the information recorded in the super block.

结合第一方面或第一方面的第二种至第十种任一可能的实现方式，在第十五种可能的实现方式中日志区和数据区在硬盘上交替设置。这样，可使得日志区和数据区的数据设置得较靠近。With reference to the first aspect or any of the second to tenth possible implementation manners of the first aspect, in a fifteenth possible implementation manner, the log area and the data area are arranged alternately on the hard disk. In this way, the data in the log area and the data area can be set closer.

结合第一方面或第一方面的第二种至第十种任一可能的实现方式，在第十六种可能的实现方式中硬盘还包括块组，块组包括预设数量的日志区和数据区，组块的日志区和数据区连续设置，通过组块可配合调整组块内的数据区和日志区的使用，例如，根据块组的管理信息确定空闲的目标数据区后，为数据在目标数据区分配数据区空间；从而将数据写入数据区空间之后，方法还包括：根据数据和数据区空间生成目标元数据；向缓存器件写入目标元数据，硬盘控制装置判断出元数据为热点数据后，确定目标数据区所属的目标块组；确定目标块组可用的日志区；将目标元数据写入可用的日志区。这样可以将元数据在硬盘上的位置靠近元数据对应的数据在硬盘上存储的位置，方便对数据的读写。In combination with the first aspect or any of the second to tenth possible implementation manners of the first aspect, in the sixteenth possible implementation manner, the hard disk further includes a block group, and the block group includes a preset number of log areas and data area, the log area and data area of the block are set continuously, and the use of the data area and log area in the block can be adjusted through the block. For example, after the free target data area is determined according to the management information of the block group, the The target data area allocates the data area space; thus after the data is written into the data area space, the method also includes: generating target metadata according to the data and the data area space; writing the target metadata to the cache device, and the hard disk control device determines that the metadata is After hot data, determine the target block group to which the target data area belongs; determine the available log area of the target block group; write the target metadata into the available log area. In this way, the position of the metadata on the hard disk can be close to the storage position of the data corresponding to the metadata on the hard disk, which facilitates reading and writing of data.

结合第一方面的第二种至第十种任一可能的实现方式，在第十七种可能的实现方式中该方法还包括：为映射关系在日志区上分配日志区空间，然后，将映射关系写入映射关系分配到的日志区空间。即将映射关系也保存在日志区上，从而使得映射关系在硬盘上可靠地保存。In combination with any of the second to tenth possible implementations of the first aspect, in the seventeenth possible implementation, the method further includes: allocating log area space for the mapping relationship in the log area, and then, mapping The relationship is written to the log area space allocated by the mapping relationship. That is, the mapping relationship is also stored in the log area, so that the mapping relationship is reliably stored on the hard disk.

本发明第二方面提供一种硬盘控制装置，该硬盘控制装置包括硬盘，该硬盘包括数据区和日志区，该硬盘控制装置具有上述方法中硬盘控制装置的功能。该功能可以通过硬件实现，也可能通过硬件执行相应的软件实现。该硬件或软件包括一个或多个与上述功能相对应的模块。The second aspect of the present invention provides a hard disk control device, the hard disk control device includes a hard disk, the hard disk includes a data area and a log area, and the hard disk control device has the functions of the hard disk control device in the above method. This function may be realized by hardware, and may also be realized by executing corresponding software by hardware. The hardware or software includes one or more modules corresponding to the above functions.

一种可能的实现方式中，该硬盘控制装置包括：In a possible implementation manner, the hard disk control device includes:

写入单元，用于向缓存器件写入数据；A write unit, used to write data to the cache device;

缓存管理器，用于判断数据是否是热点数据，其中热点数据为存储在硬盘上后在预设次数的修改和释放后能使硬盘产生预设数量碎片的数据；The cache manager is used to determine whether the data is hot data, wherein the hot data is data that can cause the hard disk to generate a preset number of fragments after being modified and released for a preset number of times after being stored on the hard disk;

数据管理器，用于若数据不是热点数据，则为数据在数据区分配数据区空间，将数据写入数据区空间；The data manager is used to allocate data area space for the data in the data area if the data is not hot data, and write the data into the data area space;

日志管理器，用于若数据是热点数据，则为数据在日志区分配日志区空间，将数据写入日志区空间。The log manager is used to allocate log area space for the data in the log area if the data is hot data, and write the data into the log area space.

另一种可能的实现方式中，该硬盘控制装置包括：In another possible implementation manner, the hard disk control device includes:

处理器；processor;

该处理器执行如下动作：向缓存器件写入数据；The processor performs the following actions: writing data to the cache device;

该处理器执行如下动作：判断数据是否是热点数据，其中热点数据为存储在硬盘上后在预设次数的修改和释放后能使硬盘产生预设数量碎片的数据；The processor performs the following actions: judging whether the data is hot data, wherein the hot data is data that can cause the hard disk to generate a preset number of fragments after being modified and released for a preset number of times after being stored on the hard disk;

该处理器执行如下动作：若数据不是热点数据，则为数据在数据区分配数据区空间，将数据写入数据区空间；The processor performs the following actions: if the data is not hot data, allocate data area space for the data in the data area, and write the data into the data area space;

该处理器执行如下动作：若数据是热点数据，则为数据在日志区分配日志区空间，将数据写入日志区空间。The processor performs the following actions: if the data is hot data, allocate log area space for the data in the log area, and write the data into the log area space.

第三方面，本申请实施例提供一种计算机存储介质，该计算机存储介质存储有程序代码，该程序代码用于指示执行上述第一方面的方法。In a third aspect, an embodiment of the present application provides a computer storage medium, where the computer storage medium stores a program code, and the program code is used to instruct execution of the method in the first aspect above.

从以上技术方案可以看出，本发明实施例具有以下优点：It can be seen from the above technical solutions that the embodiments of the present invention have the following advantages:

在包括硬盘的硬盘控制装置上，该硬盘包括数据区和日志区，向缓存器件写入数据后，硬盘控制装置判断该数据是否是热点数据，若该数据不是热点数据，则为该数据在数据区分配数据区空间，将该数据写入数据区空间；若该数据是热点数据，则为该数据在日志区分配日志区空间，将该数据写入日志区空间。这样，将待写入硬盘的数据分为热点数据和非热点数据，热点数据为存储在硬盘上后在预设次数的修改和释放后能使硬盘产生预设数量碎片的数据，热点数据易于导致硬盘产生碎片，将热点数保存在日志区上，以日志方式进行管理，即使日志区上的数据频繁修改产生硬盘碎片，也方便对这些碎片进行回收等管理，而将非热点数据保存在数据区，非热点数据的释放不易导致硬盘产生碎片，数据区可以无需为硬盘碎片管理分配过多资源，从而，通过在硬盘上将不同类型的数据保存在不同的区域以不同的方式进行管理，可提高硬盘上的碎片管理效率，日志区对硬盘碎片的高效管理，可减少硬盘碎片的产生。On a hard disk control device including a hard disk, the hard disk includes a data area and a log area. After writing data to the cache device, the hard disk control device judges whether the data is hot data. If the data is not hot data, the data is in the data The data area is allocated in the data area space, and the data is written into the data area space; if the data is hot data, the log area space is allocated for the data in the log area, and the data is written into the log area space. In this way, the data to be written into the hard disk is divided into hot data and non-hot data. Hot data is data that can cause the hard disk to generate a preset number of fragments after being stored on the hard disk after a preset number of modifications and releases. Hot data is easy to cause The hard disk generates fragments, save the hotspot data in the log area, and manage it in the form of logs. Even if the data on the log area is frequently modified to generate hard disk fragments, it is convenient to recover and manage these fragments, and save the non-hot data in the data area , the release of non-hot data is not easy to cause fragmentation of the hard disk, and the data area does not need to allocate too many resources for the management of hard disk fragmentation. Therefore, by storing different types of data in different areas on the hard disk and managing them in different ways, it can improve The efficiency of fragment management on the hard disk and the efficient management of hard disk fragments in the log area can reduce the generation of hard disk fragments.

附图说明Description of drawings

图1为本发明实施例提供的日志区上的一个对象的逻辑视图；Fig. 1 is a logical view of an object on the log area provided by the embodiment of the present invention;

图2为本发明一实施例示出的一种硬盘数据管理方法的流程图；Fig. 2 is a flow chart of a hard disk data management method shown in an embodiment of the present invention;

图3为图2所示实施例所涉及的数据在内存中迁移的示意图；FIG. 3 is a schematic diagram of data migration in memory involved in the embodiment shown in FIG. 2;

图4为图2所示实施例所涉及的数据在内存中缓存的示意图；FIG. 4 is a schematic diagram of data cached in memory involved in the embodiment shown in FIG. 2;

图5为本发明另一实施例提供的一种硬盘控制装置的结构示意图；5 is a schematic structural diagram of a hard disk control device provided by another embodiment of the present invention;

图6为图5所示的硬盘控制装置的回收单元的结构示意图；FIG. 6 is a schematic structural view of the recovery unit of the hard disk control device shown in FIG. 5;

图7为本发明另一实施例提供的一种硬盘控制装置的硬件结构示意图。FIG. 7 is a schematic diagram of a hardware structure of a hard disk control device provided by another embodiment of the present invention.

具体实施方式Detailed ways

本发明实施例提供了一种硬盘数据管理方法和硬盘控制装置，于高效管理硬盘上的碎片。Embodiments of the present invention provide a hard disk data management method and a hard disk control device for efficient management of fragments on the hard disk.

一、本发明实施例的硬盘数据管理方法所涉及的实施环境1. The implementation environment involved in the hard disk data management method of the embodiment of the present invention

本发明实施例的一种硬盘数据管理系统，该硬盘数据管理系统包括硬盘、内存，该内存可作为缓存器件，该硬盘被划分为数据区Date zone和日志区Journal zone，其中该日志区以日志方式对其上的数据进行管理。A kind of hard disk data management system of the embodiment of the present invention, this hard disk data management system comprises hard disk, internal memory, and this internal memory can be used as cache device, and this hard disk is divided into data area Date zone and log area Journal zone, wherein this log area is with log way to manage the data on it.

在硬盘数据管理系统向该硬盘写入数据前，向先作为缓存器件的内存写入该数据，若硬盘数据管理系统判断出该数据为热点数据，该热点数据为存储在硬盘上后在预设次数的修改和释放后能使硬盘产生预设数量碎片的数据，这些热点数据在硬盘的数据区上修改后将导致硬盘产生大量碎片。所以，向内存写入数据后，若该数据不是热点数据，则为该数据在数据区分配数据区空间，然后，将该数据写入该数据区空间。若该数据是热点数据，则为数据在日志区分配日志区空间，将数据写入日志区空间。Before the hard disk data management system writes data to the hard disk, write the data to the memory that is used as a cache device first. If the hard disk data management system judges that the data is hot data, the hot data is stored on the hard disk and then preset After the number of modifications and releases, the hard disk will generate a preset number of fragmented data. After these hot data are modified in the data area of the hard disk, the hard disk will generate a large number of fragments. Therefore, after data is written into the memory, if the data is not hot data, a data area space is allocated for the data in the data area, and then the data is written into the data area space. If the data is hot data, allocate log area space for the data in the log area, and write the data into the log area space.

其中，数据区空间是数据区上的存储空间，可以是一个数据区上的部分空间，也可以是一个数据区上的全部空间。日志区空间是数据区上的存储空间，可以是一个日志区上的部分空间，也可以是一个日志区上的全部空间。Wherein, the data area space is the storage space on the data area, which may be part of the space on one data area, or all the space on one data area. The log area space is the storage space on the data area, which can be part of a log area or the entire space of a log area.

这样，将待写入硬盘的数据分为热点数据和非热点数据，热点数据为存储在硬盘上后在预设次数的修改和释放后能使硬盘产生预设数量碎片的数据，热点数据易于导致硬盘产生碎片，将热点数保存在日志区上，以日志方式进行管理，即使日志区上的数据频繁修改产生硬盘碎片，也方便对这些碎片进行回收等管理，而将非热点数据保存在数据区，非热点数据的释放不易导致硬盘产生碎片，数据区可以无需为硬盘碎片管理分配过多资源，从而，通过在硬盘上将不同类型的数据保存在不同的区域以不同的方式进行管理，可提高硬盘上的碎片管理效率，有效对硬盘上的碎片进行管理，通过日志区的管理也达到了规避硬盘碎片产生的效果。In this way, the data to be written into the hard disk is divided into hot data and non-hot data. Hot data is data that can cause the hard disk to generate a preset number of fragments after being stored on the hard disk after a preset number of modifications and releases. Hot data is easy to cause The hard disk generates fragments, save the hotspot data in the log area, and manage it in the form of logs. Even if the data on the log area is frequently modified to generate hard disk fragments, it is convenient to recover and manage these fragments, and save the non-hot data in the data area , the release of non-hot data is not easy to cause fragmentation of the hard disk, and the data area does not need to allocate too many resources for the management of hard disk fragmentation. Therefore, by storing different types of data in different areas on the hard disk and managing them in different ways, it can improve The fragmentation management efficiency on the hard disk can effectively manage the fragments on the hard disk, and the effect of avoiding hard disk fragmentation can also be achieved through the management of the log area.

硬盘的日志区和数据区的设置可具有多种方式，如下进行详细的描述，以作为其中的一种实现方式。The log area and the data area of the hard disk can be set in many ways, which will be described in detail below as one of the implementation ways.

将硬盘划分为数据区和日志区两种类型的区域，对数据区和日志区的空间大小本发明实施例不作具体限定，例如可以是256M。该数据区和日志区可以是交替设置，如表一所示，表一为一种硬盘空间布局的一示例，硬盘被划分为超级块、数据区和日志区。可选地，在日志区集合中0.1％的比例作为固定的固定日志区，其均匀间隔的分布在硬盘上，这种类型的日志区只能作为日志区使用，而其它的日志区在硬盘空间不足的时候，可以转化为数据区。The hard disk is divided into two types of areas, the data area and the log area, and the space size of the data area and the log area is not specifically limited in this embodiment of the present invention, for example, it may be 256M. The data area and the log area can be set alternately, as shown in Table 1, which is an example of a hard disk space layout, and the hard disk is divided into a super block, a data area and a log area. Optionally, 0.1% of the log area set is used as a fixed fixed log area, which is distributed evenly on the hard disk. This type of log area can only be used as a log area, while other log areas are in the hard disk space. When it is insufficient, it can be converted into a data area.

表一Table I

数据区和日志区在硬盘上具有多种设置方式，上述的数据区和日志区交替设置只是其中的一种方式，本发明对此不作具体限定，例如还可以是数据区连续设置在硬盘的一区域，日志区连续设置在硬盘的另一区域，或者多个数据区连续设置为数据区组，多个日志区连续设置为日志区组，然后数据区组和日志区组交替设置，等等。The data area and the log area have multiple setting modes on the hard disk, and the above-mentioned data area and the log area are arranged alternately, which is only one of them. The present invention does not specifically limit this. area, the log area is continuously set in another area of the hard disk, or multiple data areas are continuously set as data area groups, multiple log areas are continuously set as log area groups, and then data area groups and log area groups are set alternately, and so on.

数据区可以保存非热点数据，例如将大于128KB的数据直接写入数据区。其中，在数据区，将数据写入数据区后，将产生管理数据的元数据，这些元数据可写入缓存器件后，再为其在日志区分配空间，以进行存储。The data area can store non-hot data, such as writing data larger than 128KB directly into the data area. Wherein, in the data area, after data is written into the data area, metadata for managing the data will be generated. After the metadata can be written into the cache device, space is allocated in the log area for storage.

日志区可以保存热点数据，例如将小于128KB的数据和元数据保存到日志区中，在有的实施例中，热点数据可以是以追加的方式保存在日志区上，在有的实施例中，还可以对日志区按照顺序分配一个标识信息ID。在日志区，以日志方式对数据进行管理。The log area can save hot data, for example, save data and metadata less than 128KB in the log area. In some embodiments, the hot data can be stored in the log area in an additional manner. In some embodiments, It is also possible to sequentially assign an identification information ID to the log area. In the log area, data is managed in log mode.

在有的实施例中，在日志区，按照顺序追加写方式进行io处理，当一个数据块需要写入日志区，从该日志区上一次写入的尾部分配空间，当日志区无法容纳一个数据块的时候，重新选择一个空闲最大的日志区进行追加写。In some embodiments, in the log area, io processing is performed in a sequentially appended write mode. When a data block needs to be written into the log area, space is allocated from the end of the last write in the log area. When the log area cannot accommodate a data block, re-select a log area with the largest free space for additional writing.

日志区的布局如表二所示，其中Journal ctrl为标识信息ID，Map为映射关系，Record为数据，该数据例如为小于128KB的数据和元数据。The layout of the log area is shown in Table 2, where Journal ctrl is the identification information ID, Map is the mapping relationship, and Record is data, such as data and metadata less than 128KB.

表二Table II

如表一所示，在有的实施例中，硬盘控制装置的硬盘上还设有超级块，在数据写入日志区后，日志区被修改，超级块将会记录被修改的日志区的标志信息ID。例如，在一批向日志区的写操作组合为一个事务时，一个事务的数据保存到硬盘之后，可将这个事务修改的日志区的标识信息ID记录到超级块对应的bitmap中。As shown in Table 1, in some embodiments, the hard disk of the hard disk control device is also provided with a super block. After the data is written into the log area, the log area is modified, and the super block will record the sign of the modified log area. Information ID. For example, when a batch of write operations to the log area is combined into a transaction, after the data of a transaction is saved to the hard disk, the identification information ID of the log area modified by this transaction can be recorded in the corresponding bitmap of the super block.

对于超级块的空间大小，本发明实施例不作具体限定，可根据设备具体情况进行调整，例如，一个4T的盘，日志区有4T/256M/2＝8192个，超级块中需要一个1024B来记录日志区的总体的使用情况。For the space size of the super block, the embodiment of the present invention does not specifically limit it, and it can be adjusted according to the specific conditions of the device. For example, a 4T disk has 4T/256M/2=8192 log areas, and a 1024B is required in the super block to record The overall usage of the log area.

超级块的布局可以如表三所示。其中Super Blkctrk用于记录总体的管理信息，例如硬盘已经使用容量，总容量，总的空闲容量，日志区个数，数据区个数。Journalbitmap用于记录已经处理的事务号。Super blkctrk和Journalbitmap的容量可以分别是4KThe layout of the super block can be shown in Table 3. Among them, Super Blkctrk is used to record the overall management information, such as the used capacity of the hard disk, the total capacity, the total free capacity, the number of log areas, and the number of data areas. Journalbitmap is used to record the transaction numbers that have been processed. The capacity of Super blkctrk and Journalbitmap can be 4K respectively

表三Table three

Super blkctrkSuper blkctrk JournalbitmapJournal bitmap

为了让分配到数据区和日志区的数据靠近，在有的实施例中，可以将多个日志区和数据区组合为一个组块，组块上的日志区和数据区连续设置。每个块组有一个空间管理对象，采用位图文件bitmap的方式管理这个块组中的硬盘空间的使用情况。例如，可以将连续设置的16个日志区和数据区组合为一个块组。In order to make the data allocated to the data area and the log area close together, in some embodiments, multiple log areas and data areas may be combined into a block, and the log areas and data areas on the block are set continuously. Each block group has a space management object, and the use of hard disk space in this block group is managed in the form of a bitmap file bitmap. For example, 16 log areas and data areas set consecutively can be combined into one block group.

表四和表五示出了组块、数据区和日志区三者的关系。表四是以组块为单位对硬盘的布局的示意，表五是对表四中的组块1的布局示意。Table 4 and Table 5 show the relationship among the chunk, data area and log area. Table 4 shows the layout of the hard disk in units of blocks, and Table 5 shows the layout of block 1 in Table 4.

表四Table four

表五Table five

如表二所示，日志区上还存储有映射关系Map，该该映射关系可以是用于记录日志区上的数据和该数据分配到的日志区空间的对应关系，如图1所示，其示出了硬盘对象的逻辑视图，根据该图对映射关系进行举例说明。As shown in Table 2, the mapping relationship Map is also stored on the log area, and the mapping relationship can be used to record the corresponding relationship between the data on the log area and the log area space to which the data is allocated, as shown in Figure 1. A logical view of the hard disk object is shown, and the mapping relationship is illustrated according to the figure.

如图1所示，其示出了日志区上的一个对象。一个对象可以划分为多个层次，最底层层级level为0，对应对象的数据块。level0之上是间接块，level为1。最上层是对象管理结构所在块，level为2。数据块很多的情况下，一个间接块无法保存这么多数据块的地址指针，此时需要多个间接块，对象的层数也增加了。同一层的块按照从左至右编号，例如最底层的数据块的blkid依次编号为0、1、2和3。As shown in Figure 1, it shows an object on the log area. An object can be divided into multiple levels, and the lowest level level is 0, which corresponds to the data block of the object. Above level0 is an indirect block with level 1. The top layer is the block where the object management structure is located, and the level is 2. In the case of many data blocks, one indirect block cannot store the address pointers of so many data blocks. At this time, multiple indirect blocks are needed, and the number of layers of objects also increases. Blocks of the same layer are numbered from left to right, for example, the blkid of the data block at the bottom layer is numbered 0, 1, 2, and 3 in sequence.

事务中在修改数据块的时候，需要将数据和该数据分配到的日志区空间的关系记录到一个映射关系中。例如，一个事务创建了图1中的对象，那么需要在映射关系中记录如表六的信息。在表六中，该映射关系中每一列记录的数据的信息类型依次为objsetid，objid，levelid，blkid，journalid，offset，size。When modifying a data block in a transaction, the relationship between the data and the log area space allocated to the data needs to be recorded in a mapping relationship. For example, if a transaction creates the object in Figure 1, then the information in Table 6 needs to be recorded in the mapping relationship. In Table 6, the information types of data recorded in each column in the mapping relationship are objsetid, objid, levelid, blkid, journalid, offset, and size in sequence.

其中，objsetid指对象集ID，objid指对象ID，levelid指数据块所在的层数，blkid指数据块在所在层数，从左到右的序号，journalid指数据块写入的日志区的id，offset指数据块写入日志区的相对偏移，size指数据块写入的大小。Among them, objsetid refers to the object set ID, objid refers to the object ID, levelid refers to the number of layers where the data block is located, blkid refers to the number of layers where the data block is located, and the serial number from left to right, journalid refers to the id of the log area where the data block is written, offset refers to the relative offset of the data block written to the log area, and size refers to the size of the data block written.

可以理解，映射关系记录的信息类型可以是包括上述的所有信息类型，也可以包括上述信息类型的部分类型，还可以包括更多的其它的信息类型，本发明实施例对此不作具体限定。It can be understood that the information types of the mapping relation records may include all the above information types, may also include some of the above information types, and may also include more other information types, which are not specifically limited in this embodiment of the present invention.

表六Table six

可以理解，在有的实施例中，该缓存器件可以由其它器件代替内存，例如Nvdimm、Flash卡、SSD(固态硬盘，Solid State Drives)等。可以理解，在有的实施例中，硬盘上可以不包括日志区，而将热点数据保存在缓存器件上，本发明实施例对此不做具体限定。It can be understood that, in some embodiments, the cache device can be replaced by other devices, such as Nvdimm, Flash card, SSD (Solid State Drives, Solid State Drives) and so on. It can be understood that, in some embodiments, the hard disk may not include a log area, but the hotspot data may be saved on the cache device, which is not specifically limited in this embodiment of the present invention.

可以理解，本发明实施例的硬盘控制装置可以使用在计算机、服务器等设备上，本发明实施例对此不做具体限定。It can be understood that the hard disk control apparatus in the embodiment of the present invention can be used in equipment such as computers and servers, which is not specifically limited in the embodiment of the present invention.

图2是根据一示例性实施例示出的一种硬盘数据管理方法的流程图。该方法应用于硬盘控制装置上，该硬盘控制装置包括硬盘，该硬盘包括数据区和日志区。结合上述描述的第一部分，即本发明实施例的硬盘数据管理方法所涉及的实施环境，以硬盘控制装置执行本发明实施例提供的方法的角度为例，参见图2，本发明实施例提供的方法流程包括：Fig. 2 is a flowchart of a hard disk data management method according to an exemplary embodiment. The method is applied to a hard disk control device, and the hard disk control device includes a hard disk, and the hard disk includes a data area and a log area. Combining the first part of the above description, that is, the implementation environment involved in the hard disk data management method of the embodiment of the present invention, taking the perspective of the hard disk control device executing the method provided by the embodiment of the present invention as an example, see FIG. 2 , the embodiment of the present invention provides The method flow includes:

步骤201：向内存写入数据；Step 201: write data to memory;

在设备向硬盘写入数据前，硬盘控制装置先将该数据写入缓存器件，以进行写入硬盘前的管理，例如使用硬盘控制装置的缓存管理器对该数据进行管理。Before the device writes data to the hard disk, the hard disk control device first writes the data into the cache device for management before writing to the hard disk, for example, using the cache manager of the hard disk control device to manage the data.

该缓存器件可以是内存，或者flash闪存等缓存器件，本发明实施例对此不作具体限定。The cache device may be a memory, or a cache device such as a flash memory, which is not specifically limited in this embodiment of the present invention.

在本发明实施例中，以缓存器件为内存进行说明。该写入内存的数据包括硬盘控制装置的所有外部数据和向硬盘的数据区写数据时产生的元数据。In the embodiment of the present invention, the cache device is used as a memory for description. The data written into the internal memory includes all external data of the hard disk control device and metadata generated when data is written to the data area of the hard disk.

其中，向内存写入数据包括修改写和创建写两种方式，修改写即向内存写入的数据为对内存上的数据进行修改，新的数据覆盖被修改的数据，创建写即向内存写入新的数据，内存上没有缓存该新的数据的原始数据。Among them, writing data to the memory includes modifying and creating writing. Modifying the data written to the memory means modifying the data on the memory. New data overwrites the modified data, and creating a write means writing to the memory. Entering new data, the original data of the new data is not cached in the memory.

步骤202：判断该数据是否是热点数据，若该数据不是热点数据，则执行步骤203，若该数据是热点数据，则执行步骤204。Step 202: Determine whether the data is hot data, if the data is not hot data, execute step 203, if the data is hot data, execute step 204.

在向内存写入数据后，硬盘控制装置判断该数据是否是热点数据，如通过硬盘控制装置的缓存管理器模块进行判断。硬盘控制装置判断待写入硬盘的数据是否是热点数据后，根据判断结果执行不同的处理方式。After writing data into the internal memory, the hard disk control device judges whether the data is hot data, such as through the cache manager module of the hard disk control device. After the hard disk control device judges whether the data to be written into the hard disk is hot data, it executes different processing methods according to the judgment result.

其中，热点数据为存储在硬盘上后在预设次数的修改和释放后能使硬盘产生预设数量碎片的数据。例如，在本实施例中，该热点数据可以是指数据大小小于预设数据阀值的数据，和/或该热点数据也可以是指元数据。Wherein, the hot data is the data stored on the hard disk that can cause the hard disk to generate a preset number of fragments after a preset number of modifications and releases. For example, in this embodiment, the hot data may refer to data whose data size is smaller than a preset data threshold, and/or the hot data may also refer to metadata.

数据大小小于预设数据阀值的数据为小数据，该小数据频繁修改和释放容易产生硬盘碎片，而数据大小大于一定预设数据阀值的数据即使在硬盘上频繁修改也不会产生大量的硬盘碎片。其中，该预设数据阀值的设定与业务模型有关，例如可以设为64KB、128KB等。The data whose data size is smaller than the preset data threshold is small data, and the frequent modification and release of this small data will easily cause hard disk fragmentation, while the data whose data size is larger than a certain preset data threshold will not generate a large amount of data even if it is frequently modified on the hard disk. Hard disk fragmentation. Wherein, the setting of the preset data threshold is related to the business model, for example, it can be set to 64KB, 128KB, etc.

而元数据为记录对象管理结构和记录数据块地址的数据块。当要修改硬盘数据区上的数据块时，因数据区一般使用的是COW机制，即，当要修改写一块数据的时候，不是直接覆盖老版本的数据，而是读取老版本的数据，修改好之后，写到一个新的位置，释放老版本的数据。因为数据的位置变化了，需要将指向数据的上一层索引块中的指针进行修改，即要修改元数据，修改后的新的元数据分配到新空间，而被修改的旧的元数据需要释放掉，如此递归到最顶层。这样就会因元数据的修改，要释放大量的数据，在硬盘上释放的数据的位置就产生碎片，从而加速了硬盘碎片化的过程。The metadata is a data block that records the object management structure and records the address of the data block. When modifying the data block on the hard disk data area, because the data area generally uses the COW mechanism, that is, when modifying and writing a piece of data, the old version of the data is not directly overwritten, but the old version of the data is read. After modification, write to a new location and release the old version of the data. Because the location of the data has changed, it is necessary to modify the pointer to the index block of the upper layer of the data, that is, to modify the metadata, and the modified new metadata is allocated to the new space, while the modified old metadata needs to be Released, so recursively to the top. In this way, a large amount of data will be released due to the modification of metadata, and fragmentation will occur at the position of the released data on the hard disk, thereby accelerating the process of hard disk fragmentation.

从而，本发明实施例将数据大小小于预设数据阀值的数据和/或元数据归为热点数据，这些数据容易导致硬盘产生碎片，需要对其进行相应的管理。Therefore, in the embodiment of the present invention, the data and/or metadata whose data size is smaller than the preset data threshold are classified as hot data. These data are likely to cause fragmentation of the hard disk and need to be managed accordingly.

步骤203：为数据在数据区分配数据区空间，将该数据写入数据区空间。Step 203: Allocate data area space for the data in the data area, and write the data into the data area space.

判断为不是热点数据的数据因其不容易使硬盘产生碎片，从而可将其保存在硬盘上的数据区。其中，数据区是硬盘上用于存储数据的区域，该区域可使用COW机制对其上的数据进行管理。数据区可以是硬盘上具有预设空间大小的区域，该预设空间大小例如可以是256M。对该数据区的描述可参考上述实施环境部分对数据区的描述。Data that is judged not to be hot data can be stored in the data area on the hard disk because it is not likely to cause fragmentation on the hard disk. Wherein, the data area is an area for storing data on the hard disk, and the COW mechanism can be used in this area to manage the data thereon. The data area may be an area on the hard disk with a preset space size, for example, the preset space size may be 256M. For the description of the data area, please refer to the description of the data area in the above-mentioned implementation environment part.

例如，当一数据的大小大于128KB时，缓存管理器判断该数据不是热点数据，则硬盘控制装置的数据区管理器模块可为该数据在数据区上分配数据区空间，然后将该数据写入分配到的该数据区空间。For example, when the size of a piece of data is greater than 128KB, the cache manager judges that the data is not hot data, then the data zone manager module of the hard disk control device can allocate data zone space on the data zone for the data, and then write the data into The allocated data area space.

其中，将数据写入数据区后会产生元数据。该元数据用于记录该数据的地址，从而方便对该数据的管理。例如，组织多块数据的时候一般采用多级索引的方式，也就是在数据块的上一层分配一个索引块，其内容是记录数据块的地址，通过该索引块可以将多个数据块拼接为一个逻辑上连续的对象。这个索引块就是元数据的一种类型。将数据写入数据区产生元数据后，可将该元数据写入内存，执行上述的步骤201。缓存管理器可判断出该元数据为热点数据，从而保存到日志区和内存上。Wherein, after data is written into the data area, metadata will be generated. The metadata is used to record the address of the data, so as to facilitate the management of the data. For example, when organizing multiple blocks of data, a multi-level index is generally used, that is, an index block is allocated on the upper layer of the data block, and its content is to record the address of the data block. Through this index block, multiple data blocks can be spliced. as a logically continuous object. This index block is a type of metadata. After data is written into the data area to generate metadata, the metadata can be written into the memory, and the above step 201 is executed. The cache manager can determine that the metadata is hot data and save it in the log area and memory.

可以理解，在有的实施例中，硬盘控制装置的硬盘还包括块组，块组包括预设数量的日志区和数据区，块组的日志区和数据区连续设置，在包括块组的硬盘控制装置中，为数据在数据区分配数据区空间的具体执行方式是，选择合适的块组，例如空间使用较少的块组或空闲数据区较多的块组，然后为该数据在该选择的块组上分配数据区的数据区空间。在分配空间后，需要记录空间的使用情况，即需要在块组的空间管理bitmap中查找和分配空闲的数据块后，修改块组的管理结构。后续即可根据块组记录的空间使用情况向属于该块组的日志区写入该元数据，如果该块组没有空闲的日志区，需要选择邻近的日志区写入该元数据。以使该元数据与该元数据指向的数据在硬盘上的位置靠近。It can be understood that, in some embodiments, the hard disk of the hard disk control device also includes block groups, and the block groups include a preset number of log areas and data areas, and the log areas and data areas of the block groups are set continuously. In the control device, the specific implementation method of allocating data area space for data in the data area is to select an appropriate block group, such as a block group with less space usage or a block group with more free data areas, and then allocate The data area space of the data area is allocated on the block group. After allocating space, it is necessary to record the usage of the space, that is, after finding and allocating free data blocks in the space management bitmap of the block group, modify the management structure of the block group. Later, the metadata can be written to the log area belonging to the block group according to the space usage recorded by the block group. If the block group has no free log area, you need to select an adjacent log area to write the metadata. In order to make the location of the metadata and the data pointed to by the metadata close to each other on the hard disk.

其中，该块组可参考上述实施环境部分对块组的相应描述。Wherein, for the block group, reference may be made to the corresponding description of the block group in the above-mentioned implementation environment section.

可以理解，写入数据区上的数据除了被判断为非热点数据的数据外还可以包括内存因为空间不够而淘汰过来的数据，例如，在后续内存回收第二阶段中，将映射关系指向的数据从日志区迁移到数据区。这些被迁移的数据也可以通过数据空间管理器在数据区分配数据区空间后，写入数据区空间。It can be understood that, in addition to the data that is judged as non-hot data, the data written to the data area may also include the data eliminated from the memory due to insufficient space. For example, in the second stage of subsequent memory reclamation, the data pointed to by the mapping relationship Migrate from the log area to the data area. The migrated data can also be written into the data area space after the data area space is allocated by the data space manager.

步骤204：为数据在日志区分配日志区空间。Step 204: Allocate log area space for data in the log area.

硬盘上还设有日志区，在日志区上通过日志方式对日志区上的数据进行管理。There is also a log area on the hard disk, and the data in the log area is managed in a log mode in the log area.

当步骤201的数据被判断为热点数据时，并且为该数据在日志区上分配日志区空间，例如将该数据传递到日志管理器，该日志管理器为该热点数据在日志区上分配日志区空间，以使该热点数据在日志区上存储。When the data in step 201 is judged as hot data, and allocate log area space on the log area for the data, for example, the data is passed to the log manager, and the log manager allocates the log area on the log area for the hot data space so that the hotspot data is stored in the log area.

本发明实施例中，为了更方便地对日志区上的数据进行管理，可按照写入日志区的顺序为属于热点数据的数据在日志区上分配空间。当然，在其它的实施例中，可以不为热点数据顺序分配日志区空间，本发明实施例对此不作具体限定，In the embodiment of the present invention, in order to manage the data in the log area more conveniently, the data belonging to the hot data can be allocated space in the log area according to the order in which they are written into the log area. Of course, in other embodiments, the log area space may not be allocated sequentially for hotspot data, which is not specifically limited in this embodiment of the present invention.

而将属于热点数据的数据保存在日志区上，当内存上的热点数据因掉电而丢失时，可将日志区上的数据读取到内存上以供设备执行操作，并且，内存和日志区配合使用，将热点数据保存在日志区，可使得内存可管理的热点数据的数据量得到扩大。The data belonging to the hot data is stored in the log area. When the hot data on the memory is lost due to power failure, the data on the log area can be read to the memory for the device to perform operations, and the memory and the log area Used in conjunction with saving the hotspot data in the log area, the data volume of the hotspot data that can be managed by the memory can be expanded.

该日志区可以是硬盘上具有一定空间大小的区域，例如可以是256M。在硬盘上可以具有多个日志区和数据区，该日志区可以按照在硬盘上的设置顺序分配有标识信息。关于日志区具体的设置方式可参考上述实施环境部分对日志区的相应描述。The log area may be an area with a certain size on the hard disk, for example, it may be 256M. There may be multiple log areas and data areas on the hard disk, and identification information may be assigned to the log areas according to the sequence set on the hard disk. For the specific setting method of the log area, refer to the corresponding description of the log area in the above-mentioned implementation environment section.

在硬盘上日志区和数据区间的设置方式，可以具有多种方式，例如日志区和数据区交替设置在硬盘上，如上述的表一所示。当然数据区和日志区也可以以其它的方式设置，本发明实施例对此不作具体限定，具体可参考上述实施环境部分对数据区和日志区设置的相应的描述。There are many ways to set the log area and the data area on the hard disk, for example, the log area and the data area are alternately set on the hard disk, as shown in Table 1 above. Of course, the data area and the log area can also be set in other ways, which is not specifically limited in the embodiment of the present invention. For details, refer to the corresponding description of setting the data area and the log area in the above-mentioned implementation environment section.

步骤205：建立多个目标数据和该多个目标数据分配到的日志区空间的映射关系。Step 205: Establish a mapping relationship between multiple target data and log area spaces allocated to the multiple target data.

其中目标数据属于热点数据。Among them, the target data belongs to hot data.

向内存写入多个数据后，经过是否是热点数据的判断后，可能在内存上得到多个属于热点数据的数据。硬盘控制装置将内存上的多个属于热点数据的目标数据确定出来，以对目标数据多个一起进行管理，提高处理效率。目标数据都在日志区上分配有日志区空间，硬盘控制装置根据这些目标数据和目标数据分配到的日志区空间建立映射关系。其中这些目标数据属于热点数据，即目标数据包括元数据和/或数据大小小于预设数据阀值的数据。After writing multiple data into the memory, after judging whether it is hot data, you may get multiple data belonging to the hot data in the memory. The hard disk control device determines a plurality of target data belonging to the hot data on the internal memory, so as to manage multiple target data together and improve processing efficiency. The log area space is allocated to the target data in the log area, and the hard disk control device establishes a mapping relationship between the target data and the log area space allocated to the target data. These target data belong to hot data, that is, the target data includes metadata and/or data whose data size is smaller than a preset data threshold.

写到日志区的所有数据都需要记录数据和日志区空间的映射关系，通过映射关系对日志区上的数据进行记录，以根据该映射关系对内存上的热点数据或日志区上的数据进行管理，例如，设备根据缓存在内存上的映射关系的索引读取保存在日志区上的对应的数据，或者根据映射关系记录的数据信息对日志区进行回收，以对日志区进行碎片管理。All data written to the log area needs to record the mapping relationship between the data and the log area space, and record the data on the log area through the mapping relationship, so as to manage the hot data on the memory or the data on the log area according to the mapping relationship For example, the device reads the corresponding data stored in the log area according to the index of the mapping relationship cached in the memory, or reclaims the log area according to the data information recorded in the mapping relationship, so as to perform fragmentation management on the log area.

该映射关系记录的信息类型可以包括objsetid，objid，levelid，blkid，journalid，offset，size等信息，The information type of the mapping relation record can include information such as objsetid, objid, levelid, blkid, journalid, offset, size, etc.

关于该映射关系的更多的内容可参考上述实施环境部分的相应描述。For more information about the mapping relationship, refer to the corresponding description in the above-mentioned implementation environment part.

步骤206：在内存上缓存映射关系。Step 206: Cache the mapping relationship in memory.

映射关系建立好后，在内存上对其进行保存，以为后续操作准备。After the mapping relationship is established, it is saved in memory to prepare for subsequent operations.

在有的实施例中，也可以将该映射关系保存到日志区，在需要保存到内存上时，再从日志区上读取该映射关系，以缓存在内存上。当然，在有的实施例中，可以将该映射关系同时保存在内存和日志区上。In some embodiments, the mapping relationship can also be stored in the log area, and when it needs to be stored in the memory, the mapping relationship can be read from the log area to be cached in the memory. Of course, in some embodiments, the mapping relationship can be stored in the memory and the log area at the same time.

步骤207：将该多个目标数据的多个写操作组合为一个事务。Step 207: Combine multiple write operations of the multiple target data into one transaction.

在确定出多个目标数据后，要对该多个目标数据执行写操作以写入日志区，硬盘控制装置将这多个目标数据的多个写操作组合为一个事务，以事务为单位执行写硬盘操作。其中事务是对该多个写操作组合的称呼，并非是写操作的执行。而多个目标数据指至少两个目标数据，相应的，多个写操作指至少两个写操作。After determining a plurality of target data, it is necessary to perform write operations on the multiple target data to write into the log area. Hard disk operation. A transaction is a name for a combination of multiple write operations, not an execution of a write operation. A plurality of target data refers to at least two target data, and correspondingly, a plurality of write operations refers to at least two write operations.

为了提高写入效率和保证写入可靠性，设备向日志区写入数据时，一般不是只执行一次写操作，而是在一次向日志区写入数据的操作中执行多个数据的多个写操作。这些属于同一批次的写操作的多个目标数据，在写入硬盘的过程中要么全写成功，要么全写失败。将该同一批次的多个目标数据的多个写操作组合为一个事务。In order to improve writing efficiency and ensure writing reliability, when the device writes data to the log area, it generally does not perform only one write operation, but performs multiple writes of multiple data in one operation of writing data to the log area. operate. These multiple target data belonging to the same batch of write operations are either all written successfully or all written to the hard disk. Multiple write operations of multiple target data of the same batch are combined into one transaction.

可以理解，本发明实施例对步骤207和步骤205的执行顺序不作具体限定。即多个目标数据的写操作组合为一个事务，根据这些目标数据可建立映射关系，从而一事务与一映射关系对应。It can be understood that the execution sequence of step 207 and step 205 is not specifically limited in this embodiment of the present invention. That is, write operations of multiple target data are combined into one transaction, and a mapping relationship can be established according to these target data, so that one transaction corresponds to one mapping relationship.

在有的实施例中，本发明实施例的方法还包括根据该多个目标数据建立数据链表，即根据一个事务中的一系列的元数据和小于预设阀值的数据形成一个数据链表，该数据链表用于管理目标数据。在形成映射关系时，可根据该数据链表上的数据和这些数据在日志区上分配的空间建立映射关系。In some embodiments, the method of the embodiment of the present invention further includes establishing a data link list based on the multiple target data, that is, forming a data link list based on a series of metadata in a transaction and data less than a preset threshold, the The data linked list is used to manage the target data. When forming the mapping relationship, the mapping relationship can be established according to the data on the data link list and the space allocated for these data in the log area.

步骤208：将事务的所有目标数据写入日志区空间。Step 208: Write all target data of the transaction into the log area space.

硬盘控制装置将事务的所有目标数据写入日志区空间，多个目标数据组合为一个事务后，当事务的其中一个目标数据的写操作执行失败时，事务的其他目标数据执行的写操作失败。只有每个目标数据的写操作都执行成功，该事务的写操作才能成功。The hard disk control device writes all target data of the transaction into the log area space. After multiple target data are combined into one transaction, when the write operation of one target data of the transaction fails, the write operation of other target data of the transaction fails. Only if every write operation of the target data is executed successfully can the write operation of the transaction succeed.

映射关系建立完成后，通过日志区管理器为该映射关系分配日志区空间，属于热点数据的数据也分配了日志区空间，从而可将该映射关系和目标数据都写入两者分配到的日志区空间，以在日志区上对映射关系和目标数据进行保存。After the mapping relationship is established, log area space is allocated for the mapping relationship through the log area manager, and data belonging to hot data is also allocated log area space, so that both the mapping relationship and the target data can be written into the log to which they are allocated. area space to save the mapping relationship and target data in the log area.

将映射关系保存在日志区后，系统可以将该映射关系读取到内存上，从而内存可以重新获取该映射关系，这对内存掉电后的重新工作尤其有用，当然，在有的实施例中，可以不将该映射关系保存在日志区上，从而也无需为映射关系在日志区分配空间，这也能实现减少数据区上硬盘碎片的效果，本发明实施例对此不作具体限定。After the mapping relationship is saved in the log area, the system can read the mapping relationship to the memory, so that the memory can reacquire the mapping relationship, which is especially useful for re-working after the memory is powered off. Of course, in some embodiments , the mapping relationship may not be stored in the log area, so there is no need to allocate space for the mapping relationship in the log area, which can also achieve the effect of reducing hard disk fragmentation in the data area, which is not specifically limited in the embodiment of the present invention.

本发明实施例中，对映射关系和事务的目标数据写入日志区空间的顺序不作具体限定。In the embodiment of the present invention, there is no specific limitation on the order in which the mapping relationship and the target data of the transaction are written into the log area space.

在有的实施例中，为数据在日志区分配日志区空间，将该数据写入日志区空间的具体方式为，为该数据在日志区按照顺序分配日志区空间，将该数据顺序追加写入该日志区空间。顺序分配空间和顺序追加写入数据即在日志区的存储空间上按照先后顺序分配空间或写入数据。In some embodiments, the log area space is allocated for the data in the log area, and the specific method of writing the data into the log area space is to allocate the log area space for the data in the log area in order, and write the data sequentially The log area space. Sequential allocation of space and sequential appending of data means allocating space or writing data in sequence in the storage space of the log area.

通过为该数据在日志区按照顺序分配日志区空间，将该数据顺序追加写入该日志区空间的方式，对日志区的数据进行管理时可以顺序读取和写入数据，提高数据管理的效率，且根据顺序来确定日志区上的数据，而映射关系记录有保存在日志区的数据和这些数据的日志区空间的对应关系，使用映射关系可代替元数据的作用，无需使用元数据来确定日志区上的数据，从而没有额外的元数据管理开销。且在日志区以顺序追加方式写数据时，可方便映射关系记录数据在日志区上的存储空间的信息。By allocating the log area space for the data in the log area in sequence, and adding the data to the log area space sequentially, the data in the log area can be read and written sequentially when managing the data in the log area, improving the efficiency of data management , and determine the data on the log area according to the order, and the mapping relationship records the corresponding relationship between the data stored in the log area and the log area space of these data, using the mapping relationship can replace the role of metadata, no need to use metadata to determine Data on the log area, so there is no additional metadata management overhead. In addition, when data is written in the log area in a sequential appending manner, the mapping relationship can be conveniently recorded for the storage space information of the data in the log area.

在本发明的实施例中，本发明实施例的方法还包括：若数据是热点数据，则在内存上缓存该数据，即将事务的所有目标数据写入日志区空间，并将事务的目标数据缓存在内存上，即判断数据是否是热点数据之后，若该数据是热点数据，则在内存上保留该数据，这样，在后续操作过程中，若向内存写入数据，该写操作若是对已经缓存在内存上的目标数据的修改写，则可直接在内存上对该数据进行修改，以执行上述的根据数据链表对目标数据进行管理的步骤。从而将该数据在内存上进行保留，以使该数据在内存上迁移，因热点数据容易导致硬盘产生碎片，将这些数据缓存在内存上而不存储在数据区，可避免在数据区上因该数据的迁移产生碎片。In an embodiment of the present invention, the method of the embodiment of the present invention further includes: if the data is hot data, caching the data in the memory, that is, writing all target data of the transaction into the log area space, and caching the target data of the transaction In the memory, that is, after judging whether the data is hot data, if the data is hot data, then keep the data in the memory, so that in the subsequent operation process, if the data is written to the memory, if the write operation is cached For modifying and writing the target data in the internal memory, the data can be directly modified in the internal memory to perform the above-mentioned steps of managing the target data according to the data link list. In this way, the data is retained in the memory so that the data can be migrated in the memory. Since the hot data is likely to cause fragmentation of the hard disk, caching these data in the memory instead of storing them in the data area can avoid the data being damaged in the data area. Data migration creates fragments.

在有的实施例中，也可以不执行若数据是热点数据，则在内存上缓存该数据的步骤，而是在步骤201之前，从日志区读取数据到内存进行缓存，即向缓存器件写入数据之前，从日志区读取数据到缓存器件缓存。这样也能实现直接在内存上对该数据进行修改，本完成上述的根据数据链表对目标数据进行管理的步骤。从而将该数据在内存上进行保留，以使该数据在内存上迁移，避免在数据区上因该数据的迁移产生碎片。In some embodiments, if the data is hot data, the step of caching the data in the memory may not be performed, but before step 201, the data is read from the log area to the memory for caching, that is, writing to the caching device Before entering data, read data from the log area to the cache device cache. In this way, the data can be modified directly in the memory, and the above-mentioned steps of managing the target data according to the data linked list can be completed. Therefore, the data is retained in the memory, so that the data is migrated in the memory, and fragmentation in the data area due to the migration of the data is avoided.

在有的实施例中，硬盘控制装置还包括超级块，在每个日志区分配有标识信息时，该超级块可用于在日志区被修改后记录被修改的日志区的标识信息。从而，当内存掉电后，可以根据超级块记录的日志区的标识信息读取对应的日志区上的数据，然后，在内存上缓存读取到的日志区上的数据，以继续执行后续数据操作。In some embodiments, the hard disk control device further includes a super block. When each log area is provided with identification information, the super block can be used to record the identification information of the modified log area after the log area is modified. Therefore, when the memory is powered off, the data on the corresponding log area can be read according to the identification information of the log area recorded by the super block, and then the read data on the log area can be cached in the memory to continue to execute subsequent data operate.

例如，将事务的所有目标数据和映射关系写入日志区空间后，当一个事务的所有写操作全部完成的时候，将修改的日志区的标识信息记录到超级块的bitmap中。For example, after all target data and mapping relations of a transaction are written into the log area space, when all write operations of a transaction are completed, the identification information of the modified log area is recorded in the bitmap of the super block.

可以理解，在有的实施例中，硬盘还包括块组，该块组包括预设数量的日志区和数据区，组块的日志区和数据区连续设置，此时，如上所述，为数据在数据区分配数据区空间，具体可以是：根据块组的管理信息确定空闲的目标数据区；为数据在目标数据区分配数据区空间。It can be understood that, in some embodiments, the hard disk further includes a block group, the block group includes a preset number of log areas and data areas, and the log areas and data areas of the block are set continuously. At this time, as described above, the data Allocating data area space in the data area may specifically include: determining a free target data area according to block group management information; and allocating data area space in the target data area for data.

从而，将数据写入数据区空间之后，根据数据和数据区空间会生成目标元数据，该目标元数据用于记录该数据区上的数据的地址，以对方便对该数据进行管理和查询。Therefore, after data is written into the data zone space, target metadata is generated according to the data and the data zone space, and the target metadata is used to record the address of the data on the data zone, so as to facilitate management and query of the data.

生成元数据后，向内存写入目标元数据；判断出元数据为热点数据后，确定目标数据区所属的目标块组；确定目标块组可用的日志区，即查找该目标块组上可用的日志区，然后将目标元数据写入该可用的日志区。如果没有则查找目标数据区或目标块组附近最近的可用日志区。After the metadata is generated, write the target metadata to the memory; after judging that the metadata is hot data, determine the target block group to which the target data area belongs; determine the available log area of the target block group, that is, find the available log area on the target block group log area, and then write the target metadata to the available log area. If not, find the nearest available log area near the target data area or target block group.

其中，向日志区写入数据是以事务为单位时，为事务的多个目标数据分配日志区空间可以使用上述的方法，将目标数据的元数据分配得靠近该元数据指向的数据。Wherein, when writing data to the log area is in units of transactions, the above-mentioned method can be used to allocate log area space for multiple target data of the transaction, and the metadata of the target data is allocated close to the data pointed to by the metadata.

这样，根据上述方法将元数据和该元数据指向的数据存储在硬盘上后，该元数据的存储位置靠近该元数据指向的数据的存储位置，从而减少磁头的移动距离，提高硬盘的读写效率。In this way, after the metadata and the data pointed to by the metadata are stored on the hard disk according to the above method, the storage location of the metadata is close to the storage location of the data pointed to by the metadata, thereby reducing the moving distance of the magnetic head and improving the reading and writing of the hard disk. efficiency.

在本发明有的实施例中，对内存上的属于同一事务的多个目标数据是以链表的方式进行管理。如上所述，本发明实施例的方法在确定出多个目标数据后，可根据该多个目标数据建立数据链表，数据链表用于管理目标数据。即每一个事务的数据和元数据提交到硬盘之后，对应日志区的数据以链表方法管理起来，In some embodiments of the present invention, multiple target data belonging to the same transaction on the memory are managed in the form of a linked list. As mentioned above, after the method of the embodiment of the present invention determines a plurality of target data, a data linked list can be established according to the multiple target data, and the data linked list is used to manage the target data. That is, after the data and metadata of each transaction are submitted to the hard disk, the data in the corresponding log area is managed by a linked list method.

因为这些目标数据的写操作组合为一个事务，从而一个事务对应于一个数据链表，数据链表管理的目标数据与事务的目标数据相同。且根据这些目标数据建立有一映射关系，从而一个数据链表对应一映射关系，该映射关系记录了该数据链表管理的数据在日志区上的存储情况。Because the writing operations of these target data are combined into a transaction, a transaction corresponds to a data linked list, and the target data managed by the data linked list is the same as the target data of the transaction. And a mapping relationship is established according to these target data, so that a data link table corresponds to a mapping relationship, and the mapping relationship records the storage situation of the data managed by the data link list in the log area.

根据数据链表即可对目标数据进行管理，具体的管理方法如下：The target data can be managed according to the data linked list. The specific management method is as follows:

根据上述的执行步骤，在内存上确定出属于热点数据的多个第一目标数据后，根据这些第一目标数据建立第一数据链表，这些第一目标数据属于第一事务，即这些第一目标数据在写入日志区时，是按照同一写入批次写入日志区的，只要有一个第一目标数据写入失败，则第一事务的其他数据写入失败。这些第一目标数据也建立有一第一映射关系。该第一映射关系缓存在内存上。例如，第一数据链表管理内存上的第一目标数据A1、第一目标数据B1、第一目标数据C1、第一目标数据D1。对应的第一映射关系记录有第一目标数据A1、第一目标数据B1、第一目标数据C1、第一目标数据D1与这些数据在日志区分配到的日志区空间的关系。后续过程中，硬盘控制装置根据多个第二目标数据建立第二数据链表，该多个第二目标数据属于热点数据，同时属于第二事务，根据该多个第二目标数据建立有第二映射关系，该数据链表例如可管理第二目标数据A2、第二目标数据E1、第二目标数据F1、第二目标数据D2。According to the above-mentioned execution steps, after determining a plurality of first target data belonging to the hotspot data in the memory, a first data link list is established according to these first target data, and these first target data belong to the first transaction, that is, these first target data When data is written into the log area, it is written into the log area according to the same write batch. As long as one of the first target data fails to be written, the other data of the first transaction will fail to be written. These first target data also establish a first mapping relationship. The first mapping relationship is cached in memory. For example, the first data link list manages the first target data A1, the first target data B1, the first target data C1, and the first target data D1 in the internal memory. The corresponding first mapping relationship records the relationship between the first target data A1, the first target data B1, the first target data C1, the first target data D1 and the log area space to which these data are allocated in the log area. In the subsequent process, the hard disk control device establishes a second data link list according to a plurality of second target data, the plurality of second target data belongs to hot data, and belongs to the second transaction at the same time, and a second mapping is established according to the plurality of second target data For example, the data linked list can manage the second object data A2, the second object data E1, the second object data F1, and the second object data D2.

当第二数据链表管理的第二目标数据是由预先建立的第一数据链表管理的第一目标数据修改得到时，在第一数据链表上解除对第一目标数据的管理；在与第一数据链表对应的第一映射关系上删除第一目标数据的信息。这样，就实现了目标数据在数据链表上和映射关系上的迁移。例如，当第二数据链表管理的第二目标数据A2是由第一数据链表的第一目标数据A1修改得到时，因内存上的第一目标数据A1被修改为第二目标数据A2了，第一目标数据A1不再需要，从而可解除第一数据链表对第一目标数据A1的管理，相应的，第一映射关上记录的关于第一目标数据A1的信息也可以删除。因第一映射关系记录的第一目标数据A1的信息被删除了，在日志区回收的时候，根据该第一映射关系，即可判断出与该第一目标数据A1对应的日志区有碎片产生，并在迁移合并日志区上的数据时，因从第一映射关系上读取不到第一目标数据A1的信息，从而可不将日志区上的第一目标数据A1搬迁到新的日志区，即日志区上的第一目标数据A1可释放。When the second target data managed by the second data linked list is modified by the first target data managed by the pre-established first data linked list, release the management of the first target data on the first data linked list; The first target data information is deleted from the first mapping relationship corresponding to the linked list. In this way, the migration of the target data on the data linked list and the mapping relationship is realized. For example, when the second target data A2 managed by the second data link list is obtained by modifying the first target data A1 of the first data link list, because the first target data A1 on the internal memory has been modified to the second target data A2, the second target data A2 A target data A1 is no longer needed, so the management of the first data link list on the first target data A1 can be released. Correspondingly, the information about the first target data A1 recorded in the first mapping can also be deleted. The information of the first target data A1 recorded due to the first mapping relationship is deleted. When the log area is recycled, according to the first mapping relationship, it can be judged that there is fragmentation in the log area corresponding to the first target data A1 , and when migrating and merging the data on the log area, because the information of the first target data A1 cannot be read from the first mapping relationship, it is not necessary to relocate the first target data A1 on the log area to a new log area, That is, the first target data A1 in the log area can be released.

图3为数据在内存中迁移的示意图，其中，各个事务包括多个目标数据，在图3只是对其中的部分数据块进行了标号标识。如图3所示，因事务不断产生，后面的事务因为io的局部性，修改了第一事务中管理的数据。例如第二事务修改了第一目标数据A1得到第二目标数据A2，第五事务修改了第一目标数据B1得到第五目标数据B2，第四事务修改了第一目标数据C1得到第四目标数据C2，第二事务修改了第一目标数据D1得到第二目标数据D2，第三事务修改了第二目标数据D3得到第三目标数据D3。这样根据第一事务对应的和第一数据链表可知，第一事务的所有数据全部迁移到后面的事务，对应的，日志区可以全部释放第一事务的目标数据。如表7所示，在日志区1上存储有第一事务的目标数据，在第一事务的所有数据全部迁移到后面的事务后，日志区1上面的数据块全部迁移到后面的事务，写入其它的日志区中，即其它的日志区保存有日志区1的数据对应的新版本的数据，这个时候，在日志区回收时日志区1就可以释放出来，作为空的日志区。其它的目标数据的迁移类似，执行到第五事务的时候，内存中实际的缓存情况如图4所示。FIG. 3 is a schematic diagram of data migration in memory, wherein each transaction includes a plurality of target data, and in FIG. 3 only some of the data blocks are marked with labels. As shown in Figure 3, due to the continuous generation of transactions, the subsequent transactions modify the data managed in the first transaction due to the locality of io. For example, the second transaction modifies the first target data A1 to obtain the second target data A2, the fifth transaction modifies the first target data B1 to obtain the fifth target data B2, and the fourth transaction modifies the first target data C1 to obtain the fourth target data C2, the second transaction modifies the first target data D1 to obtain the second target data D2, and the third transaction modifies the second target data D3 to obtain the third target data D3. In this way, according to the corresponding first transaction and the first data linked list, it can be known that all the data of the first transaction are migrated to the following transactions, and correspondingly, the log area can release all the target data of the first transaction. As shown in Table 7, the target data of the first transaction is stored in the log area 1. After all the data in the first transaction are migrated to the subsequent transactions, all the data blocks in the log area 1 are migrated to the subsequent transactions. In other log areas, that is, other log areas store the new version of data corresponding to the data in log area 1. At this time, log area 1 can be released as an empty log area when the log area is recycled. Migration of other target data is similar. When the fifth transaction is executed, the actual cache situation in the memory is shown in FIG. 4 .

表7Table 7

为了更方便地根据数据链表对目标数据进行管理，在有的实施例中，本发明实施例的方法中，还包括根据事务的写入顺序为事务对应的数据链表按照递增规则分配事务号，事务的写入顺序是指不同事务间写入日志区的先后顺序，先写入的事务对应的数据链表分配到的事务号较小，跟着写入日志区的事务对应的数据链表的事务号增加一个单位，这样每个数据链表即可有相应的标识，这些标识还具有递增的规律。从而，根据事务号可方便地确定出内存上的数据链表的建立先后顺序。In order to more conveniently manage the target data according to the data linked list, in some embodiments, in the method of the embodiment of the present invention, it also includes assigning a transaction number to the data linked list corresponding to the transaction according to the incremental rule according to the writing order of the transaction, and the transaction The write order refers to the order in which different transactions are written to the log area. The transaction number assigned to the data linked list corresponding to the transaction written first is smaller, and the transaction number of the data linked list corresponding to the transaction written to the log area is increased by one. unit, so that each data linked list can have a corresponding identification, and these identifications also have an increasing law. Therefore, according to the transaction number, the establishment sequence of the data linked list on the memory can be determined conveniently.

例如，首先建立第一事务的数据链表，该第一事务先写入日志区，从而为该第一事务的数据链表分配事务号1，然后根据多个第二目标数据建立第二事务的数据链表，该第二事务的数据将在第一事务之后写入日志区，从而为该第二事务对应的第二数据链表分配事务号2，类似地，为第三数据链表分配事务号3，如此类推。For example, first establish the data linked list of the first transaction, and write the first transaction into the log area first, thereby assigning transaction number 1 to the data linked list of the first transaction, and then build the data linked list of the second transaction according to a plurality of second target data , the data of the second transaction will be written into the log area after the first transaction, thereby assigning transaction number 2 to the second data linked list corresponding to the second transaction, similarly, assigning transaction number 3 to the third data linked list, and so on .

上述即为根据数据链表对数据进行管理的部分内容。属于热点数据的数据写入日志区，通过在内存上缓存该属于热点数据的数据或者从日志区读取数据后，将该读取的数据缓存在内存上，从而可在内存上根据数据链表对数据进行管理，让热点数据在内存上迁移，减少该数据在硬盘上的迁移，以减少该数据使硬盘产生的碎片。The above is part of the content of managing data according to the data link list. The data belonging to the hot data is written into the log area. After caching the data belonging to the hot data in the memory or reading the data from the log area, the read data is cached in the memory, so that the data can be paired in the memory according to the data link list. Manage the data, let the hot data migrate on the memory, reduce the migration of the data on the hard disk, and reduce the fragmentation of the data caused by the hard disk.

将数据缓存在内存上，还可以在读取数据时，直接从内存读取数据，避免读取日志区中的数据。The data is cached in the memory, and the data can also be read directly from the memory when reading the data, avoiding reading the data in the log area.

并且通过在内存上根据数据链表对数据进行管理，以对映射关系进行建立和改变，根据映射关系即可对日志区的数据和碎片进行整理，以避免或者减少硬盘的碎片，还能提高碎片处理的效率。为了充分利用内存上的空间，可以对缓存在内存上的热点数据进行释放，回收内存的空间，以使得内存缓存其它更多的数据。所以，本发明实施例的方法，在预设释放条件下，可对内存空间进行回收。例如，达到系统内存的回收水位时，触发缓存淘汰。下面示出一示例性的回收方法，该回收方法分为两阶段。And by managing the data in the memory according to the data link list, the mapping relationship can be established and changed, and the data and fragments in the log area can be sorted according to the mapping relationship, so as to avoid or reduce the fragmentation of the hard disk and improve fragmentation processing. s efficiency. In order to make full use of the memory space, the hot data cached in the memory can be released, and the memory space can be reclaimed, so that the memory can cache other more data. Therefore, the method of the embodiment of the present invention can reclaim the memory space under the preset release condition. For example, cache eviction is triggered when the recycling watermark of system memory is reached. An exemplary recycling method is shown below, which is divided into two stages.

第一阶段The first stage

当内存达到第一预设水位时，从当前事务号最小的数据链表开始，根据事务号由小到大的顺序查找数据链表未解除管理的数据。然后，对查找到的目标数据链表，在内存上释放目标数据链表未解除管理的数据，且在内存上保留目标数据链表对应的目标映射关系。When the memory reaches the first preset water level, start from the data link list with the smallest current transaction number, and search for the unmanaged data in the data link list according to the sequence of transaction numbers from small to large. Then, for the found target data linked list, release the unmanaged data of the target data linked list in the memory, and keep the target mapping relationship corresponding to the target data linked list in the memory.

因通过数据链表的方式对数据进行了管理，先建立的数据链表管理的数据可能因被后写入的数据的修改而在内存上淘汰掉，这些淘汰掉的数据从数据链表上已经完成了解除。对此，可参考上述对数据链表的管理方法的描述。对内存进行回收，即，查找数据链表没淘汰的数据，在内存上释放这些还没淘汰的数据。这些在内存回收时被释放的数据所属的数据链表可称为目标数据链表。而目标数据链表对应的映射关系仍然保留在内存上。第一阶段的内存回收直到内存空间达到预设的停止水位时，才停止。Because the data is managed through the data link list, the data managed by the data link list established earlier may be eliminated in the memory due to the modification of the data written later. These eliminated data have been released from the data link list . For this, reference may be made to the above description of the management method of the data link list. Reclaim the memory, that is, find the data that has not been eliminated in the data linked list, and release the data that has not been eliminated in the memory. The data linked list to which the data released during memory recovery belongs may be called the target data linked list. The mapping relationship corresponding to the target data linked list is still kept in the memory. The memory reclamation in the first stage does not stop until the memory space reaches the preset stop water level.

经过上述的步骤，小于等于预设数据阀值的数据和写数据产生的元数据写到日志区，同时在内存中缓存。对这些日志区和内存上的热点数据，如上，采用事务先后的顺序管理起来，这样当系统运行一段时间，缓存达到第一预设水位的时候，触发后台缓存回收线程将按照事务号从小到大的顺序回收内存空间。例如，从当前事务号最小的数据链表开始，根据事务号由小到大的顺序查找数据链表未解除管理的数据，实现了从最先建立的数据链表开始释放内存上的数据。After the above steps, the data less than or equal to the preset data threshold and the metadata generated by writing the data are written to the log area and cached in the memory at the same time. For these hot data in the log area and memory, as above, manage them in the order of transactions, so that when the system runs for a period of time and the cache reaches the first preset water level, the background cache recovery thread will be triggered from small to large transaction numbers Memory space is reclaimed in order. For example, starting from the data linked list with the smallest current transaction number, the unmanaged data in the data linked list is searched according to the order of the transaction number from small to large, and the data in the memory is released from the first established data linked list.

越早建立的数据链表剩下的没在内存淘汰的数据是较不活跃的数据，设备对其读取修改的可能性较小，从而可将这些数据从内存释放，这样对设备的数据读取效率影响较小。而因目标映射关系保存在内存上，设备要读取数据时，若从内存上读取不上，则根据内存上保留的映射关系进行查询，若根据目标映射关系确定出要查询的数据，则可根据目标映射关系从日志区上读取对应的数据。这样，内存即可管理更多的热点数据。The data that is not eliminated in the memory of the data link list established earlier is less active data, and the device is less likely to read and modify it, so that these data can be released from the memory, so that the data read of the device Efficiency is less affected. Since the target mapping relationship is stored in the memory, when the device wants to read data, if it cannot be read from the memory, it will query according to the mapping relationship retained in the memory. If the data to be queried is determined according to the target mapping relationship, then The corresponding data can be read from the log area according to the target mapping relationship. In this way, the memory can manage more hot data.

可以理解，内存达到第一预设水位，只是预设释放条件的一种，查找数据链表未解除管理的数据也可以在其它的预设释放条件下触发，例如设定的计时器到时等，本发明实施例对此不作具体限定。It can be understood that when the memory reaches the first preset water level, it is only one of the preset release conditions, and the search for unmanaged data in the data link list can also be triggered under other preset release conditions, such as when the set timer expires, etc. This embodiment of the present invention does not specifically limit it.

第二阶段second stage

内存进行第一阶段的回收后，在内存达到第二预设水位时，从日志区读取目标映射关系指向的数据；将目标映射关系指向的数据写入数据区；在内存上删除目标映射关系。After the memory is recycled in the first stage, when the memory reaches the second preset water level, read the data pointed to by the target mapping relationship from the log area; write the data pointed to by the target mapping relationship into the data area; delete the target mapping relationship on the memory .

为了进一步充分利用内存的空间，可在第二阶段对内存空间进行第二次回收。其中，第二预设水位时触发内存执行第二阶段回收的水位。第二预设水位可以与第一预设水位相同，也可以与第一预设水位不相同，本发明实施例对此不做具体限定，对第一预设水位和第二预设水位的具体数值本发明实施例也不做具体限定，例如，可根据实际内存容量和业务类型灵活设定。In order to further fully utilize the memory space, the memory space can be reclaimed for the second time in the second stage. Wherein, the second preset water level is the water level that triggers the memory to execute the second stage of recovery. The second preset water level may be the same as the first preset water level, or may be different from the first preset water level, which is not specifically limited in this embodiment of the present invention. The embodiment of the present invention does not specifically limit the value, for example, it can be flexibly set according to the actual memory capacity and service type.

从日志区读取目标映射关系指向的数据后，可将目标映射关系指向的数据写入数据区。当内存达到第二预设水位时，此时内存运行了一定的时间，缓存在内存中的映射关系越来越多，最终也达到缓存水位的时候，需要读取这些映射关系指向的数据，写入到数据区中。After the data pointed to by the target mapping relationship is read from the log area, the data pointed to by the target mapping relationship can be written into the data area. When the memory reaches the second preset water level, the memory has been running for a certain period of time, and more and more mapping relationships are cached in the memory. When the cache water level is finally reached, it is necessary to read the data pointed to by these mapping relationships and write into the data area.

在内存达到第二预设水位时，因执行了上述的数据链表的管理方法，当前越早建立的数据链表的没被解除的数据被修改的可能性越小，因为，若数据链表上的管理的数据被后面写入的数据修改，则该数据从该数据链表迁移到后面事务的对应的数据链表。从而，越早建立的数据链表剩下的数据可判定为不活跃的数据，这些不活跃的数据因被修改的可能性较小，从而它们因修改产生的硬盘碎片也较少，可将目标映射关系指向的数据写入数据区。例如，根据目标映射关系从日志区读取这些数据，然后由数据区管理器在数据区上分配好数据区空间，再将这些数据写入分配到的数据区上。When the internal memory reaches the second preset water level, due to the implementation of the above-mentioned management method of the data linked list, the possibility of modifying the data of the data linked list established earlier is less likely to be modified, because if the management method on the data linked list If the data is modified by the data written later, the data is migrated from the data linked list to the corresponding data linked list of the subsequent transaction. Thereby, the remaining data of the data link list established earlier can be judged as inactive data. These inactive data are less likely to be modified, so they will cause fewer hard disk fragments due to modification, and the target can be mapped to The data pointed to by the relationship is written into the data area. For example, the data is read from the log area according to the target mapping relationship, and then the data area manager allocates space for the data area on the data area, and then writes the data into the allocated data area.

经过上述步骤的执行后，当设备要读取硬盘上的数据时，例如有对已经写入日志区的数据进行访问，硬盘控制装置可以先在内存上查找，如果在内存中命中，则可以直接返回该数据。因内存回收的第一阶段后，从内存中删除了部分热点数据，而保留了对应的映射关系，从而如果在内存中没有查到要访问的数据，但在内存缓存的映射关系中可以查找到，则可以在该映射关系对应的日志区中直接读取这些要访问的数据。如果在映射关系中，也查找不到要查找的数据，则到数据区中查找和读取数据。After the execution of the above steps, when the device wants to read the data on the hard disk, for example, the data that has been written in the log area is accessed, the hard disk control device can first search in the memory, and if it is hit in the memory, it can directly Return that data. After the first stage of memory reclamation, some hot data is deleted from the memory, and the corresponding mapping relationship is retained, so if the data to be accessed is not found in the memory, it can be found in the mapping relationship of the memory cache , then the data to be accessed can be directly read in the log area corresponding to the mapping relationship. If the data to be searched cannot be found in the mapping relationship, search and read the data in the data area.

可以理解，在按照顺序追加写的方式对日志区进行写数据的实施例中，当连续多个事务的数据链表淘汰完毕之后，例如执行上述的内存第二阶段的回收后，与被删除的映射关系对应的日志区也完全释放了，可以重新作为一个空的日志区来使用。具体的释放日志区以及回收日志区的方法下面将叙及。It can be understood that, in the embodiment of writing data to the log area in the manner of sequentially appending writes, after the data linked lists of multiple consecutive transactions are eliminated, for example, after the above-mentioned second phase of memory recovery is performed, the deleted mapping The log area corresponding to the relationship is also completely released and can be used as an empty log area again. The specific method of releasing the log area and reclaiming the log area will be described below.

在有的实施例中，执行上述方法后，本发明实施例还包括对日志区回收的操作。从而减少日志区上的碎片，充分利用日志区的空间。In some embodiments, after the above method is executed, the embodiment of the present invention further includes the operation of reclaiming the log area. Thereby reducing fragmentation on the log area and making full use of the space in the log area.

日志区的回收，可基于上述的方法，下面以按照顺序追加写的方式对日志区进行写数据的实施例对日志区的回收进行说明。即步骤208为，将事务的目标数据顺序追加写入日志区空间，即日志区上的数据是根据事务的先后顺序存储的，The recovery of the log area can be based on the above-mentioned method, and the recovery of the log area will be described in an embodiment in which data is written to the log area in a sequentially appended manner. That is, step 208 is to sequentially write the target data of the transaction into the log area space, that is, the data on the log area is stored according to the order of the transactions,

日志区的具体回收方法，例如可包括下述步骤：The specific recovery method of the log area, for example, may include the following steps:

A1：在预设回收条件下，执行日志区数据搬迁的步骤；A1: Under the preset recycling conditions, perform the steps of data relocation in the log area;

其中，预设回收条件包括定时器超时、对内存数据的回收操作完成、日志区总水位达到预设水位阀值中的至少一个。对内存数据的回收操作完成指上述的内存回收的时候，每个阶段执行完成，都触发日志区数据搬迁步骤，即触发日志回收线程执行日志区回收流程。Wherein, the preset recovery condition includes at least one of timer timeout, completion of the recovery operation on memory data, and total water level of the log area reaching a preset water level threshold. Completion of the memory data recovery operation means that when the above-mentioned memory recovery is completed, the log area data relocation step is triggered after each stage is completed, that is, the log recovery thread is triggered to execute the log area recovery process.

A2：当当前日志区总的空间水位达到预设空间阀值时，则停止执行日志区数据搬迁的步骤，否则继续执行日志区数据搬迁的步骤。A2: When the total space water level of the current log area reaches the preset space threshold, stop executing the steps of log area data relocation, otherwise continue to execute the steps of log area data relocation.

其中，执行日志区数据搬迁的步骤，包括：Among them, the steps to perform log area data relocation include:

B1：从当前事务号最小的数据链表开始，根据事务号由小到大的顺序查找与数据链表对应的映射关系。B1: Starting from the data linked list with the smallest current transaction number, search for the mapping relationship corresponding to the data linked list according to the order of transaction numbers from small to large.

在上述的方法执行了根据事务的写入顺序为事务对应的数据链表按照递增规则分配事务号的步骤之后，硬盘控制装置按照事务号从小到大的顺序查找与数据链表对应的映射关系，因映射关系记录了数据和该数据在日志区上分配到的空间的关系，从而可分析映射关系中的数据块在日志区上的分布情况，即可知道对应的日志区上还有多少数据没迁移，没迁移的数据位于对应的日志区上。After the above-mentioned method executes the step of assigning transaction numbers to the data linked list corresponding to the transaction according to the increasing rule according to the writing order of the transactions, the hard disk control device searches for the mapping relationship corresponding to the data linked list in the order of transaction numbers from small to large, because the mapping The relationship records the relationship between the data and the space allocated to the data in the log area, so that the distribution of the data blocks in the mapping relationship in the log area can be analyzed to know how much data in the corresponding log area has not been migrated. The data that is not migrated is located in the corresponding log area.

而因为事务在推进的时候，使用完毕一个日志区之后才切换到下一个日志区，因此连续事务的数据会写到连续的日志区中。根据事务号从小到大的顺序分析映射关系，即可获取到连续的日志区上的数据存储情况。And because when the transaction is advancing, it will switch to the next log area after using one log area, so the data of continuous transactions will be written to the continuous log area. By analyzing the mapping relationship according to the sequence of transaction numbers from small to large, the data storage situation on the continuous log area can be obtained.

B2：根据映射关系记录的信息判断与该映射关系对应的第一日志区上的数据是否迁移完毕；B2: According to the information recorded in the mapping relationship, it is judged whether the data on the first log area corresponding to the mapping relationship has been migrated;

在根据数据链表对内存上的数据进行管理时，对在内存上迁移的数据，也要修改对应的映射关系，若数据链表记录的信息反映某日志区上的数据以在内存上迁移完毕，例如该日志区对应的数据都在其它的日志区有新的版本，或者内存回收的第二阶段后，已经将对应的日志区上的数据搬迁到数据区，此时，判断出该映射关系对应的日志区的数据已经迁移完。When managing the data on the memory according to the data link list, the corresponding mapping relationship should also be modified for the data migrated on the memory. The data corresponding to the log area has a new version in other log areas, or after the second stage of memory reclamation, the data on the corresponding log area has been moved to the data area. At this time, it is judged that the mapping relationship corresponds to The data in the log area has been migrated.

B3：若第一日志区的数据迁移完毕，则回收该第一日志区；B3: If the data migration in the first log area is completed, reclaim the first log area;

若第一日志区的数据已经搬迁完毕，则回收该第一日志区。在通过超级块记录日志区的使用情况的实施例中，此时可将超级块中的与第一日志区对应的信息清楚掉。If the data in the first log area has been migrated, the first log area is recovered. In the embodiment where the usage of the log area is recorded by the super block, the information corresponding to the first log area in the super block can be cleared at this time.

B4：若第一日志区上的数据未迁移完，则根据映射关系记录的信息，确定第一日志区上的空间利用率；B4: If the data on the first log area has not been migrated, then determine the space utilization rate on the first log area according to the information recorded in the mapping relationship;

因映射关系记录了数据和该数据在日志区上分配到的空间的关系，从而可根据映射关系记录的信息分析出第一日志区上的空间利用率。Since the mapping relationship records the relationship between the data and the space allocated to the data in the log area, the space utilization rate of the first log area can be analyzed according to the information recorded in the mapping relationship.

B5：当第一日志区的空间利用率小于预设利用率阀值时，将第一日志区的数据迁移至第二日志区，并更新与被搬迁的数据对应的映射关系。B5: When the space utilization rate of the first log area is less than the preset utilization threshold, migrate the data in the first log area to the second log area, and update the mapping relationship corresponding to the relocated data.

其中第二日志区为空闲的日志区或在回收日志区时使用过的日志区，例如前次日志区回收时使用过的日志区。该预设利用阀值可以根据具体使用情况设定，例如，在日志区使用较多时，但是每个日志区的利用率较低的时候可以降低该预设利用阈值。例如该预设利用阀值可设为50％等，本发明实施例对此不作具体限定。更新与被搬迁的数据对应的映射关系，可以是数据在日志区上搬迁后，在与该数据相关的映射关系上及时更新该数据和新的日志区空间的对应关系。The second log area is an idle log area or a log area that was used when the log area was recovered, for example, a log area that was used when the log area was recovered last time. The preset utilization threshold can be set according to specific usage conditions. For example, when the log area is used more, but the utilization rate of each log area is low, the preset utilization threshold can be lowered. For example, the preset utilization threshold may be set to 50%, etc., which is not specifically limited in this embodiment of the present invention. Updating the mapping relationship corresponding to the relocated data may be updating the corresponding relationship between the data and the new log area space in time on the mapping relationship related to the data after the data is relocated in the log area.

通过上述的日志区回收方法，即可实现了日志区的回收，减少日志区上的碎片，充分利用了日志区的空间。而因日志区是顺序读写的，与数据链表的事务号从小到大的顺序对应，从而在可根据数据链表的事务号大小查询对应的映射关系，并搬迁数据，此时无需元数据来记录日志区的数据的位置，从而减少了元数据的开销。Through the above log area recovery method, the recovery of the log area can be realized, the debris on the log area can be reduced, and the space of the log area can be fully utilized. Since the log area is read and written sequentially, it corresponds to the order of the transaction numbers of the data link list from small to large, so that the corresponding mapping relationship can be queried according to the transaction number of the data link list, and the data can be relocated. At this time, metadata is not required to record The location of the data in the log area, thereby reducing the overhead of metadata.

本发明实施例的方法，在包括硬盘的硬盘控制装置上，该硬盘包括数据区和日志区，向缓存器件写入数据后，硬盘控制装置判断该数据是否是热点数据，若该数据不是热点数据，则为该数据在数据区分配数据区空间，将该数据写入数据区空间；若该数据是热点数据，则为该数据在日志区分配日志区空间，将该数据写入日志区空间。这样，将待写入硬盘的数据分为热点数据和非热点数据，热点数据为存储在硬盘上后在预设次数的修改和释放后能使硬盘产生预设数量碎片的数据，热点数据易于导致硬盘产生碎片，将热点数保存在日志区上，以日志方式进行管理，即使日志区上的数据频繁修改产生硬盘碎片，也方便对这些碎片进行回收等管理，而将非热点数据保存在数据区，非热点数据的释放不易导致硬盘产生碎片，数据区可以无需为硬盘碎片管理分配过多资源，从而，通过在硬盘上将不同类型的数据保存在不同的区域以不同的方式进行管理，可提高硬盘上的碎片管理效率，日志区对硬盘碎片的高效管理，可减少硬盘碎片的产生。In the method of the embodiment of the present invention, on the hard disk control device including a hard disk, the hard disk includes a data area and a log area. After writing data to the cache device, the hard disk control device judges whether the data is hot data, if the data is not hot data , allocate data area space for the data in the data area, and write the data into the data area space; if the data is hot data, allocate log area space for the data in the log area, and write the data into the log area space. In this way, the data to be written into the hard disk is divided into hot data and non-hot data. Hot data is data that can cause the hard disk to generate a preset number of fragments after being stored on the hard disk after a preset number of modifications and releases. Hot data is easy to cause The hard disk generates fragments, save the hotspot data in the log area, and manage it in the form of logs. Even if the data on the log area is frequently modified to generate hard disk fragments, it is convenient to recover and manage these fragments, and save the non-hot data in the data area , the release of non-hot data is not easy to cause fragmentation of the hard disk, and the data area does not need to allocate too many resources for the management of hard disk fragmentation. Therefore, by storing different types of data in different areas on the hard disk and managing them in different ways, it can improve The efficiency of fragment management on the hard disk and the efficient management of hard disk fragments in the log area can reduce the generation of hard disk fragments.

在日志区中的数据因为热点数据的局部性，被迁移走了，实际在整理日志区的时候，读取的数据较少，从而日志区的数据管理效率较高。而将热点数据保存在日志区，因热点数据容易产生碎片，从而碎片集中在日志区，进一步提高了对碎片整理的效率。The data in the log area is migrated due to the locality of hot data. In fact, when the log area is sorted, less data is read, so the data management efficiency of the log area is higher. However, storing hot data in the log area, because the hot data is prone to fragmentation, the fragments are concentrated in the log area, which further improves the efficiency of defragmentation.

在以顺序追加写的方式向日志区写入数据时，在日志区只需要顺序读取数据和顺序写入数据，搬迁数据没有产生元数据的开销，日志区进行碎片整理效率高。通过整理日志区的数据，可有效避免或者减少硬盘碎片。When data is written to the log area in the form of sequential append writing, only sequential reading and writing of data is required in the log area, and there is no metadata overhead for relocating data, and the efficiency of defragmentation in the log area is high. By sorting the data in the log area, hard disk fragmentation can be effectively avoided or reduced.

在有的实施例中，也可以不以顺序追加方式向日志区写入数据，在知道日志区的空间分配情况时，可以使用位图文件bitmap对应journal zone来管理。In some embodiments, data may not be written to the journal area in a sequential appending manner. When the space allocation of the journal area is known, a bitmap file bitmap may be used to manage the journal zone.

例如，每个bit对应一个固定的块大小，例如是4K，那么256M的journal zone需要8K来管理，每次写journal zone都需要修改这个bitmap。For example, each bit corresponds to a fixed block size, such as 4K, then a 256M journal zone needs 8K to manage, and the bitmap needs to be modified every time the journal zone is written.

可以理解，在本发明的实施例中，数据区和日志区可以为分级存储，即数据区和日志区在不同的层级，在日志区回收的时候，将热点数据迁移到较低层级的存储层。It can be understood that in the embodiment of the present invention, the data area and the log area can be hierarchical storage, that is, the data area and the log area are at different levels, and when the log area is recycled, the hot data is migrated to a lower-level storage layer .

可以理解，在有的实施例中，可以以其它方式查找映射关系，例如随机查询映射关系，然后根据查到的映射关系来分析对应的日志区的空间存储情况，然后进行日志区数据的搬迁，此时可不用将事务的目标数据以顺序追加的方式写入日志区空间，对具体的写入方式不做限定。但是，这样的方式可能因映射关系反应不出日志区的全部空间存储情况，而导致日志区的回收效果不够理想。It can be understood that in some embodiments, the mapping relationship can be searched in other ways, such as randomly querying the mapping relationship, and then analyzing the storage space of the corresponding log area according to the found mapping relationship, and then relocating the data in the log area. At this time, it is not necessary to write the target data of the transaction into the log area space in the manner of sequential appending, and the specific writing method is not limited. However, such a method may not reflect the full storage space of the log area due to the mapping relationship, resulting in an unsatisfactory recycling effect of the log area.

可以理解，在本发明的包括多个日志区和多个数据区的实施例中，为了更充分地使用数据区和日志区，从而充分利用硬盘空间，本发明实施例的方法还包括数据区和日志区的转化步骤，例如，当数据区的空间利用率大于预设数据区利用阀值时，将当前空闲的日志区转化为数据区；当日志区的空间利用率大于预设日志区利用阀值时，将由空闲的日志区转化成的数据区转化为日志区。It can be understood that in the embodiments of the present invention that include multiple log areas and multiple data areas, in order to make full use of the data areas and log areas, thereby making full use of hard disk space, the method in the embodiment of the present invention also includes data areas and multiple data areas. The conversion step of the log area, for example, when the space utilization rate of the data area is greater than the preset data area utilization threshold, convert the currently free log area into a data area; when the space utilization rate of the log area is greater than the preset log area utilization threshold When the value is set, the data area converted from the free log area will be converted into the log area.

例如，硬盘上初始化的时候，预先设定，有一半的硬盘空间是日志区，其他的是数据区。当系统运行一段时间之后，数据区的利用率较高，按照日志区的标识信息递增顺序查找空闲的日志区，将其转化为数据区。转化为数据区之后，在包括块组的实施例中，使用块组的空间管理对象将其管理起来。将日志区的状态设置为数据区，并记录到块组的管理结构中，对此写盘保存。当数据区上的数据被删除，硬盘空间释放出来，一个由日志区转化成的数据区的空间全部释放出来，这个时候将该数据区的状态切换为日志区，并记录在块组的管理结构中，写盘保存。For example, when the hard disk is initialized, it is pre-set that half of the hard disk space is the log area, and the rest is the data area. After the system has been running for a period of time, the utilization rate of the data area is high, and the idle log area is searched for in the increasing order of the identification information of the log area, and converted into a data area. After conversion to a data area, in an embodiment including a block group, it is managed using the space management object of the block group. Set the state of the log area to the data area, record it in the management structure of the block group, and write it to disk for storage. When the data on the data area is deleted, the hard disk space is released, and the space of a data area converted from the log area is completely released. At this time, the state of the data area is switched to the log area and recorded in the management structure of the block group , write to disk to save.

图5为根据一示例性实施例示出的一种硬盘控制装置的结构示意图，该硬盘控制装置包括硬盘，硬盘包括数据区和日志区，该硬盘控制装置用于执行上述图2对应的实施例中硬盘控制装置执行的功能。参见图5，该硬盘控制装置包括：Fig. 5 is a schematic structural diagram of a hard disk control device shown according to an exemplary embodiment, the hard disk control device includes a hard disk, the hard disk includes a data area and a log area, and the hard disk control device is used to implement the embodiment corresponding to Fig. The function performed by the hard disk control unit. Referring to Fig. 5, the hard disk control device includes:

写入单元501，用于向缓存器件写入数据；A writing unit 501, configured to write data into the cache device;

缓存管理器502，用于判断数据是否是热点数据，其中热点数据为存储在硬盘上后在预设次数的修改和释放后能使硬盘产生预设数量碎片的数据；The cache manager 502 is used to determine whether the data is hot data, wherein the hot data is data that can cause the hard disk to generate a preset number of fragments after being modified and released for a preset number of times after being stored on the hard disk;

数据管理器503，用于若数据不是热点数据，则为数据在数据区分配数据区空间，将数据写入数据区空间；The data manager 503 is used to allocate data area space in the data area for the data if the data is not hot data, and write the data into the data area space;

日志管理器504，用于若数据是热点数据，则为数据在日志区分配日志区空间，将数据写入日志区空间。The log manager 504 is configured to allocate log area space for the data in the log area if the data is hot data, and write the data into the log area space.

可选地，缓存器件为内存，硬盘控制装置还包括：Optionally, the cache device is a memory, and the hard disk control device also includes:

映射关系建立单元505，用于在内存建立数据和日志区空间的映射关系；A mapping relationship establishment unit 505, configured to establish a mapping relationship between data and log area space in memory;

可选地，硬盘控制装置还包括：Optionally, the hard disk control device also includes:

缓存单元506，用于在内存上缓存属于热点数据的数据。The caching unit 506 is configured to cache data belonging to hot data in memory.

可选地，Optionally,

映射关系建立单元505，还用于建立多个目标数据和多个目标数据分配到的日志区空间的映射关系，其中目标数据属于热点数据；The mapping relationship establishment unit 505 is further configured to establish a mapping relationship between multiple target data and the log area space to which the multiple target data are allocated, wherein the target data belongs to hot data;

日志管理器504，还用于将多个目标数据的多个写操作组合为一个事务将事务的所有目标数据写入日志区空间，当事务的其中一个目标数据的写操作执行失败时，事务的其他目标数据执行的写操作失败。The log manager 504 is also used to combine multiple write operations of multiple target data into a transaction and write all target data of the transaction into the log area space. When the write operation of one of the target data of the transaction fails, the transaction's Write operations performed on other target data failed.

可选地，Optionally,

硬盘控制装置还包括：The hard disk control unit also includes:

链表建立单元509，用于根据多个目标数据建立数据链表，其中，数据链表用于管理目标数据，数据链表管理的目标数据与事务的目标数据相同；A linked list establishment unit 509, configured to establish a data linked list according to a plurality of target data, wherein the data linked list is used to manage the target data, and the target data managed by the data linked list is the same as the target data of the transaction;

链表管理单元510，用于根据数据链表对目标数据进行管理；A linked list management unit 510, configured to manage the target data according to the data linked list;

查找单元511，用于在预设释放条件下，根据数据链表从先到后的建立顺序，查找数据链表未解除管理的数据；The search unit 511 is used to search for the data that has not been released from the data link list according to the order in which the data link list was first established under the preset release condition;

内存管理单元512，用于在内存上释放目标数据链表未解除管理的数据，且在内存上保留目标数据链表对应的目标映射关系；The memory management unit 512 is used to release the unmanaged data of the target data linked list in the memory, and retain the target mapping relationship corresponding to the target data linked list in the memory;

其中，链表管理单元510，还用于：Wherein, the linked list management unit 510 is also used for:

建立第二数据链表后，当第二数据链表管理的第二目标数据是由预先建立的第一数据链表管理的第一目标数据修改得到时，在第一数据链表上解除对第一目标数据的管理；在与第一数据链表对应的第一映射关系上删除第一目标数据的信息After the second data link list is established, when the second target data managed by the second data link list is modified by the first target data managed by the pre-established first data link list, the first target data is released on the first data link list. Management; delete the information of the first target data on the first mapping relationship corresponding to the first data linked list

可选地，Optionally,

查找单元511，还用于当内存达到第一预设水位时，根据数据链表从先到后的建立顺序，查找数据链表未解除管理的数据；The search unit 511 is also used to search for data that has not been released from the management of the data link list according to the order in which the data link list is established from first to last when the memory reaches the first preset water level;

硬盘控制装置还包括：The hard disk control unit also includes:

读取单元523，用于在内存达到第二预设水位时，从日志区读取目标映射关系指向的数据；A reading unit 523, configured to read the data pointed to by the target mapping relationship from the log area when the memory reaches the second preset water level;

映射数据写入单元513，用于将目标映射关系指向的数据写入数据区；A mapping data writing unit 513, configured to write the data pointed to by the target mapping relationship into the data area;

删除单元514，用于在内存上删除目标映射关系。The deleting unit 514 is configured to delete the target mapping relationship in memory.

可选地，Optionally,

硬盘控制装置还包括：The hard disk control unit also includes:

事务号分配单元515，用于根据事务的写入顺序为事务对应的数据链表按照递增规则分配事务号；A transaction number allocation unit 515, configured to allocate a transaction number for the data linked list corresponding to the transaction according to the incremental rule according to the writing sequence of the transaction;

查找单元511，还用于从当前事务号最小的数据链表开始，根据事务号由小到大的顺序查找数据链表未解除管理的数据。The search unit 511 is further configured to start from the data link list with the smallest current transaction number, and search for data that has not been released from the data link list according to the order of transaction numbers from small to large.

可选地，Optionally,

硬盘控制装置还包括：The hard disk control unit also includes:

回收单元516，用于在预设回收条件下，执行日志区数据搬迁的步骤；A recovery unit 516, configured to perform the step of relocating data in the log area under preset recovery conditions;

如图6所示，在执行日志区数据搬迁的步骤中，回收单元516，包括：As shown in FIG. 6, in the step of performing log area data relocation, the recovery unit 516 includes:

回收查找模块517，用于查找映射关系；Recycling lookup module 517, used to look up the mapping relationship;

回收判断模块518，用于根据映射关系记录的信息判断与映射关系对应的第一日志区上的数据是否迁移完；Recycling judging module 518, used to judge whether the data on the first log area corresponding to the mapping relationship has been migrated according to the information recorded in the mapping relationship;

回收确定模块519，用于若第一日志区上的数据未迁移完，则根据映射关系记录的信息，确定第一日志区上的空间利用率；The recovery determination module 519 is used to determine the space utilization rate on the first log area according to the information recorded in the mapping relationship if the data on the first log area has not been migrated;

回收执行模块520，用于当第一日志区的空间利用率小于预设利用率阀值时，将第一日志区的数据迁移至第二日志区，并更新与被搬迁的数据对应的映射关系，其中第二日志区为空闲的日志区或在回收日志区时使用过的日志区；A recovery execution module 520, configured to migrate the data in the first log area to the second log area when the space utilization rate of the first log area is less than the preset utilization threshold, and update the mapping relationship corresponding to the relocated data , wherein the second log area is a free log area or a log area used when recycling the log area;

回收模块521，用于当当前日志区总的空间水位达到预设空间阀值时，则停止执行日志区数据搬迁的步骤，否则继续执行日志区数据搬迁的步骤。The recycling module 521 is configured to stop executing the step of relocating data in the log area when the total space water level in the current log area reaches a preset space threshold, otherwise continue to execute the step of relocating data in the log area.

可选地，Optionally,

硬盘控制装置还包括：The hard disk control unit also includes:

回收查找模块517，还用于从当前事务号最小的数据链表开始，根据事务号由小到大的顺序查找与数据链表对应的映射关系；The recovery search module 517 is also used to start from the data linked list with the smallest current transaction number, and search for the mapping relationship corresponding to the data linked list according to the order of transaction numbers from small to large;

可选地，预设回收条件包括定时器超时、对内存数据的回收操作完成、日志区总水位达到预设水位阀值中的至少一个。Optionally, the preset recovery condition includes at least one of timer timeout, completion of recovery operation on memory data, and total water level of the log area reaching a preset water level threshold.

可选地，Optionally,

缓存单元506，还用于若数据是热点数据，则在内存上缓存数据；或者，从日志区读取数据到缓存器件缓存。The cache unit 506 is also used to cache the data in the memory if the data is hot data; or read the data from the log area to the cache of the cache device.

可选地，热点数据包括数据大小小于预设数据阀值的数据和/或热点数据包括元数据。Optionally, the hotspot data includes data whose data size is smaller than a preset data threshold and/or the hotspot data includes metadata.

可选地，Optionally,

日志管理器504，还用于为数据在日志区按照顺序分配日志区空间，将数据顺序追加写入日志区空间。The log manager 504 is further configured to sequentially allocate log area space for data in the log area, and sequentially write data into the log area space.

可选地，Optionally,

硬盘控制装置还包括：The hard disk control unit also includes:

数据区转化单元522，用于当数据区的空间利用率大于预设数据区利用阀值时，将当前空闲的日志区转化为数据区；The data area conversion unit 522 is used to convert the currently idle log area into a data area when the space utilization rate of the data area is greater than the preset data area utilization threshold;

日志区转化单元524，用于当日志区的空间利用率大于预设日志区利用阀值时，将由空闲的日志区转化成的数据区转化为日志区。The log area conversion unit 524 is configured to convert the data area converted from the idle log area into a log area when the space utilization rate of the log area is greater than a preset log area utilization threshold.

可选地，Optionally,

硬盘还包括超级块，每个日志区分配有标识信息，超级块用于在日志区被修改后记录被修改的日志区的标识信息。The hard disk also includes a super block, each log area is provided with identification information, and the super block is used to record the identification information of the modified log area after the log area is modified.

可选地，日志区和数据区在硬盘上交替设置。Optionally, the log area and the data area are set alternately on the hard disk.

可选地，Optionally,

硬盘还包括块组，块组包括预设数量的日志区和数据区，组块的日志区和数据区连续设置，The hard disk also includes a block group. The block group includes a preset number of log areas and data areas. The log area and data area of the block are set continuously.

数据管理器503，包括：Data manager 503, comprising:

空闲区确定模块525，用于根据块组的管理信息确定空闲的目标数据区；A free area determining module 525, configured to determine a free target data area according to the management information of the block group;

分配模块508，用于为数据在目标数据区分配数据区空间；An allocation module 508, configured to allocate data area space for data in the target data area;

硬盘控制装置还包括：The hard disk control unit also includes:

元数据生成模块507，用于根据数据和数据区空间生成目标元数据；A metadata generation module 507, configured to generate target metadata according to the data and data zone space;

写入单元501，还用于向缓存器件写入目标元数据；The writing unit 501 is also used to write target metadata to the cache device;

缓存管理器判断出元数据为热点数据后，日志管理器504，还用于确定目标数据区所属的目标块组；确定目标块组可用的日志区；将目标元数据写入可用的日志区。After the cache manager determines that the metadata is hot data, the log manager 504 is further configured to determine the target block group to which the target data area belongs; determine the available log area of the target block group; and write the target metadata into the available log area.

可选地，Optionally,

日志管理器504，还用于为映射关系在日志区上分配日志区空间；将映射关系写入映射关系分配到的日志区空间。The log manager 504 is further configured to allocate log area space on the log area for the mapping relationship; and write the mapping relationship into the log area space allocated to the mapping relationship.

综上所述，在包括硬盘的硬盘控制装置上，该硬盘包括数据区和日志区，写入单元501向缓存器件写入数据后，缓存管理器502判断该数据是否是热点数据，若该数据不是热点数据，则数据管理器503为该数据在数据区分配数据区空间，将该数据写入数据区空间；若该数据是热点数据，则日志管理器504为该数据在日志区分配日志区空间，将该数据写入日志区空间。这样，将待写入硬盘的数据分为热点数据和非热点数据，热点数据为存储在硬盘上后在预设次数的修改和释放后能使硬盘产生预设数量碎片的数据，热点数据易于导致硬盘产生碎片，将热点数保存在日志区上，以日志方式进行管理，即使日志区上的数据频繁修改产生硬盘碎片，也方便对这些碎片进行回收等管理，而将非热点数据保存在数据区，非热点数据的释放不易导致硬盘产生碎片，数据区可以无需为硬盘碎片管理分配过多资源，从而，通过在硬盘上将不同类型的数据保存在不同的区域以不同的方式进行管理，可提高硬盘上的碎片管理效率，日志区对硬盘碎片的高效管理，可减少硬盘碎片的产生。To sum up, on a hard disk control device including a hard disk, the hard disk includes a data area and a log area. After the writing unit 501 writes data to the cache device, the cache manager 502 judges whether the data is hot data. If the data If it is not hot data, then the data manager 503 allocates a data area space for the data in the data area, and writes the data into the data area space; if the data is hot data, then the log manager 504 allocates a log area for the data in the log area space, write the data into the log area space. In this way, the data to be written into the hard disk is divided into hot data and non-hot data. Hot data is data that can cause the hard disk to generate a preset number of fragments after being stored on the hard disk after a preset number of modifications and releases. Hot data is easy to cause The hard disk generates fragments, save the hotspot data in the log area, and manage it in the form of logs. Even if the data on the log area is frequently modified to generate hard disk fragments, it is convenient to recover and manage these fragments, and save the non-hot data in the data area , the release of non-hot data is not easy to cause fragmentation of the hard disk, and the data area does not need to allocate too many resources for the management of hard disk fragmentation. Therefore, by storing different types of data in different areas on the hard disk and managing them in different ways, it can improve The efficiency of fragment management on the hard disk and the efficient management of hard disk fragments in the log area can reduce the generation of hard disk fragments.

图7为本发明另一实施例提供的一种硬盘控制装置的硬件结构示意图，该硬盘控制装置包括处理器CPU701、缓存器件703和硬盘702，以及硬盘控制器705和总线704。硬盘702包括数据区和日志区，在有的实施例中缓存器件例如可以是内存。FIG. 7 is a schematic diagram of a hardware structure of a hard disk control device according to another embodiment of the present invention. The hard disk control device includes a processor CPU 701 , a cache device 703 , a hard disk 702 , a hard disk controller 705 and a bus 704 . The hard disk 702 includes a data area and a log area, and in some embodiments, the cache device may be, for example, a memory.

上述实施例中由硬盘控制装置所执行的步骤可以基于该图7所示的硬盘控制装置的结构。The steps performed by the hard disk control device in the above embodiments may be based on the structure of the hard disk control device shown in FIG. 7 .

该处理器701执行程序，使得硬盘控制装置执行上述硬盘数据管理方法的方法，举例各种可选设计具体如下。The processor 701 executes a program, so that the hard disk control device executes the method of the above-mentioned hard disk data management method, examples of various optional designs are as follows.

该处理器701执行程序，使得硬盘控制装置具有如下功能：向缓存器件写入数据；判断数据是否是热点数据，其中热点数据为存储在硬盘上后在预设次数的修改和释放后能使硬盘产生预设数量碎片的数据；若数据不是热点数据，则为数据在数据区分配数据区空间，将数据写入数据区空间；若数据是热点数据，则为数据在日志区分配日志区空间，将数据写入日志区空间。The processor 701 executes the program, so that the hard disk control device has the following functions: write data to the cache device; judge whether the data is hot data, wherein the hot data is stored on the hard disk and can be modified and released after a preset number of times to make the hard disk Generate data with a preset number of fragments; if the data is not hot data, allocate data area space for the data in the data area, and write the data into the data area space; if the data is hot data, allocate log area space for the data in the log area, Write data to the log area space.

一种可选设计，缓存器件为内存，该处理器701执行程序，使得硬盘控制装置具有如下功能：为数据在日志区分配日志区空间之后，建立数据和日志区空间的映射关系。In an optional design, the cache device is a memory, and the processor 701 executes a program so that the hard disk control device has the following function: after allocating log area space for data in the log area, establish a mapping relationship between data and log area space.

一种可选设计，该处理器701执行程序，使得硬盘控制装置具有如下功能：建立多个目标数据和多个目标数据分配到的日志区空间的映射关系，其中目标数据属于热点数据；将多个目标数据的多个写操作组合为一个事务；将事务的所有目标数据写入日志区空间，当事务的其中一个目标数据的写操作执行失败时，事务的其他目标数据执行的写操作失败。An optional design, the processor 701 executes a program, so that the hard disk control device has the following functions: establish a mapping relationship between multiple target data and the log area space to which the multiple target data is allocated, wherein the target data belongs to hot data; Combine multiple write operations of one target data into one transaction; write all the target data of the transaction into the log area space, and when the write operation of one of the target data of the transaction fails, the write operations of the other target data of the transaction fail.

一种可选设计，该处理器701执行程序，使得硬盘控制装置具有如下功能：在内存上缓存属于热点数据的数据。In an optional design, the processor 701 executes a program, so that the hard disk control device has the following function: cache data belonging to hot data in memory.

一种可选设计，该处理器701执行程序，使得硬盘控制装置具有如下功能：将事务的所有目标数据写入日志区空间之前，根据多个目标数据建立数据链表，其中，数据链表用于管理目标数据，数据链表管理的目标数据与事务的目标数据相同；根据数据链表对目标数据进行管理；在预设释放条件下，根据数据链表从先到后的建立顺序，查找数据链表未解除管理的数据；在内存上释放目标数据链表未解除管理的数据，且在内存上保留目标数据链表对应的目标映射关系；其中，根据数据链表对目标数据进行管理，包括：建立第二数据链表后，当第二数据链表管理的第二目标数据是由预先建立的第一数据链表管理的第一目标数据修改得到时，在第一数据链表上解除对第一目标数据的管理；在与第一数据链表对应的第一映射关系上删除第一目标数据的信息An optional design, the processor 701 executes the program, so that the hard disk control device has the following functions: before writing all the target data of the transaction into the log area space, a data link list is established according to a plurality of target data, wherein the data link list is used to manage The target data, the target data managed by the data linked list is the same as the target data of the transaction; the target data is managed according to the data linked list; under the preset release condition, according to the establishment order of the data linked list from first to last, search for the unmanaged data linked list Data; release the data that is not managed by the target data linked list on the memory, and retain the target mapping relationship corresponding to the target data linked list in the memory; wherein, the target data is managed according to the data linked list, including: After the second data linked list is established, when When the second target data managed by the second data link list is modified by the first target data managed by the pre-established first data link list, the management of the first target data is released on the first data link list; Information about deleting the first target data on the corresponding first mapping relationship

一种可选设计，该处理器701执行程序，使得硬盘控制装置具有如下功能：当内存达到第一预设水位时，根据数据链表从先到后的建立顺序，查找数据链表未解除管理的数据；An optional design, the processor 701 executes the program, so that the hard disk control device has the following functions: when the memory reaches the first preset water level, according to the order in which the data link list is established from first to last, search for data that has not been released from the data link list ;

在内存上释放目标数据链表未解除管理的数据之后，在内存达到第二预设水位时，从日志区读取目标映射关系指向的数据；After releasing the unmanaged data of the target data linked list on the memory, when the memory reaches the second preset water level, read the data pointed to by the target mapping relationship from the log area;

将目标映射关系指向的数据写入数据区；Write the data pointed to by the target mapping relationship into the data area;

在内存上删除目标映射关系。Delete the target mapping relationship in memory.

一种可选设计，该处理器701执行程序，使得硬盘控制装置具有如下功能：An optional design, the processor 701 executes the program, so that the hard disk control device has the following functions:

根据事务的写入顺序为事务对应的数据链表按照递增规则分配事务号；According to the writing order of the transaction, the transaction number is assigned to the data linked list corresponding to the transaction according to the increment rule;

从当前事务号最小的数据链表开始，根据事务号由小到大的顺序查找数据链表未解除管理的数据。Starting from the data linked list with the smallest current transaction number, search for the unmanaged data in the data linked list according to the order of transaction numbers from small to large.

一种可选设计，该处理器701执行程序，使得硬盘控制装置具有如下功能：在预设回收条件下，执行日志区数据搬迁的步骤；An optional design, the processor 701 executes the program, so that the hard disk control device has the following functions: under the preset recovery condition, execute the step of relocating the data in the log area;

执行日志区数据搬迁的步骤，包括：Perform the steps of log area data migration, including:

查找映射关系；Find the mapping relationship;

根据事务的写入顺序为事务对应的数据链表按照递增规则分配事务号；从当前事务号最小的数据链表开始，根据事务号由小到大的顺序查找与数据链表对应的映射关系；According to the writing order of the transaction, assign the transaction number to the data linked list corresponding to the transaction according to the increment rule; start from the data linked list with the smallest current transaction number, and search for the mapping relationship corresponding to the data linked list according to the order of the transaction number from small to large;

预设回收条件包括定时器超时、对内存数据的回收操作完成、日志区总水位达到预设水位阀值中的至少一个。The preset recovery conditions include at least one of timer timeout, completion of the recovery operation on memory data, and total water level of the log area reaching a preset water level threshold.

判断数据是否是热点数据之后，若数据是热点数据，则在内存上缓存数据；或者，After judging whether the data is hot data, if the data is hot data, cache the data in memory; or,

向缓存器件写入数据之前，从日志区读取数据到缓存器件缓存。Before writing data to the cache device, read data from the log area to the cache device cache.

热点数据包括数据大小小于预设数据阀值的数据和/或热点数据包括元数据。The hot data includes data whose data size is smaller than a preset data threshold and/or the hot data includes metadata.

为数据在日志区按照顺序分配日志区空间，将数据顺序追加写入日志区空间。Allocate the log area space for the data in the log area in sequence, and write the data sequentially into the log area space.

当数据区的空间利用率大于预设数据区利用阀值时，将当前空闲的日志区转化为数据区；When the space utilization rate of the data area is greater than the preset data area utilization threshold, the currently idle log area is converted into a data area;

当日志区的空间利用率大于预设日志区利用阀值时，将由空闲的日志区转化成的数据区转化为日志区。When the space utilization rate of the log area is greater than the preset log area utilization threshold, the data area converted from the idle log area is converted into the log area.

日志区和数据区在硬盘上交替设置。The log area and data area are set alternately on the hard disk.

一种可选设计，该处理器701执行程序，使得硬盘控制装置具有如下功能：硬盘还包括块组，块组包括预设数量的日志区和数据区，组块的日志区和数据区连续设置，根据块组的管理信息确定空闲的目标数据区；为数据在目标数据区分配数据区空间；An optional design, the processor 701 executes the program, so that the hard disk control device has the following functions: the hard disk also includes a block group, the block group includes a preset number of log areas and data areas, and the log area and data area of the block are set continuously , determine the free target data area according to the management information of the block group; allocate data area space for the data in the target data area;

将数据写入数据区空间之后，根据数据和数据区空间生成目标元数据；向缓存器件写入目标元数据；判断出元数据为热点数据后，确定目标数据区所属的目标块组；确定目标块组可用的日志区；将目标元数据写入可用的日志区。After writing the data into the data area space, generate the target metadata according to the data and the data area space; write the target metadata to the cache device; after judging that the metadata is hot data, determine the target block group to which the target data area belongs; determine the target Available log area for block group; write target metadata to available log area.

一种可选设计，该处理器701执行程序，使得硬盘控制装置具有如下功能：为映射关系在日志区上分配日志区空间；将映射关系写入映射关系分配到的日志区空间。In an optional design, the processor 701 executes a program so that the hard disk control device has the following functions: allocate log area space for the mapping relationship in the log area; write the mapping relationship into the log area space allocated to the mapping relationship.

综上所述，在包括硬盘的硬盘控制装置上，该硬盘包括数据区和日志区，该处理器701向缓存器件写入数据后，该处理器701判断该数据是否是热点数据，若该数据不是热点数据，则该处理器701为该数据在数据区分配数据区空间，将该数据写入数据区空间；若该数据是热点数据，则该处理器701为该数据在日志区分配日志区空间，将该数据写入日志区空间。这样，将待写入硬盘的数据分为热点数据和非热点数据，热点数据为存储在硬盘上后在预设次数的修改和释放后能使硬盘产生预设数量碎片的数据，热点数据易于导致硬盘产生碎片，将热点数保存在日志区上，以日志方式进行管理，即使日志区上的数据频繁修改产生硬盘碎片，也方便对这些碎片进行回收等管理，而将非热点数据保存在数据区，非热点数据的释放不易导致硬盘产生碎片，数据区可以无需为硬盘碎片管理分配过多资源，从而，通过在硬盘上将不同类型的数据保存在不同的区域以不同的方式进行管理，可提高硬盘上的碎片管理效率，日志区对硬盘碎片的高效管理，可减少硬盘碎片的产生。To sum up, on a hard disk control device including a hard disk, the hard disk includes a data area and a log area. After the processor 701 writes data to the cache device, the processor 701 judges whether the data is hot data. If the data If it is not hot data, then the processor 701 allocates data area space for the data in the data area, and writes the data into the data area space; if the data is hot data, then the processor 701 allocates a log area for the data in the log area space, write the data into the log area space. In this way, the data to be written into the hard disk is divided into hot data and non-hot data. Hot data is data that can cause the hard disk to generate a preset number of fragments after being stored on the hard disk after a preset number of modifications and releases. Hot data is easy to cause The hard disk generates fragments, save the hotspot data in the log area, and manage it in the form of logs. Even if the data on the log area is frequently modified to generate hard disk fragments, it is convenient to recover and manage these fragments, and save the non-hot data in the data area , the release of non-hot data is not easy to cause fragmentation of the hard disk, and the data area does not need to allocate too many resources for the management of hard disk fragmentation. Therefore, by storing different types of data in different areas on the hard disk and managing them in different ways, it can improve The efficiency of fragment management on the hard disk and the efficient management of hard disk fragments in the log area can reduce the generation of hard disk fragments.

所属领域的技术人员可以清楚地了解到，为描述的方便和简洁，上述描述的系统，装置和单元的具体工作过程，可以参考前述方法实施例中的对应过程，在此不再赘述。Those skilled in the art can clearly understand that for the convenience and brevity of the description, the specific working process of the above-described system, device and unit can refer to the corresponding process in the foregoing method embodiment, which will not be repeated here.

在本申请所提供的几个实施例中，应该理解到，所揭露的系统，装置和方法，可以通过其它的方式实现。例如，以上所描述的装置实施例仅仅是示意性的，例如，所述单元的划分，仅仅为一种逻辑功能划分，实际实现时可以有另外的划分方式，例如多个单元或组件可以结合或者可以集成到另一个系统，或一些特征可以忽略，或不执行。另一点，所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口，装置或单元的间接耦合或通信连接，可以是电性，机械或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed system, device and method can be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.

所述作为分离部件说明的单元可以是或者也可以不是物理上分开的，作为单元显示的部件可以是或者也可以不是物理单元，即可以位于一个地方，或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.

另外，在本发明各个实施例中的各功能单元可以集成在一个处理单元中，也可以是各个单元单独物理存在，也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现，也可以采用软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present invention may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit. The above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.

所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时，可以存储在一个计算机可读取存储介质中。基于这样的理解，本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来，该计算机软件产品存储在一个存储介质中，包括若干指令用以使得一台计算机设备(可以是个人计算机，服务器，或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括：U盘、移动硬盘、只读存储器(ROM，Read-OnlyMemory)、随机存取存储器(RAM，Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。If the integrated unit is realized in the form of a software function unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the essence of the technical solution of the present invention or the part that contributes to the prior art or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the method described in each embodiment of the present invention. The aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (ROM, Read-OnlyMemory), random access memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program codes.

以上所述，以上实施例仅用以说明本发明的技术方案，而非对其限制；尽管参照前述实施例对本发明进行了详细的说明，本领域的普通技术人员应当理解：其依然可以对前述各实施例所记载的技术方案进行修改，或者对其中部分技术特征进行等同替换；而这些修改或者替换，并不使相应技术方案的本质脱离本发明各实施例技术方案的精神和范围。As mentioned above, the above embodiments are only used to illustrate the technical solutions of the present invention, rather than to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: it can still understand the foregoing The technical solutions recorded in each embodiment are modified, or some of the technical features are replaced equivalently; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the various embodiments of the present invention.

Claims

1. A hard disk data management method, characterized in that, the method is applied to a hard disk control device comprising a hard disk, and the hard disk comprises a data area and a log area, and the method comprises:

Write data to the cache device;

Judging whether the data is hot data, wherein the hot data is data that can cause the hard disk to generate a preset number of fragments after a preset number of modifications and releases after being stored on the hard disk;

If the data is not hot data, then allocate a data area space in the data area for the data, and write the data into the data area space;

If the data is hot data, allocate a log area space for the data in the log area, and write the data into the log area space;

The method also includes:

When the space utilization rate of the data area is greater than the preset data area utilization threshold, the currently idle log area is converted into a data area;

When the space utilization rate of the log area is greater than the preset log area utilization threshold, the data area converted from the idle log area is converted into the log area.

2. The method according to claim 1, wherein the cache device is a memory,

After the log area space is allocated for the data in the log area, the method further includes:

A mapping relationship between the data and the log area space is established.

3. The method of claim 2, wherein,

The establishment of the mapping relationship between the data and the log area space includes:

Establishing a mapping relationship between multiple target data and the log area space to which the multiple target data are allocated, wherein the target data belongs to hot data;

The writing the data into the log area space includes:

Combining multiple write operations of the multiple target data into one transaction;

All target data of the transaction is written into the log area space, and when the write operation of one of the target data of the transaction fails, the write operation of other target data of the transaction fails.

4. method according to claim 3, is characterized in that, described method also comprises:

The data belonging to the hotspot data is cached on the memory.

5. The method of claim 4, wherein,

Before writing all target data of the transaction into the log area space, the method also includes:

Establishing a data link list according to the plurality of target data, wherein the data link list is used to manage the target data, and the target data managed by the data link list is the same as the target data of the transaction;

managing the target data according to the data linked list;

Under the preset release condition, according to the establishment sequence of the data link list from first to last, search for the data that has not been released from the management of the data link list;

releasing the unmanaged data of the target data linked list in the internal memory, and retaining the target mapping relationship corresponding to the target data linked list in the internal memory;

Wherein, the managing the target data according to the data link list includes:

After the second data link list is established, when the second target data managed by the second data link list is obtained by modifying the first target data managed by the pre-established first data link list, release all data from the first data link list. management of the first target data; deleting the information of the first target data on the first mapping relationship corresponding to the first data linked list.

6. The method of claim 5, wherein,

Under the preset release condition, according to the establishment order of the data link list from first to last, searching for the data that has not been released from the management of the data link list includes:

When the internal memory reaches the first preset water level, according to the order in which the data linked list is established from first to last, search for data that has not been released from management in the data linked list;

After releasing the unmanaged data of the target data linked list on the memory, the method further includes:

When the internal memory reaches a second preset water level, read the data pointed to by the target mapping relationship from the log area;

writing the data pointed to by the target mapping relationship into the data area;

Delete the target mapping relationship on the memory.

7. The method of claim 5, wherein,

The method also includes:

Under the preset recycling conditions, execute the steps of log area data relocation;

The steps of performing log area data relocation include:

Find the mapping relationship;

According to the information recorded in the mapping relationship, it is judged whether the data on the first log area corresponding to the mapping relationship has been migrated;

If the data on the first log area has not been migrated, then determine the space utilization rate on the first log area according to the information recorded in the mapping relationship;

When the space utilization rate of the first log area is less than the preset utilization threshold, migrate the data in the first log area to the second log area, and update the mapping relationship corresponding to the relocated data, Wherein the second log area is an idle log area or a log area used when reclaiming the log area;

When the total space water level of the current log area reaches the preset space threshold, the step of relocating the data in the log area is stopped; otherwise, the step of relocating the data in the log area is continued.

8. The method of claim 7, wherein,

The method also includes:

Assigning a transaction number to the data linked list corresponding to the transaction according to the order of writing of the transaction according to an increment rule;

In the step of performing log area data relocation, the searching for the mapping relationship includes:

Starting from the data link list with the smallest current transaction number, the mapping relationship corresponding to the data link list is searched according to the order of the transaction numbers from small to large.

9. The method of claim 4, wherein,

The caching of the data belonging to the hotspot data on the memory includes:

After determining whether the data is hot data, if the data is hot data, then retain the data in the memory; or,

Before the data is written into the cache device, data is read from the log area and cached by the cache device.

10. The method according to any one of claims 1 to 9, wherein the hotspot data includes data whose data size is smaller than a preset data threshold and/or the hotspot data includes metadata.

11. The method according to any one of claims 1 to 9, characterized in that,

The allocating log area space for the data in the log area, and writing the data into the log area space includes:

A log area space is allocated in sequence for the data in the log area, and the data is sequentially appended to the log area space.

12. The method according to any one of claims 2 to 9, wherein

The method also includes:

Allocating log area space on the log area for the mapping relationship;

Writing the mapping relationship into the log area space to which the mapping relationship is allocated.

13. A hard disk control device, characterized in that the hard disk control device comprises:

A write unit, used to write data to the buffer;

A cache manager for judging whether the data is hot data, where the hot data is data that can cause the hard disk to generate a preset number of fragments after a preset number of modifications and releases after being stored on the hard disk ;

The data manager is used to allocate data area space for the data in the data area of the hard disk if the data is not hot data, and write the data into the data area space;

The log manager is used to allocate log area space for the data in the log area of the hard disk if the data is hot data, and write the data into the log area space;

The hard disk control device also includes:

A data area conversion unit, configured to convert the currently idle log area into a data area when the space utilization rate of the data area is greater than the preset data area utilization threshold;

The log area conversion unit is configured to convert the data area converted from the idle log area into the log area when the space utilization rate of the log area is greater than the preset log area utilization threshold.

14. The hard disk control device according to claim 13, wherein the cache device is a memory, and the hard disk control device further comprises:

A mapping relationship establishing unit, configured to establish a mapping relationship between the data and the log area space.

15. The hard disk control device according to claim 14, characterized in that,

The mapping relationship establishment unit is further configured to establish a mapping relationship between multiple target data and the log area space to which the multiple target data are allocated, wherein the target data belongs to hot data;

The log manager is also used to combine multiple write operations of the multiple target data into one transaction; write all target data of the transaction into the log area space, when one of the target data of the transaction When the execution of the write operation fails, the write operations performed by other target data of the transaction fail.

16. The hard disk control device according to claim 15, wherein the hard disk control device further comprises:

A cache unit, configured to cache data belonging to the hotspot data on the memory.

17. The hard disk control device according to claim 16, characterized in that,

The hard disk control device also includes:

A linked list establishment unit, configured to establish a data linked list according to the plurality of target data, wherein the data linked list is used to manage the target data, and the target data managed by the data linked list is the same as the target data of the transaction;

a linked list management unit, configured to manage the target data according to the data linked list;

A search unit, configured to search for data that has not been released from management in the data link list according to the order in which the data link list was first established under a preset release condition;

A memory management unit, configured to release the unmanaged data of the target data linked list in the memory, and retain the target mapping relationship corresponding to the target data linked list in the memory;

Wherein, the linked list management unit is also used for:

18. The hard disk control device according to claim 17, wherein:

The search unit is further configured to search for data that has not been released from management in the data link list according to the order in which the data link list is established from first to last when the memory reaches a first preset water level;

The hard disk control device also includes:

A reading unit, configured to read the data pointed to by the target mapping relationship from the log area when the memory reaches a second preset water level;

a mapping data writing unit, configured to write the data pointed to by the target mapping relationship into the data area;

A deletion unit, configured to delete the target mapping relationship on the memory.

19. The hard disk control device according to claim 17, wherein:

The hard disk control device also includes:

The recovery unit is used to perform the steps of data relocation in the log area under preset recovery conditions;

In the step of performing log area data relocation, the recycling unit includes:

Recycling a search module, configured to search for the mapping relationship;

A recycling judging module, configured to judge whether the data on the first log area corresponding to the mapping relationship has been migrated according to information recorded in the mapping relationship;

A recovery determination module, configured to determine the space utilization rate of the first log area according to the information recorded in the mapping relationship if the data on the first log area has not been migrated;

A recovery execution module, configured to migrate the data in the first log area to the second log area when the space utilization rate of the first log area is less than a preset utilization threshold, and update the The mapping relationship corresponding to the data, wherein the second log area is an idle log area or a log area used when recycling the log area;

The recovery module is used to stop executing the step of data relocation in the log area when the total space water level in the current log area reaches the preset space threshold, otherwise continue to execute the step of relocating the data in the log area.

20. The hard disk control device according to claim 19, wherein:

The hard disk control device also includes:

A transaction number allocation unit, configured to allocate a transaction number for the data linked list corresponding to the transaction according to an increment rule according to the writing sequence of the transaction;

21. The hard disk control device according to claim 16, characterized in that,

The buffer unit is further configured to retain the data in the memory if the data is hot data; or read data from the log area to the buffer device for buffering.

22. The hard disk control device according to any one of claims 13 to 21, wherein the hot data includes data whose data size is smaller than a preset data threshold and/or the hot data includes metadata.

23. The hard disk control device according to any one of claims 13 to 21, characterized in that,

The log manager is further configured to sequentially allocate log area space for the data in the log area, and sequentially write the data into the log area space.

24. The hard disk control device according to any one of claims 15 to 21, characterized in that,

The log manager is further configured to allocate log area space for the mapping relationship in the log area; and write the mapping relationship into the log area space allocated to the mapping relationship.