[go: up one dir, main page]

CN104714758A - Method for building array by adding mirror image structure to check-based RAID and read-write system - Google Patents

Method for building array by adding mirror image structure to check-based RAID and read-write system Download PDF

Info

Publication number
CN104714758A
CN104714758A CN201510025251.XA CN201510025251A CN104714758A CN 104714758 A CN104714758 A CN 104714758A CN 201510025251 A CN201510025251 A CN 201510025251A CN 104714758 A CN104714758 A CN 104714758A
Authority
CN
China
Prior art keywords
data
disk
segment
mirror
array
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510025251.XA
Other languages
Chinese (zh)
Other versions
CN104714758B (en
Inventor
姚杰
曹强
吴思
谢长生
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong University of Science and Technology
Original Assignee
Huazhong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong University of Science and Technology filed Critical Huazhong University of Science and Technology
Priority to CN201510025251.XA priority Critical patent/CN104714758B/en
Publication of CN104714758A publication Critical patent/CN104714758A/en
Application granted granted Critical
Publication of CN104714758B publication Critical patent/CN104714758B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Signal Processing For Digital Recording And Reproducing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明公开了一种基于校验RAID加入镜像结构的阵列构建方法及读写系统;所述阵列构建方法包括地址布局、数据布局、数据存取和数据重构四个步骤;所述读写系统包括I/O模块、镜像管理模块、地址变换模块和基于校验RAID模块;本发明在基于校验RAID的基础上加入镜像结构,提高了阵列的I/O性能;利用镜像数据重构缩短重构时间;新颖的地址布局方式将原始数据与镜像数据尽可能的放在邻近的位置,缩短了磁头移动的距离;在后台更新校验信息,缓解基于校验RAID的写放大问题,提高阵列的写性能;由于镜像数据的存在,提高了阵列的容错能力,因此在阵列的可用性和可靠性方面有很大的改善。

The invention discloses an array construction method and a read-write system based on verifying that RAID is added to a mirror structure; the array construction method includes four steps of address layout, data layout, data access and data reconstruction; the read-write system It includes an I/O module, a mirror image management module, an address conversion module, and a verification-based RAID module; the present invention adds a mirror image structure on the basis of a verification RAID, which improves the I/O performance of the array; construction time; the novel address layout method puts the original data and the mirrored data as adjacent as possible, shortening the moving distance of the magnetic head; updating the verification information in the background, alleviating the write amplification problem based on the verification RAID, and improving the performance of the array Write performance; due to the existence of mirrored data, the fault tolerance of the array is improved, so the availability and reliability of the array are greatly improved.

Description

一种基于校验RAID加入镜像结构的阵列构建方法及读写系统An array construction method and read-write system based on verifying RAID and adding mirror structure

技术领域technical field

本发明属于数据储存技术领域,更具体地,涉及一种基于校验RAID加入镜像结构的阵列构建方法及读写系统。The invention belongs to the technical field of data storage, and more specifically relates to an array construction method and a read-write system based on a verification RAID and adding a mirror structure.

背景技术Background technique

独立磁盘冗余阵列(Redundant Arrays of Independent Disks,RAID)因其性能以及可靠性在大规模分布式并行存储系统中得到了广泛的应用。在性能上,RAID通过条带化实现高吞吐量,在可靠性上,RAID利用镜像或是校验码等冗余数据来提高可靠性。Redundant Arrays of Independent Disks (RAID) has been widely used in large-scale distributed parallel storage systems because of its performance and reliability. In terms of performance, RAID achieves high throughput through striping. In terms of reliability, RAID uses redundant data such as mirroring or checksums to improve reliability.

根据其冗余结构的不同可将RAID分为三类。其一,没有冗余结构,代表性的阵列为RAID0,RAID0使用数据分条技术(Data stripping),将数据分散到所有磁盘中,使得磁盘间可以并行操作,提高了读写速度,由于没有冗余结构,其空间利用率是阵列结构中最高的,但也因此导致其可靠性低,阵列中任何一块磁盘出现故障,都会导致整个系统受到破坏。其二,基于校验的RAID,代表性的阵列为RAID5,RAID5同样采用了数据分条的技术,将数据条块化的存储在所有磁盘中,使磁盘可以并行操作,与RAID0不同的是,每个条带都有一个奇偶校验值,所有奇偶校验值均匀地分布在所有磁盘中,其优势是提高了可靠性,阵列中任何一块磁盘出现故障,系统都可以根据剩下的磁盘计算出故障盘的数据;由于加入了冗余结构,空间利用率有所下降,其冗余结构总共需要一个磁盘的容量,同时,由于其冗余结构,导致了写放大的问题,即一次写操作,将产生四个实际的IO操作,包括两次读和两次写;在重构模式下,需要读取剩下所有磁盘的数据来计算故障盘的数据;降级模式下,访问故障盘的请求被分解为多个I/O请求,这将导致重构时间长,降级模式以及重构模式下I/O性能低。其三,基于镜像的RAID,代表性的阵列为RAID1,RAID1也称为磁盘镜像,它将一个磁盘的数据镜像到另一个磁盘上,由于数据在不同磁盘上存在副本,可以实现I/O分流,使其访问可以均匀的分布在镜像中,并且阵列中任何一块磁盘出现故障,数据都不会丢失,访问故障盘的请求可直接转换为访问其镜像盘的请求,不会分解成多个I/O,且重构时可以直接读取故障盘对应镜像盘的数据,重构时间短,I/O性能高,但其空间利用率差,仅能使用阵列容量的一半。RAID can be divided into three types according to its redundant structure. First, there is no redundant structure. The representative array is RAID0. RAID0 uses data striping technology (Data stripping) to disperse data to all disks, so that disks can be operated in parallel, which improves the read and write speed. The remaining structure has the highest space utilization rate in the array structure, but it also leads to low reliability. If any disk in the array fails, the entire system will be destroyed. Second, based on parity RAID, the representative array is RAID5. RAID5 also uses data striping technology to store data in all disks in blocks, so that the disks can be operated in parallel. Unlike RAID0, Each stripe has a parity value, and all parity values are evenly distributed among all disks, which has the advantage of improving reliability. If any disk in the array fails, the system can calculate it based on the remaining disks. The data of the failed disk; due to the addition of the redundant structure, the space utilization rate has decreased, and its redundant structure requires a total capacity of one disk. At the same time, due to its redundant structure, it leads to the problem of write amplification, that is, a write operation , will generate four actual IO operations, including two reads and two writes; in the reconstruction mode, it is necessary to read the data of all remaining disks to calculate the data of the faulty disk; in the degraded mode, the request to access the faulty disk is decomposed into multiple I/O requests, which will result in long refactoring times, degraded mode, and low I/O performance in refactored mode. Third, mirror-based RAID. The representative array is RAID1. RAID1 is also called disk mirroring. It mirrors the data of one disk to another disk. Since the data is copied on different disks, I/O offloading can be realized. , so that access can be evenly distributed in the mirror, and if any disk in the array fails, the data will not be lost. The request to access the faulty disk can be directly converted into the request to access its mirror disk, and will not be decomposed into multiple I /O, and the data of the mirror disk corresponding to the faulty disk can be directly read during reconstruction, the reconstruction time is short, and the I/O performance is high, but its space utilization rate is poor, and only half of the array capacity can be used.

发明内容Contents of the invention

针对现有技术的以上缺陷或改进需求,本发明提供了一种面向校验磁盘阵列的弹性数据镜像系统及方法,其目的在于:在基于校验RAID的基础上加入镜像结构,提高阵列的容错能力。Aiming at the above defects or improvement needs of the prior art, the present invention provides an elastic data mirroring system and method for verification disk arrays, the purpose of which is to add a mirror structure on the basis of verification RAID to improve the fault tolerance of the array ability.

为实现上述目的,按照本发明的一个方面,提供一种基于校验RAID加入镜像结构的阵列构建方法,其特征在于,所述阵列构建方法包括地址布局、数据布局、数据存取和数据重构四个步骤,具体如下:In order to achieve the above object, according to one aspect of the present invention, there is provided a method for constructing an array based on verifying RAID and adding a mirror structure, wherein the method for constructing an array includes address layout, data layout, data access and data reconstruction Four steps, as follows:

(1)地址布局:(1) Address layout:

(1.1)根据磁盘数M设定段内条带数,段内条带数为M的整数倍;(1.1) Set the number of stripes in the segment according to the number of disks M, and the number of stripes in the segment is an integer multiple of M;

(1.2)根据磁盘中数据块的数目与段内条带数确定段数K,段数K=磁盘中数据块的数目/段内条带数;(1.2) determine the segment number K according to the number of data blocks in the disk and the number of stripes in the segment, the number of segments K=the number of data blocks in the disk/segment number of stripes;

(1.3)取N=K/2;将段编号为1至N的段与段编号为N+1至K的段交叉存放:编号为1的段之后存放编号为N+1的段,之后存放编号为2的段,编号为2之后存放编号为N+2的段,如此交叉存放;如K/2为非整数,则取整;(1.3) Get N=K/2; Interleave the segments whose segment numbers are 1 to N with the segments whose segment numbers are N+1 to K: store the segment whose number is N+1 after the segment whose number is 1, and then store For the segment numbered 2, the segment numbered N+2 will be stored after the number 2, and stored interleaved in this way; if K/2 is a non-integer, it will be rounded;

(2)数据布局:(2) Data layout:

(2.1)原始数据段段内的数据与基于校验的RAID相同布局;(2.1) The data in the original data segment is in the same layout as the parity-based RAID;

(2.2)镜像数据段在原始数据段布局的基础上,改变同条带中数据块存放的磁盘号,磁盘号统一向右或向左偏移;统一向右偏移,即磁盘号加i,若加i后的磁盘号大于M,则减M;统一向左偏移,即磁盘号减i,若减i后的磁盘号小于0,则加M;(2.2) On the basis of the layout of the original data segment, the mirrored data segment changes the disk number stored in the data block in the same stripe, and the disk number is uniformly shifted to the right or left; uniformly shifted to the right, that is, the disk number plus i, If the disk number after adding i is greater than M, then subtract M; uniformly shift to the left, that is, the disk number minus i, if the disk number after subtracting i is less than 0, then add M;

(3)数据读写:(3) Data reading and writing:

(3.1)计算拟访问数据的物理地址;若为写请求,还要计算校验信息物理地址;(3.1) Calculate the physical address of the data to be accessed; if it is a write request, also calculate the physical address of the verification information;

(3.2)判断拟访问的数据是否有镜像;如果没有镜像,则进入步骤(3.3),若有镜像,则进入步骤(3.4);(3.2) Judging whether the data to be accessed has a mirror image; if there is no mirror image, then enter step (3.3), if there is a mirror image, then enter step (3.4);

(3.3)下发读写请求到数据的物理地址指向的磁盘;若为写请求,还更新原始数据的校验信息;(3.3) Send a read and write request to the disk pointed to by the physical address of the data; if it is a write request, update the verification information of the original data;

(3.4)计算镜像数据的物理地址;若为写请求,还计算镜像数据校验数据的物理地址,下发写请求到相应的数据物理地址所指向的磁盘和镜像物理地址指向的磁盘,并更新原始数据和镜像数据的校验信息;若为读,均衡原始数据所在磁盘与镜像数据所在磁盘的负载大小,选择负载较小者下发读请求;(3.4) Calculate the physical address of the mirrored data; if it is a write request, also calculate the physical address of the mirrored data verification data, send the write request to the disk pointed to by the corresponding data physical address and the disk pointed to by the mirrored physical address, and update The verification information of the original data and the mirrored data; if it is read, balance the load size of the disk where the original data is located and the disk where the mirrored data is located, and select the one with the smaller load to issue a read request;

(4)数据重构:(4) Data reconstruction:

如果出现故障盘,判断故障盘上的数据是否有镜像,如果有镜像,从镜像读取对应数据写入备份盘;如果没有镜像,则根据基于校验RAID的重构方法,读取条带内其他的数据块计算故障盘的数据写入备份盘。If there is a faulty disk, determine whether the data on the faulty disk has a mirror image. If there is a mirror image, read the corresponding data from the mirror image and write it to the backup disk; Other data blocks calculate the data of the faulty disk and write it into the backup disk.

优选地,写数据时,若阵列空间不足,则根据用户指定的优先级弹性释放镜像数据空间,所述弹性释放具体为:当用户数据按基于校验的RAID组织所需要的空间小于或等于阵列容量的一半时,所有的数据都有镜像冗余;当用户数据按基于校验的RAID组织所需要的空间大于阵列容量的一半时,用户可指定数据的优先级,先释放优先级低的数据段的镜像;随着用户数据的递增达到饱和,阵列蜕变成基于校验的RAID。Preferably, when writing data, if the array space is insufficient, the mirrored data space is elastically released according to the priority specified by the user. When the capacity is half of the capacity, all data has mirror redundancy; when the user data needs more than half of the array capacity according to the parity-based RAID organization, the user can specify the priority of the data and release the data with low priority first segment mirroring; as user data increments to saturation, the array degenerates into a parity-based RAID.

优选地,判断是否有镜像的方法具体为:根据镜像管理模块中的条带段位图记录的内容判断,每一个段用2比特来记录状态,为“00”表示空闲;“01”表示有数据没有镜像;“10”表示有数据且其镜像数据在向下相邻的位置;“11”表示有数据且该数据有镜像,镜像位置需通过镜像映射表查找。Preferably, the method for judging whether there is a mirror image is specifically: judging according to the content recorded in the strip segment bitmap in the mirror image management module, each segment uses 2 bits to record the state, and "00" means idle; "01" means that there is data There is no mirror image; "10" means that there is data and its mirror data is in the downward adjacent position; "11" means that there is data and the data has a mirror image, and the mirror position needs to be searched through the mirror mapping table.

优选地,所述更新校验信息的方式有两种:其一,若用户不要求校验数据同步更新,则利用I/O模块中的校验延迟队列后台处理;其二,同步处理,在写原始数据和镜像数据时同步更新原始数据的校验信息和镜像数据的校验信息。Preferably, there are two ways to update the verification information: first, if the user does not require the verification data to be updated synchronously, then use the verification delay queue in the I/O module for background processing; second, synchronous processing, in When writing original data and mirrored data, the verification information of the original data and the verification information of the mirrored data are updated synchronously.

优选的,在没有镜像的情况下,所述读请求执行过程中,若原始数据所在盘出现故障,则遵循校验RAID的方法执行读;这种遵循校验RAID读操作的方法即为降级读。Preferably, in the absence of mirroring, during the execution of the read request, if the disk where the original data is located fails, the read is performed according to the method of verifying the RAID; this method of following the verification of the RAID read operation is degraded read .

优选地,在有镜像的情况下,所述读请求执行过程中,若原始数据所在盘出现故障,直接将用户读请求下发到镜像数据所在磁盘。Preferably, in the case of mirroring, during the execution of the read request, if the disk where the original data is located fails, the user read request is directly sent to the disk where the mirrored data is located.

为实现本发明的目的,按照本发明的另一方面,提供了一种基于校验RAID加入镜像结构的阵列读写系统,其特征在于,所述系统包括I/O模块、镜像管理模块、地址变换模块、基于校验RAID模块;In order to realize the purpose of the present invention, according to another aspect of the present invention, a kind of array reading and writing system based on verification RAID adding mirror structure is provided, it is characterized in that, described system comprises I/O module, mirror image management module, address Transformation module, RAID module based on verification;

所述I/O模块接收上层读写请求,输出读写请求包含的逻辑地址信息;镜像管理模块根据所述逻辑地址信息,判断查找所述地址所指向的原始数据的镜像数据,输出镜像数据逻辑地址;地址变换模块接收原始数据与镜像数据的逻辑地址,输出原始数据与镜像数据的物理地址;如果为写请求,还输出原始数据和镜像数据检验块的物理地址;I/O模块根据所述物理地址,下发读写请求到对应的磁盘;基于校验RAID模块则用于在没有镜像的情况下,原始数据所在盘出现故障时执行读请求。The I/O module receives the upper layer read and write request, and outputs the logical address information included in the read and write request; the mirror management module judges and searches for the mirror data of the original data pointed to by the address according to the logic address information, and outputs the mirror data logic Address; the address translation module receives the logical address of the original data and the image data, and outputs the physical address of the original data and the image data; if it is a write request, it also outputs the physical address of the original data and the image data inspection block; the I/O module is based on the described The physical address sends read and write requests to the corresponding disks; the RAID module based on parity is used to execute read requests when the disk where the original data is located fails without mirroring.

优选的,所述I/O模块包括校验延迟队列,用于后台更新数据段的校验信息。Preferably, the I/O module includes a check delay queue for updating the check information of the data segment in the background.

进一步优选的,所述校验延迟队列可扩展8个比特位或16个比特位;扩展后的队列用于记录需要更新校验的数据块所在段号以及相应的偏移条带号,实现小粒度的更新。Further preferably, the verification delay queue can be extended by 8 bits or 16 bits; the extended queue is used to record the segment number of the data block that needs to be updated and the corresponding offset stripe number, so as to realize small Granular updates.

优选的,所述镜像管理模块包括条带段位图和镜像映射表,所述条带段位图记录条带段的状态,所述镜像映射表记录不相邻镜像的位置;通过遍历条带段位图查找空闲的段分配给原始数据和镜像数据;若为镜像数据分配的空间不是向下临近的,则将不相邻镜像的位置记录在镜像映射表中;镜像管理模块用于管理空间和镜像数据:为原始数据分配空间、分配镜像;根据原始数据查找对应的镜像、在空间不足的情况下根据用户指定的优先级与阵列容量弹性释放镜像数据空间。Preferably, the mirror management module includes a stripe segment bitmap and a mirror mapping table, the stripe segment bitmap records the state of the stripe segment, and the mirror mapping table records the positions of non-adjacent mirror images; by traversing the stripe segment bitmap Find free segments and allocate them to original data and mirrored data; if the space allocated for mirrored data is not adjacent downward, record the location of non-adjacent mirrors in the mirror mapping table; the mirror management module is used to manage space and mirror data : Allocate space for the original data, allocate the image; find the corresponding image according to the original data, and elastically release the image data space according to the priority and array capacity specified by the user when the space is insufficient.

总体而言,通过本发明所构思的以上技术方案与现有技术相比,能够取得下列有益效果:Generally speaking, compared with the prior art, the above technical solutions conceived by the present invention can achieve the following beneficial effects:

(1)由于在基于校验RAID的基础上加入了镜像结构,利用空闲空间存放数据的镜像,在读请求执行过程中,可以利用镜像做磁盘之间的均衡,降级读的情况下直接返回镜像的数据;在写请求执行过程中可延迟更新校验信息,缓解基于校验RAID的写放大问题,提高了阵列的读写性能;(1) Since the mirror structure is added based on the verification RAID, the mirror image of the data is stored in the free space. During the execution of the read request, the mirror image can be used to balance the disks, and the mirror image can be returned directly when the read is downgraded. Data; during the execution of the write request, the verification information can be delayed to be updated, alleviating the write amplification problem based on the verification RAID, and improving the read and write performance of the array;

(2)由于加入了镜像结构,利用镜像数据进行数据重构,缩短了数据重构的时间;(2) Due to the addition of the mirror structure, the mirror data is used for data reconstruction, which shortens the time for data reconstruction;

(3)由于本发明采用了交叉存放的地址布局方式,将原始数据与镜像数据尽可能的放在邻近的位置,缩短磁头移动的距离,提高了阵列的读写速度;(3) Since the present invention has adopted the address layout mode of interleaved storage, the original data and the image data are placed in adjacent positions as much as possible, shortening the moving distance of the magnetic head, and improving the read-write speed of the array;

(4)由于本发明的I/O模块引入了延迟队列,利用延迟队列实现校验信息的后台更新,缓解了基于校验RAID的写放大问题,提高阵列的写性能;(4) Since the I/O module of the present invention introduces a delay queue, the background update of the verification information is realized by using the delay queue, which alleviates the write amplification problem based on the verification RAID, and improves the write performance of the array;

(5)由于加入了镜像结构,镜像数据的存在,阵列的容错可以达到1至M/2个磁盘(M为阵列中磁盘的总数),提高了阵列的可用性和可靠性,并提高了系统的性能。(5) Due to the addition of the mirror structure and the existence of mirror data, the fault tolerance of the array can reach 1 to M/2 disks (M is the total number of disks in the array), which improves the availability and reliability of the array and improves the system performance. performance.

附图说明Description of drawings

图1本发明实施例1提供的基于校验RAID加入镜像结构的阵列读写系统框图;The block diagram of the array reading and writing system based on the verification RAID added to the mirror structure provided by Fig. 1 embodiment 1 of the present invention;

图2是本发明实施例1提供的数据结构示意图;FIG. 2 is a schematic diagram of the data structure provided by Embodiment 1 of the present invention;

图3是本发明实施例2中RAID5数据布局示意图;Fig. 3 is a schematic diagram of RAID5 data layout in Embodiment 2 of the present invention;

图4是本发明实施例2提供的基于校验RAID加入镜像结构的阵列构建方法应用到RAID5的数据布局示意图;FIG. 4 is a schematic diagram of the data layout applied to RAID5 by the array construction method based on verifying that RAID is added to the mirror structure provided by Embodiment 2 of the present invention;

图5是本发明实施例2中段地址布局示意图;Fig. 5 is a schematic diagram of the segment address layout in Embodiment 2 of the present invention;

图6是本发明实施例2中数据写入方法流程图;6 is a flow chart of a data writing method in Embodiment 2 of the present invention;

图7是本发明实施例2中数据读取方法流程图;Fig. 7 is a flow chart of the data reading method in Embodiment 2 of the present invention;

图8是本发明实施例2中数据重构方法流程图。Fig. 8 is a flow chart of the data reconstruction method in Embodiment 2 of the present invention.

具体实施方式Detailed ways

为了使本发明的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本发明进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。此外,下面所描述的本发明各个实施方式中所涉及到的技术特征只要彼此之间未构成冲突就可以相互组合。In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention. In addition, the technical features involved in the various embodiments of the present invention described below can be combined with each other as long as they do not constitute a conflict with each other.

图1示出了本发明实施例1提供的一种基于校验RAID加入镜像结构的阵列读写系统,为了便于说明仅示出了与本发明相关的部分。FIG. 1 shows an array read-write system based on verifying RAID and adding a mirror structure provided by Embodiment 1 of the present invention. For convenience of description, only parts related to the present invention are shown.

实施例1提供的系统,包括I/O模块、镜像管理模块、地址变换模块和基于校验RAID模块;The system provided by embodiment 1 includes an I/O module, a mirror image management module, an address translation module and a RAID module based on verification;

所述I/O模块接受上层读写请求;镜像管理模块根据读写请求内的地址信息判断查找地址所指向的原始数据段的镜像数据段,返回镜像数据段地址;地址变换模块将逻辑地址根据段地址布局以及基于校验RAID的数据布局映射成磁盘上的物理地址,如果为写请求,还需要计算原始数据与镜像数据对应的检验块的物理地址;I/O模块分解读写请求,下发到物理地址所对应的磁盘;基于校验RAID模块则用于在没有镜像的情况下,原始数据所在盘出现故障时执行读请求。Described I/O module accepts upper layer read and write request; Mirror image management module judges the image data segment of the original data segment that search address points to according to the address information in the read and write request, returns mirror image data segment address; Address conversion module logical address according to Segment address layout and data layout based on verification RAID are mapped to physical addresses on the disk. If it is a write request, it is also necessary to calculate the physical address of the verification block corresponding to the original data and mirror data; the I/O module decomposes the read and write requests, and the following Send to the disk corresponding to the physical address; based on the verification RAID module, it is used to execute the read request when the disk where the original data is located fails without mirroring.

实施例1的I/O模块拥有校验延迟队列,如图2中校验延迟队列11所示,用于后台更新数据段的校验信息;在写入过程中,如果用户不要求校验数据同步更新,则将段号记录在校验延迟队列中,待系统空闲或是负载不高的时候,后台更新记录的数据块的校验信息;校验延迟队列可扩展8个比特位或16个比特位,扩展后的队列可记录需要更新校验的数据块所在段号以及相应的偏移条带号,实现小粒度的更新。The I/O module of embodiment 1 has verification delay queue, as shown in verification delay queue 11 among Fig. 2, is used for the verification information of background update data segment; In writing process, if user does not require verification data For synchronous update, the segment number will be recorded in the verification delay queue. When the system is idle or the load is not high, the verification information of the recorded data block will be updated in the background; the verification delay queue can be expanded to 8 bits or 16 bit, the extended queue can record the segment number of the data block that needs to be updated and the corresponding offset stripe number, so as to realize small-grained update.

所述镜像管理模块用来管理空间和镜像数据,包括为原始数据分配空间、为镜像数据分配空间、根据原始数据查找对应的镜像、空间不足时,根据用户指定的优先级与阵列容量弹性释放镜像数据空间,具体为:当用户数据按基于校验的RAID组织所需要的空间小于或等于阵列容量的一半时,所有的数据都有镜像冗余;当用户数据按基于校验的RAID组织所需要的空间大于阵列容量的一半时,用户可指定数据的优先级,先释放优先级低的数据段的镜像;随着用户数据的递增达到饱和,阵列蜕变成基于校验的RAID。The mirror management module is used to manage space and mirror data, including allocating space for original data, allocating space for mirror data, searching for corresponding mirror images according to original data, and elastically releasing mirror images according to user-specified priority and array capacity when space is insufficient Data space, specifically: when the user data requires less than or equal to half of the array capacity according to the verification-based RAID organization, all data has mirror redundancy; when the user data is required by the verification-based RAID organization When the space is greater than half of the array capacity, the user can specify the priority of the data, and first release the image of the data segment with lower priority; as the user data increases and reaches saturation, the array transforms into a parity-based RAID.

所述镜像管理模块拥有镜像映射表和条带段位图,如图2中镜像映射表21和条带段位图22所示,条带段位图记录段的状态,一个段需要2个比特位记录状态,总共有四种状态,分别为:“00”表示空闲;“01”表示有数据;“10”表示有数据且其镜像数据在向下相近的位置,比如条带段2与条带段N+2向下相邻;“11”表示有数据且该数据有镜像,镜像位置需通过镜像映射表查找;所述镜像映射表记录了不相邻镜像的位置,前一项为原始数据条带段的段号,后一项为镜像段的段号。Described image management module has image mapping table and strip section bitmap, as shown in image mapping table 21 and strip section bitmap 22 among Fig. 2, the state of strip section bitmap record segment, a section needs 2 bit record states , there are four states in total, which are: "00" means idle; "01" means there is data; "10" means there is data and its mirror data is in a similar downward position, such as stripe segment 2 and stripe segment N +2 is adjacent downward; "11" indicates that there is data and the data has a mirror, and the mirror position needs to be looked up through the mirror mapping table; the mirror mapping table records the position of non-adjacent mirrors, and the previous item is the original data strip The segment number of the segment, the last item is the segment number of the mirrored segment.

所述地址变换模块用于将逻辑地址根据段地址布局以及基于校验RAID的数据布局映射成磁盘上的物理地址。The address conversion module is used to map the logical address into the physical address on the disk according to the segment address layout and the data layout based on the parity RAID.

所述基于校验RAID模块完成段内的数据布局、校验码计算,以及在没有镜像数据的情况下,降级读与数据重构。The verification-based RAID module completes the data layout in the segment, the verification code calculation, and in the case of no mirror data, downgraded reading and data reconstruction.

实施例2中基于校验RAID加入镜像结构的阵列构建方法应用于本发明的阵列读写系统上,包括地址布局、数据布局、数据存取和数据重构四个步骤,具体如下:In embodiment 2, the array construction method based on verifying RAID and adding mirror structure is applied to the array read-write system of the present invention, including four steps of address layout, data layout, data access and data reconstruction, as follows:

(1)地址布局:(1) Address layout:

(1.1)根据磁盘数M设定段内条带数,段内条带数为M的整数倍;(1.1) Set the number of stripes in the segment according to the number of disks M, and the number of stripes in the segment is an integer multiple of M;

(1.2)根据磁盘中数据块的数目与段内条带数确定段数K,段数K=磁盘中数据块的数目/段内条带数;(1.2) determine the segment number K according to the number of data blocks in the disk and the number of stripes in the segment, the number of segments K=the number of data blocks in the disk/segment number of stripes;

(1.3)取N=K/2;将段编号为1至N的段与段编号为N+1至K的段交叉存放:编号为1的段之后存放编号为N+1的段,之后存放编号为2的段,编号为2之后存放编号为N+2的段,如此交叉存放;如K/2为非整数,则取整;(1.3) Get N=K/2; Interleave the segments whose segment numbers are 1 to N with the segments whose segment numbers are N+1 to K: store the segment whose number is N+1 after the segment whose number is 1, and then store For the segment numbered 2, the segment numbered N+2 will be stored after the number 2, and stored interleaved in this way; if K/2 is a non-integer, it will be rounded;

(2)数据布局:(2) Data layout:

(2.1)原始数据段段内的数据与基于校验的RAID相同布局;(2.1) The data in the original data segment is in the same layout as the parity-based RAID;

(2.2)镜像数据段在原始数据段布局的基础上,改变同条带中数据块存放的磁盘号,磁盘号统一向右或向左偏移;统一向右偏移,即磁盘号加i,若加i后的磁盘号大于M,则减M;统一向左偏移,即移磁盘号减i,若减i后的磁盘号小于0,则加M;(2.2) On the basis of the layout of the original data segment, the mirrored data segment changes the disk number stored in the data block in the same stripe, and the disk number is uniformly shifted to the right or left; uniformly shifted to the right, that is, the disk number plus i, If the disk number after adding i is greater than M, then subtract M; uniformly shift to the left, that is, shift the disk number and subtract i, if the disk number after subtracting i is less than 0, then add M;

(3)数据读写:(3) Data reading and writing:

(3.1)计算拟访问数据的物理地址;若为写请求,还要计算校验信息物理地址;(3.1) Calculate the physical address of the data to be accessed; if it is a write request, also calculate the physical address of the verification information;

(3.2)判断拟访问的数据是否有镜像;如果没有镜像,则进入步骤(3.3),若有镜像,则进入步骤(3.4);(3.2) Judging whether the data to be accessed has a mirror image; if there is no mirror image, then enter step (3.3), if there is a mirror image, then enter step (3.4);

(3.3)下发读写请求到数据的物理地址指向的磁盘;若为写请求,还更新原始数据的校验信息;(3.3) Send a read and write request to the disk pointed to by the physical address of the data; if it is a write request, update the verification information of the original data;

(3.4)计算镜像数据的物理地址;若为写请求,还计算镜像数据校验数据的物理地址,下发写请求到相应的数据物理地址所指向的磁盘和镜像物理地址指向的磁盘,并更新原始数据和镜像数据的校验信息;若为读,均衡原始数据所在磁盘与镜像数据所在磁盘的负载大小,选择负载较小者下发读请求;(3.4) Calculate the physical address of the mirrored data; if it is a write request, also calculate the physical address of the mirrored data verification data, send the write request to the disk pointed to by the corresponding data physical address and the disk pointed to by the mirrored physical address, and update The verification information of the original data and the mirrored data; if it is read, balance the load size of the disk where the original data is located and the disk where the mirrored data is located, and select the one with the smaller load to issue a read request;

(4)数据重构:(4) Data reconstruction:

如果出现故障盘,判断故障盘上的数据是否有镜像,如果有镜像,从镜像读取对应数据写入备份盘;如果没有镜像,则根据基于校验RAID的重构方法,读取条带内其他的数据块计算故障盘的数据写入备份盘。If there is a faulty disk, determine whether the data on the faulty disk has a mirror image. If there is a mirror image, read the corresponding data from the mirror image and write it to the backup disk; Other data blocks calculate the data of the faulty disk and write it into the backup disk.

实施例2的阵列构建方法中的校验RAID为RAID5,RAID5的数据布局如图3所示,Ai、Bi、Ci为数据块,Pi为第i条条带数据的校验信息,例如P1为A1、B1、C1的校验信息。The verification RAID in the array construction method of embodiment 2 is RAID5, and the data layout of RAID5 is as shown in Figure 3, and Ai, Bi, Ci are data blocks, and Pi is the verification information of the ith stripe data, for example P1 is Verification information of A1, B1, and C1.

实施例2中的数据布局方法如图4所示:Ai、Bi、Ci为数据块,Pi为第i条条带数据的校验信息,Ai’、Bi’、Ci’、Pi’分别为Ai、Bi、Ci、Pi的镜像。数据划分为段(section),一个段分为数个条带(stripe),段中条带的个数由用户指定,本实施例中一个段包含4个条带。原始数据段段内的数据布局与RAID5布局相同,镜像数据段在原始数据段布局的基础上,改变存放的磁盘号,统一向右偏移1,如图4中的(a)部分所示;或者统一向左偏移1,如图4中的(b)部分所示。例如,原本存放在编号为m磁盘上的数据,在镜像中存放在编号为m+1的磁盘上,或者存放在编号为m-1的磁盘上。The data layout method in Embodiment 2 is shown in Figure 4: Ai, Bi, and Ci are data blocks, Pi is the verification information of the i-th stripe data, and Ai', Bi', Ci', and Pi' are respectively Ai , Bi, Ci, and Pi mirror images. The data is divided into sections. A section is divided into several stripes. The number of stripes in a section is specified by the user. In this embodiment, a section includes 4 stripes. The data layout in the original data segment is the same as the RAID5 layout, and the mirrored data segment changes the stored disk number on the basis of the original data segment layout, and uniformly shifts to the right by 1, as shown in part (a) of Figure 4; or Uniformly offset to the left by 1, as shown in part (b) of Figure 4. For example, the data originally stored on the disk numbered m is stored on the disk numbered m+1 or on the disk numbered m-1 in the mirror.

实施例2中,段地址布局方法具体为:假设有2N个段,将段编号为1至N的段与段编号为N+1至2*N的段交叉存放。具体如图5所示,编号为1的段之后排列编号为N+1的段,之后排列编号为2的段,之后排列编号为N+2的段,如此交叉编址。在镜像管理模块中,为原始数据分配空间从段标号1至2N顺序查找空闲分配,为原始数据分配镜像空间,优先分配原始数据物理地址向下邻近的条带段,例如在图5中,条带段k为原始数据,若再写入新的文件,若条带段k+1为空闲,优先分配条带段k+1,如果条带段k+1有数据,则查看条带段k+2是否空闲,顺序查找空闲空间分配,若是为条带段k的原始数据分配镜像,优先分配条带段N+k,如果条带段N+k有数据,则首先往前查找空闲条带段分配,如果没有,则往后查找空闲条带段。In Embodiment 2, the segment address layout method is specifically as follows: assuming that there are 2N segments, the segments numbered 1 to N are interleaved with the segments numbered N+1 to 2*N. Specifically, as shown in FIG. 5 , the segment numbered N+1 is arranged after the segment numbered 1, the segment numbered 2 is arranged after that, and the segment numbered N+2 is arranged after that, so that cross-addressing is performed. In the mirror management module, allocate space for the original data from segment number 1 to 2N to search for free allocation in sequence, allocate mirror space for the original data, and preferentially allocate the stripe segment adjacent to the physical address of the original data downward, for example, in Figure 5, the stripe segment Segment k is the original data. If a new file is written, if segment k+1 is free, segment k+1 will be allocated first. If there is data in segment k+1, view segment k +2 Is it free? Search for free space allocation sequentially. If the original data of stripe segment k is assigned a mirror image, stripe segment N+k will be allocated first. If stripe segment N+k has data, it will first search for free stripes forward. Segment allocation, if not, then look for free stripe segments later.

实施例2中,所述用户数据写入方法具体为:对于上层用户写请求,如果写请求指向的数据没有镜像,计算原始数据物理地址,下发写请求并更新校验信息;如果写请求指向的数据有镜像,根据数据布局以及镜像管理,找到原始数据物理地址与镜像数据物理地址,下发写请求并更新校验信息。更新校验信息的方法有两种,其一,后台处理,等系统空闲或负载不高的时候,更新原始校验信息和其镜像校验信息;其二,同时处理,在写原始数据和镜像数据时同时更新原始校验和镜像校验。In Embodiment 2, the method for writing user data is specifically: for an upper-layer user write request, if the data pointed to by the write request does not have a mirror image, calculate the physical address of the original data, issue the write request and update the verification information; if the write request points to According to the data layout and mirror management, find the physical address of the original data and the physical address of the mirrored data, send a write request and update the verification information. There are two ways to update the verification information, one is background processing, when the system is idle or the load is not high, the original verification information and its mirror verification information are updated; the second is simultaneous processing, when writing the original data and the mirror image The original checksum and mirror checksum are updated at the same time when the data is updated.

更具体的,如图6所示,所述用户数据写入具体包括如下步骤:More specifically, as shown in Figure 6, the user data writing specifically includes the following steps:

步骤S401根据数据布局以及段地址布局计算原始数据物理地址及其校验信息物理地址;Step S401 calculates the physical address of the original data and the physical address of the verification information according to the data layout and the segment address layout;

步骤S402判断是否有镜像,如果有镜像,转入步骤S403;否则,转入步骤S408;所述镜像判断是根据镜像管理模块中的条带段位图记录的数值判断,如果为“10”或“11”,表示有镜像,如果为“01”,表示没有镜像;Step S402 judges whether there is a mirror image, and if there is a mirror image, proceed to step S403; otherwise, proceed to step S408; said mirror image judgment is based on the numerical judgment of the stripe segment bitmap record in the mirror image management module, if it is "10" or " 11", it means there is a mirror image, if it is "01", it means there is no mirror image;

步骤S403计算镜像数据的数据物理地址及其校验数据的物理地址;Step S403 calculates the data physical address of the image data and the physical address of the verification data thereof;

步骤S404下发请求到原始数据和镜像数据所在磁盘;Step S404 sends a request to the disk where the original data and mirrored data are located;

步骤S405校验信息是否同步更新,如果是,转入步骤S406,否则,转入步骤S407;Step S405 checks whether the information is updated synchronously, if yes, proceeds to step S406, otherwise, proceeds to step S407;

步骤S406更新原始数据和镜像数据的校验信息;Step S406 updates the verification information of the original data and the image data;

步骤S407等系统空闲或者磁盘负载不高的时候,后台更新原始数据和镜像数据的校验信息;如果后台更新,相应的段号以及段内的偏移条带号会加入校验延迟队列11;Step S407 etc. When the system is idle or the disk load is not high, the verification information of the original data and the mirrored data is updated in the background; if the background is updated, the corresponding segment number and the offset stripe number in the segment will be added to the verification delay queue 11;

步骤S408下发写请求到原始数据所在的磁盘并更新原始数据的校验信息。Step S408 sends a write request to the disk where the original data is located and updates the verification information of the original data.

所述用户数据读取过程具体为:对于上层用户读请求,计算原始数据物理地址;如果有镜像,计算镜像数据物理地址,取负载较小的磁盘下发读请求;如果没有镜像,直接下发到原始数据所在磁盘;在数据所在盘出现故障的情况下,若数据有镜像,计算镜像数据的物理地址,将用户读请求下发到镜像数据所在磁盘;若数据没有镜像,则根据基于校验RAID模块,降级读。The user data reading process is specifically as follows: for the upper-layer user read request, calculate the physical address of the original data; if there is a mirror image, calculate the physical address of the mirrored data, and send the read request to a disk with a smaller load; if there is no mirror image, directly send to the disk where the original data resides; in the event of a failure on the disk where the data resides, if the data is mirrored, calculate the physical address of the mirrored data, and send the user’s read request to the disk where the mirrored data resides; RAID module, downgrade read.

更具体的,如图7所示,所述用户数据读取具体包括以下步骤:More specifically, as shown in Figure 7, the reading of user data specifically includes the following steps:

步骤S501根据数据布局以及段地址布局计算原始数据物理地址;Step S501 calculates the original data physical address according to the data layout and segment address layout;

步骤S502判断是否有镜像,如果有镜像,转入步骤S503,否则,转入步骤S507;Step S502 judges whether there is a mirror image, if there is a mirror image, proceed to step S503, otherwise, proceed to step S507;

步骤S503计算镜像数据的数据物理地址;Step S503 calculates the data physical address of the image data;

步骤S504判断是否有坏盘,如果数据及镜像盘中有坏盘,则转入步骤S505,否则,转入步骤S506;Step S504 judges whether there is a bad disk, if there is a bad disk in the data and image disk, then proceed to step S505, otherwise, proceed to step S506;

步骤S505下发请求到镜像数据所在的盘;Step S505 sends the request to the disk where the image data is located;

步骤S506均衡原始数据与镜像数据所在盘的负载,择负载较轻的盘下发请求;Step S506 balances the loads of the disks where the original data and the mirrored data reside, and selects the disk with the lighter load to send the request;

步骤S507判断是否有坏盘,如果原始数据所在盘出现故障,则转入步骤S509,否则,转入步骤S508;Step S507 judges whether there is a bad disk, if the original data place disk breaks down, then proceed to step S509, otherwise, proceed to step S508;

步骤S508下发请求到原始数据所在的磁盘;Step S508 sends the request to the disk where the original data is located;

步骤S509降级读,根据基于校验RAID的方法执行降级读,在本实施例中遵循RAID5的方法,比如在图4中,如果读取数据块C5,C盘故障,此时读取数据块A5、B5、P5,计算出C5的数据。Step S509 downgrade read, perform downgrade read according to the method based on verifying RAID, follow the method of RAID5 in this embodiment, such as in Fig. 4, if read data block C5, C disk failure, read data block A5 at this moment , B5, P5, calculate the data of C5.

实施例2中,用户数据重构方法具体为:如果出现故障盘,判断故障盘上的数据是否有镜像,如果有镜像,从镜像读取对应数据写入备份盘,如果没有镜像,则根据基于校验RAID的重构方法,读取条带内其他的数据块计算故障盘的数据写入备份盘。In Embodiment 2, the user data reconstruction method is specifically: if a faulty disk occurs, determine whether the data on the faulty disk has a mirror image, if there is a mirror image, read the corresponding data from the mirror image and write it to the backup disk, if there is no mirror image, then based on Verify the RAID reconstruction method, read other data blocks in the stripe, calculate the data of the faulty disk, and write it to the backup disk.

更具体的,如图8所示,所述重构方法具体包括以下步骤:More specifically, as shown in Figure 8, the reconstruction method specifically includes the following steps:

步骤S601判断数据是否有镜像,如果有镜像,转入步骤S603,否则,转入步骤S602;Step S601 judges whether the data has a mirror image, if there is a mirror image, proceed to step S603, otherwise, proceed to step S602;

步骤S602读取校验信息和其他正常磁盘的信息计算故障盘上的数据;Step S602 reads the verification information and the information of other normal disks to calculate the data on the faulty disk;

步骤S603从镜像读取故障盘的数据;Step S603 reads the data of the faulty disk from the image;

步骤S604将数据写入备份盘。Step S604 writes the data into the backup disk.

实施例2的阵列构建方法,通过在基于校验RAID的基础上加入镜像结构,利用空闲空间存放数据的镜像,提高了阵列的读写性能,缓解基于校验RAID的写放大问题,提高了阵列的可用性和可靠性。The array construction method of embodiment 2, by adding the mirror structure based on the verification RAID, utilizes the free space to store the mirror image of the data, improves the read and write performance of the array, alleviates the write amplification problem based on the verification RAID, and improves the performance of the array. availability and reliability.

本领域的技术人员容易理解,以上所述仅为本发明的较佳实施例而已,并不用以限制本发明,凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等,均应包含在本发明的保护范围之内。Those skilled in the art can easily understand that the above descriptions are only preferred embodiments of the present invention, and are not intended to limit the present invention. Any modifications, equivalent replacements and improvements made within the spirit and principles of the present invention, All should be included within the protection scope of the present invention.

Claims (10)

1.一种基于校验RAID加入镜像结构的阵列构建方法,其特征在于,所述阵列构建方法包括地址布局、数据布局、数据存取和数据重构四个步骤,具体如下:1. a kind of array construction method that adds mirror image structure based on checking RAID, it is characterized in that, described array construction method comprises address layout, data layout, data access and data reconstruction four steps, specifically as follows: (1)地址布局:(1) Address layout: (1.1)根据磁盘数M设定段内条带数,段内条带数为M的整数倍;(1.1) Set the number of stripes in the segment according to the number of disks M, and the number of stripes in the segment is an integer multiple of M; (1.2)根据磁盘中数据块的数目与段内条带数确定段数K,段数K=磁盘中数据块的数目/段内条带数;(1.2) determine the segment number K according to the number of data blocks in the disk and the number of stripes in the segment, the number of segments K=the number of data blocks in the disk/segment number of stripes; (1.3)取N=K/2;将段编号为1至N的段与段编号为N+1至K的段交叉存放:编号为1的段之后存放编号为N+1的段,之后存放编号为2的段,编号为2之后存放编号为N+2的段,如此交叉存放;如K/2为非整数,则取整;(1.3) Get N=K/2; Interleave the segments whose segment numbers are 1 to N with the segments whose segment numbers are N+1 to K: store the segment whose number is N+1 after the segment whose number is 1, and then store For the segment numbered 2, the segment numbered N+2 will be stored after the number 2, and stored interleaved in this way; if K/2 is a non-integer, it will be rounded; (2)数据布局:(2) Data layout: (2.1)原始数据段段内的数据与基于校验的RAID相同布局;(2.1) The data in the original data segment is in the same layout as the parity-based RAID; (2.2)镜像数据段在原始数据段布局的基础上,改变同条带中数据块存放的磁盘号,磁盘号统一向右或向左偏移;统一向右偏移,即磁盘号加i,若加i后的磁盘号大于M,则减M;统一向左偏移,即移磁盘号减i,若减i后的磁盘号小于0,则加M;(2.2) On the basis of the layout of the original data segment, the mirrored data segment changes the disk number stored in the data block in the same stripe, and the disk number is uniformly shifted to the right or left; uniformly shifted to the right, that is, the disk number plus i, If the disk number after adding i is greater than M, then subtract M; uniformly shift to the left, that is, shift the disk number and subtract i, if the disk number after subtracting i is less than 0, then add M; (3)数据读写:(3) Data reading and writing: (3.1)计算拟访问数据的物理地址;若为写请求,还要计算校验信息物理地址;(3.1) Calculate the physical address of the data to be accessed; if it is a write request, also calculate the physical address of the verification information; (3.2)判断拟访问的数据是否有镜像;如果没有镜像,则进入步骤(3.3),若有镜像,则进入步骤(3.4);(3.2) Judging whether the data to be accessed has a mirror image; if there is no mirror image, then enter step (3.3), if there is a mirror image, then enter step (3.4); (3.3)下发读写请求到数据的物理地址指向的磁盘;若为写请求,还更新原始数据的校验信息;(3.3) Send a read and write request to the disk pointed to by the physical address of the data; if it is a write request, update the verification information of the original data; (3.4)计算镜像数据的物理地址;若为写请求,还计算镜像数据校验数据的物理地址,下发写请求到相应的数据物理地址所指向的磁盘和镜像物理地址指向的磁盘,并更新原始数据和镜像数据的校验信息;若为读,均衡原始数据所在磁盘与镜像数据所在磁盘的负载大小,选择负载较小者下发读请求;(3.4) Calculate the physical address of the mirrored data; if it is a write request, also calculate the physical address of the mirrored data verification data, send the write request to the disk pointed to by the corresponding data physical address and the disk pointed to by the mirrored physical address, and update The verification information of the original data and the mirrored data; if it is read, balance the load size of the disk where the original data is located and the disk where the mirrored data is located, and select the one with the smaller load to issue a read request; (4)数据重构:(4) Data reconstruction: 如果出现故障盘,判断故障盘上的数据是否有镜像,如果有镜像,从镜像读取对应数据写入备份盘;如果没有镜像,则根据基于校验RAID的重构方法,读取条带内其他的数据块计算故障盘的数据写入备份盘。If there is a faulty disk, determine whether the data on the faulty disk has a mirror image. If there is a mirror image, read the corresponding data from the mirror image and write it to the backup disk; Other data blocks calculate the data of the faulty disk and write it into the backup disk. 2.如权利要求1所述的阵列构建方法,其特征在于,在所述步骤(3)中,写数据时,若阵列空间不足,则根据用户指定的优先级弹性释放镜像数据空间:当用户数据按基于校验的RAID组织所需要的空间小于或等于阵列容量的一半时,所有的数据都有镜像冗余;当用户数据按基于校验的RAID组织所需要的空间大于阵列容量的一半时,根据用户指定数据的优先级,先释放优先级低的数据段的镜像;随着用户数据的递增达到饱和,阵列蜕变成基于校验的RAID。2. The array construction method according to claim 1, wherein, in the step (3), when writing data, if the array space is insufficient, the mirrored data space is elastically released according to the priority specified by the user: when the user When the space required by data organization based on parity is less than or equal to half of the array capacity, all data has mirror redundancy; when user data is organized by parity-based RAID, the space required is greater than half of the array capacity , According to the priority of data specified by the user, the mirror image of the data segment with low priority is released first; as the increase of user data reaches saturation, the array transforms into a RAID based on parity. 3.如权利要求1或2所述的阵列构建方法,其特征在于,所述步骤(3)中,判断是否有镜像的方法具体为:根据镜像管理模块中条带段位图记录的内容判断,为“00”表示空闲;“01”表示有数据没有镜像;“10”表示有数据且其镜像数据在向下相邻的位置;“11”表示有数据且该数据有镜像,镜像位置需通过镜像映射表查找。3. the array construction method as claimed in claim 1 or 2, is characterized in that, in the described step (3), the method for judging whether there is a mirror image is specifically: according to the content judgment of the stripe segment bitmap record in the mirror image management module, "00" means free; "01" means there is data but no mirror image; "10" means there is data and its mirror data is in the downward adjacent position; "11" means there is data and the data has a mirror image, and the mirror position needs to pass Mirror map lookup. 4.如权利要求1至3任一项所述的阵列构建方法,其特征在于,所述步骤(3)中更新校验信息有两种方式:其一,若用户不要求校验数据同步更新,则利用I/O模块中的校验延迟队列后台处理;其二,同步处理,在写原始数据和镜像数据时同步更新原始数据的校验信息和镜像数据的校验信息。4. The array construction method according to any one of claims 1 to 3, characterized in that there are two ways to update the verification information in the step (3): one, if the user does not require the verification data to be updated synchronously , then use the verification delay queue in the I/O module for background processing; second, synchronous processing, when writing the original data and the mirrored data, the verification information of the original data and the verification information of the mirrored data are updated synchronously. 5.如权利要求1至4任一项所述的阵列构建方法,其特征在于,在所述步骤(3)中,执行读请求过程中,若原始数据所在盘出现故障,则遵循校验RAID的方法执行读。5. The array construction method according to any one of claims 1 to 4, characterized in that, in the step (3), in the process of executing the read request, if the disk where the original data is located fails, then follow the verification RAID The method performs a read. 6.如权利要求1至5任一项所述的阵列构建方法,其特征在于,在所述步骤(3)中,读请求执行过程中,若原始数据所在盘出现故障,直接将用户读请求下发到镜像数据所在磁盘。6. The array construction method according to any one of claims 1 to 5, characterized in that, in the step (3), during the execution of the read request, if the disk where the original data is located fails, the user read request is directly sent to Send it to the disk where the mirrored data resides. 7.一种基于校验RAID加入镜像结构的阵列读写系统,其特征在于,所述系统包括I/O模块、镜像管理模块、地址变换模块、基于校验RAID模块;7. A kind of array reading and writing system that adds mirror image structure based on verification RAID, it is characterized in that, described system comprises I/O module, mirror image management module, address conversion module, based on verification RAID module; 所述I/O模块接收上层读写请求,输出逻辑地址信息;镜像管理模块根据所述逻辑地址信息,判断查找所述地址所指向的原始数据的镜像数据,输出镜像数据逻辑地址;地址变换模块接收原始数据与镜像数据的逻辑地址,输出原始数据与镜像数据的物理地址;如果为写请求,还输出原始数据和镜像数据检验块的物理地址;I/O模块根据所述物理地址,下发读写请求到对应的磁盘;基于校验RAID模块则用于在没有镜像的情况下,原始数据所在盘出现故障时执行读请求。The I/O module receives the upper layer read and write request, and outputs logical address information; the mirror management module judges and searches the mirror data of the original data pointed to by the address according to the logical address information, and outputs the mirror data logical address; the address conversion module Receive the logical address of the original data and mirrored data, and output the physical address of the original data and mirrored data; if it is a write request, also output the physical address of the original data and mirrored data check block; the I/O module sends The read and write requests go to the corresponding disk; the parity-based RAID module is used to execute the read request when the disk where the original data resides fails without mirroring. 8.如权利要求7所述的阵列读写系统,其特征在于,所述I/O模块包括校验延迟队列,用于后台更新数据段的校验信息。8. The array read-write system according to claim 7, wherein the I/O module includes a check delay queue for updating check information of data segments in the background. 9.如权利要求8所述的阵列读写系统,其特征在于,所述校验延迟队列可扩展8个比特位或16个比特位;扩展后的队列用于记录需要更新校验的数据块所在段号以及相应的偏移条带号,实现小粒度的更新。9. The array read-write system according to claim 8, wherein the verification delay queue can be extended by 8 bits or 16 bits; the extended queue is used to record data blocks that need to be updated for verification The segment number and the corresponding offset stripe number realize small-grained updates. 10.如权利要求7至9任一项所述的阵列读写系统,其特征在于,所述镜像管理模块用于为原始数据分配空间和分配镜像,包括条带段位图和镜像映射表;通过遍历条带段位图查找空闲的段分配给原始数据和镜像数据;若为镜像数据分配的空间不是向下临近的,则将不相邻镜像的位置记录在镜像映射表中。10. The array read-write system according to any one of claims 7 to 9, wherein the mirror image management module is used for allocating space and allocating mirror images for raw data, including a strip segment bitmap and a mirror image mapping table; by Traverse the stripe segment bitmap to find free segments and allocate them to original data and mirror data; if the space allocated for mirror data is not adjacent downward, record the non-adjacent mirror positions in the mirror mapping table.
CN201510025251.XA 2015-01-19 2015-01-19 A kind of array construction method and read-write system based on verification RAID addition mirror-image structures Active CN104714758B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510025251.XA CN104714758B (en) 2015-01-19 2015-01-19 A kind of array construction method and read-write system based on verification RAID addition mirror-image structures

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510025251.XA CN104714758B (en) 2015-01-19 2015-01-19 A kind of array construction method and read-write system based on verification RAID addition mirror-image structures

Publications (2)

Publication Number Publication Date
CN104714758A true CN104714758A (en) 2015-06-17
CN104714758B CN104714758B (en) 2017-07-07

Family

ID=53414143

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510025251.XA Active CN104714758B (en) 2015-01-19 2015-01-19 A kind of array construction method and read-write system based on verification RAID addition mirror-image structures

Country Status (1)

Country Link
CN (1) CN104714758B (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106339276A (en) * 2016-08-16 2017-01-18 浪潮(北京)电子信息产业有限公司 Data backup state-based data recovery method and system
CN105022673B (en) * 2015-07-15 2018-07-20 南京师范大学 A kind of fault-tolerant fast parallel double calculation method of data-oriented parallel computation
CN109032513A (en) * 2018-07-16 2018-12-18 山东大学 Based on the RAID framework of SSD and HDD and its backup, method for reconstructing
CN113407125A (en) * 2021-08-20 2021-09-17 苏州浪潮智能科技有限公司 Method, system and related device for determining block number in RAID6 array
CN114510379A (en) * 2022-04-21 2022-05-17 山东百盟信息技术有限公司 Distributed array video data storage device
CN115562594A (en) * 2022-12-06 2023-01-03 苏州浪潮智能科技有限公司 Method, system and related device for constructing RAID card
CN118885132A (en) * 2024-09-29 2024-11-01 苏州元脑智能科技有限公司 A data processing method, device, equipment, medium, product and system
CN119415044A (en) * 2025-01-03 2025-02-11 麒麟软件有限公司 A disk selection method and system for RAID1 disk array read operation

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1247608A (en) * 1997-02-27 2000-03-15 国际商业机器公司 Transformational raid for hierarchical storage management system
CN1518697A (en) * 2001-04-19 2004-08-04 �Ҵ���˾ Method, apparatus and program for providing hybrid disk mirroring and striping
KR20050060804A (en) * 2003-12-17 2005-06-22 한국전자통신연구원 Data mirroring system to improve the performance of read operation for large data
US20100191907A1 (en) * 2009-01-26 2010-07-29 Lsi Corporation RAID Converter and Methods for Transforming a First RAID Array to a Second RAID Array Without Creating a Backup Copy
CN101866307A (en) * 2010-06-24 2010-10-20 杭州华三通信技术有限公司 Data storage method and device based on mirror image technology
CN102662607A (en) * 2012-03-29 2012-09-12 华中科技大学 RAID6 level mixed disk array, and method for accelerating performance and improving reliability
CN103761058A (en) * 2014-01-23 2014-04-30 天津中科蓝鲸信息技术有限公司 RAID1 and RAID4 hybrid structure network storage system and method
US20140304470A1 (en) * 2013-04-04 2014-10-09 Lsi Corporation Reverse mirroring in raid level 1
CN104281499A (en) * 2014-10-28 2015-01-14 苏州工业职业技术学院 Odd-even check-based RAID (redundant arrays of inexpensive disks) striped mirror data distribution method

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1247608A (en) * 1997-02-27 2000-03-15 国际商业机器公司 Transformational raid for hierarchical storage management system
CN1518697A (en) * 2001-04-19 2004-08-04 �Ҵ���˾ Method, apparatus and program for providing hybrid disk mirroring and striping
KR20050060804A (en) * 2003-12-17 2005-06-22 한국전자통신연구원 Data mirroring system to improve the performance of read operation for large data
US20100191907A1 (en) * 2009-01-26 2010-07-29 Lsi Corporation RAID Converter and Methods for Transforming a First RAID Array to a Second RAID Array Without Creating a Backup Copy
CN101866307A (en) * 2010-06-24 2010-10-20 杭州华三通信技术有限公司 Data storage method and device based on mirror image technology
CN102662607A (en) * 2012-03-29 2012-09-12 华中科技大学 RAID6 level mixed disk array, and method for accelerating performance and improving reliability
US20140304470A1 (en) * 2013-04-04 2014-10-09 Lsi Corporation Reverse mirroring in raid level 1
CN103761058A (en) * 2014-01-23 2014-04-30 天津中科蓝鲸信息技术有限公司 RAID1 and RAID4 hybrid structure network storage system and method
CN104281499A (en) * 2014-10-28 2015-01-14 苏州工业职业技术学院 Odd-even check-based RAID (redundant arrays of inexpensive disks) striped mirror data distribution method

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105022673B (en) * 2015-07-15 2018-07-20 南京师范大学 A kind of fault-tolerant fast parallel double calculation method of data-oriented parallel computation
CN106339276A (en) * 2016-08-16 2017-01-18 浪潮(北京)电子信息产业有限公司 Data backup state-based data recovery method and system
CN106339276B (en) * 2016-08-16 2019-10-18 浪潮(北京)电子信息产业有限公司 A data recovery method and system based on data backup state
CN109032513A (en) * 2018-07-16 2018-12-18 山东大学 Based on the RAID framework of SSD and HDD and its backup, method for reconstructing
CN113407125A (en) * 2021-08-20 2021-09-17 苏州浪潮智能科技有限公司 Method, system and related device for determining block number in RAID6 array
CN113407125B (en) * 2021-08-20 2021-11-09 苏州浪潮智能科技有限公司 Method, system and related device for determining block number in RAID6 array
CN114510379A (en) * 2022-04-21 2022-05-17 山东百盟信息技术有限公司 Distributed array video data storage device
CN114510379B (en) * 2022-04-21 2022-11-01 山东百盟信息技术有限公司 A distributed array video data storage device
CN115562594A (en) * 2022-12-06 2023-01-03 苏州浪潮智能科技有限公司 Method, system and related device for constructing RAID card
CN115562594B (en) * 2022-12-06 2023-03-24 苏州浪潮智能科技有限公司 Method, system and related device for constructing RAID card
CN118885132A (en) * 2024-09-29 2024-11-01 苏州元脑智能科技有限公司 A data processing method, device, equipment, medium, product and system
CN119415044A (en) * 2025-01-03 2025-02-11 麒麟软件有限公司 A disk selection method and system for RAID1 disk array read operation
CN119415044B (en) * 2025-01-03 2025-03-25 麒麟软件有限公司 A disk selection method and system for RAID1 disk array read operation

Also Published As

Publication number Publication date
CN104714758B (en) 2017-07-07

Similar Documents

Publication Publication Date Title
US10896089B2 (en) System level data-loss protection using storage device local buffers
CN104714758B (en) A kind of array construction method and read-write system based on verification RAID addition mirror-image structures
US10365983B1 (en) Repairing raid systems at per-stripe granularity
US8677063B2 (en) Parity declustered storage device array with partition groups
CA2063897C (en) Method and means for distributed sparing in dasd arrays
US9753674B2 (en) RAIDed memory system
US8775772B2 (en) Method and apparatus for performing enhanced read and write operations in a FLASH memory system
US7831768B2 (en) Method and apparatus for writing data to a disk array
JP3505093B2 (en) File management system
JP5256149B2 (en) Fast data recovery from HDD failure
US9417823B2 (en) Memory system management
US9495110B2 (en) LUN management with distributed RAID controllers
US20080178040A1 (en) Disk failure restoration method and disk array apparatus
US20110213920A1 (en) FLASH-based Memory System with Static or Variable Length Page Stripes Including Data Protection Information and Auxiliary Protection Stripes
US8392813B2 (en) Redundant file system
US9251059B2 (en) Storage system employing MRAM and redundant array of solid state disk
CN111095217B (en) Data storage system based on RAID mechanism with globally shared resources
US10521145B1 (en) Method, apparatus and computer program product for managing data storage
US9106260B2 (en) Parity data management for a memory architecture
KR102133316B1 (en) Memory system management
CN108319427A (en) A kind of Raid10 implementation methods for supporting quickly to rebuild and device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant