[go: up one dir, main page]

CN107301106A - The restoration methods and device of a kind of RAID system failure - Google Patents

The restoration methods and device of a kind of RAID system failure Download PDF

Info

Publication number
CN107301106A
CN107301106A CN201710509018.8A CN201710509018A CN107301106A CN 107301106 A CN107301106 A CN 107301106A CN 201710509018 A CN201710509018 A CN 201710509018A CN 107301106 A CN107301106 A CN 107301106A
Authority
CN
China
Prior art keywords
disk
added
raid system
metadata
raid
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710509018.8A
Other languages
Chinese (zh)
Inventor
成金祥
刘相乐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhengzhou Yunhai Information Technology Co Ltd
Original Assignee
Zhengzhou Yunhai Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhengzhou Yunhai Information Technology Co Ltd filed Critical Zhengzhou Yunhai Information Technology Co Ltd
Priority to CN201710509018.8A priority Critical patent/CN107301106A/en
Publication of CN107301106A publication Critical patent/CN107301106A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1456Hardware arrangements for backup

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Debugging And Monitoring (AREA)

Abstract

本申请提供的RAID系统故障的恢复方法及装置,当检测到待添加磁盘时,先判断所述待添加磁盘是否为RAID系统的成员盘,当所述待添加磁盘为RAID系统的成员盘时,判断所述待添加磁盘的元数据与所述RAID系统已有成员盘的元数据是否一致;若是,将所述待添加磁盘添加到所述RAID系统中;若否,将所述待添加磁盘的元数据更新为所述RAID系统已有成员盘的元数据,并将所述待添加磁盘添加到所述RAID系统中。有效的解决了由于硬件链路故障导致RAID系统故障的问题,不需要将掉线磁盘中的数据进行转移,只需更新掉线磁盘的元数据即可,缩短了RAID系统故障的恢复时间。

The recovery method and device of the RAID system failure provided by the present application, when detecting a disk to be added, first judge whether the disk to be added is a member disk of the RAID system, and when the disk to be added is a member disk of the RAID system, Judging whether the metadata of the disk to be added is consistent with the metadata of the existing member disks of the RAID system; if so, adding the disk to be added to the RAID system; if not, adding the disk to be added The metadata is updated to the metadata of existing member disks of the RAID system, and the disk to be added is added to the RAID system. It effectively solves the problem of RAID system failure due to hardware link failure. It does not need to transfer the data in the offline disk, but only needs to update the metadata of the offline disk, which shortens the recovery time of the RAID system failure.

Description

一种RAID系统故障的恢复方法及装置A recovery method and device for a RAID system failure

技术领域technical field

本发明涉及信息存储技术领域,更具体的,涉及一种RAID系统故障的恢复方法及装置。The present invention relates to the technical field of information storage, and more specifically, to a recovery method and device for a RAID system failure.

背景技术Background technique

RAID是一个复杂的系统,linux平台中的软RAID是以块设备的形式提供给用户使用,并利用linux的通用块层作为RAID的cache缓冲管理,而且软RAID没有自己的IO调度模块,借助磁盘驱动完成IO的调度和处理。当创建RAID系统之后,在每个RAID系统成员盘中都会保存一份RAID的元数据信息,RAID系统通过元数据信息管理各成员盘。RAID is a complex system. Soft RAID on the Linux platform is provided to users in the form of block devices, and uses the general block layer of Linux as the cache buffer management of RAID. Moreover, soft RAID does not have its own IO scheduling module. The driver completes the scheduling and processing of IO. After the RAID system is created, a piece of RAID metadata information is saved in each member disk of the RAID system, and the RAID system manages each member disk through the metadata information.

当存储系统出现硬件链路故障导致JBOD(Just a Bunch Of Disks,磁盘簇)或者背板与主柜之间的SAS链路断开时,就会导致RAID系统成员盘之间的元数据信息不一致。此时即使将硬件链路恢复,RAID系统也无法恢复,要想恢复RAID系统,需要将掉线的磁盘中的数据进行转移,非常耗费时间。When the hardware link failure of the storage system causes the JBOD (Just a Bunch Of Disks, disk cluster) or the SAS link between the backplane and the main cabinet to be disconnected, the metadata information between the member disks of the RAID system will be inconsistent. . At this time, even if the hardware link is restored, the RAID system cannot be restored. To restore the RAID system, the data in the offline disk needs to be transferred, which is very time-consuming.

发明内容Contents of the invention

有鉴于此,本发明提供了一种RAID系统故障的恢复方法及装置,解决当出现硬件链路故障时导致RAID系统无法恢复的问题,缩短了RAID系统故障的恢复时间。In view of this, the present invention provides a recovery method and device for a RAID system failure, which solves the problem that the RAID system cannot be recovered when a hardware link failure occurs, and shortens the recovery time of the RAID system failure.

为了实现上述发明目的,本发明提供的具体技术方案如下:In order to realize the foregoing invention object, the specific technical scheme provided by the present invention is as follows:

一种RAID系统故障的恢复方法,包括:A recovery method for a RAID system failure, comprising:

当检测到待添加磁盘时,判断所述待添加磁盘是否为RAID系统的成员盘;When the disk to be added is detected, it is judged whether the disk to be added is a member disk of the RAID system;

当所述待添加磁盘为所述RAID系统的成员盘时,判断所述待添加磁盘的元数据与所述RAID系统已有成员盘的元数据是否一致;When the disk to be added is a member disk of the RAID system, judging whether the metadata of the disk to be added is consistent with the metadata of an existing member disk of the RAID system;

若是,将所述待添加磁盘添加到所述RAID系统中;If so, adding the disk to be added to the RAID system;

若否,将所述待添加磁盘的元数据更新为所述RAID系统已有成员盘的元数据,并将所述待添加磁盘添加到所述RAID系统中。If not, update the metadata of the disk to be added to the metadata of an existing member disk of the RAID system, and add the disk to be added to the RAID system.

优选的,所述判断所述待添加磁盘是否为RAID系统的成员盘,包括:Preferably, the judging whether the disk to be added is a member disk of the RAID system includes:

提取所述待添加磁盘的RAID信息;Extracting the RAID information of the disk to be added;

判断所述待添加磁盘的RAID信息与所述RAID系统的配置文件中的RAID信息是否匹配,若是,所述待添加磁盘为所述RAID系统的成员盘,若否,所述待添加磁盘不是所述RAID系统的成员盘。Judging whether the RAID information of the disk to be added matches the RAID information in the configuration file of the RAID system, if so, the disk to be added is a member disk of the RAID system, if not, the disk to be added is not the member disk of the RAID system Member disks of the RAID system described above.

优选的,所述判断所述待添加磁盘的元数据与所述RAID系统已有成员盘的元数据是否一致,包括:Preferably, the judging whether the metadata of the disk to be added is consistent with the metadata of the existing member disks of the RAID system includes:

计算所述待添加磁盘的元数据值与所述RAID系统已有成员盘的元数据值的差值;Calculate the difference between the metadata value of the disk to be added and the metadata value of the existing member disks of the RAID system;

判断所述差值是否大于预设值;judging whether the difference is greater than a preset value;

若是,所述待添加磁盘的元数据与所述RAID系统已有成员盘的元数据不一致;If so, the metadata of the disk to be added is inconsistent with the metadata of the existing member disks of the RAID system;

若否,所述待添加磁盘的元数据与所述RAID系统已有成员盘的元数据一致。If not, the metadata of the disk to be added is consistent with the metadata of an existing member disk of the RAID system.

优选的,所述方法还包括:Preferably, the method also includes:

当所述待添加磁盘不是所述RAID系统的成员盘时,提示RAID故障恢复失败。When the disk to be added is not a member disk of the RAID system, it prompts that the RAID failure recovery fails.

一种RAID系统故障的恢复装置,包括:A recovery device for a RAID system failure, comprising:

第一判断单元,用于当检测到待添加磁盘时,判断所述待添加磁盘是否为RAID系统的成员盘;The first judging unit is used to judge whether the disk to be added is a member disk of the RAID system when the disk to be added is detected;

第二判断单元,用于当所述待添加磁盘为所述RAID系统的成员盘时,判断所述待添加磁盘的元数据与所述RAID系统已有成员盘的元数据是否一致,若是,触发第一添加单元,若否,触发第二添加单元;The second judging unit is configured to judge whether the metadata of the disk to be added is consistent with the metadata of an existing member disk of the RAID system when the disk to be added is a member disk of the RAID system, and if so, trigger The first adding unit, if not, trigger the second adding unit;

所述第一添加单元,用于将所述待添加磁盘添加到所述RAID系统中;The first adding unit is configured to add the disk to be added to the RAID system;

所述第二添加单元,用于将所述待添加磁盘的元数据更新为所述RAID系统已有成员盘的元数据,并将所述待添加磁盘添加到所述RAID系统中。The second adding unit is configured to update the metadata of the disk to be added to the metadata of an existing member disk of the RAID system, and add the disk to be added to the RAID system.

优选的,所述第一判断单元包括:Preferably, the first judging unit includes:

提取子单元,用于提取所述待添加磁盘的RAID信息;Extracting a subunit, configured to extract the RAID information of the disk to be added;

第一判断子单元,用于判断所述待添加磁盘的RAID信息与所述RAID系统的配置文件中的RAID信息是否匹配,若是,触发所述第二判断单元。The first judging subunit is configured to judge whether the RAID information of the disk to be added matches the RAID information in the configuration file of the RAID system, and if so, trigger the second judging unit.

优选的,所述第二判断单元包括:Preferably, the second judging unit includes:

计算子单元,用于计算所述待添加磁盘的元数据值与所述RAID系统已有成员盘的元数据值的差值;A calculation subunit, configured to calculate the difference between the metadata value of the disk to be added and the metadata value of the existing member disks of the RAID system;

第二判断子单元,用于判断所述差值是否大于预设值,若是,触发所述第二添加单元,若否,触发所述第一添加单元。The second judging subunit is used to judge whether the difference is greater than a preset value, if so, trigger the second adding unit, and if not, trigger the first adding unit.

优选的,所述装置还包括:Preferably, the device also includes:

提示单元,用于当所述待添加磁盘不是所述RAID系统的成员盘时,提示RAID故障恢复失败。A prompting unit, configured to prompt that the RAID failure recovery fails when the disk to be added is not a member disk of the RAID system.

相对于现有技术,本发明的有益效果如下:Compared with the prior art, the beneficial effects of the present invention are as follows:

本发明提供的RAID系统故障的恢复方法及装置,当检测到待添加磁盘时,先判断所述待添加磁盘是否为RAID系统的成员盘,当所述待添加磁盘为RAID系统的成员盘时,判断所述待添加磁盘的元数据与所述RAID系统已有成员盘的元数据是否一致;若是,将所述待添加磁盘添加到所述RAID系统中;若否,将所述待添加磁盘的元数据更新为所述RAID系统已有成员盘的元数据,并将所述待添加磁盘添加到所述RAID系统中。有效的解决了由于硬件链路故障导致RAID系统故障的问题,不需要将掉线磁盘中的数据进行转移,只需更新掉线磁盘的元数据即可,缩短了RAID系统故障的恢复时间。The recovery method and device for a RAID system failure provided by the present invention, when detecting a disk to be added, first judge whether the disk to be added is a member disk of the RAID system, and when the disk to be added is a member disk of the RAID system, Judging whether the metadata of the disk to be added is consistent with the metadata of the existing member disks of the RAID system; if so, adding the disk to be added to the RAID system; if not, adding the disk to be added The metadata is updated to the metadata of existing member disks of the RAID system, and the disk to be added is added to the RAID system. It effectively solves the problem of RAID system failure due to hardware link failure. It does not need to transfer the data in the offline disk, but only needs to update the metadata of the offline disk, which shortens the recovery time of the RAID system failure.

附图说明Description of drawings

为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据提供的附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description are only It is an embodiment of the present invention, and those skilled in the art can also obtain other drawings according to the provided drawings without creative work.

图1为本发明实施例公开的一种RAID系统故障的恢复方法流程图;Fig. 1 is a flowchart of a recovery method for a RAID system failure disclosed by an embodiment of the present invention;

图2为本发明实施例公开的另一种RAID系统故障的恢复方法流程图;Fig. 2 is another kind of recovery method flowchart of RAID system failure that the embodiment of the present invention discloses;

图3为本发明实施例公开的一种RAID系统故障的恢复装置结构示意图。FIG. 3 is a schematic structural diagram of a recovery device for a RAID system failure disclosed by an embodiment of the present invention.

具体实施方式detailed description

下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

本实施例公开的RAID系统故障的恢复方法,在于解决由于SAS链路故障、硬盘故障及其他硬件链路故障导致RAID系统中部分RAID成员盘掉线后无法重新加入并启动RAID系统的问题。具体的,请参阅图1,本实施例公开了一种RAID系统故障的恢复方法,具体包括以下步骤:The recovery method for a RAID system failure disclosed in this embodiment is to solve the problem that some RAID member disks in the RAID system cannot rejoin and start the RAID system after being disconnected due to SAS link failure, hard disk failure and other hardware link failures. Specifically, referring to Fig. 1, the present embodiment discloses a recovery method for a RAID system failure, which specifically includes the following steps:

S101:当检测到待添加磁盘时,判断所述待添加磁盘是否为RAID系统的成员盘,若是,执行S102;S101: When the disk to be added is detected, determine whether the disk to be added is a member disk of the RAID system, and if so, execute S102;

当硬件链路故障导致RAID系统部分RAID成员盘掉线后,需要工作人员人为将掉线的RAID成员盘拔下并重新插入槽位。通过监听磁盘的热插拔动作,检测是否有待添加磁盘,所述待添加磁盘即为掉线的RAID成员盘。When a hardware link failure causes some RAID member disks in the RAID system to go offline, the staff needs to manually unplug the offline RAID member disks and reinsert them into the slots. By monitoring the hot-swapping action of the disk, it is detected whether there is a disk to be added, and the disk to be added is a RAID member disk that is offline.

当检测到待添加磁盘时,首先需要判断所述待添加磁盘是否为RAID系统的成员盘,具体判断流程如下:When the disk to be added is detected, it is first necessary to judge whether the disk to be added is a member disk of the RAID system. The specific judgment process is as follows:

提取所述待添加磁盘的RAID信息;Extracting the RAID information of the disk to be added;

判断所述待添加磁盘的RAID信息与所述RAID系统的配置文件中的RAID信息是否匹配,若是,所述待添加磁盘为所述RAID系统的成员盘,并执行S102;若否,所述待添加磁盘不是所述RAID系统的成员盘。Determine whether the RAID information of the disk to be added matches the RAID information in the configuration file of the RAID system, if yes, the disk to be added is a member disk of the RAID system, and perform S102; if not, the disk to be added The added disk is not a member disk of the RAID system.

优选的,请参阅图2,当所述待添加磁盘不是所述RAID系统的成员盘时,所述方法还包括:Preferably, referring to Fig. 2, when the disk to be added is not a member disk of the RAID system, the method also includes:

S105:提示RAID故障恢复失败。S105: It prompts that the recovery of the RAID fault fails.

提示RAID故障恢复失败的方式可以为语音提示、文本提示、闪灯提示等任意一种可以起到提示作用的方式。The way of prompting that the RAID failure recovery fails can be any one of voice prompts, text prompts, and flashing light prompts that can play a prompt role.

S102:判断所述待添加磁盘的元数据与所述RAID系统已有成员盘的元数据是否一致,若是,执行S103,若否,执行S104;S102: Determine whether the metadata of the disk to be added is consistent with the metadata of the existing member disks of the RAID system, if yes, execute S103, if not, execute S104;

具体的,S102的执行过程如下:Specifically, the execution process of S102 is as follows:

计算所述待添加磁盘的元数据值与所述RAID系统已有成员盘的元数据值的差值;Calculate the difference between the metadata value of the disk to be added and the metadata value of the existing member disks of the RAID system;

判断所述差值是否大于预设值;judging whether the difference is greater than a preset value;

若是,所述待添加磁盘的元数据与所述RAID系统已有成员盘的元数据不一致;If so, the metadata of the disk to be added is inconsistent with the metadata of the existing member disks of the RAID system;

若否,所述待添加磁盘的元数据与所述RAID系统已有成员盘的元数据一致。If not, the metadata of the disk to be added is consistent with the metadata of an existing member disk of the RAID system.

优选的,所述预设值为2。Preferably, the preset value is 2.

S103:将所述待添加磁盘添加到所述RAID系统中;S103: Add the disk to be added to the RAID system;

S104:将所述待添加磁盘的元数据更新为所述RAID系统已有成员盘的元数据,并将所述待添加磁盘添加到所述RAID系统中。S104: Update the metadata of the disk to be added to the metadata of an existing member disk of the RAID system, and add the disk to be added to the RAID system.

具体的,当所述待添加磁盘的元数据与所述RAID系统已有成员盘的元数据不一致时,删除所述待添加磁盘的元数据,并将所述RAID系统已有成员盘的元数据拷贝到所述待添加磁盘中,使所述待添加磁盘的元数据更新为所述RAID系统已有成员盘的元数据。Specifically, when the metadata of the disk to be added is inconsistent with the metadata of the existing member disks of the RAID system, delete the metadata of the disk to be added, and replace the metadata of the existing member disks of the RAID system copy to the disk to be added, so that the metadata of the disk to be added is updated to the metadata of an existing member disk of the RAID system.

本实施例提供的RAID系统故障的恢复方法,当检测到待添加磁盘时,先判断所述待添加磁盘是否为RAID系统的成员盘,当所述待添加磁盘为RAID系统的成员盘时,判断所述待添加磁盘的元数据与所述RAID系统已有成员盘的元数据是否一致;若是,将所述待添加磁盘添加到所述RAID系统中;若否,将所述待添加磁盘的元数据更新为所述RAID系统已有成员盘的元数据,并将所述待添加磁盘添加到所述RAID系统中。有效的解决了由于硬件链路故障导致RAID系统故障的问题,不需要将掉线磁盘中的数据进行转移,只需更新掉线磁盘的元数据即可,缩短了RAID系统故障的恢复时间。The recovery method of the RAID system failure that this embodiment provides, when detecting that the disk to be added, first judge whether the disk to be added is a member disk of the RAID system, and when the disk to be added is a member disk of the RAID system, judge Whether the metadata of the disk to be added is consistent with the metadata of the existing member disks of the RAID system; if so, add the disk to be added to the RAID system; if not, add the metadata of the disk to be added The data is updated to the metadata of existing member disks of the RAID system, and the disk to be added is added to the RAID system. It effectively solves the problem of RAID system failure due to hardware link failure. It does not need to transfer the data in the offline disk, but only needs to update the metadata of the offline disk, which shortens the recovery time of the RAID system failure.

基于上述实施例公开的一种RAID系统故障的恢复方法,请参阅图3,本实施例对应公开了一种RAID系统故障的恢复装置,包括:Based on the recovery method for a RAID system failure disclosed in the above embodiment, please refer to FIG. 3. This embodiment discloses a recovery device for a RAID system failure correspondingly, including:

第一判断单元101,用于当检测到待添加磁盘时,判断所述待添加磁盘是否为RAID系统的成员盘;The first judging unit 101 is used to judge whether the disk to be added is a member disk of the RAID system when the disk to be added is detected;

优选的,所述第一判断单元101包括:Preferably, the first judging unit 101 includes:

提取子单元,用于提取所述待添加磁盘的RAID信息;Extracting a subunit, configured to extract the RAID information of the disk to be added;

第一判断子单元,用于判断所述待添加磁盘的RAID信息与所述RAID系统的配置文件中的RAID信息是否匹配,若是,触发所述第二判断单元102。The first judging subunit is configured to judge whether the RAID information of the disk to be added matches the RAID information in the configuration file of the RAID system, and if so, trigger the second judging unit 102 .

优选的,所述装置还包括:Preferably, the device also includes:

提示单元,用于当所述待添加磁盘不是所述RAID系统的成员盘时,提示RAID故障恢复失败。A prompting unit, configured to prompt that the RAID failure recovery fails when the disk to be added is not a member disk of the RAID system.

即,当所述第一判断单元101判断所述待添加磁盘不是所述RAID系统的成员盘时,触发所述提示单元。That is, when the first judging unit 101 judges that the disk to be added is not a member disk of the RAID system, the prompting unit is triggered.

第二判断单元102,用于当所述待添加磁盘为所述RAID系统的成员盘时,判断所述待添加磁盘的元数据与所述RAID系统已有成员盘的元数据是否一致,若是,触发第一添加单元103,若否,触发第二添加单元104;The second judging unit 102 is configured to judge whether the metadata of the disk to be added is consistent with the metadata of an existing member disk of the RAID system when the disk to be added is a member disk of the RAID system, and if so, Trigger the first adding unit 103, if not, trigger the second adding unit 104;

优选的,所述第二判断单元102包括:Preferably, the second judging unit 102 includes:

计算子单元,用于计算所述待添加磁盘的元数据值与所述RAID系统已有成员盘的元数据值的差值;A calculation subunit, configured to calculate the difference between the metadata value of the disk to be added and the metadata value of the existing member disks of the RAID system;

第二判断子单元,用于判断所述差值是否大于预设值,若是,触发所述第二添加单元104,若否,触发所述第一添加单元103。The second judging subunit is used to judge whether the difference is greater than a preset value, if yes, trigger the second adding unit 104 , if not, trigger the first adding unit 103 .

所述第一添加单元103,用于将所述待添加磁盘添加到所述RAID系统中;The first adding unit 103 is configured to add the disk to be added to the RAID system;

所述第二添加单元104,用于将所述待添加磁盘的元数据更新为所述RAID系统已有成员盘的元数据,并将所述待添加磁盘添加到所述RAID系统中。The second adding unit 104 is configured to update the metadata of the disk to be added to the metadata of an existing member disk of the RAID system, and add the disk to be added to the RAID system.

本实施例提供的RAID系统故障的恢复装置,当检测到待添加磁盘时,先判断所述待添加磁盘是否为RAID系统的成员盘,当所述待添加磁盘为RAID系统的成员盘时,判断所述待添加磁盘的元数据与所述RAID系统已有成员盘的元数据是否一致;若是,将所述待添加磁盘添加到所述RAID系统中;若否,将所述待添加磁盘的元数据更新为所述RAID系统已有成员盘的元数据,并将所述待添加磁盘添加到所述RAID系统中。有效的解决了由于硬件链路故障导致RAID系统故障的问题,不需要将掉线磁盘中的数据进行转移,只需更新掉线磁盘的元数据即可,缩短了RAID系统故障的恢复时间。The recovery device for RAID system failure provided by this embodiment, when detecting a disk to be added, first judges whether the disk to be added is a member disk of the RAID system, and when the disk to be added is a member disk of the RAID system, judges Whether the metadata of the disk to be added is consistent with the metadata of the existing member disks of the RAID system; if so, add the disk to be added to the RAID system; if not, add the metadata of the disk to be added The data is updated to the metadata of existing member disks of the RAID system, and the disk to be added is added to the RAID system. It effectively solves the problem of RAID system failure due to hardware link failure. It does not need to transfer the data in the offline disk, but only needs to update the metadata of the offline disk, which shortens the recovery time of the RAID system failure.

对所公开的实施例的上述说明,使本领域专业技术人员能够实现或使用本发明。对这些实施例的多种修改对本领域的专业技术人员来说将是显而易见的,本文中所定义的一般原理可以在不脱离本发明的精神或范围的情况下,在其它实施例中实现。因此,本发明将不会被限制于本文所示的这些实施例,而是要符合与本文所公开的原理和新颖特点相一致的最宽的范围。The above description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the general principles defined herein may be implemented in other embodiments without departing from the spirit or scope of the invention. Therefore, the present invention will not be limited to the embodiments shown herein, but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (8)

1.一种RAID系统故障的恢复方法,其特征在于,包括:1. a recovery method of RAID system failure, is characterized in that, comprises: 当检测到待添加磁盘时,判断所述待添加磁盘是否为RAID系统的成员盘;When the disk to be added is detected, it is judged whether the disk to be added is a member disk of the RAID system; 当所述待添加磁盘为所述RAID系统的成员盘时,判断所述待添加磁盘的元数据与所述RAID系统已有成员盘的元数据是否一致;When the disk to be added is a member disk of the RAID system, judging whether the metadata of the disk to be added is consistent with the metadata of an existing member disk of the RAID system; 若是,将所述待添加磁盘添加到所述RAID系统中;If so, adding the disk to be added to the RAID system; 若否,将所述待添加磁盘的元数据更新为所述RAID系统已有成员盘的元数据,并将所述待添加磁盘添加到所述RAID系统中。If not, update the metadata of the disk to be added to the metadata of an existing member disk of the RAID system, and add the disk to be added to the RAID system. 2.根据权利要求1所述的方法,其特征在于,所述判断所述待添加磁盘是否为RAID系统的成员盘,包括:2. The method according to claim 1, wherein the judging whether the disk to be added is a member disk of a RAID system comprises: 提取所述待添加磁盘的RAID信息;Extracting the RAID information of the disk to be added; 判断所述待添加磁盘的RAID信息与所述RAID系统的配置文件中的RAID信息是否匹配,若是,所述待添加磁盘为所述RAID系统的成员盘,若否,所述待添加磁盘不是所述RAID系统的成员盘。Judging whether the RAID information of the disk to be added matches the RAID information in the configuration file of the RAID system, if so, the disk to be added is a member disk of the RAID system, if not, the disk to be added is not the member disk of the RAID system Member disks of the RAID system described above. 3.根据权利要求1所述的方法,其特征在于,所述判断所述待添加磁盘的元数据与所述RAID系统已有成员盘的元数据是否一致,包括:3. The method according to claim 1, wherein the judging whether the metadata of the disk to be added is consistent with the metadata of the existing member disks of the RAID system comprises: 计算所述待添加磁盘的元数据值与所述RAID系统已有成员盘的元数据值的差值;Calculate the difference between the metadata value of the disk to be added and the metadata value of the existing member disks of the RAID system; 判断所述差值是否大于预设值;judging whether the difference is greater than a preset value; 若是,所述待添加磁盘的元数据与所述RAID系统已有成员盘的元数据不一致;If so, the metadata of the disk to be added is inconsistent with the metadata of the existing member disks of the RAID system; 若否,所述待添加磁盘的元数据与所述RAID系统已有成员盘的元数据一致。If not, the metadata of the disk to be added is consistent with the metadata of an existing member disk of the RAID system. 4.根据权利要求1所述的方法,其特征在于,所述方法还包括:4. The method according to claim 1, wherein the method further comprises: 当所述待添加磁盘不是所述RAID系统的成员盘时,提示RAID故障恢复失败。When the disk to be added is not a member disk of the RAID system, it prompts that the RAID failure recovery fails. 5.一种RAID系统故障的恢复装置,其特征在于,包括:5. A recovery device for a RAID system failure, characterized in that it comprises: 第一判断单元,用于当检测到待添加磁盘时,判断所述待添加磁盘是否为RAID系统的成员盘;The first judging unit is used to judge whether the disk to be added is a member disk of the RAID system when the disk to be added is detected; 第二判断单元,用于当所述待添加磁盘为所述RAID系统的成员盘时,判断所述待添加磁盘的元数据与所述RAID系统已有成员盘的元数据是否一致,若是,触发第一添加单元,若否,触发第二添加单元;The second judging unit is configured to judge whether the metadata of the disk to be added is consistent with the metadata of an existing member disk of the RAID system when the disk to be added is a member disk of the RAID system, and if so, trigger The first adding unit, if not, trigger the second adding unit; 所述第一添加单元,用于将所述待添加磁盘添加到所述RAID系统中;The first adding unit is configured to add the disk to be added to the RAID system; 所述第二添加单元,用于将所述待添加磁盘的元数据更新为所述RAID系统已有成员盘的元数据,并将所述待添加磁盘添加到所述RAID系统中。The second adding unit is configured to update the metadata of the disk to be added to the metadata of an existing member disk of the RAID system, and add the disk to be added to the RAID system. 6.根据权利要求5所述的装置,其特征在于,所述第一判断单元包括:6. The device according to claim 5, wherein the first judging unit comprises: 提取子单元,用于提取所述待添加磁盘的RAID信息;Extracting a subunit, configured to extract the RAID information of the disk to be added; 第一判断子单元,用于判断所述待添加磁盘的RAID信息与所述RAID系统的配置文件中的RAID信息是否匹配,若是,触发所述第二判断单元。The first judging subunit is configured to judge whether the RAID information of the disk to be added matches the RAID information in the configuration file of the RAID system, and if so, trigger the second judging unit. 7.根据权利要求5所述的装置,其特征在于,所述第二判断单元包括:7. The device according to claim 5, wherein the second judging unit comprises: 计算子单元,用于计算所述待添加磁盘的元数据值与所述RAID系统已有成员盘的元数据值的差值;A calculation subunit, configured to calculate the difference between the metadata value of the disk to be added and the metadata value of the existing member disks of the RAID system; 第二判断子单元,用于判断所述差值是否大于预设值,若是,触发所述第二添加单元,若否,触发所述第一添加单元。The second judging subunit is used to judge whether the difference is greater than a preset value, if yes, trigger the second adding unit, and if not, trigger the first adding unit. 8.根据权利要求5所述的装置,其特征在于,所述装置还包括:8. The device according to claim 5, further comprising: 提示单元,用于当所述待添加磁盘不是所述RAID系统的成员盘时,提示RAID故障恢复失败。A prompting unit, configured to prompt that the RAID failure recovery fails when the disk to be added is not a member disk of the RAID system.
CN201710509018.8A 2017-06-28 2017-06-28 The restoration methods and device of a kind of RAID system failure Pending CN107301106A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710509018.8A CN107301106A (en) 2017-06-28 2017-06-28 The restoration methods and device of a kind of RAID system failure

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710509018.8A CN107301106A (en) 2017-06-28 2017-06-28 The restoration methods and device of a kind of RAID system failure

Publications (1)

Publication Number Publication Date
CN107301106A true CN107301106A (en) 2017-10-27

Family

ID=60135432

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710509018.8A Pending CN107301106A (en) 2017-06-28 2017-06-28 The restoration methods and device of a kind of RAID system failure

Country Status (1)

Country Link
CN (1) CN107301106A (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101196799A (en) * 2008-01-09 2008-06-11 杭州华三通信技术有限公司 Magnetic disk redundant array and its controller and synchronization process
CN101699389A (en) * 2009-10-30 2010-04-28 中兴通讯股份有限公司 Method and device for processing hot removal of magnetic disk
CN101840311A (en) * 2009-12-30 2010-09-22 创新科存储技术有限公司 Self-repairing method suitable for RAID system and RAID system
CN103019894A (en) * 2012-12-25 2013-04-03 创新科存储技术(深圳)有限公司 Reconstruction method for redundant array of independent disks
CN103116474A (en) * 2013-01-25 2013-05-22 浪潮电子信息产业股份有限公司 Redundant array of independent disks (RAID) card design method for data recovery and self restoring
US20140089581A1 (en) * 2012-09-27 2014-03-27 Hewlett-Packard Development Company, L.P. Capacity-expansion of a logical volume
CN105824572A (en) * 2015-01-05 2016-08-03 中兴通讯股份有限公司 Disk storage space managing method, apparatus and storage device
CN106095330A (en) * 2016-05-30 2016-11-09 杭州宏杉科技有限公司 A kind of storage method and apparatus of metadata
CN106227627A (en) * 2016-08-22 2016-12-14 浪潮(北京)电子信息产业有限公司 A kind of raid is inserted into data distribution method and the system of new disk after data are recovered

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101196799A (en) * 2008-01-09 2008-06-11 杭州华三通信技术有限公司 Magnetic disk redundant array and its controller and synchronization process
CN101699389A (en) * 2009-10-30 2010-04-28 中兴通讯股份有限公司 Method and device for processing hot removal of magnetic disk
CN101840311A (en) * 2009-12-30 2010-09-22 创新科存储技术有限公司 Self-repairing method suitable for RAID system and RAID system
US20140089581A1 (en) * 2012-09-27 2014-03-27 Hewlett-Packard Development Company, L.P. Capacity-expansion of a logical volume
CN103019894A (en) * 2012-12-25 2013-04-03 创新科存储技术(深圳)有限公司 Reconstruction method for redundant array of independent disks
CN103116474A (en) * 2013-01-25 2013-05-22 浪潮电子信息产业股份有限公司 Redundant array of independent disks (RAID) card design method for data recovery and self restoring
CN105824572A (en) * 2015-01-05 2016-08-03 中兴通讯股份有限公司 Disk storage space managing method, apparatus and storage device
CN106095330A (en) * 2016-05-30 2016-11-09 杭州宏杉科技有限公司 A kind of storage method and apparatus of metadata
CN106227627A (en) * 2016-08-22 2016-12-14 浪潮(北京)电子信息产业有限公司 A kind of raid is inserted into data distribution method and the system of new disk after data are recovered

Similar Documents

Publication Publication Date Title
US11741048B2 (en) Distributed write journals that support fast snapshotting for a distributed file system
US8874508B1 (en) Systems and methods for enabling database disaster recovery using replicated volumes
US9940206B2 (en) Handling failed cluster members when replicating a database between clusters
US9256612B1 (en) Systems and methods for managing references in deduplicating data systems
JP6186374B2 (en) System and method for secure migration to a virtualized platform
US10565071B2 (en) Smart data replication recoverer
US10127119B1 (en) Systems and methods for modifying track logs during restore processes
US10860447B2 (en) Database cluster architecture based on dual port solid state disk
US8984325B2 (en) Systems and methods for disaster recovery of multi-tier applications
US8707107B1 (en) Systems and methods for proactively facilitating restoration of potential data failures
WO2016107220A1 (en) Remote replication method and apparatus based on duplicated data deletion
CN102981931A (en) Backup method and device for virtual machine
US9391865B1 (en) Systems and methods for facilitating fault-tolerant backup jobs
CN106339276B (en) A data recovery method and system based on data backup state
CN107533495A (en) Technology for data backup and resume
CN102609454B (en) Replica management method for distributed file system
CN104461757A (en) Method and device for restoring virtual machines
CN105573867A (en) Method and system for realizing high availability of MySQL
US9367457B1 (en) Systems and methods for enabling write-back caching and replication at different abstraction layers
CN107301106A (en) The restoration methods and device of a kind of RAID system failure
US9146868B1 (en) Systems and methods for eliminating inconsistencies between backing stores and caches
CN104317674A (en) Disk data backup method and device
US9424189B1 (en) Systems and methods for mitigating write-back caching failures
CN106648966A (en) Metadata backup method and system for storage server
CN103488550A (en) Processing method and device of data protection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20171027