CN117573451B

CN117573451B - Disk management method, device, equipment and computer readable storage medium

Info

Publication number: CN117573451B
Application number: CN202311562486.3A
Authority: CN
Inventors: 潘阳
Original assignee: Zhongdian Cloud Computing Technology Co ltd
Current assignee: Zhongdian Cloud Computing Technology Co ltd
Priority date: 2023-11-20
Filing date: 2023-11-20
Publication date: 2025-01-28
Anticipated expiration: 2043-11-20
Also published as: CN117573451A

Abstract

A disk management method, apparatus, device, and computer readable storage medium. The method comprises the steps of representing each disk based on the self-defined resources, and realizing operation and maintenance based on the self-defined resources on the basis of representing each disk. According to the application, the influence of the abnormal disk on the system can be eliminated in time from the software level without intervention of operation and maintenance personnel, and the running stability of the system is ensured.

Description

Disk management method, apparatus, device and computer readable storage medium

Technical Field

The present application relates to the field of operation and maintenance technologies, and in particular, to a disk management method, apparatus, device, and computer readable storage medium.

Background

In the storage cluster, the running state of the disk is detected, when the disk is in an abnormal running state, an alarm is triggered to inform operation staff, and the operation staff needs to intervene in disk replacement or disk isolation after receiving the alarm so as to enable the cluster to continue to normally run.

The existing strategies have the following technical problems:

(1) The operation and maintenance personnel are needed to intervene after the disk fault report alarm, the time from the problem discovery to the problem solution is poor, the problem cannot be immediately solved, and the fault processing efficiency is relatively low.

(2) Before the intervention of operation and maintenance personnel, the system is still in a risk operation state, and if abnormality occurs again, more serious problems can occur, and even normal operation of the service is affected.

(3) The processing of disk faults depends on the experience of operation and maintenance personnel, so that potential risks of abnormality caused by human errors can exist, and the labor cost of system maintenance is increased.

Disclosure of Invention

The present application provides a disk management method, apparatus, device, and computer readable storage medium, which can solve at least one of the above-mentioned technical problems.

In a first aspect, an embodiment of the present application provides a disk management method, where the disk management method includes:

Creating corresponding custom resources for each disk based on the definition of the custom resources, wherein the custom resources comprise disk roles, disk attributes, expected states and actual states, the expected states and the actual states are normal states, the disks comprise working disks and standby disks, the disk roles in the custom resources corresponding to the working disks are first disk roles, and the disk roles in the custom resources corresponding to the standby disks are second disk roles;

detecting the running state of each disk to obtain a detection result of each disk;

updating the actual state in the custom resource of each spare disk based on the detection result of each spare disk;

selecting a target custom resource from custom resources with a disk role being a second disk role and an actual state being a normal state aiming at any work disk in an abnormal state, wherein the disk attribute in the target custom resource is matched with the disk attribute in the custom resource of any work disk in the abnormal state;

changing the disk roles in the custom resources of any working disk in an abnormal state into a second disk role and changing the actual state into an abnormal state;

And changing the disk role in the target custom resource into a first disk role.

With reference to the first aspect, in one implementation manner, the step of performing running state detection on each disk to obtain a detection result of each disk includes:

for each disk, detecting whether the average IO time delay of the disk in a preset time period is larger than a threshold value, detecting whether a bad track exists, detecting whether a fault exists, and detecting whether the disk is pulled out;

if the average IO time delay of the device in the preset time period is larger than a threshold value, determining that the device is in a first abnormal state;

If the fault or the fault exists, determining that the fault or the fault exists in the second abnormal state;

if the device is pulled out, determining that the device is in a third abnormal state;

If the average IO time delay of the device in the preset time period is not greater than the threshold value, the device is not pulled out and has no bad track or fault, and the device is determined to be in a normal state.

With reference to the first aspect, in an implementation manner, the step of changing a disk role in the custom resource of the working disk in an abnormal state to a second disk role and changing an actual state to an abnormal state includes:

When the abnormal state is a first abnormal state, changing the disk role in the custom resource of any work disk in the abnormal state into a second disk role and changing the actual state into the first abnormal state;

When the abnormal state is a second abnormal state, changing the disk role in the custom resource of any work disk in the abnormal state into a second disk role and changing the actual state into the second abnormal state;

When the abnormal state is a third abnormal state, changing the disk role in the custom resource of any work disk in the abnormal state into a second disk role and changing the actual state into the third abnormal state.

In combination with the first aspect, in one embodiment, the disc attributes include a disc type and a disc capacity, and when the disc types of the two disc attributes are the same and a gap between the disc capacities is smaller than a preset capacity, the two disc attributes are matched.

With reference to the first aspect, in an implementation manner, before the step of changing a disk role in the custom resource of the working disk in an abnormal state to a second disk role and changing an actual state to an abnormal state, the method further includes:

Changing the actual state in the target custom resource into an initialized state, wherein when the actual state in the custom resource corresponding to the disk is the initialized state, the disk is in an unusable state;

after the step of changing the disk role in the target custom resource to the first disk role, the method further comprises:

And changing the actual state in the target custom resource into a normal state.

In a second aspect, an embodiment of the present application provides a disk management apparatus, including:

The system comprises a creation module, a storage module and a storage module, wherein the creation module is used for creating corresponding custom resources for each disk based on custom resources, the custom resources comprise disk roles, disk attributes, expected states and actual states, the expected states and the actual states are normal states, the disks comprise working disks and standby disks, the disk roles in the custom resources corresponding to the working disks are first disk roles, and the disk roles in the custom resources corresponding to the standby disks are second disk roles;

the detection module is used for detecting the running state of each magnetic disk to obtain a detection result of each magnetic disk;

the first updating module is used for updating the actual state in the custom resource of each spare disk based on the detection result of each spare disk;

The selection module is used for selecting a target custom resource from custom resources with a disk role being a second disk role and an actual state being a normal state aiming at any work disk in an abnormal state, wherein the disk attribute in the target custom resource is matched with the disk attribute in the custom resource of any work disk in the abnormal state;

the second updating module is used for changing the disk roles in the custom resources of any work disk in the abnormal state into a second disk role and changing the actual state into the abnormal state, and changing the disk roles in the target custom resources into a first disk role.

With reference to the second aspect, in one embodiment, the detection module is configured to:

With reference to the second aspect, in one embodiment, the disc attributes include a disc type and a disc capacity, and when the disc types of the two disc attributes are the same and the difference between the disc capacities is smaller than the preset capacity, the two disc attributes are matched.

In a third aspect, an embodiment of the present application provides a disk management apparatus, where the disk management apparatus includes a processor, a memory, and a disk management program stored on the memory and executable by the processor, where the disk management program, when executed by the processor, implements the steps of the disk management method according to the first aspect.

In a fourth aspect, an embodiment of the present application provides a computer readable storage medium having a disk management program stored thereon, where the disk management program, when executed by a processor, implements the steps of the disk management method according to the first aspect.

In the embodiment of the application, corresponding custom resources are created for each disk based on custom resources, wherein the custom resources comprise disk roles, disk attributes, expected states and actual states, the expected states and the actual states are normal states, the disks comprise working disks and standby disks, the disk roles in the custom resources corresponding to the working disks are first disk roles, the disk roles in the custom resources corresponding to the standby disks are second disk roles, running state detection is carried out on each disk to obtain detection results of each disk, the actual states in the custom resources of each standby disk are updated based on detection results of each standby disk, a target custom resource is selected from the custom resources with the disk roles being the second disk roles and the actual states being the normal states for any work disk, the disk attributes in the target custom resource are matched with the disk attributes in the custom resources of any work disk in the abnormal states, the disk roles in the work custom resources in any abnormal states are changed into the second disk and the actual states, and the target custom resources are changed into the abnormal states. According to the embodiment of the application, each disk is characterized based on the self-defined resource, on the basis, the operation and maintenance work is realized based on the self-defined resource, the intervention of operation and maintenance personnel is not needed, the influence of the abnormal disk on the system can be eliminated in time from the software level, and the running stability of the system is ensured.

Drawings

FIG. 1 is a flow chart of an embodiment of a disk management method according to the present application;

FIG. 2 is a schematic diagram of functional modules of an embodiment of a disk management apparatus according to the present application;

Fig. 3 is a schematic hardware structure of a disk management apparatus according to an embodiment of the present application.

Detailed Description

In order that those skilled in the art will better understand the present application, a technical solution in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

First, some technical terms in the present application are explained so as to facilitate understanding of the present application by those skilled in the art.

Kubernetes provides an extended mechanism that allows us to define their own resource types, called custom resources. Running a pod on the cluster monitors a custom resource of some type and manages other resources based on it, which is the Operator mode.

For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the embodiments of the present application will be described in further detail with reference to the accompanying drawings.

In a first aspect, an embodiment of the present application provides a disk management method.

In an embodiment, referring to fig. 1, fig. 1 is a flowchart of a first embodiment of a disk management method according to the present application. As shown in fig. 1, the disk management method includes:

step S10, creating corresponding custom resources for each disk based on the definition of the custom resources, wherein the custom resources comprise disk roles, disk attributes, expected states and actual states, the expected states and the actual states are normal states, the disks comprise working disks and standby disks, the disk roles in the custom resources corresponding to the working disks are first disk roles, and the disk roles in the custom resources corresponding to the standby disks are second disk roles;

In this embodiment, the operating mechanism based on the operators makes each disk resource, and registers in Kubernetes in the form of a custom resource. Wherein all or a portion of the selected disks in the storage cluster may be recycled.

The definition of the custom resource is formulated according to actual needs, and the core data is designed as follows:

disk role, disk attributes, expected state, and actual state.

On this basis, for each disk, a Custom Resource (CR) creation interface is called for registration, i.e., a corresponding custom resource is created. The disk roles are filled according to the roles of the disks, if the disks are working disks, the disk roles are filled into first disk roles (for example, working disks), if the disks are standby disks, the disk roles are filled into second disk roles (for example, standby disks), the disk attributes are filled according to actual conditions, and the expected state and the actual state are filled into normal states.

Optionally, the disk attribute includes a hard disk pool to which the disk belongs, a disk serial number, a disk type, and a disk capacity. The method comprises the steps of filling a hard disk pool which belongs to a working disk according to the hard disk pool which is actually added into the working disk, and filling a hard disk pool which belongs to a spare disk into the spare disk.

Step S20, detecting the running state of each disk to obtain a detection result of each disk;

in this embodiment, the running state detection may be performed on each disk in real time or every preset time. Disk herein refers to both working disk and spare disk.

Step S30, updating the actual state in the custom resource of each spare disk based on the detection result of each spare disk;

In this embodiment, the detection result is divided into a normal state and an abnormal state, if the detection result of a spare disk is the abnormal state, the actual state in the custom resource is changed into the abnormal state, and if the detection result of a spare disk is the normal state, the actual state in the custom resource is kept unchanged.

Step S40, selecting a target custom resource from custom resources with a disk role being a second disk role and an actual state being a normal state aiming at any work disk in an abnormal state, wherein the disk attribute in the target custom resource is matched with the disk attribute in the custom resource of any work disk in the abnormal state;

In this embodiment, any working disk in an abnormal state is exemplified by a disk a, and if the disk a is in an abnormal state, a spare disk is needed to replace the working disk. Because in the embodiment of the application, each disk is recycled, and the operation and maintenance are also expanded around the custom resources of the disk, a target custom resource needs to be searched for the disk A.

For example, the disk a is a spare disk, the actual state in the custom resource a of the disk a is a normal state, and the disk attribute a in the custom resource a is matched with the disk attribute a in the custom resource a of the disk a, so that the custom resource a can be used as the target custom resource.

Step S50, changing the disk roles in the custom resources of any working disk in an abnormal state into a second disk role and changing the actual state into an abnormal state;

And step S60, changing the disk role in the target custom resource into a first disk role.

In this embodiment, in order to make the disk a corresponding to the target custom resource a take over the work of the disk a, it is necessary to change the disk role in the custom resource a to the second disk role (i.e. to indicate that the disk a is now a spare disk) and change the actual state to the abnormal state, and change the disk role in the target custom resource a to the first disk role (i.e. to indicate that the disk a is now a working disk).

Further, in an embodiment, step S20 includes:

In this embodiment, for each disk, the IO link data on the disk is collected by the disk IO detection tool of the kernel, if the average time delay of the disk in a certain time is detected to be higher than the threshold, the disk is considered to be in a sub-health state and is determined to be in a first abnormal state, the state of the disk is obtained by the disk intelligent state detection tool, if the disk is obtained to have a bad track or a fault, the disk is considered to be in a second abnormal state, the disk list information on the node is obtained by periodically scanning the disk on the node, the disk list information obtained this time is compared with the disk list obtained last time, if the disk is pulled out, a difference can be found between the new disk list and the last comparison, whether the disk is pulled out or not can be judged by the difference, and if the disk is pulled out, the disk is determined to be in a third abnormal state.

Further, in an embodiment, step S50 includes:

When the abnormal state is the first abnormal state, changing the disk role in the custom resource of any abnormal working disk into a second disk role and changing the actual state into the first abnormal state, when the abnormal state is the second abnormal state, changing the disk role in the custom resource of any abnormal working disk into the second disk role and changing the actual state into the second abnormal state, and when the abnormal state is the third abnormal state, changing the disk role in the custom resource of any abnormal working disk into the second disk role and changing the actual state into the third abnormal state.

In this embodiment, the abnormal states are classified into three types, and for any working disk in the abnormal state, the actual state is changed to the corresponding abnormal state according to the specific abnormal state. By the processing, the abnormal state labeling of the working disk in the abnormal state is more refined, and the processing of the abnormal state is facilitated.

Further, in an embodiment, the disc attributes include a disc type and a disc capacity, and when the disc types of the two disc attributes are the same and the difference between the disc capacities is smaller than the preset capacity, the two disc attributes are matched.

In this embodiment, the preset capacity is set based on actual needs, for example, 0.1TB.

Further, in an embodiment, before step S50, the method further includes:

after step S60, further comprising:

In this embodiment, after determining the target custom resource, in order to avoid that the disk corresponding to the target custom resource is used by other processes before the disc replacement operation is completed, the actual state in the target custom resource needs to be changed to the initialized state, so that the corresponding disk is not used by other processes. On this basis, the actual state in the target custom resource needs to be changed into the normal state at the end of the disc replacement operation. The action of changing the actual state in the target custom resource to the normal state may be performed synchronously with step S60 or performed before step S60.

In a second aspect, an embodiment of the present application further provides a disk management apparatus.

In an embodiment, referring to fig. 2, fig. 2 is a schematic functional block diagram of a disk management apparatus according to an embodiment of the application. As shown in fig. 2, the disk management apparatus includes:

The creating module 10 is configured to create a corresponding custom resource for each disk based on the definition of the custom resource, where the custom resource includes a disk role, a disk attribute, an expected state and an actual state, the expected state and the actual state are both normal states, the disk includes a working disk and a standby disk, the disk role in the custom resource corresponding to the working disk is a first disk role, and the disk role in the custom resource corresponding to the standby disk is a second disk role;

The detection module 20 is configured to detect an operation state of each disk, so as to obtain a detection result of each disk;

A first updating module 30, configured to update an actual state in the custom resource of each spare disk based on a detection result of each spare disk;

A selection module 40, configured to select, for any working disk in an abnormal state, a target custom resource from custom resources whose disk role is a second disk role and whose actual state is a normal state, where a disk attribute in the target custom resource matches a disk attribute in the custom resource of the any working disk in an abnormal state;

The second updating module 50 is configured to change a disk role in the custom resource of any working disk in an abnormal state to a second disk role and change an actual state to an abnormal state, and change a disk role in the target custom resource to a first disk role.

Further, in an embodiment, the detection module 20 is configured to:

Further, in an embodiment, the second updating module is configured to:

The function implementation of each module in the disk management device corresponds to each step in the disk management method embodiment, and the function and implementation process thereof are not described in detail herein.

In a third aspect, an embodiment of the present application provides a disk management apparatus, which may be a personal computer (personal computer, PC), a notebook computer, a server, or the like having a data processing function.

Referring to fig. 3, fig. 3 is a schematic diagram of a hardware structure of a disk management apparatus according to an embodiment of the present application. In an embodiment of the present application, a disk management device may include a processor, a memory, a communication interface, and a communication bus.

The communication bus may be of any type for implementing the processor, memory, and communication interface interconnections.

The communication interfaces include input/output (I/O) interfaces, physical interfaces, logical interfaces, and the like for realizing interconnection of devices inside the disk management apparatus, and interfaces for realizing interconnection of the disk management apparatus with other apparatuses (e.g., other computing apparatuses or user apparatuses). The physical interface may be an ethernet interface, an optical fiber interface, an ATM interface, etc., and the user device may be a Display screen (Display), a Keyboard (Keyboard), etc.

The memory may be various types of storage media such as random access memory (randomaccess memory, RAM), read-only memory (ROM), nonvolatile RAM (non-volatileRAM, NVRAM), flash memory, optical memory, hard disk, programmable ROM (PROM), erasable PROM (erasable PROM, EPROM), electrically erasable PROM (ELECTRICALLY ERASABLE PROM, EEPROM), and the like.

The processor may be a general-purpose processor, and the general-purpose processor may call a disk management program stored in the memory and execute the disk management method provided by the embodiment of the present application. For example, the general purpose processor may be a central processing unit (central processing unit, CPU). The method executed when the disk management program is called may refer to various embodiments of the disk management method of the present application, and will not be described herein.

Those skilled in the art will appreciate that the hardware configuration shown in fig. 3 is not limiting of the application and may include more or fewer components than shown, or may combine certain components, or a different arrangement of components.

In a fourth aspect, embodiments of the present application also provide a computer-readable storage medium.

The computer readable storage medium of the present application stores a disk management program, wherein the disk management program, when executed by a processor, implements the steps of the disk management method as described above.

The method implemented when the disk management program is executed may refer to various embodiments of the disk management method of the present application, which are not described herein.

It should be noted that, the foregoing reference numerals of the embodiments of the present application are merely for describing the embodiments, and do not represent the advantages and disadvantages of the embodiments.

The terms "comprising" and "having" and any variations thereof in the description and claims of the application and in the foregoing drawings are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those listed steps or elements but may include other steps or elements not listed or inherent to such process, method, article, or apparatus. The terms "first," "second," and "third," etc. are used for distinguishing between different objects and not necessarily for describing a sequential or chronological order, and are not limited to the fact that "first," "second," and "third" are not identical.

In describing embodiments of the present application, "exemplary," "such as," or "for example," etc., are used to indicate by way of example, illustration, or description. Any embodiment or design described herein as "exemplary," "such as" or "for example" is not necessarily to be construed as preferred or advantageous over other embodiments or designs. Rather, the use of words such as "exemplary," "such as" or "for example," etc., is intended to present related concepts in a concrete fashion.

In the description of the embodiment of the present application, "/" means or, for example, a/B may mean a or B, and "and/or" in the text is merely an association relationship describing an association object, means that three relationships may exist, for example, a and/or B, three cases where a exists alone, a and B exist together, and B exists alone, and further, in the description of the embodiment of the present application, "a plurality" means two or more.

In some of the processes described in the embodiments of the present application, a plurality of operations or steps occurring in a particular order are included, but it should be understood that the operations or steps may be performed out of the order in which they occur in the embodiments of the present application or in parallel, the sequence numbers of the operations merely serve to distinguish between the various operations, and the sequence numbers themselves do not represent any order of execution. In addition, the processes may include more or fewer operations, and the operations or steps may be performed in sequence or in parallel, and the operations or steps may be combined.

From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) as described above, comprising several instructions for causing a terminal device to perform the method according to the embodiments of the present application.

The foregoing description is only of the preferred embodiments of the present application, and is not intended to limit the scope of the application, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein or in the alternative, which may be employed directly or indirectly in other related arts.

Claims

1. A disk management method is characterized in that, the disk management method comprises the following steps:

2. The disc management method according to claim 1, wherein the step of performing the operation state detection for each disc to obtain the detection result of each disc comprises:

3. The disk management method as claimed in claim 2, wherein the step of changing the disk role in the custom resource of any one of the working disks in the abnormal state to the second disk role and changing the actual state to the abnormal state includes:

4. The disc management method according to claim 1, wherein the disc attributes include a disc type and a disc capacity, and when the disc types of the two disc attributes are the same and a gap between the disc capacities is smaller than a preset capacity, the two disc attributes are matched.

5. The disk management method as claimed in claim 1, further comprising, before the step of changing the disk role in the custom resource of any of the working disks in the abnormal state to the second disk role and changing the actual state to the abnormal state:

6. A disk management apparatus, characterized in that the disk management apparatus comprises:

7. The disk management apparatus of claim 6, wherein the detection module is configured to:

8. The disc management apparatus according to claim 6, wherein the disc attributes include a disc type and a disc capacity, and the two disc attributes are matched when the disc types of the two disc attributes are identical and a gap between the disc capacities is smaller than a preset capacity.

9. A disk management apparatus comprising a processor, a memory, and a disk management program stored on the memory and executable by the processor, wherein the disk management program, when executed by the processor, implements the steps of the disk management method according to any one of claims 1 to 5.

10. A computer readable storage medium, wherein a disk management program is stored on the computer readable storage medium, wherein the disk management program, when executed by a processor, implements the steps of the disk management method according to any one of claims 1 to 5.