[go: up one dir, main page]

WO2020135889A1 - Method for dynamic loading of disk and cloud storage system - Google Patents

Method for dynamic loading of disk and cloud storage system Download PDF

Info

Publication number
WO2020135889A1
WO2020135889A1 PCT/CN2019/130169 CN2019130169W WO2020135889A1 WO 2020135889 A1 WO2020135889 A1 WO 2020135889A1 CN 2019130169 W CN2019130169 W CN 2019130169W WO 2020135889 A1 WO2020135889 A1 WO 2020135889A1
Authority
WO
WIPO (PCT)
Prior art keywords
disk
storage
storage node
node
management
Prior art date
Application number
PCT/CN2019/130169
Other languages
French (fr)
Chinese (zh)
Inventor
黄华东
夏伟强
王伟
林起芊
Original Assignee
杭州海康威视系统技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 杭州海康威视系统技术有限公司 filed Critical 杭州海康威视系统技术有限公司
Publication of WO2020135889A1 publication Critical patent/WO2020135889A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/08Error detection or correction by redundancy in data representation, e.g. by using checking codes
    • G06F11/10Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's
    • G06F11/1004Adding special bits or symbols to the coded information, e.g. parity check, casting out 9's or 11's to protect a block of data words, e.g. CRC or checksum
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0629Configuration or reconfiguration of storage systems
    • G06F3/0631Configuration or reconfiguration of storage systems by allocating resources to storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0683Plurality of storage devices
    • G06F3/0689Disk arrays, e.g. RAID, JBOD

Definitions

  • the present application relates to the field of data storage technology, in particular to a method for dynamically loading a disk and a cloud storage system.
  • Cloud storage provides flexible storage space for the storage of massive video data.
  • the storage space of cloud storage needs to maintain the storage cluster, and the data is generally scattered in the storage cluster. That is to say, massive video data can be stored through the storage cluster.
  • Cloud storage can use copy mode or EC (Erasure Code, erasure code) mode to ensure data integrity.
  • EC Erasure Code, erasure code
  • a storage cluster after a device fails, the data in the failed storage node needs to be recovered through the copy or EC data. That is refactoring.
  • the storage cluster size of cloud storage is large, storage node failures will become frequent.
  • a part of the storage node failure is a software failure, such as service startup failure, operating system abnormality, etc., although the data in the failed storage node can be calculated through copy or EC data, it consumes the computing power of the storage cluster and increases Cluster burden.
  • Embodiments of the present application provide a method for dynamically loading a disk and a cloud storage system, which can reduce system resource consumption caused by data reconstruction.
  • the technical solution is as follows:
  • a method for dynamically loading a disk is provided, which is applied to a cloud storage system.
  • the cloud storage system includes a management node and multiple storage nodes.
  • the multiple storage nodes access the same SAS switch.
  • the method includes:
  • the management node When the management node detects that the first storage node of the plurality of storage nodes has a software failure, it sends a disk load instruction to the second storage node of the plurality of storage nodes;
  • the second storage node loads the disk of the first storage node through the SAS switch.
  • the management node updates storage node information corresponding to the locally stored disk.
  • the method further includes:
  • the management node When the management node receives the read request to read the data of the disk, the management node sends the read request to the second storage node according to the updated information of the storage node corresponding to the locally stored disk.
  • the second storage node reads the data in the disk through the SAS switch according to the received read request.
  • the management node when the management node receives a write request to write data to the disk, according to the updated locally stored information of the storage node corresponding to the disk, the management node will write The request is sent to the second storage node;
  • the second storage node writes data to the disk through the SAS switch according to the received write request.
  • loading the disk of the first storage node through the SAS switch includes:
  • the second storage node updates the index information of the disk in the first storage node to the database of the second storage node.
  • the management node updating the storage node information corresponding to the locally stored disk includes:
  • the management node correspondingly updates the storage node information of the disk and the second storage node to a local database.
  • the method before the management node updates the storage node information corresponding to the locally stored disk, the method further includes:
  • the management node receives the message that the disk is loaded successfully from the second storage node.
  • the cloud storage system includes: a management node and multiple storage nodes, the multiple storage nodes accessing the same SAS switch, and the multiple storage nodes include a first storage Node and second storage node, where:
  • the management node is configured to send a disk load instruction to the second storage node when a software failure is detected in the first storage node;
  • the second storage node is configured to load the disk of the first storage node through the SAS switch after receiving the disk loading instruction.
  • the management node is further used to update storage node information corresponding to the locally stored disk.
  • the management node is further configured to, when receiving a read request to read the data of the disk, according to the updated storage node information corresponding to the disk stored locally , And send the read request to the second storage node;
  • the second storage node is also used to read the data in the disk through the SAS switch according to the received read request.
  • the management node is further configured to, when receiving a write request to write data to the disk, update the locally stored storage node information corresponding to the disk , Send the write request to the second storage node;
  • the second storage node is further configured to write data to the disk through the SAS switch according to the received write request.
  • the second storage node is also used to update the index information of the disk in the first storage node to the database of the second storage node.
  • the management node is further configured to correspondingly update the storage node information of the disk and the second storage node to a local database.
  • the management node is further configured to receive a message that the second storage node successfully loads the disk.
  • a disk dynamic loading device including:
  • One or more processors a storage device that stores one or more programs
  • the one or more processors When the one or more programs are executed by the one or more processors, the one or more processors implement the disk dynamic loading method.
  • a computer-readable storage medium on which a computer program is stored, which implements the disk dynamic loading method when the computer program is executed by a processor.
  • a computer program product containing instructions which, when run on a computer, causes the computer to implement the disk dynamic loading method described in the above aspect.
  • the method for dynamically loading the disk of the present application accesses the same SAS switch through the storage node, and the storage node can access the disks of all storage nodes connected to the SAS switch, so that in the case of a storage node software failure, loading through other storage nodes
  • the disk of the failed storage node can realize the dynamic loading of the disk, reduce the performance loss of system reconstruction, and improve the availability of the object storage disk.
  • FIG. 1 shows a first overall flowchart of a method for dynamically loading a magnetic disk according to an embodiment of the present application.
  • FIG. 2 shows a schematic structural diagram of a storage node accessing a SAS switch according to an embodiment of the present application.
  • FIG. 3 shows a second overall flowchart of a method for dynamically loading a disk according to an embodiment of the present application.
  • FIG. 4 shows a first schematic flowchart of an MDS drift disk according to an embodiment of the present application.
  • FIG. 5 shows a second schematic flowchart of an MDS drift disk according to an embodiment of the present application.
  • FIG. 6 shows a third schematic flow chart of an MDS drift disk according to an embodiment of the present application.
  • Database refers to a collection of related, structured data that is reasonably stored on a computer's storage device.
  • a database contains various contents, including tables, views, fields, indexes, etc.
  • Video positioning This application refers to the time entered by the user.
  • the system can quickly find the stored video data corresponding to this time according to the relevant information recorded in the database.
  • Byte Data storage is in the unit of "Byte" (Byte), and every 8 bits (bit, abbreviated as b) form a byte (Byte, abbreviated as B), which is the smallest level of information unit.
  • Video stream includes the video data to be transmitted, which can be processed as a stable and continuous stream through the network.
  • the object storage system is a massive, safe, highly reliable and easily expandable cloud storage service provided to users. Instead of organizing files into a directory hierarchy, it stores files in a flat container organization and uses unique IDs to retrieve them. The result is that object storage systems require less metadata to store and access files than file systems, and they also reduce the overhead of managing file metadata due to storing metadata.
  • the object storage system provides services for users through the platform-independent RESTFUL protocol and supports convenient storage and management of massive objects through the web.
  • the object storage system can store arbitrary objects in a durable and highly available system.
  • Applications and users can use simple APIs (Application Programming Interface) to access data in the object storage; these are usually based on the state of table attributes Transfer (REST) architecture, but there are also interfaces for programming languages.
  • OSD Object-Based Storage Device
  • This solution represents a storage node and is a module for reading and writing objects in an object storage system.
  • the OSD stores data to the tracks and sectors of the disk, and combines several tracks and sectors to form an object, and provides access to the data to the outside world through this object.
  • MDS Metal Database Server
  • It is the management node in the object storage system, which stores the index information of the object, including the name of the object, the specific location information of the object, and the last modification time of the object.
  • Allocation of resources This solution refers to MDS allocating storage resources for object writing, and specifically refers to allocating OSD and object disks.
  • File object Responsible for file access operations. After obtaining the file object, you can use the file object to read the data on the disk. The file object is uploaded to the cloud storage by the user at a time, and the upload is completed in one interaction using the PUT protocol.
  • a cluster is a group of independent computers interconnected by a high-speed network. They form a group and are managed as a single system. When a client interacts with the cluster, the cluster acts as an independent server. The cluster configuration is used to improve availability and scalability.
  • Disk loading Cloud storage persists data to multiple disks, which are media for storing data in cloud storage. Each disk usually includes multiple partitions. In the Linux operating system, disk loading refers to mounting the disk of a device (usually a storage device) to an existing directory. Specifically, if you want to access files in a disk of a storage device, you must Mount the partition where the file is located on an existing directory, and then access the file by accessing this directory. The disk can only be read and written after it is loaded by cloud storage.
  • Disk drift The disk drifts between OSDs, which means that the read and write control of the disk is switched from one OSD to another OSD.
  • the process of recovering damaged data blocks can be calculated through valid data blocks and check blocks in EC data.
  • SAS Serial Attached SCSI (Small Computer System Interface, small computer system interface) serial connection SCSI) switch: a switch that uses the SAS protocol for disk discovery and simulated network communication. After a storage node is connected to a SAS switch, you can discover and use disks in all storage nodes connected to the switch.
  • FIG. 1 is a schematic flowchart of a method for dynamically loading a magnetic disk provided by this embodiment, and each step is described in detail below.
  • the method for dynamically loading a magnetic disk may be applied to a cloud storage system.
  • the cloud storage system includes a management node and multiple storage nodes, and the multiple storage nodes are connected to the same SAS switch.
  • the cloud storage system may include multiple Management nodes, the multiple management nodes form a management cluster, as shown in Figure 2, the signaling ports of each management node MDS1, MDS2, MDS3...MDSN in the management cluster are interconnected with ordinary Gigabit switches, through the mutual The interconnection realizes signaling exchange.
  • the multiple storage nodes form a storage cluster, and each storage node OSD1, OSD2, 0SD3...OSDN signaling ports in the storage cluster are interconnected with ordinary Gigabit switches, and the signaling exchange is realized through the interconnection between each other.
  • the data ports of each storage node OSD1, OSD2, 0SD3...OSDN of the storage cluster are interconnected through SAS switches, and data exchange between them is realized through interconnection.
  • the cloud storage system includes a management node MDS as an example for description.
  • the signaling exchange between the management node MDS and the ordinary gigabit switch is a two-way exchange.
  • the signaling can be transmitted bidirectionally between the management node MDS and the ordinary gigabit switch;
  • the signaling exchange between the storage node OSD and the ordinary gigabit switch is Bidirectional exchange, signaling can be bidirectionally transmitted between the storage node OSD and the ordinary gigabit switch;
  • the data exchange between the storage node OSD and the SAS switch is also a bidirectional exchange, and the data can be bidirectionally transferred between the storage node OSD and the SAS switch.
  • SAS switches use the SAS protocol for disk discovery and simulated network communication
  • when a storage node is connected to the SAS switch it can discover and use the disks in all storage nodes connected to the switch.
  • the storage node OSD can access the disks of other storage nodes connected to the SAS switch.
  • the method for dynamically loading a disk may include the following steps:
  • the management node When the management node detects that the first storage node has a software failure, it sends a disk load instruction to the second storage node.
  • a storage node has a software level failure, such as a failed service start or an abnormal operating system.
  • this storage node is referred to as a faulty storage node.
  • the faulty storage node is also referred to as a first storage node.
  • the management node MDS considers the failed storage node offline. At this time, the management node MDS requests other storage nodes connected to the SAS switch to try to load the failed storage node.
  • the other storage node is referred to as a second storage node, that is, the management node sends a disk load instruction to the second storage node to instruct the second storage node to load the first storage node. Disk.
  • the second storage node After receiving the disk loading instruction, the second storage node loads the disk of the first storage node through a SAS switch.
  • the second storage node After receiving the disk loading instruction, the second storage node loads the disk in the first storage node. After the second storage node is successfully loaded, the data in the disk in the first storage node can be normally read by the second storage node, of course, the data in the disk of the first storage node can also be written by the second storage node , Thereby avoiding the process of data recovery.
  • the second storage node updates the index information of the disk in the first storage node to the database of the second storage node.
  • the disk index information in the first storage node can be sent to the second storage node through the SAS switch, and the second storage node copies the disk index information in the first storage node to the database of the local node for
  • the purpose of the update is to use this disk index information to read the data in the disk in the first storage node that has a software failure in the future.
  • the management node MDS can dynamically adjust the disks in the failed storage node to other storage nodes for read, write, and load according to the status of the storage node. For example, if the management node MDS does not find the storage node abnormal, it reads and writes the data in the disk normally; and when the management node MDS finds a storage node abnormal, it requests another storage node in the same switch to load the disk of the failed storage node , Through the other storage node to normally read and write data in the disk of the failed storage node, to achieve disk drift.
  • the management node MDS realizes disk drift according to the status of the storage node, that is, when a storage node has a software failure, the read and write permissions of the disk drift from the failed storage node to the normal storage node in the storage cluster. After the disk drifts, all disk read and write requests are executed through the normal storage node.
  • the normal storage node uses the drifted disk like a local disk. Therefore, the normal storage node can access the disk in the failed storage node normally through the SAS switch, and the disk in the failed storage node can be normally loaded. In this way, the data in the disk of the failed storage node can still be read and written normally without using copy mode or EC mode for recovery.
  • the management node updates locally stored storage node information corresponding to the disk.
  • the disk is a disk in the first storage node, that is, a disk that drifts to the second storage node.
  • the management node updates the disk and storage node information of the second storage node to the local database.
  • the storage node information is used to uniquely indicate a storage node.
  • the method may further include: the management node receives a message that the second storage node successfully loads the disk. That is to say, after determining that the second storage node successfully loads the disk in the first storage node, the management node updates the above-mentioned disk and the storage node information of the second storage node to the local database.
  • the second storage node After the second storage node successfully loads the disk in the first storage node, it will send a corresponding message that the disk is successfully loaded to the management node.
  • the management node After receiving the message that the disk is loaded successfully, the management node updates the information of the disk and the storage node of the second storage node to the local database of the management node as a record, so that if the first storage node fails again next time When the disk needs to be loaded again, it is not necessary to search or find a new storage node to load the disk, and a second storage node can be directly assigned to load the disk.
  • the disk in the faulty storage node with software level abnormality can be successfully loaded and read and written by other storage nodes. Data read and write does not need to be reconstructed to restore data, avoiding Unnecessary calculations. Moreover, after the storage node is abnormal, the reading and writing of data in the entire cloud storage system will not have much performance impact.
  • the management node MDS may request the second storage node to unload the loaded disk. For example, after the failed storage node returns to normal, the management node MDS may first request the second storage node to unload the disk of the loaded failed storage node, and then request the failed storage node to load the disk, so that the local disk of the failed storage node can be replaced by the The failed storage node takes over itself, thereby dispersing the pressure on the storage disk of the storage node in the system.
  • this application uses disks to drift between storage nodes inside the object storage to realize dynamic loading of the disks.
  • the storage nodes can be drifted through SAS switches to continue to access the disks of the failed storage nodes Data to improve disk availability.
  • the disk dynamic loading method may further include:
  • the management node When the management node receives the read request to read the data of the disk, the management node sends the read request to the second storage node according to the updated storage node information corresponding to the locally stored disk.
  • the second storage node reads the data in the disk through the SAS switch according to the received read request.
  • the management node when the management node receives a write request to write data to the disk, it sends the write request to the second according to the updated storage node information corresponding to the disk stored locally.
  • the second storage node writes data to the disk through the SAS switch according to the received write request.
  • the disks of the failed storage node can be read and written normally after being loaded by other storage nodes. After the disk of the failed storage node is successfully loaded by the normal storage node, subsequent disk data reading and writing can be performed through the normal storage node of the loaded disk.
  • the SAS switch allows the storage node to access the disks of other storage nodes in the same switch, just like The same as accessing the local disk.
  • the cloud storage system includes: a management node and multiple storage nodes, the multiple storage nodes accessing the same SAS switch, and the multiple storage nodes include a first storage Node and second storage node, where:
  • the management node is configured to send a disk load instruction to the second storage node when a software failure is detected in the first storage node;
  • the second storage node is configured to load the disk of the first storage node through the SAS switch after receiving the disk loading instruction.
  • the management node is also used to update storage node information corresponding to the locally stored disk.
  • the management node is further configured to, when receiving a read request to read the data of the disk, update the read request according to the updated storage node information corresponding to the disk stored locally Sent to the second storage node;
  • the second storage node is further configured to read the data in the disk through the SAS switch according to the received read request.
  • the management node is further configured to, when receiving a write request to write data to the disk, update the write request according to the updated storage node information corresponding to the disk stored locally Sent to the second storage node;
  • the second storage node is further configured to write data to the disk through the SAS switch according to the received write request.
  • the second storage node is also used to update the index information of the disk in the first storage node to the database of the second storage node.
  • the management node is also used to update the disk and the storage node information of the second storage node to the local database.
  • the management node is further configured to receive a message that the second storage node successfully loads the disk.
  • the management node MDS implements the drift disk Steps can include:
  • the storage node OSD1 is abnormal.
  • the software level of the storage node OSD1 fails, such as service startup failure, abnormal operating system, etc.
  • the disk and the data on the disk are normal, and the disk can still be accessed.
  • the management node MDS requests the storage node OSD2 to load the disk of the storage node OSD1.
  • the storage node OSD1 After a software failure occurs on the storage node OSD1, the storage node OSD1 cannot report the heartbeat to the management node MDS.
  • the management node MDS considers that the storage node OSD1 is offline. At this time, the management node MDS requests the other storage node OSD2 to try to load the storage node OSD1 disk. After the storage node OSD2 is successfully loaded, the disk data in the storage node OSD1 can be normally read by other storage nodes OSD2. Of course, the other storage node OSD2 can also write data to the disk, thereby avoiding the data recovery process.
  • the management node MDS can dynamically adjust the disk to other storage node OSD2 for reading, writing, and loading. For example, if the management node MDS finds that the storage node is abnormal, it reads and writes the disk data normally; and when the management node MDS finds that the storage node OSD1 is abnormal, it requests the storage node OSD2 in the same switch to load the disk of the storage node OSD1, through the storage node OSD2 normally reads and writes the disk data of storage node OSD1 to realize disk drift.
  • the management node MDS implements disk drift according to the state of the storage node.
  • OSD1 the read and write permissions of the disk drift from the faulty storage node OSD1 to the normal storage node OSD2 in the storage cluster.
  • the storage node OSD2 successfully loads the disk of OSD1.
  • the storage node OSD2 uses the drifted disk like a local disk.
  • the storage node OSD2 can normally access the disk in OSD1 through the SAS switch, and the disk in OSD1 can be normally loaded.
  • the disk of the faulty storage node OSD1 can be read and written normally after being loaded by other storage nodes OSD2. After the disk of the faulty storage node OSD1 is successfully loaded by the normal storage node OSD2, the subsequent reading and writing of the disk data can be performed by the storage node OSD2 loading the disk.
  • the SAS switch allows the storage node OSD2 to access other storage nodes OSD1 in the same switch. Disk, just like accessing a local disk.
  • the disk can be successfully loaded and read and written by other storage nodes OSD2. Data read and write does not need to be reconstructed to restore data, avoiding unnecessary calculations. Moreover, after the storage node OSD1 is abnormal, the reading and writing of data in the entire cloud storage system will not have much performance impact.
  • the MDS requests other storage nodes to unmount the loaded disk.
  • the management node MDS may first request the other storage node OSD2 to unload the disk of the failed storage node OSD1, and then request the storage node OSD1 to load the disk, so that the local disk of the storage node OSD1
  • the storage node OSD1 can take over the reading and writing by itself, thereby dispersing the pressure of the operating disk of the storage node OSD in the system.
  • the management node MDS requests the other storage nodes in the SAS switch to load the disks of the failed storage node, thereby managing the nodes
  • the steps of MDS drift disk are as follows:
  • the storage nodes OSD1 and OSD3 are abnormal.
  • the management node MDS requests the storage node OSD2 to load the disks of the storage nodes OSD1 and OSD3.
  • the storage nodes OSD1 and OSD3 cannot report the heartbeat to the management node MDS.
  • the management node MDS considers the storage node OSD1 and OSD3 to be offline. At this time, the management node MDS requests the other storage node OSD2 to try to load the storage node OSD1 and The disk in OSD3, after the other storage node OSD2 is loaded successfully, the data in the disk in the failed storage node can be read normally through the other storage node OSD2, of course, the other storage node OSD2 can also write data to the disk, thereby avoiding The process of data recovery.
  • the management node MDS can dynamically adjust the disk to other storage nodes for read-write loading. For example, if the management node MDS finds that the storage node is abnormal, it reads and writes the disk data normally; and when the management node MDS finds that the storage nodes OSD1 and OSD3 are abnormal, it requests the storage node OSD2 in the same switch to load the storage nodes OSD1 and OSD3. Disk, through the storage node OSD2 to read and write data in the disks of the storage nodes OSD1 and OSD3 normally, to achieve disk drift.
  • the management node MDS implements disk drift according to the state of the storage node.
  • the read and write permissions of the disk automatically drift from the failed storage node OSD1 and OSD3 to the normal storage node OSD2 in the storage cluster.
  • the storage node OSD2 successfully loads the disks in the storage nodes OSD1 and OSD3.
  • the storage node OSD2 uses the drifted disk like a local disk.
  • the storage node OSD2 can normally access the disks in OSD1 and OSD3 through the SAS switch, and the disks in OSD1 and OSD3 can be normally loaded.
  • the management node MDS requests the other multiple storage nodes in the switch to load the disks of the failed storage node, thereby managing
  • the steps for implementing drift disk on the node MDS are as follows:
  • the storage nodes OSD1 and OSD3 are abnormal.
  • the management node MDS requests the storage nodes OSD2 and OSD4 to load the disks of the storage nodes OSD1 and OSD3.
  • the storage nodes OSD1 and OSD3 cannot report the heartbeat to the management node MDS.
  • the management node MDS considers that the storage nodes OSD1 and OSD3 are offline. At this time, the management node MDS requests other storage nodes OSD2 and OSD4 to try to load the storage node. After the disks of OSD1 and OSD3, and other storage nodes OSD2 and OSD4 are loaded successfully, the data in the disk in the failed storage node can be read normally by other storage nodes OSD2 and OSD4. Of course, the OSD2 and OSD4 can also write data to the disk. This avoids the process of data recovery.
  • the management node MDS can dynamically adjust the disk to other storage nodes for loading and reading and writing. For example, if the management node MDS does not find the storage node abnormal, it reads and writes the data in the disk normally; and when the management node MDS finds that the storage node OSD1, OSD3 is abnormal, requests the storage node OSD2, OSD4 in the same switch to load the storage node OSD1 , OSD3 disk, through the storage node OSD2, OSD4 normally read and write data in the storage node OSD1, OSD3 disk, for example, the storage node OSD2 can normally read and write data in the storage node OSD1 disk, and, through storage The node OSD4 normally reads and writes data in the disk of the storage node OSD3, thereby realizing disk drift.
  • the management node MDS implements disk drift according to the state of the storage node.
  • the read and write permissions of the disk drift from the failed storage nodes OSD1 and OSD3 to the normal storage nodes OSD2 and OSD4 in the storage cluster.
  • the storage nodes OSD2 and OSD4 successfully load the disks of the storage nodes OSD1 and OSD3.
  • the storage nodes OSD2 and OSD4 can normally access the disks in OSD1 and OSD3 through the SAS switch, and the disks in OSD1 and OSD3 can be normally loaded.
  • the disk dynamic loading device may be the aforementioned management device, or may also be the aforementioned storage node, which may include:
  • One or more processors a storage device that stores one or more programs
  • the one or more processors When the one or more programs are executed by the one or more processors, the one or more processors implement the disk dynamic loading method.
  • a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the disk dynamic loading method is implemented.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Quality & Reliability (AREA)
  • Computer Security & Cryptography (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A method for dynamic loading of a disk and a cloud storage system. The method is applied to a cloud storage system. The cloud storage system comprises a management node and multiple storage nodes, and the multiple storage nodes access the same SAS switch. The method comprises: when detecting that a software failure occurs in a first storage node, the management node sends a disk loading instruction to a second storage node (S1); the second storage node loads the disk of the first storage node by means of the SAS switch after receiving the disk loading instruction (S2); the management node updates locally stored storage node information corresponding to the disk (S3). According to the method, the use of the SAS switch enables the storage nodes to access the disks of all the storage nodes on the switch, thereby enabling the disk of a failed storage node to be loaded by means of other storage nodes, implementing the dynamic loading of the disk, reducing the performance loss of system reconstruction, and improving the usability of object storage disks.

Description

一种磁盘动态加载的方法和云存储系统Dynamic disk loading method and cloud storage system
本申请要求于2018年12月28日提交的申请号为201811625675.X、发明名称为“一种磁盘动态加载的方法和云存储系统”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application requires the priority of the Chinese patent application filed on December 28, 2018 with the application number 201811625675.X and the invention titled "A method for dynamic disk loading and cloud storage system", the entire content of which is incorporated by reference in In this application.
技术领域Technical field
本申请涉及数据存储技术领域,尤其涉及一种磁盘动态加载的方法和云存储系统。The present application relates to the field of data storage technology, in particular to a method for dynamically loading a disk and a cloud storage system.
背景技术Background technique
随着社会的发展,安全日益成为人们关注的重点,平安城市等项目的推进为我们的安全生活提供了一定的保障,其中,安防监控在平安城市等项目中起到至关重要作用。安防监控中存在海量录像数据,云存储为海量录像数据的存储提供了弹性的存储空间,云存储的存储空间需要维护存储集群,数据一般分散在存储集群中。也即是,通过存储集群可以存储海量录像数据。With the development of society, safety has increasingly become the focus of people's attention. The promotion of safe cities and other projects provides a certain guarantee for our safe life. Among them, security monitoring plays a vital role in safe cities and other projects. There are massive video data in security monitoring. Cloud storage provides flexible storage space for the storage of massive video data. The storage space of cloud storage needs to maintain the storage cluster, and the data is generally scattered in the storage cluster. That is to say, massive video data can be stored through the storage cluster.
云存储可以采用副本模式或者EC(Erasure Code,纠删码)模式,保证数据的完整性,在存储集群中,一台设备故障后,需要通过副本或者EC数据恢复出故障存储节点中的数据,也就是重构。当云存储的存储集群规模较大时,存储节点故障将变得频繁。尤其当存储节点的故障中有一部分为软件故障,如服务启动失败、操作系统异常等,虽然可以通过副本或者EC数据计算得到故障存储节点中的数据,却消耗了存储集群的计算能力,增加了集群负担。Cloud storage can use copy mode or EC (Erasure Code, erasure code) mode to ensure data integrity. In a storage cluster, after a device fails, the data in the failed storage node needs to be recovered through the copy or EC data. That is refactoring. When the storage cluster size of cloud storage is large, storage node failures will become frequent. Especially when a part of the storage node failure is a software failure, such as service startup failure, operating system abnormality, etc., although the data in the failed storage node can be calculated through copy or EC data, it consumes the computing power of the storage cluster and increases Cluster burden.
发明内容Summary of the invention
本申请实施例提供了一种磁盘动态加载的方法和云存储系统,可以减少因进行数据重构而造成的系统资源消耗。所述技术方案如下:Embodiments of the present application provide a method for dynamically loading a disk and a cloud storage system, which can reduce system resource consumption caused by data reconstruction. The technical solution is as follows:
一方面,提供了一种磁盘动态加载的方法,应用于云存储系统,所述云存储系统包括管理节点以及多个存储节点,所述多个存储节点接入同一SAS交换机,所述方法包括:In one aspect, a method for dynamically loading a disk is provided, which is applied to a cloud storage system. The cloud storage system includes a management node and multiple storage nodes. The multiple storage nodes access the same SAS switch. The method includes:
当所述管理节点检测到所述多个存储节点中的第一存储节点出现软件故障 时,发送磁盘加载指令至所述多个存储节点中的第二存储节点;When the management node detects that the first storage node of the plurality of storage nodes has a software failure, it sends a disk load instruction to the second storage node of the plurality of storage nodes;
所述第二存储节点在接收到所述磁盘加载指令后,通过所述SAS交换机加载所述第一存储节点的磁盘。After receiving the disk loading instruction, the second storage node loads the disk of the first storage node through the SAS switch.
在本申请一种可能的实现方式中,所述管理节点更新本地存储的所述磁盘对应的存储节点信息。In a possible implementation manner of the present application, the management node updates storage node information corresponding to the locally stored disk.
在本申请一种可能的实现方式中,所述方法还包括:In a possible implementation manner of the present application, the method further includes:
当所述管理节点接收到读取所述磁盘的数据的读取请求时,根据更新后的本地存储的所述磁盘对应的存储节点信息,将读取请求下发至第二存储节点;When the management node receives the read request to read the data of the disk, the management node sends the read request to the second storage node according to the updated information of the storage node corresponding to the locally stored disk.
所述第二存储节点根据接收到的所述读取请求,通过所述SAS交换机读取所述磁盘中的数据。The second storage node reads the data in the disk through the SAS switch according to the received read request.
在本申请一种可能的实现方式中,当所述管理节点接收到向所述磁盘写入数据的写入请求时,根据更新后的本地存储的所述磁盘对应的存储节点信息,将写入请求下发至第二存储节点;In a possible implementation manner of the present application, when the management node receives a write request to write data to the disk, according to the updated locally stored information of the storage node corresponding to the disk, the management node will write The request is sent to the second storage node;
所述第二存储节点根据接收到的所述写入请求,通过所述SAS交换机向所述磁盘写入数据。The second storage node writes data to the disk through the SAS switch according to the received write request.
在本申请一种可能的实现方式中,所述通过所述SAS交换机加载所述第一存储节点的磁盘,包括:In a possible implementation manner of the present application, loading the disk of the first storage node through the SAS switch includes:
所述第二存储节点将第一存储节点中的磁盘的索引信息更新至第二存储节点的数据库。The second storage node updates the index information of the disk in the first storage node to the database of the second storage node.
在本申请一种可能的实现方式中,所述管理节点更新本地存储的所述磁盘对应的存储节点信息,包括:In a possible implementation manner of the present application, the management node updating the storage node information corresponding to the locally stored disk includes:
所述管理节点将所述磁盘与第二存储节点的存储节点信息对应更新至本地数据库中。The management node correspondingly updates the storage node information of the disk and the second storage node to a local database.
在本申请一种可能的实现方式中,在所述管理节点更新本地存储的所述磁盘对应的存储节点信息之前,还包括:In a possible implementation manner of the present application, before the management node updates the storage node information corresponding to the locally stored disk, the method further includes:
所述管理节点接收到所述第二存储节点发送的加载磁盘成功的消息。The management node receives the message that the disk is loaded successfully from the second storage node.
另一方面,还提供了一种云存储系统,所述云存储系统包括:管理节点以及多个存储节点,所述多个存储节点接入同一SAS交换机,所述多个存储节点包括第一存储节点和第二存储节点,其中:On the other hand, a cloud storage system is also provided. The cloud storage system includes: a management node and multiple storage nodes, the multiple storage nodes accessing the same SAS switch, and the multiple storage nodes include a first storage Node and second storage node, where:
所述管理节点,用于在检测到第一存储节点出现软件故障时,发送磁盘加载指令至所述第二存储节点;The management node is configured to send a disk load instruction to the second storage node when a software failure is detected in the first storage node;
所述第二存储节点,用于在接收到所述磁盘加载指令后,通过所述SAS交换机加载所述第一存储节点的磁盘。The second storage node is configured to load the disk of the first storage node through the SAS switch after receiving the disk loading instruction.
在本申请一种可能的实现方式中,所述管理节点,还用于更新本地存储的所述磁盘对应的存储节点信息。In a possible implementation manner of the present application, the management node is further used to update storage node information corresponding to the locally stored disk.
在本申请一种可能的实现方式中,所述管理节点,还用于在接收到读取所述磁盘的数据的读取请求时,根据更新后的本地存储的所述磁盘对应的存储节点信息,将读取请求下发至第二存储节点;In a possible implementation manner of the present application, the management node is further configured to, when receiving a read request to read the data of the disk, according to the updated storage node information corresponding to the disk stored locally , And send the read request to the second storage node;
所述第二存储节点,还用于根据接收到的所述读取请求,通过所述SAS交换机读取所述磁盘中的数据。The second storage node is also used to read the data in the disk through the SAS switch according to the received read request.
在本申请一种可能的实现方式中,所述管理节点,还用于在接收到向所述磁盘写入数据的写入请求时,根据更新后的本地存储的所述磁盘对应的存储节点信息,将写入请求下发至第二存储节点;In a possible implementation manner of the present application, the management node is further configured to, when receiving a write request to write data to the disk, update the locally stored storage node information corresponding to the disk , Send the write request to the second storage node;
所述第二存储节点,还用于根据接收到的所述写入请求,通过所述SAS交换机向所述磁盘写入数据。The second storage node is further configured to write data to the disk through the SAS switch according to the received write request.
在本申请一种可能的实现方式中,所述第二存储节点,还用于将第一存储节点中的磁盘的索引信息更新至第二存储节点的数据库。In a possible implementation manner of the present application, the second storage node is also used to update the index information of the disk in the first storage node to the database of the second storage node.
在本申请一种可能的实现方式中,所述管理节点,还用于将所述磁盘与第二存储节点的存储节点信息对应更新至本地数据库中。In a possible implementation manner of the present application, the management node is further configured to correspondingly update the storage node information of the disk and the second storage node to a local database.
在本申请一种可能的实现方式中,所述管理节点,还用于接收所述第二存储节点发送的加载磁盘成功的消息。In a possible implementation manner of the present application, the management node is further configured to receive a message that the second storage node successfully loads the disk.
另一方面,提出了一种磁盘动态加载设备,包括:On the other hand, a disk dynamic loading device is proposed, including:
一个或多个处理器、存储一个或多个程序的存储装置;One or more processors, a storage device that stores one or more programs;
当所述一个或多个程序被所述一个或多个处理器执行时,所述一个或多个处理器实现所述的磁盘动态加载方法。When the one or more programs are executed by the one or more processors, the one or more processors implement the disk dynamic loading method.
另一方面,还提出了一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现所述的磁盘动态加载方法。On the other hand, a computer-readable storage medium is also proposed, on which a computer program is stored, which implements the disk dynamic loading method when the computer program is executed by a processor.
另一方面,提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机实现上述一方面所述的磁盘动态加载方法。On the other hand, a computer program product containing instructions is provided, which, when run on a computer, causes the computer to implement the disk dynamic loading method described in the above aspect.
本申请的磁盘动态加载的方法通过存储节点接入同一SAS交换机,存储节点可以访问接入该SAS交换机的所有存储节点的磁盘,从而实现在一个存储节 点软件故障的情况下,通过其他存储节点加载故障存储节点的磁盘,实现磁盘的动态加载,减少系统重构的性能损耗,提高对象存储磁盘的可用性。The method for dynamically loading the disk of the present application accesses the same SAS switch through the storage node, and the storage node can access the disks of all storage nodes connected to the SAS switch, so that in the case of a storage node software failure, loading through other storage nodes The disk of the failed storage node can realize the dynamic loading of the disk, reduce the performance loss of system reconstruction, and improve the availability of the object storage disk.
附图说明BRIEF DESCRIPTION
通过阅读下文可选实施方式的详细描述,各种其他的优点和益处对于本领域普通技术人员将变得清楚明了。附图仅用于示出可选实施方式的目的,而并不认为是对本申请的限制。而且在整个附图中,用相同的参考符号表示相同的部件。在附图中:By reading the detailed description of alternative embodiments below, various other advantages and benefits will become clear to those of ordinary skill in the art. The drawings are only for the purpose of showing alternative embodiments, and are not considered as limitations to the present application. Furthermore, throughout the drawings, the same reference symbols are used to denote the same components. In the drawings:
附图1示出了根据本申请实施方式的磁盘动态加载的方法的第一种整体流程示意图。FIG. 1 shows a first overall flowchart of a method for dynamically loading a magnetic disk according to an embodiment of the present application.
附图2示出了根据本申请实施方式的存储节点接入SAS交换机的结构示意图。FIG. 2 shows a schematic structural diagram of a storage node accessing a SAS switch according to an embodiment of the present application.
附图3示出了根据本申请实施方式的磁盘动态加载的方法的第二种整体流程示意图。FIG. 3 shows a second overall flowchart of a method for dynamically loading a disk according to an embodiment of the present application.
附图4示出了根据本申请实施方式的MDS漂移磁盘的第一种流程示意图。FIG. 4 shows a first schematic flowchart of an MDS drift disk according to an embodiment of the present application.
附图5示出了根据本申请实施方式的MDS漂移磁盘的第二种流程示意图。FIG. 5 shows a second schematic flowchart of an MDS drift disk according to an embodiment of the present application.
附图6示出了根据本申请实施方式的MDS漂移磁盘的第三种流程示意图。FIG. 6 shows a third schematic flow chart of an MDS drift disk according to an embodiment of the present application.
具体实施方式detailed description
下面将参照附图更详细地描述本公开的示例性实施方式。虽然附图中显示了本公开的示例性实施方式,然而应当理解,可以以各种形式实现本公开而不应被这里阐述的实施方式所限制。相反,提供这些实施方式是为了能够更透彻地理解本公开,并且能够将本公开的范围完整的传达给本领域的技术人员。Hereinafter, exemplary embodiments of the present disclosure will be described in more detail with reference to the accompanying drawings. Although the exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure can be implemented in various forms and should not be limited by the embodiments set forth herein. Rather, these embodiments are provided to enable a more thorough understanding of the present disclosure and to fully convey the scope of the present disclosure to those skilled in the art.
本申请中,以下术语定义如下:In this application, the following terms are defined as follows:
数据库:数据库(Data Base,DB)是指在计算机的存储设备上合理存放的、相关联的、有结构的数据集合。一个数据库含有各种内容,包括表、视图、字段、索引等。Database: Database (Data, Base, DB) refers to a collection of related, structured data that is reasonably stored on a computer's storage device. A database contains various contents, including tables, views, fields, indexes, etc.
录像定位:本申请指的是根据用户输入的时间,系统可以根据数据库中记录的相关信息,快速找到此时间所对应存储的录像数据。Video positioning: This application refers to the time entered by the user. The system can quickly find the stored video data corresponding to this time according to the relevant information recorded in the database.
Byte:数据存储是以“字节”(Byte)为单位,每8个位(bit,简写为b)组成一个字节(Byte,简写为B),是最小一级的信息单位。Byte: Data storage is in the unit of "Byte" (Byte), and every 8 bits (bit, abbreviated as b) form a byte (Byte, abbreviated as B), which is the smallest level of information unit.
视频流:包括待传输的视频数据,它能够被作为一个稳定的和连续的流通过网络进行处理。Video stream: includes the video data to be transmitted, which can be processed as a stable and continuous stream through the network.
对象存储:对象存储系统是为用户提供的海量、安全、高可靠和易扩展的云存储服务。它并非将文件组织成一个目录层次结构,而是在一个扁平化的容器组织中存储文件,并使用唯一的ID来检索它们。其结果是对象存储系统相比文件系统需要更少的元数据来存储和访问文件,并且它们还减少了因存储元数据而产生的管理文件元数据的开销。对象存储系统通过与平台无关的RESTFUL协议为用户提供服务,支持通过web方便的存储和管理海量对象。对象存储系统可以在一个持久稳固且高度可用的系统中存储任意的对象,应用和用户可以在对象存储中使用简单的API(Application Programming Interface,应用程序接口)访问数据;这些通常都基于表属性状态转移(REST)架构,但是也有面向编程语言的界面。Object storage: The object storage system is a massive, safe, highly reliable and easily expandable cloud storage service provided to users. Instead of organizing files into a directory hierarchy, it stores files in a flat container organization and uses unique IDs to retrieve them. The result is that object storage systems require less metadata to store and access files than file systems, and they also reduce the overhead of managing file metadata due to storing metadata. The object storage system provides services for users through the platform-independent RESTFUL protocol and supports convenient storage and management of massive objects through the web. The object storage system can store arbitrary objects in a durable and highly available system. Applications and users can use simple APIs (Application Programming Interface) to access data in the object storage; these are usually based on the state of table attributes Transfer (REST) architecture, but there are also interfaces for programming languages.
OSD(Object-Based Storage Device,对象存储设备):本方案代表存储节点,是对象存储系统中的读写对象的模块。OSD将数据存放到磁盘的磁道和扇区,将若干磁道和扇区组合起来构成对象,并且通过此对象向外界提供对数据的访问。OSD (Object-Based Storage Device): This solution represents a storage node and is a module for reading and writing objects in an object storage system. The OSD stores data to the tracks and sectors of the disk, and combines several tracks and sectors to form an object, and provides access to the data to the outside world through this object.
MDS(Metadata Server,元数据服务器):是对象存储系统中的管理节点,保存了对象的索引信息,包括对象名字、对象保存的具体位置信息、对象的最后修改时间等。MDS (Metadata Server): It is the management node in the object storage system, which stores the index information of the object, including the name of the object, the specific location information of the object, and the last modification time of the object.
分配资源:本方案指MDS为对象的写入分配存储资源,具体的指分配OSD以及对象的磁盘。Allocation of resources: This solution refers to MDS allocating storage resources for object writing, and specifically refers to allocating OSD and object disks.
文件对象:负责文件的访问操作,在得到文件对象后,可以使用该文件对象来读取磁盘内的数据。文件对象是用户一次性上传到云存储的,是使用PUT协议一次交互完成上传的。File object: Responsible for file access operations. After obtaining the file object, you can use the file object to read the data on the disk. The file object is uploaded to the cloud storage by the user at a time, and the upload is completed in one interaction using the PUT protocol.
集群技术:集群是一组相互独立的、通过高速网络互联的计算机,它们构成了一个组,并以单一系统的模式加以管理。一个客户与集群相互作用时,集群像是一个独立的服务器。集群配置是用于提高可用性和可缩放性。Cluster technology: A cluster is a group of independent computers interconnected by a high-speed network. They form a group and are managed as a single system. When a client interacts with the cluster, the cluster acts as an independent server. The cluster configuration is used to improve availability and scalability.
磁盘加载:云存储将数据持久化到多个磁盘中,该多个磁盘是云存储中存储数据的介质,每个磁盘通常包括多个分区。在linux操作系统中,磁盘加载是指将一个设备(通常是存储设备)的磁盘挂接到一个已存在的目录上,具体来说,如果要访问存储设备的某个磁盘中的文件,必须将文件所在的分区挂载到 一个已存在的目录上,然后通过访问这个目录来访问该文件。磁盘只有被云存储加载起来后,才能进行读写操作。Disk loading: Cloud storage persists data to multiple disks, which are media for storing data in cloud storage. Each disk usually includes multiple partitions. In the Linux operating system, disk loading refers to mounting the disk of a device (usually a storage device) to an existing directory. Specifically, if you want to access files in a disk of a storage device, you must Mount the partition where the file is located on an existing directory, and then access the file by accessing this directory. The disk can only be read and written after it is loaded by cloud storage.
磁盘漂移:磁盘在OSD之间漂移,是指磁盘的读写控制从一个OSD,切换到另外一个OSD。Disk drift: The disk drifts between OSDs, which means that the read and write control of the disk is switched from one OSD to another OSD.
重构:EC数据中通过有效的数据块和校验块,可以计算恢复出损坏数据块的过程。Reconstruction: The process of recovering damaged data blocks can be calculated through valid data blocks and check blocks in EC data.
SAS(Serial Attached SCSI(Small Computer System Interface,小型计算机系统接口)串联连接SCSI)交换机:使用SAS协议进行磁盘发现、模拟网络通信的交换机。存储节点接入SAS交换机后,可以发现并且使用交换机上接入的所有存储节点中的磁盘。SAS (Serial Attached SCSI (Small Computer System Interface, small computer system interface) serial connection SCSI) switch: a switch that uses the SAS protocol for disk discovery and simulated network communication. After a storage node is connected to a SAS switch, you can discover and use disks in all storage nodes connected to the switch.
附图1为本实施例提供的磁盘动态加载的方法的流程示意图,下面分别对各个步骤进行详细说明。该磁盘动态加载的方法可以应用于云存储系统中,所述云存储系统包括管理节点以及多个存储节点,所述多个存储节点接入同一SAS交换机。FIG. 1 is a schematic flowchart of a method for dynamically loading a magnetic disk provided by this embodiment, and each step is described in detail below. The method for dynamically loading a magnetic disk may be applied to a cloud storage system. The cloud storage system includes a management node and multiple storage nodes, and the multiple storage nodes are connected to the same SAS switch.
在介绍该方法之前,先对本申请提供的多个存储节点接入同一SAS交换机的结构框架进行介绍,请参考图2,在本申请的一种实施方式中,该云存储系统中可以包括多个管理节点,该多个管理节点组成一个管理集群,如图2所示,该管理集群中的各个管理节点MDS1、MDS2、MDS3…MDSN的信令口与普通千兆交换机互联,通过相互之间的互联实现信令交换。同理,该多个存储节点组成一个存储集群,该存储集群中的各个存储节点OSD1、OSD2、0SD3…OSDN的信令口与普通千兆交换机互联,通过相互之间的互联实现信令交换,同时,存储集群的各个存储节点OSD1、OSD2、0SD3…OSDN的数据口通过SAS交换机互联,通过互联实现相互之间的数据交换。Before introducing this method, first introduce the structural framework of multiple storage nodes provided by this application to access the same SAS switch, please refer to FIG. 2, in an embodiment of this application, the cloud storage system may include multiple Management nodes, the multiple management nodes form a management cluster, as shown in Figure 2, the signaling ports of each management node MDS1, MDS2, MDS3...MDSN in the management cluster are interconnected with ordinary Gigabit switches, through the mutual The interconnection realizes signaling exchange. Similarly, the multiple storage nodes form a storage cluster, and each storage node OSD1, OSD2, 0SD3...OSDN signaling ports in the storage cluster are interconnected with ordinary Gigabit switches, and the signaling exchange is realized through the interconnection between each other. At the same time, the data ports of each storage node OSD1, OSD2, 0SD3...OSDN of the storage cluster are interconnected through SAS switches, and data exchange between them is realized through interconnection.
接下来以该云存储系统包括一个管理节点MDS为例进行说明。管理节点MDS与普通千兆交换机之间的信令交换为双向交换,信令在管理节点MDS与普通千兆交换机之间可双向传输;存储节点OSD与普通千兆交换机之间的信令交换为双向交换,信令在存储节点OSD与普通千兆交换机之间可双向传输;存储节点OSD与SAS交换机之间的数据交换也为双向交换,数据在存储节点OSD与SAS交换机之间可双向传输。Next, the cloud storage system includes a management node MDS as an example for description. The signaling exchange between the management node MDS and the ordinary gigabit switch is a two-way exchange. The signaling can be transmitted bidirectionally between the management node MDS and the ordinary gigabit switch; the signaling exchange between the storage node OSD and the ordinary gigabit switch is Bidirectional exchange, signaling can be bidirectionally transmitted between the storage node OSD and the ordinary gigabit switch; the data exchange between the storage node OSD and the SAS switch is also a bidirectional exchange, and the data can be bidirectionally transferred between the storage node OSD and the SAS switch.
由于SAS交换机使用SAS协议进行磁盘发现、模拟网络通信,所以,当某 个存储节点接入SAS交换机后,可以发现并且使用交换机上接入的所有存储节点中的磁盘。通过将云存储系统中的存储节点OSD与SAS交换机连接,存储节点OSD可以访问接入该SAS交换机的其他存储节点的磁盘。Because SAS switches use the SAS protocol for disk discovery and simulated network communication, when a storage node is connected to the SAS switch, it can discover and use the disks in all storage nodes connected to the switch. By connecting the storage node OSD in the cloud storage system to the SAS switch, the storage node OSD can access the disks of other storage nodes connected to the SAS switch.
具体的,如图1所示,本申请提供的磁盘动态加载的方法可以包括如下几个步骤:Specifically, as shown in FIG. 1, the method for dynamically loading a disk provided by this application may include the following steps:
S1、当管理节点检测到第一存储节点出现软件故障时,发送磁盘加载指令至第二存储节点。S1. When the management node detects that the first storage node has a software failure, it sends a disk load instruction to the second storage node.
假设某个存储节点软件层面出现故障,如服务启动失败、操作系统异常等,为了便于描述,这里将该存储节点称为故障存储节点,在这里,该故障存储节点又被称为第一存储节点。存储节点出现软件故障后,故障存储节点无法正常上报心跳给管理节点MDS,管理节点MDS会认为故障存储节点离线,此时管理节点MDS会请求接入该SAS交换机的其他存储节点尝试加载故障存储节点的磁盘,为了便于描述,这里将该其他存储节点称为第二存储节点,即该管理节点向该第二存储节点发送磁盘加载指令,以指示该第二存储节点加载该第一存储节点中的磁盘。Suppose that a storage node has a software level failure, such as a failed service start or an abnormal operating system. For convenience of description, this storage node is referred to as a faulty storage node. Here, the faulty storage node is also referred to as a first storage node. . After the storage node has a software failure, the failed storage node cannot report the heartbeat to the management node MDS. The management node MDS considers the failed storage node offline. At this time, the management node MDS requests other storage nodes connected to the SAS switch to try to load the failed storage node. For the convenience of description, the other storage node is referred to as a second storage node, that is, the management node sends a disk load instruction to the second storage node to instruct the second storage node to load the first storage node. Disk.
S2、所述第二存储节点在接收到所述磁盘加载指令后,通过SAS交换机加载所述第一存储节点的磁盘。S2. After receiving the disk loading instruction, the second storage node loads the disk of the first storage node through a SAS switch.
该第二存储节点接收到该磁盘加载指令后,加载该第一存储节点中的磁盘。当该第二存储节点加载成功后,第一存储节点中的磁盘内的数据可以通过该第二存储节点正常读取,当然也可以通过该第二存储节点向第一存储节点的磁盘中写数据,由此避免了数据恢复的过程。After receiving the disk loading instruction, the second storage node loads the disk in the first storage node. After the second storage node is successfully loaded, the data in the disk in the first storage node can be normally read by the second storage node, of course, the data in the disk of the first storage node can also be written by the second storage node , Thereby avoiding the process of data recovery.
在本申请一种可能的实现方式中,所述第二存储节点将第一存储节点中的磁盘的索引信息更新至第二存储节点的数据库。In a possible implementation manner of the present application, the second storage node updates the index information of the disk in the first storage node to the database of the second storage node.
在本申请中,可以通过SAS交换机,将第一存储节点中的磁盘索引信息发送到第二存储节点,第二存储节点将第一存储节点中的磁盘索引信息拷贝到本节点的数据库内以进行更新,目的是将来利用这个磁盘索引信息读取存在软件故障的第一存储节点中的磁盘内的数据。In this application, the disk index information in the first storage node can be sent to the second storage node through the SAS switch, and the second storage node copies the disk index information in the first storage node to the database of the local node for The purpose of the update is to use this disk index information to read the data in the disk in the first storage node that has a software failure in the future.
也即是,管理节点MDS根据存储节点的状态,可以将故障存储节点中的磁盘动态调整到其他存储节点进行读写加载。譬如,若管理节点MDS未发现存储节点异常,则正常读写磁盘内的数据;而当管理节点MDS发现某个存储节点异常后,请求同一个交换机中的另一个存储节点加载故障存储节点的磁盘,通过 该另一个存储节点正常读写故障存储节点的磁盘内的数据,实现磁盘漂移。That is, the management node MDS can dynamically adjust the disks in the failed storage node to other storage nodes for read, write, and load according to the status of the storage node. For example, if the management node MDS does not find the storage node abnormal, it reads and writes the data in the disk normally; and when the management node MDS finds a storage node abnormal, it requests another storage node in the same switch to load the disk of the failed storage node , Through the other storage node to normally read and write data in the disk of the failed storage node, to achieve disk drift.
从上述流程可以看出,管理节点MDS根据存储节点的状态实现磁盘漂移,即当某个存储节点出现软件故障后,磁盘的读写权限从故障存储节点漂移到存储集群中的正常存储节点。磁盘漂移后,磁盘的读写请求都通过正常存储节点来执行,正常存储节点像使用本地磁盘一样使用漂移过来的磁盘。由此,正常存储节点通过SAS交换机正常访问到故障存储节点中的磁盘,可以实现正常加载故障存储节点中的磁盘。如此,使得在无需采用副本模式或EC模式进行恢复的情况下,仍能正常读写故障存储节点的磁盘中的数据。It can be seen from the above process that the management node MDS realizes disk drift according to the status of the storage node, that is, when a storage node has a software failure, the read and write permissions of the disk drift from the failed storage node to the normal storage node in the storage cluster. After the disk drifts, all disk read and write requests are executed through the normal storage node. The normal storage node uses the drifted disk like a local disk. Therefore, the normal storage node can access the disk in the failed storage node normally through the SAS switch, and the disk in the failed storage node can be normally loaded. In this way, the data in the disk of the failed storage node can still be read and written normally without using copy mode or EC mode for recovery.
S3、所述管理节点更新本地存储的所述磁盘对应的存储节点信息。S3. The management node updates locally stored storage node information corresponding to the disk.
其中,该磁盘为第一存储节点中的磁盘,即漂移至第二存储节点的磁盘。The disk is a disk in the first storage node, that is, a disk that drifts to the second storage node.
在本申请一种可能的实现方式中,所述管理节点将所述磁盘与第二存储节点的存储节点信息对应更新至本地数据库中。其中,存储节点信息用于唯一指示一个存储节点。In a possible implementation manner of the present application, the management node updates the disk and storage node information of the second storage node to the local database. Among them, the storage node information is used to uniquely indicate a storage node.
进一步地,在所述管理节点更新本地存储的所述磁盘对应的存储节点信息之前,还可以包括:所述管理节点接收到所述第二存储节点发送的加载磁盘成功的消息。也就是说,该管理节点在确定第二存储节点成功加载第一存储节点中的磁盘后,将上述磁盘与第二存储节点的存储节点信息对应更新至本地数据库中。Further, before the management node updates the storage node information corresponding to the locally stored disk, the method may further include: the management node receives a message that the second storage node successfully loads the disk. That is to say, after determining that the second storage node successfully loads the disk in the first storage node, the management node updates the above-mentioned disk and the storage node information of the second storage node to the local database.
例如,当第二存储节点加载第一存储节点中的磁盘成功后,会发送相应的加载磁盘成功的消息给管理节点。管理节点接收到该加载磁盘成功的消息后,将所述磁盘与第二存储节点的存储节点信息对应更新至管理节点的本地数据库中,以此作为记录,这样下一次如果第一存储节点再次故障并需要再加载该磁盘时,不必搜索或寻找新的存储节点加载该磁盘,可以直接分配第二存储节点加载该磁盘。For example, after the second storage node successfully loads the disk in the first storage node, it will send a corresponding message that the disk is successfully loaded to the management node. After receiving the message that the disk is loaded successfully, the management node updates the information of the disk and the storage node of the second storage node to the local database of the management node as a record, so that if the first storage node fails again next time When the disk needs to be loaded again, it is not necessary to search or find a new storage node to load the disk, and a second storage node can be directly assigned to load the disk.
经过上述步骤,存储节点的软件层面异常后,出现软件层面异常的故障存储节点中的磁盘可以顺利被其他存储节点加载并读写,数据读写不需要通过重构等方式来恢复数据,避免了不必要的计算。而且,存储节点异常后,整个云存储系统中的数据的读写不会有太大的性能影响。After the above steps, after the software level of the storage node is abnormal, the disk in the faulty storage node with software level abnormality can be successfully loaded and read and written by other storage nodes. Data read and write does not need to be reconstructed to restore data, avoiding Unnecessary calculations. Moreover, after the storage node is abnormal, the reading and writing of data in the entire cloud storage system will not have much performance impact.
故障存储节点恢复正常后,管理节点MDS可以请求第二存储节点卸载该被加载的磁盘。例如,在故障存储节点恢复正常后,管理节点MDS可以首先请求第二存储节点卸载已加载的故障存储节点的磁盘,再请求故障存储节点加载所 述磁盘,这样故障存储节点的本地磁盘又可以由故障存储节点自己接管,由此可以分散系统中存储节点的操作磁盘的压力。After the failed storage node returns to normal, the management node MDS may request the second storage node to unload the loaded disk. For example, after the failed storage node returns to normal, the management node MDS may first request the second storage node to unload the disk of the loaded failed storage node, and then request the failed storage node to load the disk, so that the local disk of the failed storage node can be replaced by the The failed storage node takes over itself, thereby dispersing the pressure on the storage disk of the storage node in the system.
如此,本申请通过磁盘在对象存储内部的存储节点之间漂移,实现磁盘的动态加载,在存储节点出现软件故障后,通过SAS交换机,实现存储节点的漂移,可以继续访问故障存储节点的磁盘内的数据,提高磁盘的可用性。In this way, this application uses disks to drift between storage nodes inside the object storage to realize dynamic loading of the disks. After a software failure occurs on the storage nodes, the storage nodes can be drifted through SAS switches to continue to access the disks of the failed storage nodes Data to improve disk availability.
在本申请一种可能的实现方式中,如图3所示,所述磁盘动态加载方法还可以进一步包括:In a possible implementation manner of the present application, as shown in FIG. 3, the disk dynamic loading method may further include:
S4、当所述管理节点接收到读取所述磁盘的数据的读取请求时,根据更新后的本地存储的所述磁盘对应的存储节点信息,将读取请求下发至第二存储节点。S4. When the management node receives the read request to read the data of the disk, the management node sends the read request to the second storage node according to the updated storage node information corresponding to the locally stored disk.
作为一种示例,所述第二存储节点根据接收到的所述读取请求,通过所述SAS交换机读取所述磁盘中的数据。As an example, the second storage node reads the data in the disk through the SAS switch according to the received read request.
作为一种示例,当所述管理节点接收到向所述磁盘写入数据的写入请求时,根据更新后的本地存储的所述磁盘对应的存储节点信息,将写入请求下发至第二存储节点;As an example, when the management node receives a write request to write data to the disk, it sends the write request to the second according to the updated storage node information corresponding to the disk stored locally. Storage node
所述第二存储节点根据接收到的所述写入请求,通过所述SAS交换机向所述磁盘写入数据。The second storage node writes data to the disk through the SAS switch according to the received write request.
由此,故障存储节点的磁盘,通过其他存储节点的加载后,实现正常读写。故障存储节点的磁盘被正常存储节点成功加载后,后续的磁盘数据的读写,都可以通过加载磁盘的正常存储节点进行,SAS交换机使得存储节点访问同一个交换机中其他存储节点的磁盘,就像访问本地磁盘一样。Therefore, the disks of the failed storage node can be read and written normally after being loaded by other storage nodes. After the disk of the failed storage node is successfully loaded by the normal storage node, subsequent disk data reading and writing can be performed through the normal storage node of the loaded disk. The SAS switch allows the storage node to access the disks of other storage nodes in the same switch, just like The same as accessing the local disk.
对应的,本申请还提出一种云存储系统,所述云存储系统包括:管理节点以及多个存储节点,所述多个存储节点接入同一SAS交换机,所述多个存储节点包括第一存储节点和第二存储节点,其中:Correspondingly, this application also proposes a cloud storage system. The cloud storage system includes: a management node and multiple storage nodes, the multiple storage nodes accessing the same SAS switch, and the multiple storage nodes include a first storage Node and second storage node, where:
所述管理节点,用于在检测到第一存储节点出现软件故障时,发送磁盘加载指令至所述第二存储节点;The management node is configured to send a disk load instruction to the second storage node when a software failure is detected in the first storage node;
所述第二存储节点,用于在接收到所述磁盘加载指令后,通过所述SAS交换机加载所述第一存储节点的磁盘。The second storage node is configured to load the disk of the first storage node through the SAS switch after receiving the disk loading instruction.
作为一种示例,所述管理节点,还用于更新本地存储的所述磁盘对应的存储节点信息。As an example, the management node is also used to update storage node information corresponding to the locally stored disk.
作为一种示例,所述管理节点,还用于在接收到读取所述磁盘的数据的读取请求时,根据更新后的本地存储的所述磁盘对应的存储节点信息,将读取请求下发至第二存储节点;As an example, the management node is further configured to, when receiving a read request to read the data of the disk, update the read request according to the updated storage node information corresponding to the disk stored locally Sent to the second storage node;
作为一种示例,所述第二存储节点,还用于根据接收到的所述读取请求,通过所述SAS交换机读取所述磁盘中的数据。As an example, the second storage node is further configured to read the data in the disk through the SAS switch according to the received read request.
作为一种示例,所述管理节点,还用于在接收到向所述磁盘写入数据的写入请求时,根据更新后的本地存储的所述磁盘对应的存储节点信息,将写入请求下发至第二存储节点;As an example, the management node is further configured to, when receiving a write request to write data to the disk, update the write request according to the updated storage node information corresponding to the disk stored locally Sent to the second storage node;
所述第二存储节点,还用于根据接收到的所述写入请求,通过所述SAS交换机向所述磁盘写入数据。The second storage node is further configured to write data to the disk through the SAS switch according to the received write request.
作为一种示例,所述第二存储节点,还用于将第一存储节点中的磁盘的索引信息更新至第二存储节点的数据库。As an example, the second storage node is also used to update the index information of the disk in the first storage node to the database of the second storage node.
作为一种示例,所述管理节点,还用于将所述磁盘与第二存储节点的存储节点信息对应更新至本地数据库中。As an example, the management node is also used to update the disk and the storage node information of the second storage node to the local database.
作为一种示例,所述管理节点,还用于接收所述第二存储节点发送的加载磁盘成功的消息。As an example, the management node is further configured to receive a message that the second storage node successfully loads the disk.
如图4所示,为了便于理解,接下来以一个具体示例来说明本实施例提供的管理节点MDS请求交换机中其他存储节点加载故障存储节点的磁盘的具体实现,即管理节点MDS实现漂移磁盘的步骤可以包括:As shown in FIG. 4, for ease of understanding, a specific example is used to explain the specific implementation of the management node MDS provided in this embodiment to request other storage nodes in the switch to load the disk of the failed storage node, that is, the management node MDS implements the drift disk Steps can include:
A1、存储节点OSD1异常。A1. The storage node OSD1 is abnormal.
假设存储节点OSD1的软件层面出现故障,如服务启动失败、操作系统异常等,这种情况下磁盘和磁盘上的数据是正常的,磁盘依然可以访问。Suppose that the software level of the storage node OSD1 fails, such as service startup failure, abnormal operating system, etc. In this case, the disk and the data on the disk are normal, and the disk can still be accessed.
A2、管理节点MDS请求存储节点OSD2加载存储节点OSD1的磁盘。A2. The management node MDS requests the storage node OSD2 to load the disk of the storage node OSD1.
存储节点OSD1出现软件故障后,存储节点OSD1无法正常上报心跳给管理节点MDS,管理节点MDS会认为存储节点OSD1离线,此时管理节点MDS会请求其他存储节点OSD2尝试加载存储节点OSD1的磁盘,其他存储节点OSD2加载成功后,存储节点OSD1中的磁盘数据可以通过其他存储节点OSD2正常读取,当然该其他存储节点OSD2也可以写数据到该磁盘中,由此避免了数据恢复的过程。After a software failure occurs on the storage node OSD1, the storage node OSD1 cannot report the heartbeat to the management node MDS. The management node MDS considers that the storage node OSD1 is offline. At this time, the management node MDS requests the other storage node OSD2 to try to load the storage node OSD1 disk. After the storage node OSD2 is successfully loaded, the disk data in the storage node OSD1 can be normally read by other storage nodes OSD2. Of course, the other storage node OSD2 can also write data to the disk, thereby avoiding the data recovery process.
具体的,管理节点MDS根据存储节点的状态,可以将磁盘动态调整到其他 存储节点OSD2进行读写加载。譬如,若管理节点MDS未发现存储节点异常,则正常读写磁盘数据;而当管理节点MDS发现存储节点OSD1异常后,请求同一个交换机中的存储节点OSD2加载存储节点OSD1的磁盘,通过存储节点OSD2正常读写存储节点OSD1的磁盘数据,实现磁盘漂移。Specifically, according to the state of the storage node, the management node MDS can dynamically adjust the disk to other storage node OSD2 for reading, writing, and loading. For example, if the management node MDS finds that the storage node is abnormal, it reads and writes the disk data normally; and when the management node MDS finds that the storage node OSD1 is abnormal, it requests the storage node OSD2 in the same switch to load the disk of the storage node OSD1, through the storage node OSD2 normally reads and writes the disk data of storage node OSD1 to realize disk drift.
从上述流程可以看出,管理节点MDS根据存储节点的状态实现磁盘漂移。当存储节点OSD1出现软件故障后,磁盘的读写权限从故障存储节点OSD1漂移到存储集群中的正常存储节点OSD2。It can be seen from the above process that the management node MDS implements disk drift according to the state of the storage node. When a software failure occurs on the storage node OSD1, the read and write permissions of the disk drift from the faulty storage node OSD1 to the normal storage node OSD2 in the storage cluster.
A3、存储节点OSD2成功加载OSD1的磁盘。A3. The storage node OSD2 successfully loads the disk of OSD1.
磁盘漂移后,磁盘的读写请求都通过正常存储节点OSD2进行,存储节点OSD2像使用本地磁盘一样使用漂移过来的磁盘。After the disk drifts, all disk read and write requests are made through the normal storage node OSD2. The storage node OSD2 uses the drifted disk like a local disk.
由此,存储节点OSD2通过SAS交换机正常访问到OSD1中的磁盘,可以实现正常加载OSD1中的磁盘。Therefore, the storage node OSD2 can normally access the disk in OSD1 through the SAS switch, and the disk in OSD1 can be normally loaded.
故障存储节点OSD1的磁盘,通过其他存储节点OSD2的加载后,实现正常读写。故障存储节点OSD1的磁盘被正常存储节点OSD2成功加载后,后续的磁盘数据的读写,都可以通过加载磁盘的存储节点OSD2进行,SAS交换机使得存储节点OSD2访问同一个交换机中其他存储节点OSD1的磁盘,就像访问本地磁盘一样。The disk of the faulty storage node OSD1 can be read and written normally after being loaded by other storage nodes OSD2. After the disk of the faulty storage node OSD1 is successfully loaded by the normal storage node OSD2, the subsequent reading and writing of the disk data can be performed by the storage node OSD2 loading the disk. The SAS switch allows the storage node OSD2 to access other storage nodes OSD1 in the same switch. Disk, just like accessing a local disk.
经过上述步骤,存储节点OSD1的软件层面异常后,磁盘可以顺利被其他存储节点OSD2加载并读写,数据读写不需要通过重构等方式来恢复数据,避免了不必要的计算。而且,存储节点OSD1异常后,整个云存储系统中的数据的读写不会有太大的性能影响。After the above steps, after the software level of the storage node OSD1 is abnormal, the disk can be successfully loaded and read and written by other storage nodes OSD2. Data read and write does not need to be reconstructed to restore data, avoiding unnecessary calculations. Moreover, after the storage node OSD1 is abnormal, the reading and writing of data in the entire cloud storage system will not have much performance impact.
故障存储节点恢复正常后,MDS请求其他存储节点卸载该被加载的磁盘。例如,在故障存储节点OSD1恢复正常后,管理节点MDS可以首先请求其他存储节点OSD2卸载已加载的故障存储节点OSD1的磁盘,再请求存储节点OSD1加载所述磁盘,这样存储节点OSD1的本地磁盘又可以由存储节点OSD1自己接管读写,由此可以分散系统中存储节点OSD的操作磁盘的压力。After the faulty storage node returns to normal, the MDS requests other storage nodes to unmount the loaded disk. For example, after the failed storage node OSD1 returns to normal, the management node MDS may first request the other storage node OSD2 to unload the disk of the failed storage node OSD1, and then request the storage node OSD1 to load the disk, so that the local disk of the storage node OSD1 The storage node OSD1 can take over the reading and writing by itself, thereby dispersing the pressure of the operating disk of the storage node OSD in the system.
如图5所示,在另一可选的实施例中,当SAS交换机中多个存储节点出现软件故障后,管理节点MDS请求SAS交换机中其他存储节点加载故障存储节点的磁盘,由此管理节点MDS漂移磁盘的步骤具体如下:As shown in FIG. 5, in another optional embodiment, when multiple storage nodes in the SAS switch have a software failure, the management node MDS requests the other storage nodes in the SAS switch to load the disks of the failed storage node, thereby managing the nodes The steps of MDS drift disk are as follows:
B1、存储节点OSD1、OSD3异常。B1. The storage nodes OSD1 and OSD3 are abnormal.
假设存储节点OSD1、OSD3的软件层面出现故障,如服务启动失败、操作系统异常等,这种情况下磁盘和磁盘上的数据是正常的,磁盘依然可以访问。Assume that the software level of the storage nodes OSD1 and OSD3 is faulty, such as service startup failure and abnormal operating system. In this case, the disk and the data on the disk are normal, and the disk can still be accessed.
B2、管理节点MDS请求存储节点OSD2加载存储节点OSD1、OSD3的磁盘。B2. The management node MDS requests the storage node OSD2 to load the disks of the storage nodes OSD1 and OSD3.
存储节点出现软件故障后,存储节点OSD1、OSD3无法正常上报心跳给管理节点MDS,管理节点MDS会认为存储节点OSD1、OSD3离线,此时管理节点MDS会请求其他存储节点OSD2尝试加载存储节点OSD1和OSD3中的磁盘,其他存储节点OSD2加载成功后,故障存储节点中的磁盘内的数据可以通过其他存储节点OSD2正常读取,当然该其他存储节点OSD2也可以写数据到该磁盘中,由此避免了数据恢复的过程。After a software failure occurs on the storage node, the storage nodes OSD1 and OSD3 cannot report the heartbeat to the management node MDS. The management node MDS considers the storage node OSD1 and OSD3 to be offline. At this time, the management node MDS requests the other storage node OSD2 to try to load the storage node OSD1 and The disk in OSD3, after the other storage node OSD2 is loaded successfully, the data in the disk in the failed storage node can be read normally through the other storage node OSD2, of course, the other storage node OSD2 can also write data to the disk, thereby avoiding The process of data recovery.
具体的,管理节点MDS根据存储节点的状态,可以将磁盘动态调整到其他存储节点进行读写加载。譬如,若管理节点MDS未发现存储节点异常,则正常读写磁盘数据;而当管理节点MDS发现存储节点OSD1、OSD3异常后,请求同一个交换机中的存储节点OSD2加载存储节点OSD1和OSD3中的磁盘,通过存储节点OSD2正常读写存储节点OSD1和OSD3中的磁盘内的数据,实现磁盘漂移。Specifically, according to the state of the storage node, the management node MDS can dynamically adjust the disk to other storage nodes for read-write loading. For example, if the management node MDS finds that the storage node is abnormal, it reads and writes the disk data normally; and when the management node MDS finds that the storage nodes OSD1 and OSD3 are abnormal, it requests the storage node OSD2 in the same switch to load the storage nodes OSD1 and OSD3. Disk, through the storage node OSD2 to read and write data in the disks of the storage nodes OSD1 and OSD3 normally, to achieve disk drift.
从上述流程可以看出,管理节点MDS根据存储节点的状态实现磁盘漂移。当多个存储节点OSD1、OSD3出现软件故障后,磁盘的读写权限从故障存储节点OSD1、OSD3自动漂移到存储集群中的正常存储节点OSD2。It can be seen from the above process that the management node MDS implements disk drift according to the state of the storage node. When multiple storage nodes OSD1 and OSD3 have a software failure, the read and write permissions of the disk automatically drift from the failed storage node OSD1 and OSD3 to the normal storage node OSD2 in the storage cluster.
B3、存储节点OSD2成功加载存储节点OSD1和OSD3中的磁盘。B3. The storage node OSD2 successfully loads the disks in the storage nodes OSD1 and OSD3.
磁盘漂移后,磁盘的读写请求都通过正常存储节点OSD2进行,存储节点OSD2像使用本地磁盘一样使用漂移过来的磁盘。After the disk drifts, all disk read and write requests are made through the normal storage node OSD2. The storage node OSD2 uses the drifted disk like a local disk.
由此,存储节点OSD2通过SAS交换机正常访问到OSD1和OSD3中的磁盘,可以实现正常加载OSD1和OSD3中的磁盘。Therefore, the storage node OSD2 can normally access the disks in OSD1 and OSD3 through the SAS switch, and the disks in OSD1 and OSD3 can be normally loaded.
如图6所示,在另一可选的实施例中,当SAS交换机中多个存储节点出现软件故障后,管理节点MDS请求交换机中其他多个存储节点加载故障存储节点的磁盘,由此管理节点MDS实现漂移磁盘的步骤具体如下:As shown in FIG. 6, in another optional embodiment, when multiple storage nodes in the SAS switch have a software failure, the management node MDS requests the other multiple storage nodes in the switch to load the disks of the failed storage node, thereby managing The steps for implementing drift disk on the node MDS are as follows:
C1、存储节点OSD1、OSD3异常。C1. The storage nodes OSD1 and OSD3 are abnormal.
假设存储节点OSD1、OSD3的软件层面出现故障,如服务启动失败、操作系统异常等,这种情况下磁盘和磁盘上的数据是正常的,磁盘依然可以访问。Assume that the software level of the storage nodes OSD1 and OSD3 is faulty, such as service startup failure and abnormal operating system. In this case, the disk and the data on the disk are normal, and the disk can still be accessed.
C2、管理节点MDS请求存储节点OSD2、OSD4加载存储节点OSD1、OSD3的磁盘。C2. The management node MDS requests the storage nodes OSD2 and OSD4 to load the disks of the storage nodes OSD1 and OSD3.
存储节点出现软件故障后,存储节点OSD1、OSD3无法正常上报心跳给管理节点MDS,管理节点MDS会认为存储节点OSD1、OSD3离线,此时管理节点MDS会请求其他存储节点OSD2和OSD4尝试加载存储节点OSD1和OSD3的磁盘,其他存储节点OSD2和OSD4加载成功后,故障存储节点中的磁盘内的数据可以通过其他存储节点OSD2和OSD4正常读取,当然该OSD2和OSD4也可以写数据到磁盘中,由此避免了数据恢复的过程。After a software failure occurs on the storage node, the storage nodes OSD1 and OSD3 cannot report the heartbeat to the management node MDS. The management node MDS considers that the storage nodes OSD1 and OSD3 are offline. At this time, the management node MDS requests other storage nodes OSD2 and OSD4 to try to load the storage node. After the disks of OSD1 and OSD3, and other storage nodes OSD2 and OSD4 are loaded successfully, the data in the disk in the failed storage node can be read normally by other storage nodes OSD2 and OSD4. Of course, the OSD2 and OSD4 can also write data to the disk. This avoids the process of data recovery.
具体的,管理节点MDS根据存储节点的状态,可以将磁盘动态调整到其他存储节点进行加载并读写。譬如,若管理节点MDS未发现存储节点异常,则正常读写磁盘内的数据;而当管理节点MDS发现存储节点OSD1、OSD3异常后,请求同一个交换机中的存储节点OSD2、OSD4加载存储节点OSD1、OSD3的磁盘,通过存储节点OSD2、OSD4正常读写存储节点OSD1、OSD3的磁盘内的数据,示例性地,可以通过存储节点OSD2正常读写存储节点OSD1的磁盘内的数据,并且,通过存储节点OSD4正常读写存储节点OSD3的磁盘内的数据,从而实现磁盘漂移。Specifically, according to the state of the storage node, the management node MDS can dynamically adjust the disk to other storage nodes for loading and reading and writing. For example, if the management node MDS does not find the storage node abnormal, it reads and writes the data in the disk normally; and when the management node MDS finds that the storage node OSD1, OSD3 is abnormal, requests the storage node OSD2, OSD4 in the same switch to load the storage node OSD1 , OSD3 disk, through the storage node OSD2, OSD4 normally read and write data in the storage node OSD1, OSD3 disk, for example, the storage node OSD2 can normally read and write data in the storage node OSD1 disk, and, through storage The node OSD4 normally reads and writes data in the disk of the storage node OSD3, thereby realizing disk drift.
从上述流程可以看出,管理节点MDS根据存储节点的状态实现磁盘漂移。当多个存储节点OSD1、OSD3出现软件故障后,磁盘的读写权限从故障存储节点OSD1、OSD3漂移到存储集群中的正常存储节点OSD2、OSD4。It can be seen from the above process that the management node MDS implements disk drift according to the state of the storage node. When multiple storage nodes OSD1 and OSD3 have a software failure, the read and write permissions of the disk drift from the failed storage nodes OSD1 and OSD3 to the normal storage nodes OSD2 and OSD4 in the storage cluster.
C3、存储节点OSD2、OSD4成功加载存储节点OSD1、OSD3的磁盘。C3. The storage nodes OSD2 and OSD4 successfully load the disks of the storage nodes OSD1 and OSD3.
磁盘漂移后,磁盘的读写请求都通过正常存储节点OSD2、OSD4进行,存储节点OSD2、OSD4像使用本地磁盘一样使用漂移过来的磁盘。After the disk drifts, all disk read and write requests are made through the normal storage nodes OSD2 and OSD4. The storage nodes OSD2 and OSD4 use the drifted disks as if they were local disks.
由此,存储节点OSD2、OSD4通过SAS交换机正常访问到OSD1、OSD3中的磁盘,可以实现正常加载OSD1、OSD3中的磁盘。Therefore, the storage nodes OSD2 and OSD4 can normally access the disks in OSD1 and OSD3 through the SAS switch, and the disks in OSD1 and OSD3 can be normally loaded.
在本申请的其他实施方式中,提出了一种磁盘动态加载设备,该磁盘动态加载设备可以为上述管理设备,或者,也可以为上述存储节点,可以包括:In other embodiments of the present application, a disk dynamic loading device is proposed. The disk dynamic loading device may be the aforementioned management device, or may also be the aforementioned storage node, which may include:
一个或多个处理器、存储一个或多个程序的存储装置;One or more processors, a storage device that stores one or more programs;
当所述一个或多个程序被所述一个或多个处理器执行时,所述一个或多个处理器实现所述的磁盘动态加载方法。When the one or more programs are executed by the one or more processors, the one or more processors implement the disk dynamic loading method.
在本申请的其他实施方式中,还提出了一种计算机可读存储介质,其上存 储有计算机程序,所述计算机程序被处理器执行时实现所述的磁盘动态加载方法。In other embodiments of the present application, a computer-readable storage medium is also provided, on which a computer program is stored, and when the computer program is executed by a processor, the disk dynamic loading method is implemented.
以上是对本申请所提供的一种磁盘动态加载的方法和装置进行详细介绍,本申请通过磁盘在对象存储内部的存储节点之间漂移,实现磁盘的动态加载,在存储节点的软件发生故障后,通过SAS交换机,实现存储节点的漂移,可以继续访问故障存储节点的磁盘数据,提高对象存储磁盘的可用性。本文中应用了具体实施例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的一般技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处。The above is a detailed introduction to a method and device for dynamically loading a disk provided by the present application. In this application, the disk is drifted between storage nodes inside the object storage to realize the dynamic loading of the disk. After the software of the storage node fails, Through the SAS switch, the drift of the storage node can be realized, and the disk data of the failed storage node can be continuously accessed to improve the availability of the object storage disk. This document uses specific examples to explain the principles and implementation of this application. The descriptions of the above examples are only used to help understand the method and core ideas of this application; at the same time, for those of ordinary skill in the art, based on this application There will be changes in the specific implementation and application scope of the idea.
以上所述,仅为本申请较佳的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到的变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。The above is only the preferred specific implementation of this application, but the scope of protection of this application is not limited to this, any person skilled in the art can easily think of changes or changes within the technical scope disclosed in this application. Replacement should be covered within the scope of protection of this application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (14)

  1. 一种磁盘动态加载的方法,其特征在于,应用于云存储系统,所述云存储系统包括管理节点以及多个存储节点,所述多个存储节点接入同一SAS交换机,所述方法包括:A method for dynamically loading a magnetic disk is characterized by being applied to a cloud storage system. The cloud storage system includes a management node and multiple storage nodes. The multiple storage nodes access the same SAS switch. The method includes:
    当所述管理节点检测到所述多个存储节点中的第一存储节点出现软件故障时,发送磁盘加载指令至所述多个存储节点中的第二存储节点;When the management node detects a software failure of the first storage node of the plurality of storage nodes, it sends a disk load instruction to the second storage node of the plurality of storage nodes;
    所述第二存储节点在接收到所述磁盘加载指令后,通过所述SAS交换机加载所述第一存储节点的磁盘。After receiving the disk loading instruction, the second storage node loads the disk of the first storage node through the SAS switch.
  2. 如权利要求1所述的方法,其特征在于,所述方法还包括:The method according to claim 1, wherein the method further comprises:
    所述管理节点更新本地存储的所述磁盘对应的存储节点信息。The management node updates storage node information corresponding to the locally stored disk.
  3. 根据权利要求2所述的方法,其特征在于,所述方法还包括:The method according to claim 2, wherein the method further comprises:
    当所述管理节点接收到读取所述磁盘的数据的读取请求时,根据更新后的本地存储的所述磁盘对应的存储节点信息,将所述读取请求下发至所述第二存储节点;When the management node receives the read request to read the data of the disk, the management node sends the read request to the second storage according to the updated storage node information corresponding to the disk stored locally node;
    所述第二存储节点根据接收到的所述读取请求,通过所述SAS交换机读取所述磁盘中的数据。The second storage node reads the data in the disk through the SAS switch according to the received read request.
  4. 如权利要求2所述的方法,其特征在于,所述方法还包括:The method according to claim 2, wherein the method further comprises:
    当所述管理节点接收到向所述磁盘写入数据的写入请求时,根据更新后的本地存储的所述磁盘对应的存储节点信息,将所述写入请求下发至所述第二存储节点;When the management node receives a write request to write data to the disk, it sends the write request to the second storage according to the updated storage node information corresponding to the disk stored locally node;
    所述第二存储节点根据接收到的所述写入请求,通过所述SAS交换机向所述磁盘写入数据。The second storage node writes data to the disk through the SAS switch according to the received write request.
  5. 根据权利要求1所述的方法,其特征在于,所述通过所述SAS交换机加载所述第一存储节点的磁盘,包括:The method according to claim 1, wherein loading the disk of the first storage node through the SAS switch includes:
    所述第二存储节点将所述第一存储节点中的磁盘的索引信息更新至所述第二存储节点的数据库。The second storage node updates the index information of the disk in the first storage node to the database of the second storage node.
  6. 根据权利要求2所述的方法,其特征在于,所述管理节点更新本地存储的所述磁盘对应的存储节点信息,包括:The method according to claim 2, wherein the management node updating the storage node information corresponding to the locally stored disk includes:
    所述管理节点将所述磁盘与所述第二存储节点的存储节点信息对应更新至本地数据库中。The management node correspondingly updates the storage disk information of the disk and the second storage node to a local database.
  7. 根据权利要求1所述的方法,其特征在于,在所述管理节点更新本地存储的所述磁盘对应的存储节点信息之前,还包括:The method according to claim 1, wherein before the management node updates the storage node information corresponding to the locally stored disk, the method further comprises:
    所述管理节点接收到所述第二存储节点发送的加载磁盘成功的消息。The management node receives the message that the disk is loaded successfully from the second storage node.
  8. 一种云存储系统,其特征在于,所述云存储系统包括:管理节点以及多个存储节点,所述多个存储节点接入同一SAS交换机,所述多个存储节点包括第一存储节点和第二存储节点,其中:A cloud storage system, characterized in that the cloud storage system includes: a management node and a plurality of storage nodes, the plurality of storage nodes are connected to the same SAS switch, and the plurality of storage nodes include a first storage node and a Two storage nodes, where:
    所述管理节点,用于在检测到所述第一存储节点出现软件故障时,发送磁盘加载指令至所述第二存储节点;The management node is configured to send a disk load instruction to the second storage node when it detects that the first storage node has a software failure;
    所述第二存储节点,用于在接收到所述磁盘加载指令后,通过所述SAS交换机加载所述第一存储节点的磁盘。The second storage node is configured to load the disk of the first storage node through the SAS switch after receiving the disk loading instruction.
  9. 如权利要求8所述的系统,其特征在于,The system of claim 8, wherein:
    所述管理节点,还用于更新本地存储的所述磁盘对应的存储节点信息。The management node is also used to update locally stored storage node information corresponding to the disk.
  10. 根据权利要求9所述的系统,其特征在于,The system of claim 9, wherein:
    所述管理节点,还用于在接收到读取所述磁盘的数据的读取请求时,根据更新后的本地存储的所述磁盘对应的存储节点信息,将所述读取请求下发至所述第二存储节点;The management node is further configured to, when receiving a read request to read the data of the disk, send the read request to all of the disks according to the updated storage node information corresponding to the disk stored locally Describe the second storage node;
    所述第二存储节点,还用于根据接收到的所述读取请求,通过所述SAS交换机读取所述磁盘中的数据。The second storage node is also used to read the data in the disk through the SAS switch according to the received read request.
  11. 根据权利要求9所述的系统,其特征在于,The system of claim 9, wherein:
    所述管理节点,还用于在接收到向所述磁盘写入数据的写入请求时,根据更新后的本地存储的所述磁盘对应的存储节点信息,将所述写入请求下发至所述第二存储节点;The management node is further configured to, when receiving a write request to write data to the disk, send the write request to all the disks according to the updated storage node information corresponding to the disk stored locally Describe the second storage node;
    所述第二存储节点,还用于根据接收到的所述写入请求,通过所述SAS交换机向所述磁盘写入数据。The second storage node is further configured to write data to the disk through the SAS switch according to the received write request.
  12. 根据权利要求8所述的系统,其特征在于:The system according to claim 8, characterized in that:
    所述第二存储节点,还用于将所述第一存储节点中的磁盘的索引信息更新至所述第二存储节点的数据库。The second storage node is also used to update the index information of the disk in the first storage node to the database of the second storage node.
  13. 根据权利要求9所述的系统,其特征在于:The system according to claim 9, characterized in that:
    所述管理节点,还用于将所述磁盘与所述第二存储节点的存储节点信息对应更新至本地数据库中。The management node is also used to correspondingly update the storage node information of the disk and the second storage node to a local database.
  14. 根据权利要求8所述的系统,其特征在于:The system according to claim 8, characterized in that:
    所述管理节点,还用于接收所述第二存储节点发送的加载磁盘成功的消息。The management node is also used to receive a message that the second storage node successfully loads the disk.
PCT/CN2019/130169 2018-12-28 2019-12-30 Method for dynamic loading of disk and cloud storage system WO2020135889A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811625675.X 2018-12-28
CN201811625675.XA CN111381766B (en) 2018-12-28 2018-12-28 Method for dynamically loading disk and cloud storage system

Publications (1)

Publication Number Publication Date
WO2020135889A1 true WO2020135889A1 (en) 2020-07-02

Family

ID=71129699

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/130169 WO2020135889A1 (en) 2018-12-28 2019-12-30 Method for dynamic loading of disk and cloud storage system

Country Status (2)

Country Link
CN (1) CN111381766B (en)
WO (1) WO2020135889A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111880751B (en) * 2020-09-28 2020-12-25 浙江大华技术股份有限公司 Hard disk migration method, distributed storage cluster system and storage medium
TWI784750B (en) * 2021-10-15 2022-11-21 啟碁科技股份有限公司 Data processing method of terminal device and data processing system of terminal device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101969465A (en) * 2010-10-13 2011-02-09 北京神州融信信息技术股份有限公司 Cluster read-write method, apparatus and system and controller
CN103608784A (en) * 2013-06-26 2014-02-26 华为技术有限公司 Method for creating network volumes, data storage method, storage device and storage system
US20160070622A1 (en) * 2010-09-24 2016-03-10 Hitachi Data Systems Corporation System and method for enhancing availability of a distributed object storage system during a partial database outage
CN107046575A (en) * 2017-04-18 2017-08-15 南京卓盛云信息科技有限公司 A kind of cloud storage system and its high density storage method
CN107124469A (en) * 2017-06-07 2017-09-01 郑州云海信息技术有限公司 A kind of clustered node communication means and system

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105657066B (en) * 2016-03-23 2019-06-14 天津书生云科技有限公司 Load rebalancing method and device for storage system
CN103067485A (en) * 2012-12-25 2013-04-24 曙光信息产业(北京)有限公司 Disk monitoring method for cloud storage system
CN103152397B (en) * 2013-02-06 2017-05-03 浪潮电子信息产业股份有限公司 Method for designing multi-protocol storage system
CN104967577B (en) * 2015-06-25 2019-09-03 北京百度网讯科技有限公司 SAS switch and server

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160070622A1 (en) * 2010-09-24 2016-03-10 Hitachi Data Systems Corporation System and method for enhancing availability of a distributed object storage system during a partial database outage
CN101969465A (en) * 2010-10-13 2011-02-09 北京神州融信信息技术股份有限公司 Cluster read-write method, apparatus and system and controller
CN103608784A (en) * 2013-06-26 2014-02-26 华为技术有限公司 Method for creating network volumes, data storage method, storage device and storage system
CN107046575A (en) * 2017-04-18 2017-08-15 南京卓盛云信息科技有限公司 A kind of cloud storage system and its high density storage method
CN107124469A (en) * 2017-06-07 2017-09-01 郑州云海信息技术有限公司 A kind of clustered node communication means and system

Also Published As

Publication number Publication date
CN111381766B (en) 2022-08-02
CN111381766A (en) 2020-07-07

Similar Documents

Publication Publication Date Title
US11153380B2 (en) Continuous backup of data in a distributed data store
US10387673B2 (en) Fully managed account level blob data encryption in a distributed storage environment
US9098466B2 (en) Switching between mirrored volumes
US9703803B2 (en) Replica identification and collision avoidance in file system replication
US9582213B2 (en) Object store architecture for distributed data processing system
US7913046B2 (en) Method for performing a snapshot in a distributed shared file system
US20190007208A1 (en) Encrypting existing live unencrypted data using age-based garbage collection
US8386707B2 (en) Virtual disk management program, storage device management program, multinode storage system, and virtual disk managing method
US9262087B2 (en) Non-disruptive configuration of a virtualization controller in a data storage system
US9823955B2 (en) Storage system which is capable of processing file access requests and block access requests, and which can manage failures in A and storage system failure management method having a cluster configuration
US11681443B1 (en) Durable data storage with snapshot storage space optimization
US9760457B2 (en) System, method and computer program product for recovering stub files
CN103037004A (en) Implement method and device of cloud storage system operation
US20050234916A1 (en) Method, apparatus and program storage device for providing control to a networked storage architecture
US12001724B2 (en) Forwarding operations to bypass persistent memory
US8386741B2 (en) Method and apparatus for optimizing data allocation
US20090024768A1 (en) Connection management program, connection management method and information processing apparatus
US20240103744A1 (en) Block allocation for persistent memory during aggregate transition
CN106528338A (en) Remote data replication method, storage equipment and storage system
WO2020135889A1 (en) Method for dynamic loading of disk and cloud storage system
US20130275670A1 (en) Multiple enhanced catalog sharing (ecs) cache structure for sharing catalogs in a multiprocessor system
US11169728B2 (en) Replication configuration for multiple heterogeneous data stores
CN114490540A (en) Data storage method, medium, device and computing equipment
CN119127095B (en) Multi-tenant distributed file system, request method and device based on gRPC

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19904425

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19904425

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 19904425

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 03.02.2022)

122 Ep: pct application non-entry in european phase

Ref document number: 19904425

Country of ref document: EP

Kind code of ref document: A1