[go: up one dir, main page]

CN108287660B - Data storage method and device - Google Patents

Data storage method and device Download PDF

Info

Publication number
CN108287660B
CN108287660B CN201710012670.9A CN201710012670A CN108287660B CN 108287660 B CN108287660 B CN 108287660B CN 201710012670 A CN201710012670 A CN 201710012670A CN 108287660 B CN108287660 B CN 108287660B
Authority
CN
China
Prior art keywords
storage
node
data
data storage
storage device
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710012670.9A
Other languages
Chinese (zh)
Other versions
CN108287660A (en
Inventor
付永振
靳晓嘉
魏春来
汤云峰
王靖
付旭轮
单雷光
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Group Hebei Co Ltd
Original Assignee
China Mobile Group Hebei Co Ltd
China Mobile Communications Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Group Hebei Co Ltd, China Mobile Communications Corp filed Critical China Mobile Group Hebei Co Ltd
Priority to CN201710012670.9A priority Critical patent/CN108287660B/en
Publication of CN108287660A publication Critical patent/CN108287660A/en
Application granted granted Critical
Publication of CN108287660B publication Critical patent/CN108287660B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0629Configuration or reconfiguration of storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • G06F3/0611Improving I/O performance in relation to response time
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请公开了一种数据存储方法和设备。该数据存储方法包括:将待存储数据划分到N个对象中,其中,N为正整数;将N个对象分配到M个归置组中,其中,M为小于N的正整数;对于M个归置组中的任意一个归置组,基于存储映射表确定与该归置组对应的至少三个对象存储设备,其中,存储映射表中包含归置组与对象存储设备之间的映射关系,并且基于伪随机数据分布算法将该归置组中包含的各个对象存储到与该归置组对应的相应对象存储设备中。本申请实施例的数据存储方法和设备,能够提高Ceph分布式数据存储系统中的数据存储效率,有效实现数据在Ceph分布式数据存储系统中的高速读写。

Figure 201710012670

The present application discloses a data storage method and device. The data storage method includes: dividing the data to be stored into N objects, where N is a positive integer; allocating the N objects into M placement groups, where M is a positive integer smaller than N; For any placement group in the placement group, determine at least three object storage devices corresponding to the placement group based on the storage mapping table, wherein the storage mapping table includes the mapping relationship between the placement group and the object storage device, And each object included in the placement group is stored in a corresponding object storage device corresponding to the placement group based on a pseudo-random data distribution algorithm. The data storage method and device in the embodiments of the present application can improve the data storage efficiency in the Ceph distributed data storage system, and effectively realize high-speed reading and writing of data in the Ceph distributed data storage system.

Figure 201710012670

Description

Data storage method and device
Technical Field
The present application relates to the field of computers, and in particular, to a data storage method and device.
Background
With the advent of the big data age, the traditional centralized data storage system cannot meet the requirement of large-scale data storage. In order to meet the requirements of large-scale data storage and ensure the reliability and safety of data storage, distributed data storage systems have appeared. The Ceph is an open-source distributed data storage system, and can store data in a distributed manner on a plurality of storage nodes, i.e., a plurality of storage servers, so that distributed storage of the data is realized, and the reliability, the availability and the access efficiency of the data storage system are improved.
In practical application, the distributed data storage system of the Ceph realizes the distributed storage of data through three-level mapping. Firstly, dividing File data (File) to be stored into a plurality of Object data (Object) with consistent data size, and realizing the mapping from the File to the Object; then, any Object is allocated to a Place Group (PG) through a hash algorithm, and the mapping of the Object to the PG is realized; finally, the Object contained in any PG is stored in different Object Storage Devices (OSD) in the Object Storage cluster through a pseudo random data distribution algorithm (CRUSH), so that the mapping from PG to OSD is realized.
However, since the Ceph distributed data storage system needs to perform hash operation in the data storage process and implement mapping storage of data, the storage efficiency of data is low, and the requirement of high-speed reading and writing cannot be met.
Disclosure of Invention
In view of this, embodiments of the present application provide a data storage method and device, so as to improve data storage efficiency of a Ceph distributed data storage system.
The data storage method according to the embodiment of the application is applied to a Ceph distributed data storage system and comprises the following steps: dividing data to be stored into N objects, wherein N is a positive integer; distributing N objects into M homing groups, wherein M is a positive integer less than N; for any one of the M homing groups, determining at least three object storage devices corresponding to the homing group based on a storage mapping table, wherein the storage mapping table comprises a mapping relation between the homing group and the object storage devices, and storing each object contained in the homing group into the corresponding object storage device corresponding to the homing group based on a pseudo-random data distribution algorithm.
A data storage device according to an embodiment of the present application is applied to a Ceph distributed data storage system, and includes: the device comprises a dividing unit, a storage unit and a processing unit, wherein the dividing unit is configured to divide data to be stored into N objects, and N is a positive integer; an allocation unit configured to allocate the N objects into M homing groups, wherein M is a positive integer less than N; and the storage unit is configured to determine, for any one of the M homing groups, at least three object storage devices corresponding to the homing group based on a storage mapping table, wherein the storage mapping table contains a mapping relationship between the homing group and the object storage devices, and store each object contained in the homing group into the corresponding object storage device corresponding to the homing group based on a pseudo-random data distribution algorithm.
According to the data storage method and system provided by the embodiment of the application, after the data to be stored are divided into N objects and the N objects are distributed into M homing groups, at least three object storage devices corresponding to any one homing group are directly searched through a predetermined storage mapping table, and then the objects contained in any one homing group are stored into the corresponding object storage devices corresponding to the homing group through a pseudo-random data distribution algorithm, so that the data storage efficiency in the Ceph distributed data storage system can be improved, and the high-speed reading and writing of the data in the Ceph distributed data storage system are effectively realized.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
FIG. 1 is a schematic flow chart illustrating a data storage method according to an embodiment of the present application;
fig. 2 is a schematic diagram of a system boot process of the Ceph distributed data storage system according to an embodiment of the present application;
fig. 3 is a schematic diagram illustrating a process of setting storage nodes in a Ceph distributed data storage system according to an embodiment of the present application;
fig. 4 is a schematic process diagram of data storage of a Ceph distributed data storage system according to an embodiment of the present application;
fig. 5 is a schematic diagram of a process of repairing a failure of an object storage device of the Ceph distributed data storage system according to an embodiment of the present application;
fig. 6 is a schematic diagram of a process of repairing a failure of a storage node of the Ceph distributed data storage system according to an embodiment of the present application;
fig. 7 is a schematic structural diagram of a data storage device according to an embodiment of the present application.
Detailed Description
In practical application, the Ceph distributed data storage system mainly includes four parts, namely, a client, a metadata server, an object storage cluster, and a Monitor (Monitor, hereinafter referred to as Ceph Mon), where: the client represents a storage node where a current data user is located; the metadata server is used for caching and synchronizing information describing data attributes (such as storage positions of data, historical data, record files and the like); the object storage cluster comprises a plurality of storage nodes for data storage, and the monitor is used for performing a monitoring function on the whole Ceph distributed data storage system.
In the process of storing data in a Ceph distributed data storage system, an Inode Number (INO) is allocated to a File to be stored (File), and the INO is a unique identifier of the File; when the data size of the File to be stored is large, the File needs to be divided into a series of objects with uniform size for storage. Here, the size of the last Object may be different from the preceding Object.
In a Ceph large-scale storage cluster, the number of objects is large, the amount of data contained in each Object is small, and if the objects are stored in a read-write mode through traversal addressing, the data storage rate is seriously influenced. Meanwhile, if the Object is mapped to the OSD for storage through some fixed mapping hashing algorithm, the Object cannot be automatically migrated to other idle OSDs when the OSD is damaged, and data loss is caused. Therefore, objects with large data volumes are typically allocated into several PGs.
The Object identification code (OID) of any one Object can be determined by the INO and the Object Number (ONO). For any Object, a static HASH function is used to perform HASH on the OID to determine a HASH value of the Object, modulo operation is performed on the HASH value and the number of PGs to determine a PG identification code (PGID) of the PG corresponding to the Object, and mapping from the Object to the PG is further realized.
PG is a concept container of objects and also a logical concept, which is a virtual existence in the Ceph distributed data storage system, and is used for organizing and mapping the storage of objects. One PG is responsible for organizing several objects, but one Object can only be mapped into one PG, i.e. there is a "one-to-many" mapping between PG and Object. The reasonable setting of the number of PGs can ensure the uniformity of data distribution.
The Ceph distributed data storage system determines the OSD corresponding to any PG through a pseudo-random data distribution algorithm (CRUSH), and then stores the Object in the PG into the corresponding OSD in the PG, thereby realizing the mapping from the PG to the OSD. A large amount of PG is carried on an OSD, i.e., there is a "many-to-many" mapping between PG and OSD. Through the CRUSH algorithm, data loss when a single point of failure occurs in the storage node can be avoided, the storage node can be prevented from relying on metadata for storage, and the data storage efficiency is effectively improved.
However, since the Ceph distributed data storage system needs to perform hash operation and modulo operation between the hash value and the PG number in the data storage process, the data storage efficiency is low, and the requirement of high-speed reading and writing cannot be met.
In order to achieve the purpose of the present application, embodiments of the present application provide a data storage method and device, after dividing data to be stored into N objects and allocating the N objects to M PGs, directly find at least three OSDs corresponding to any one PG through a predetermined storage mapping table, and further store each Object included in any one PG into a corresponding OSD corresponding to the PG through a CRUSH algorithm, so that data storage efficiency in a Ceph distributed data storage system can be improved, and high-speed reading and writing of data in the Ceph distributed data storage system are effectively achieved.
The technical solutions of the present application will be described clearly and completely below with reference to the specific embodiments of the present application and the accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The technical solutions provided by the embodiments of the present application are described in detail below with reference to the accompanying drawings.
Example 1
Fig. 1 is a schematic flowchart of a data storage method according to an embodiment of the present application. The data storage method can be applied to a Ceph distributed data storage system and can comprise the following steps.
Step 11: dividing data to be stored into N objects, wherein N is a positive integer.
In step 11, to implement Object storage, the data to be stored is divided into N objects. Here, each Object has an Object identification code different from the other objects. The data amount of the N objects may be the same or different, and is not limited specifically here.
Step 12: and distributing the N objects into M classified groups PG according to the Object size, wherein M is a positive integer smaller than N.
In step 12, the N objects obtained by dividing in step 11 are allocated to M PGs according to the Object size, so as to implement packet storage of the objects. It should be noted that any PG has a different grouping identification code from the other PGs. To achieve uniform distribution of data, the N objects are evenly distributed into the M PGs by Object size. For example, when 500 objects are allocated to 100 PGs, each PG contains 5 objects.
Step 13: and determining at least three object storage devices OSD corresponding to any PG through a storage mapping table, wherein the storage mapping table comprises the mapping relation between PG and OSD.
In step 13, Ceph Mon determines at least three OSDs corresponding to any one PG according to a storage mapping table pre-stored in the Ceph distributed data storage system. In a Ceph distributed data storage system, at least three copies are to be saved by one Object, i.e., one Object is to be stored in at least three OSDs. Since one Object is mapped to only one PG, any one PG needs to be mapped to at least three OSDs to ensure that one Object can be stored in at least three OSDs.
Step 14: for any PG, storing each Object contained in the PG into a corresponding OSD corresponding to the PG through a CRUSH algorithm.
In step 14, for any PG, since at least three OSDs corresponding to the PG have been determined in step 13, each Object contained in the PG can be stored in the corresponding OSD corresponding to the PG by the CRUSH algorithm, thereby realizing distributed storage of data to be stored.
In alternative embodiments of the present application, the memory mapping table may be created in the following manner. Specifically, the process of creating the memory mapping table includes:
first, the hash value of each OSD is read from the memory. Specifically, Ceph Mon reads the hash value of each OSD stored in the memory.
Secondly, a mapping relation between any PG and at least three OSD is established. Specifically, CephMon establishes a mapping relationship between any one PG and at least three OSDs. It should be noted that the OSD that establishes the mapping relationship with the PG is an idle OSD, that is, an OSD capable of implementing a data storage function.
And finally, storing the mapping relation between the PG and the OSD in a storage mapping table.
The mapping relation between each PG and at least three OSD is established by reading the Hash value of each OSD stored in the memory, so that the problem of low storage efficiency caused by the Hash value calculation is avoided.
In an alternative embodiment of the present application, the hash value of each OSD may be determined in the following manner. Specifically, the process of determining the hash value of each OSD includes:
first, device information of a preset number of storage nodes is called from a system folder, wherein any one storage node comprises at least three OSDs. The device information of the storage node includes, but is not limited to, device information such as an IP address and a machine name corresponding to the storage node.
Secondly, according to the device information of any storage node, the hash value of each OSD in the storage node is calculated.
And finally, storing the hash value of each OSD in the storage nodes with the preset number in the memory.
Fig. 2 is a schematic diagram of a system boot process of the Ceph distributed data storage system according to an embodiment of the present application. As shown in fig. 2, when the Ceph distributed data storage system is started, the Ceph Mon calls the device information of 3 storage nodes stored in the system folder, calculates the hash value of each OSD in the storage node according to the device information of any one storage node, and stores the hash values of the OSD in the memory.
In an alternative embodiment of the present application, the device information of the storage node may be determined by:
firstly, setting equipment information of a preset number of storage nodes in a node scanning script;
then, the device information of a preset number of storage nodes is stored in a system folder by parsing the node scanning script.
Fig. 3 is a schematic diagram of a process of setting storage nodes in a Ceph distributed data storage system according to an embodiment of the present application. As shown in fig. 3, Ceph Mon sets device information of 3 storage nodes in a node scanning script of the Ceph distributed data storage system, and further stores the device information of the 3 storage nodes in a system folder by analyzing the node scanning script.
Fig. 4 is a schematic process diagram of data storage of the Ceph distributed data storage system according to an embodiment of the present application. As shown in fig. 4, after determining three OSDs (OSD1, OSD2, and OSD3) corresponding to a PG, Ceph Mon stores each Object included in the PG in the corresponding OSD corresponding to the PG by the CRUSH algorithm.
In an optional embodiment of the present application, the data storage method according to an embodiment of the present application further includes: when the OSD storing the Object fails, calculating the update hash value of each OSD in the storage node where the OSD is located; determining idle OSD in the storage node according to the updated hash value of each OSD in the storage node; and storing the Object stored in the faulted OSD into the idle OSD.
Fig. 5 is a schematic diagram illustrating an OSD fault repair process of the Ceph distributed data storage system according to an embodiment of the present application. As shown in fig. 5, when the OSD2 storing the Object in the Object storage cluster of the Ceph distributed data storage system fails, the storage node where the OSD2 is located starts fault repair, the hash values of the OSDs in the storage node are recalculated (i.e., the updated hash values of the OSDs are calculated), and the idle OSDx in the storage node is determined according to the updated hash values of the OSDs in the storage node, so that the Object stored in the failed OSD2 is stored in the idle OSDx.
In an optional embodiment of the present application, the data storage method according to an embodiment of the present application further includes: when the storage node storing the homing group fails, adding idle storage nodes in the node scanning script; and storing the reset group stored in the storage node with the fault into the idle storage node through a CRUSH algorithm.
Fig. 6 is a schematic diagram of a process of repairing a failure of a storage node of the Ceph distributed data storage system according to an embodiment of the present application. As shown in fig. 6, when any storage Node3 in the object storage cluster of the Ceph distributed data storage system fails, the Ceph Mon updates the Node scanning script, adds the device information of the free storage Node4 in the Node scanning script, and further stores the PG stored in the failed storage Node3 in the free storage Node4 by using the CRUSH algorithm.
In the Ceph distributed data storage system, after data to be stored is divided into N objects and the N objects are distributed into M PGs according to the sizes of the objects, at least three OSD corresponding to any one PG can be directly found through a predetermined storage mapping table, and then any one PG can be stored in each OSD corresponding to the PG through a CRUSH algorithm, so that the data storage efficiency in the Ceph distributed data storage system is improved, and high-speed reading and writing of the data in the Ceph distributed data storage system are effectively realized.
Example 2
Fig. 7 is a schematic structural diagram of a data storage device according to an embodiment of the present application. As shown in fig. 7, the data storage device 70 according to the embodiment of the present application includes a dividing unit 701, an allocating unit 702, and a storage unit 703, wherein: the dividing unit 701 is configured to divide data to be stored into N objects, where N is a positive integer; the allocating unit 702 is configured to allocate N objects into M grouped groups PG according to Object size, where M is a positive integer smaller than N; the storage unit 703 is configured to, for any one of the M homing groups, determine at least three object storage devices OSD corresponding to the homing group PG based on a storage mapping table, where the storage mapping table includes a mapping relationship between the PG and the OSD, and store each object included in the PG into a corresponding OSD corresponding to the PG based on a pseudo random data distribution CRUSH algorithm.
In an alternative embodiment of the present application, the data storage device 70 further comprises: a reading unit 704 and a mapping unit 705, wherein: the reading unit 705 is configured to read the hash value of each OSD from the memory; the mapping unit 706 is configured to establish a mapping relationship between the PG and at least three OSDs for any one of the M homing groups, and store the established mapping relationship in a storage mapping table.
In an alternative embodiment of the present application, the data storage device 70 further comprises a calling unit 706 and a calculating unit 707, wherein: the calling unit 706 is configured to call device information of a preset number of storage nodes from a system folder, where any one storage node includes at least three OSDs; the calculation unit 707 is configured to, for any one of a preset number of storage nodes, calculate a hash value of each OSD in the storage node according to the device information of the storage node, and store the hash value of each OSD in the memory.
In an alternative embodiment of the present application, the data storage device 70 further comprises a setting unit 708, wherein: the setting unit 709 is configured to set device information of a preset number of storage nodes in the node scan script, and store the device information of the preset number of storage nodes in the system folder by parsing the node scan script.
In an optional embodiment of the present application, the calculating unit 707 is further configured to, when an OSD storing an object fails, calculate an updated hash value of each OSD in a storage node where the OSD is located; the storage unit 703 is further configured to determine a free OSD in the storage node according to the updated hash value of each OSD in the storage node, and store an object stored in the failed OSD into the free OSD.
In an optional embodiment of the present application, the setting unit 709 is further configured to, when any storage node storing the homing group fails, add a free storage node in the node scan script; the storage unit 703 is further configured to store the PG stored in the failed storage node into a free storage node by the CRUSH algorithm.
According to the data storage device of the embodiment of the application, after the data to be stored is divided into N objects and the N objects are distributed into M PGs according to the sizes of the objects, at least three OSD corresponding to any one PG can be directly found through a predetermined storage mapping table, and then any one PG can be stored in each OSD corresponding to the PG through a CRUSH algorithm, so that the data storage efficiency in the Ceph distributed data storage system is improved, and high-speed reading and writing of the data in the Ceph distributed data storage system are effectively achieved.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (12)

1. A data storage method is applied to a Ceph distributed data storage system and comprises the following steps:
dividing data to be stored into N objects, wherein N is a positive integer;
allocating the N objects to M homing groups, wherein M is a positive integer less than N;
for any one of the M homing groups,
determining at least three object storage devices corresponding to the homing group based on a storage mapping table, wherein the storage mapping table comprises mapping relations between the homing group and the object storage devices, and
and storing each object contained in the homing group into a corresponding object storage device corresponding to the homing group based on a pseudo-random data distribution algorithm.
2. The data storage method of claim 1, further comprising:
reading the hash value of each object storage device from the memory;
for any one of the M homing groups,
establishing a mapping relationship between the grouped identification code of the grouped and the hash values of at least three object storage devices,
and storing the mapping relation in the storage mapping table.
3. The data storage method of claim 2, further comprising:
the method comprises the steps of calling equipment information of a preset number of storage nodes from a system folder, wherein any one of the preset number of storage nodes comprises at least three object storage equipment;
for any one storage node in the preset number of storage nodes, calculating hash values of all object storage devices in the storage nodes based on the device information of the storage nodes;
and storing the hash value of each object storage device in the storage nodes with the preset number in the memory.
4. The data storage method of claim 3, further comprising:
setting the equipment information of the storage nodes with the preset number in the node scanning script;
and storing the equipment information of the storage nodes with the preset number in the system folder by analyzing the node scanning script.
5. The data storage method of claim 4, further comprising:
when any one object storage device storing an object fails, calculating an updated hash value of each object storage device in a storage node where the object storage device is located;
determining idle object storage equipment in the storage node according to the updated hash value of each object storage equipment in the storage node; and
and storing the object stored in the object storage device with the fault into the free object storage device.
6. The data storage method of claim 3, further comprising:
when any storage node storing the homing group fails, adding idle storage nodes in the node scanning script; and
and storing the homing group stored in the failed storage node into the idle storage node based on the pseudo-random data distribution algorithm.
7. A data storage device, wherein the data storage device is applied to a Ceph distributed data storage system, and comprises:
the device comprises a dividing unit, a storage unit and a processing unit, wherein the dividing unit is configured to divide data to be stored into N objects, and N is a positive integer;
an assigning unit configured to assign the N objects to M homing groups, wherein M is a positive integer less than N;
a storage unit configured to, for any one of the M homing groups,
determining at least three object storage devices corresponding to the homing group based on a storage mapping table, wherein the storage mapping table comprises mapping relations between the homing group and the object storage devices, and
and storing each object contained in the homing group into a corresponding object storage device corresponding to the homing group based on a pseudo-random data distribution algorithm.
8. The data storage device of claim 7, further comprising:
the reading unit is configured to read the hash value of each object storage device from the memory;
a mapping unit configured to map, for any one of the M homing groups,
establishing a mapping relationship between the grouped identification code of the grouped and the hash values of at least three object storage devices,
and storing the mapping relation in the storage mapping table.
9. The data storage device of claim 8, further comprising:
the system comprises a calling unit and a processing unit, wherein the calling unit is configured to call device information of a preset number of storage nodes from a system folder, and any one of the preset number of storage nodes comprises at least three object storage devices;
a calculating unit configured to calculate, for any one of the preset number of storage nodes, hash values of respective object storage devices in the storage nodes based on the device information of the storage node, and store the hash values of the respective object storage devices in the preset number of storage nodes in the memory.
10. The data storage device of claim 9, further comprising:
a setting unit configured to set the device information of the preset number of storage nodes in a node scanning script, and store the device information of the preset number of storage nodes in the system folder by parsing the node scanning script.
11. The data storage device of claim 10,
the computing unit is further configured to compute an updated hash value of each object storage device in the storage node where the object storage device is located when the object storage device storing the object fails;
the storage unit is further configured to determine a free object storage device in the storage node according to the updated hash value of each object storage device in the storage node, and store the object stored in the failed object storage device into the free object storage device.
12. The data storage device of claim 10,
the setting unit is further configured to add a free storage node in the node scanning script when any one storage node storing the homing group fails;
the storage unit is further configured to store the homing group stored in the failed storage node into the free storage node based on the pseudo random number data distribution algorithm.
CN201710012670.9A 2017-01-09 2017-01-09 Data storage method and device Active CN108287660B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710012670.9A CN108287660B (en) 2017-01-09 2017-01-09 Data storage method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710012670.9A CN108287660B (en) 2017-01-09 2017-01-09 Data storage method and device

Publications (2)

Publication Number Publication Date
CN108287660A CN108287660A (en) 2018-07-17
CN108287660B true CN108287660B (en) 2021-07-09

Family

ID=62819128

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710012670.9A Active CN108287660B (en) 2017-01-09 2017-01-09 Data storage method and device

Country Status (1)

Country Link
CN (1) CN108287660B (en)

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109464564A (en) * 2018-10-24 2019-03-15 岭南师范学院 A kind of supercritical extraction method of phenolic acid compounds
CN109669822B (en) * 2018-11-28 2023-06-06 平安科技(深圳)有限公司 Electronic device, method for creating backup storage pool, and computer-readable storage medium
US10963332B2 (en) 2018-12-17 2021-03-30 Western Digital Technologies, Inc. Data storage systems and methods for autonomously adapting data storage system performance, capacity and/or operational requirements
CN109933284A (en) * 2019-02-26 2019-06-25 启迪云计算有限公司 A kind of data distribution algorithms of distributed block storage system
CN110780819A (en) * 2019-10-25 2020-02-11 浪潮电子信息产业股份有限公司 Data read-write method of distributed storage system
CN110908606B (en) * 2019-11-15 2021-06-29 浪潮电子信息产业股份有限公司 A Data Reconstruction Method for Distributed File System
CN111026720B (en) * 2019-12-20 2023-05-12 深信服科技股份有限公司 File processing method, system and related equipment
CN111125011B (en) * 2019-12-20 2024-02-23 深信服科技股份有限公司 File processing method, system and related equipment
CN110955733A (en) * 2020-01-02 2020-04-03 北京同有飞骥科技股份有限公司 Data equalization method and system for distributed system
CN111258508B (en) * 2020-02-16 2020-11-10 西安奥卡云数据科技有限公司 A Metadata Management Method in Distributed Object Storage
CN112596973A (en) * 2020-11-17 2021-04-02 新华三大数据技术有限公司 Data object storage method and device and storage medium
CN112486413B (en) * 2020-11-27 2022-08-05 杭州朗和科技有限公司 A data reading method, apparatus, medium and computing device
CN113778341B (en) * 2021-09-17 2024-12-10 航天科工(北京)空间信息应用股份有限公司 Remote sensing data distributed storage method and device and remote sensing data reading method
CN114253481A (en) * 2021-12-23 2022-03-29 深圳市名竹科技有限公司 Data storage method and device, computer equipment and storage medium
CN114253482A (en) * 2021-12-23 2022-03-29 深圳市名竹科技有限公司 Data storage method and device, computer equipment and storage medium
CN117609195B (en) * 2024-01-24 2024-06-18 济南浪潮数据技术有限公司 Object management method, device, equipment and medium of distributed storage system

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103905540A (en) * 2014-03-25 2014-07-02 浪潮电子信息产业股份有限公司 Object storage data distribution mechanism based on two-sage Hash
CN105450734A (en) * 2015-11-09 2016-03-30 上海爱数信息技术股份有限公司 Distributed storage CEPH data distribution optimization method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8055734B2 (en) * 2008-08-15 2011-11-08 International Business Machines Corporation Mapping of logical volumes to host clusters

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103905540A (en) * 2014-03-25 2014-07-02 浪潮电子信息产业股份有限公司 Object storage data distribution mechanism based on two-sage Hash
CN105450734A (en) * 2015-11-09 2016-03-30 上海爱数信息技术股份有限公司 Distributed storage CEPH data distribution optimization method

Also Published As

Publication number Publication date
CN108287660A (en) 2018-07-17

Similar Documents

Publication Publication Date Title
CN108287660B (en) Data storage method and device
CN108780386B (en) A method, device and system for data storage
WO2019144553A1 (en) Data storage method and apparatus, and storage medium
CN103677752B (en) Distributed data based concurrent processing method and system
CN110868435B (en) A bare metal server scheduling method, device and storage medium
CN104750757B (en) A kind of date storage method and equipment based on HBase
US20140258221A1 (en) Increasing distributed database capacity
CN103384550B (en) The method of storage data and device
US9952778B2 (en) Data processing method and apparatus
CN108881512A (en) Virtual IP address equilibrium assignment method, apparatus, equipment and the medium of CTDB
US20190079791A1 (en) Data Storage Method and Apparatus
CN115756955A (en) Data backup and data recovery method and device and computer equipment
CN111522811A (en) Database processing method and device, storage medium and terminal
CN110765094B (en) File creation method, device, system and storage medium
CN107391039B (en) Data object storage method and device
CN112256204B (en) Storage resource allocation method and device, storage node and storage medium
US20240205292A1 (en) Data processing method and apparatus, computer device, and computer-readable storage medium
CN110765190B (en) Automatic database cluster capacity expansion method and device and electronic equipment
CN111309260B (en) Data storage node selection method
US11249952B1 (en) Distributed storage of data identifiers
CN113326099A (en) Resource management method, device, electronic equipment and storage medium
WO2015062371A1 (en) Memory allocation method and device
CN114070740A (en) Node deployment method and device, electronic equipment and storage medium
CN108572993A (en) Db divides library hash methods, electronic equipment, storage medium and the device to data access
CN115454328A (en) Data storage method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant