CN113778331B

CN113778331B - Data processing method, master node and storage medium

Info

Publication number: CN113778331B
Application number: CN202110925774.5A
Authority: CN
Inventors: 孙梓洲; 刘昌鑫; 宋文革
Original assignee: Lenovo Netapp Technology Ltd
Current assignee: Lenovo Netapp Technology Ltd
Priority date: 2021-08-12
Filing date: 2021-08-12
Publication date: 2024-06-07
Anticipated expiration: 2041-08-12
Also published as: CN113778331A

Abstract

The application provides a data processing method, which comprises the following steps: a master node in the cluster receives and writes data to be written, which is sent by a client; the master node sends the data to be written to at least one slave node in the cluster; if the master node receives first confirmation information sent by a first slave node in all the slave nodes, confirming that the data to be written is successfully written into the cluster; the first confirmation information is used for indicating the first slave node to successfully write the data to be written; the application also provides a master node and a storage medium, and by the data processing method, the master node and the storage medium, when data is written, the number of times of data writing can be reduced, and further delay caused by multiple writing operations is reduced.

Description

Data processing method, master node and storage medium

Technical Field

The present application relates to the field of distributed storage technologies, and in particular, to a data processing method, a master node, and a storage medium.

Background

In a conventional log-based data model, the process of writing data to disk requires two steps, first writing data and operational information as a log to a log partition, and then the master node within the cluster replication group sends this log to the other slave nodes of the cluster replication group. The other slave nodes also write this log into the log partition and send response information to the master node. After the second half member in the replication group confirms that the write operation can be performed, the master node sends write instructions to other slave nodes, and the other slave nodes move the data from the log partition to the data partition.

The model has the advantages that when the system fails while ensuring the strong consistency of the data, the incremental recovery of the data can be performed based on the log; however, for a scenario in which one log writes data based on the above model, there is a problem that the number of write operations is large, increasing the write data delay.

Disclosure of Invention

The application provides a data processing method, a master node and a storage medium, which are used for at least solving the technical problems in the prior art.

The first aspect of the present application provides a data processing method, including:

The master node sends the data to be written to at least one slave node in the cluster;

If the master node receives first confirmation information sent by a first slave node in all the slave nodes, confirming that the data to be written is successfully written into the cluster; the first confirmation information is used for indicating the first slave node to successfully write the data to be written.

In the above scheme, the master node and/or the first slave node writes the version identifier corresponding to the data to be written.

In the above solution, after the master node receives the first acknowledgement information sent by the first slave node in all the slave nodes, the method further includes:

the master node confirms that the data block corresponding to the data to be written is valid in the first slave node;

And/or the master node confirms that the data block corresponding to the data to be written is invalid in a second slave node, wherein the second slave node is a slave node except the first slave node in all the slave nodes.

In the above scheme, the method further comprises:

The master node acquires data blocks corresponding to the slave nodes;

And the master node determines the validity of each data block according to the attribute information of the data block corresponding to each slave node, and obtains a confirmation result.

In the above solution, if the acknowledgement result indicates that at least one of the data blocks fails, the method further includes:

the master node confirms the version identifiers corresponding to the data blocks in all nodes included in the cluster respectively;

Determining a first node corresponding to a data block corresponding to a version identifier with the latest update time;

if the first node is the master node, the master node determines a third slave node, and the version identifier of the data block corresponding to the third slave node is not the version identifier with the latest update time;

and sending the data corresponding to the version identifier with the latest update time to the third slave node.

if the first node is not the master node, the master node writes a data block corresponding to the version identifier with the latest update time;

the master node determines a third slave node, wherein the version identifier of the data block corresponding to the third slave node is not the version identifier with the latest update time;

In the above solution, after the confirming that the data to be written is successfully written into the cluster, the method further includes:

The master node sends second confirmation information to the client; the second confirmation information is used for indicating that the cluster successfully writes the data to be written.

A second aspect of the present application provides a master node comprising:

The receiving unit is used for receiving and writing the data to be written sent by the client;

a sending unit, configured to send the data to be written to at least one slave node in the cluster;

A confirmation unit, configured to confirm that the data to be written is successfully written into the cluster if first confirmation information sent by all the first slave nodes in the slave nodes is received; the first confirmation information is used for indicating the first slave node to successfully write the data to be written.

A third aspect of the present application provides an electronic apparatus, comprising: a memory for storing executable instructions; and the processor is used for realizing the method steps of the data processing method when executing the executable instructions stored in the memory.

A fourth aspect of the application provides a computer-readable storage medium having stored therein a computer program which, when executed by a processor, implements the method steps of the data processing method described above.

By the data processing method provided by the application, the master node in the cluster receives and writes the data to be written sent by the client; the master node sends the data to be written to at least one slave node in the cluster; if the master node receives first confirmation information sent by a first slave node in all the slave nodes, confirming that the data to be written is successfully written into the cluster; the first confirmation information is used for indicating the first slave node to successfully write the data to be written. In this way, when writing data, the number of times of data writing can be reduced, and further, the delay of writing data due to a plurality of writing operations can be reduced.

Drawings

FIG. 1 is a schematic diagram showing a flow of writing data to a disk in the related art;

FIG. 2 is a schematic diagram of a first alternative flow chart of a data processing method according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a second alternative flow chart of a data processing method according to an embodiment of the present application;

FIG. 4 is a schematic diagram of a third alternative flow chart of a data processing method according to an embodiment of the present application;

FIG. 5 is a schematic diagram of a fourth alternative flow chart of a data processing method according to an embodiment of the present application;

FIG. 6 is a schematic diagram of a fifth alternative flow chart of a data processing method according to an embodiment of the present application;

FIG. 7 is a schematic diagram of a sixth alternative flow chart of a data processing method according to an embodiment of the present application;

FIG. 8 is a schematic diagram illustrating an alternative configuration of a master node according to an embodiment of the present application;

Fig. 9 shows a schematic diagram of a hardware composition structure of a master node according to an embodiment of the present application.

Detailed Description

In order to make the objects, features and advantages of the present application more comprehensible, the technical solutions according to the embodiments of the present application will be clearly described in the following with reference to the accompanying drawings, and it is obvious that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

In conventional log-based data models, when a user modifies data (including writing new data and modifying written data), it is necessary to write the data and operation information as a log to a local log partition of each node in the cluster, and then wait for a replication group acknowledgement to perform the operation (write operation), before writing the data to the corresponding data partition. The above operation is commonly referred to as pre-written Log (WRITE AHEAD Log, WAL), which presents a problem of write amplification.

The process of writing data to disk requires two steps, first writing data and operation information as a log to the log partition, and then the master node in the cluster replication group sends this log to the other slave nodes in the cluster replication group. The other slave nodes also write this log into the log partition and send response information to the master node. After the second half member in the replication group confirms that the write operation can be performed, the master node sends write instructions to other slave nodes, and the other slave nodes move the data from the log partition to the data partition.

Specifically, in order to ensure consistency of multiple copies of data, a technical scheme is generally adopted to implement based on a log. Before writing Data to the local persistent storage medium of each Node, the system needs to write the user's Operation (Operation), version identification (Version), real Data (Data), etc. in the form of a Log (Log) to the Log area of the master Node (Primary Node), and then the master Node transmits the Log to each slave Node (REPLICATED NODE).

If more than half of the slave nodes receive the master node's log and respond, the master node can confirm that this write operation can be performed correctly accordingly. After that, the master node writes the user data to the data partition and sends a message confirming the writing to each slave node. Fig. 1 is a schematic diagram showing a flow of writing data to a disk in the related art, and will be described according to the respective steps.

In step S101, the client sends data to be written to the master node.

The master node firstly writes the data to be written and operation information into a Log partition (Log partition) in the form of a Log, and the operation information can at least comprise one of the moment of writing the data, the data source and at least one slave node for writing the data.

Step S102, the master node sends the log to each slave node.

Step S103, each slave node writes the log.

And each slave node writes the log into the own log partition, and sends confirmation information to the master node after confirming that the log can be written.

Step S104, the master node confirms the write information.

And if the master node receives the confirmation information sent by more than half slave nodes, confirming that the data to be written can be written in the slave nodes, and sending a data writing request to each slave node.

Step S105, writing data.

After each slave node receives the data writing request, the data to be written is transferred from the log partition to the data partition, and the data writing operation is completed; and/or the master node transfers the data to be written from the log partition to the data partition, and completes the data writing operation.

From the above flow, it can be seen that for one data writing operation, the bottom layer is disassembled into two writing operations, one writing to the log area (step S101 to step S103), and the second writing to the data partition (step S105). Although the strong consistency requirement is guaranteed, the problem of write amplification is brought, and one data is written twice at the bottom layer, so that the system write performance is reduced, and the write delay is increased.

In some scenarios that are very sensitive to write latency, the schemes in the related art cannot meet the needs of the user.

Therefore, in view of the drawbacks of the current distributed storage technology, the present application provides a data processing method, which can overcome some or all of the drawbacks of the prior art.

Fig. 2 is a schematic diagram of a first alternative flow of a data processing method according to an embodiment of the present application, and will be described according to the steps.

In step S201, a master node in the cluster receives and writes data to be written sent by the client.

In some embodiments, the cluster includes one master node and at least one slave node; in the case where there is data to be written, data transfer is typically performed by a client with a master node, the client transferring the data to be written to the master node.

In some optional embodiments, the master node may further write the data to be written into a persistent storage medium corresponding to the master node; the master node can also write the version identifier corresponding to the data to be written into the persistent storage medium corresponding to the master node.

The version identifier corresponding to the data to be written can be used for representing the number of times of writing the data to be written, and the version identifier corresponding to the data to be written can monotonically increase along with the number of times of writing operation; the version identifier corresponding to the data to be written can be used for carrying out version comparison when the data is recovered so as to confirm the data corresponding to the latest version identifier; the version identifier corresponding to the data to be written can also be used for confirming the node which lacks the data corresponding to the latest version identifier.

Step S202, the master node sends the data to be written to at least one slave node in the cluster.

In some embodiments, the master node sends the data to be written, and/or a version identifier corresponding to the data to be written, to at least one slave node in the cluster; the master node may send the data to be written to all slave nodes in the cluster, or may send the data to be written to some slave nodes in the cluster.

Step S203, if the master node receives the first acknowledgement information sent by the first slave node in all the slave nodes, it is confirmed that the data to be written is successfully written into the cluster.

In some embodiments, after any slave node receives the data to be written, writing the data to be written into the slave node; if any slave node also receives the version identifier corresponding to the data to be written sent by the master node, any slave node can also write the version identifier corresponding to the data to be written into the slave node. The writing of the data to be written and/or the version identification into the slave node means that the data to be written and/or the version identification are written into a data partition of the slave node. And the slave node sends first confirmation information to the master node under the condition that the data to be written is successfully written.

In the implementation, if the master node receives first confirmation information sent by a first slave node in all the slave nodes (or a plurality of slave nodes allowing the data to be written), the data to be written is confirmed to be successfully written into the cluster; the first slave node may be any slave node in the cluster that successfully writes data to be written. Optionally, after the first slave node writes the data to be written, a data block (Chunk) corresponding to the data to be written is generated, and after the master node receives the first acknowledgement information sent by the first slave node, the data block corresponding to the data to be written in the first slave node is marked as valid.

And if the master node does not receive the first confirmation information sent by any slave node, confirming that the data to be written is not successfully written into the cluster, and marking all the data blocks corresponding to the data to be written in the slave nodes as invalid.

If the master node does not receive the confirmation information sent by the second slave node, confirming that the data block corresponding to the data to be written is invalid in the second slave node; the second slave node is a slave node except the first slave node among all the slave nodes.

The first confirmation information is used for indicating that the first slave node successfully writes the data to be written; the validity (valid or invalid) of the data block corresponding to the data to be written is used for subsequent data recovery. The first slave node may include one slave node, or may include at least one slave node.

In some embodiments, if the master node confirms that the data to be written is successfully written into the cluster, the master node may further send second confirmation information to the client; the second confirmation information is used for indicating that the cluster successfully writes the data to be written.

Thus, by the data processing method provided by the embodiment of the application, the main node in the cluster receives and writes the data to be written sent by the client; the master node sends the data to be written to at least one slave node in the cluster; if the master node receives first confirmation information sent by a first slave node in all the slave nodes, confirming that the data to be written is successfully written into the cluster; the first confirmation information is used for indicating the first slave node to successfully write the data to be written. When writing data, the extra performance cost caused by writing the data into the log partition is avoided, the disk performance can be utilized to the maximum extent, the writing efficiency is improved, and the method is suitable for scenes with higher requirements on writing delay.

In some embodiments, for a slave node that does not successfully write the data to be written, the embodiment of the present application further provides a data processing method, where the data to be written may be recovered at each node.

Fig. 3 is a schematic diagram showing a second alternative flow of the data processing method according to the embodiment of the present application, and will be described according to the steps.

In step S301, the master node obtains a data block corresponding to each slave node.

In some embodiments, the master node obtains a data block corresponding to each slave node; and/or attribute information of the data block corresponding to each slave node.

Wherein the attribute information of the data block includes that the data block is valid or that the data block is invalid. Alternatively, the attribute information of the data block may be set by an extended attribute of the underlying file system (e.g., ext4, xfs, etc.). Specifically, the master node may mark the validity of the data block by marking the value of the key (e.g., isValid) of the extended attribute of the data block. For example, a value of 1 for a key indicates that the data block is valid; a key value of 0 indicates that the data block is invalid (or invalid).

Step S302, the master node determines the validity of each data block according to the attribute information of the data block corresponding to each slave node, and obtains a confirmation result.

In some embodiments, the master node confirms validity of each data block according to attribute information of the data block corresponding to each slave node, and obtains a confirmation result.

In some optional embodiments, after the master node confirms the failed data block, the node corresponding to the failed data block may also be added to the queue to be recovered.

In some embodiments, if the validation result characterizes at least one of the data blocks as invalid, the method may further include:

step S303, confirming the version identifiers corresponding to the data blocks in all the nodes included in the cluster respectively.

In some embodiments, all nodes include a master node and all slave nodes, and the master node confirms version identifiers respectively corresponding to the data blocks in all nodes included in the cluster.

In some embodiments, when the node writes the data to be written, the version identifier corresponding to the data to be written is also written at the same time. Whether the data to be written is written or not, the node generates a corresponding data block; the difference is that the corresponding data blocks are the same in the nodes which successfully write the data to be written; and the data block fails in the node which is not successfully written with the data to be written.

In some embodiments, the version identification of the data block corresponding to the same data to be written may be different at different nodes. For example, for the data to be written of the first version identifier, the slave node a successfully writes, and the slave node B does not successfully write, at this time, the data block corresponding to the slave node a is valid, and the data block corresponding to the slave node B is invalid; for the data to be written of the second version identifier, the slave node A does not successfully write, and the slave node B successfully writes, at the moment, the data blocks corresponding to the slave node A and the slave node B are valid, except that the version identifier corresponding to the slave node A is the first version identifier and the version identifier corresponding to the slave node B is the second version identifier.

Step S304, determining a first node corresponding to a data block corresponding to the version identification with the latest update time.

In some embodiments, the master node determines a first node corresponding to a data block corresponding to a version identifier with a latest update time.

Taking the version identifier as a number as an example, if the version identifier monotonically increases, the master node can confirm that the data block with the largest version identifier is the data block with the latest update time.

In step S305, it is determined whether the first node is a master node.

In some embodiments, the master node determines whether the first node is a master node; if the first node is the master node, executing step S306; if the first node is not the master node, step S307 is performed.

In step S306, the master node determines a third slave node.

In some embodiments, if the first node is the master node, the master node determines a third slave node, where the version identifier of the data block corresponding to the third slave node is not the version identifier with the latest update time; and sending the data corresponding to the version identifier with the latest update time to the third slave node.

In some embodiments, after the slave node successfully writes data, acknowledgement information may also be sent to the master node, where the master node marks validity of the data block corresponding to the version identifier with the latest update time in each slave node according to the acknowledgement information.

In step S307, the master node writes the data corresponding to the version identifier with the latest update time.

In some embodiments, the master node writes data corresponding to the version identifier with the latest update time and/or the version identifier with the latest update time.

In some embodiments, the master node may further determine a third slave node that does not include the data block corresponding to the version identifier with the latest update time; and sending the data corresponding to the version identifier with the latest update time to the third slave node. And the version identifier of the data block corresponding to the third slave node is not the version identifier with the latest updating time.

In some alternative embodiments, steps S301 to S307 may be performed within a first time threshold range (e.g., early morning or idle time) to avoid preempting disk bandwidth and central processor resources. Optionally, the execution time and/or frequency of steps S301 to S307 may also be set based on the priority of the data to be written, so as to reduce the influence of steps S301 to S307 on the normal traffic of the client.

For example, the data to be written with high priority can be executed in a time beyond the first time threshold range, and the data to be written with higher frequency than the data with low priority is executed, so that the problem that when the data writing is inconsistent due to the occurrence of a fault of a cluster, a program corresponding to the data with high priority cannot be normally executed is avoided.

Thus, by the data processing method provided by the embodiment of the application, whether the data needs to be recovered is confirmed by silence scanning, the node needing to be recovered is further determined, and data recovery is performed. When the data writing is inconsistent due to the reasons of failure and the like of the cluster, the final consistency of the data in the cluster can be ensured through the full recovery and version comparison of the data.

Fig. 4 is a schematic diagram showing a third alternative flow of the data processing method according to the embodiment of the present application, and will be described according to the steps.

In step S401, the client sends data to be written to a master node in the cluster.

In some optional embodiments, the master node may further write the data to be written into a persistent storage medium corresponding to the master node, and/or the master node may further write a version identifier corresponding to the data to be written into the persistent storage medium corresponding to the master node.

Step S402, the master node sends the data to be written to at least one slave node in the cluster.

Step S403, the slave node transmits the first acknowledgement information to the master node.

Thus, by the data processing method provided by the embodiment of the application, the main node in the cluster receives and writes the data to be written sent by the client; the master node sends the data to be written to at least one slave node in the cluster; if the master node receives first confirmation information sent by a first slave node in all the slave nodes, confirming that the data to be written is successfully written into the cluster; the first confirmation information is used for indicating the first slave node to successfully write the data to be written. When data is written, the extra performance overhead caused by writing the log is avoided, the disk performance can be utilized to the maximum extent, the data writing efficiency is improved, and the method is suitable for scenes with higher requirements on data writing delay.

Fig. 5 is a schematic diagram showing a fourth alternative flow chart of the data processing method according to the embodiment of the present application, and will be described according to the steps.

In step S501, the client sends data to be written to a master node in the cluster.

In step S502, the master node writes the data to be written.

In some optional embodiments, the master node may further write the data to be written into a persistent storage medium corresponding to the master node, and the master node may further write a version identifier corresponding to the data to be written into the persistent storage medium corresponding to the master node.

In some alternative embodiments, if the master node fails to write the data to be written, the master node may further send a retransmission request to the client to request the client to resend the data to be written to the master node.

In step S503, the master node transmits data to be written to at least one slave node.

Step S504, the slave node writes the data to be written.

In some embodiments, after receiving the data to be written and the version identifier corresponding to the data to be written, the at least one slave node writes the data to be written and the version identifier corresponding to the data to be written into the slave node.

In step S505, when the slave node succeeds in writing, the first acknowledgement information is sent to the master node.

In step S506, the master node sends the second acknowledgement information to the client.

In the implementation, if the master node receives first confirmation information sent by a first slave node in all the slave nodes (or a plurality of slave nodes allowing the data to be written), the data to be written is confirmed to be successfully written into the cluster; optionally, after the first slave node writes the data to be written, a corresponding data block is generated, and after the master node receives the first acknowledgement information sent by the first slave node, the data block corresponding to the data to be written in the first slave node is marked as valid.

Or if the master node does not receive the first confirmation information sent by any slave node, confirming that the data to be written is not successfully written into the cluster, and marking all the data blocks corresponding to the data to be written in the slave nodes as invalid.

In particular implementations, the master node may mark the validity of the data block by marking the value of a key (e.g., isValid) of the extended attribute of the data block. For example, a value of 1 indicates that the data block is valid; a value of 0 indicates a data block failure (or failure). Optionally, the master node may be set by an extended attribute of an underlying file system (such as ext4, xfs, etc.), specifically, a key of isValid, a value of 1 indicates that Chunk is valid, and a value of 0 indicates that Chunk is invalid.

The first confirmation information is used for indicating that the first slave node successfully writes the data to be written; the validity (valid or invalid) of the data block corresponding to the data to be written is used for subsequent data recovery. The first slave node may include one slave node, or may include two or more slave nodes.

In some embodiments, in order to timely count the Chunk of the storage system failure, the embodiment of the present application may further set a timing task for executing a Chunk silence scan (scrub), scan the Chunk corresponding to each node according to the timing task, read its extended attribute, determine the validity of the Chunk, and determine that the Chunk is a valid Chunk or a failed Chunk. In other embodiments, the master node may also read the extended attribute of the Chunk to determine the validity of the Chunk when performing the read data operation.

Fig. 6 shows a fifth alternative flow chart of the data processing method according to the embodiment of the present application, which will be described according to the steps.

In step S601, the master node acquires a data block list.

In some embodiments, the data block list includes information of data blocks corresponding to each slave node corresponding to the master node in the cluster. The information of the data block may include a node identifier of a node corresponding to the data block, and may further include attribute information of the data block, for example, the data block is valid or invalid.

In step S602, the master node reads the value of each data block.

In some embodiments, the master node reads a "isValid" value in the metadata corresponding to each data block, and validates each data block based on the "isValid" value.

In other embodiments, the master node obtains the data block corresponding to each slave node, and the attribute information of the data block corresponding to each slave node.

Wherein the attribute information of the data block includes that the data block is valid or that the data block is invalid.

In some optional embodiments, the master node further determines whether values of all data blocks have been read, and if all the values have been read, ends the flow; if not all the readings are taken, step S602 is repeatedly performed.

Step S603, adding the node corresponding to the failed data block into the queue to be restored.

In some embodiments, after the master node confirms the failed data block, the node corresponding to the failed data block is added to the queue to be recovered.

In some embodiments, if a data block is determined to be invalid, a data recovery operation may need to be performed. Fig. 7 is a schematic diagram of a sixth alternative flow chart of a data processing method according to an embodiment of the present application, and will be described according to the steps.

Step S801, the master node confirms version identifiers corresponding to the data blocks in all the nodes included in the cluster, respectively.

In some embodiments, the master node confirms version identifiers respectively corresponding to the data blocks in all nodes included in the cluster; all the nodes comprise the master node and at least one slave node corresponding to the master node.

In some embodiments, when the node writes the data to be written, the version identifier corresponding to the data to be written is also written at the same time. Whether the data to be written is written or not, the node generates a corresponding data block; the difference is that in the node successfully written with the data to be written, the data block corresponding to the data to be written is valid; and the data blocks successfully written into all nodes to which the data is to be written are the same; and the data block corresponding to the data to be written is invalid in the node which is not successfully written with the data to be written.

Step S802, determining the data block corresponding to the version identifier with the latest update time.

In some embodiments, taking the version identifier as a number as an example, if the version identifier monotonically increases with the number of writing times, the master node may confirm that the data block with the largest version identifier is the data block with the latest update time.

In some alternative embodiments, the master node determines a first node corresponding to a data block corresponding to a version identifier with a latest update time.

In step S803, it is confirmed whether the first node is a master node.

In some embodiments, the master node determines whether the first node is a master node; if the first node is the master node, executing step S806; if the first node is not the master node, step S804 is performed.

In step S804, the master node confirms the first node.

In some embodiments, the master node determines, based on the version identifier, a first node corresponding to a data block with a latest update time.

In step S805, the master node writes data corresponding to the version identifier with the latest update time.

In some embodiments, the master node writes data corresponding to the version identifier with the latest update time, and the version identifier with the latest update time.

In step S806, the master node determines a third slave node, where the version identifier of the data block corresponding to the third slave node is not the version identifier with the latest update time.

In some embodiments, the master node determines a third slave node according to the version identifier of the data block corresponding to each slave node, where the version identifier of the data block corresponding to the third slave node is not the version identifier with the latest update time.

Step S807, transmitting data corresponding to the version identifier with the latest update time to the third slave node.

In some embodiments, the master node may further determine a third slave node that does not include the data block corresponding to the version identifier with the latest update time; and sending the data corresponding to the version identifier with the latest update time to the third slave node.

Therefore, by the data processing method provided by the embodiment of the application, the operation flow when writing data is simplified on the premise of ensuring the final consistency, WAL operation is not required to be executed when a user writes the data, the extra performance cost caused by LOG writing is avoided, the data is directly written into a persistent storage medium, and the final consistency of the data can be ensured through a full recovery mode when a system fault occurs. The model can maximize the performance of the disk in the distributed storage environment, and improve the writing efficiency so as to cope with the scene with very strict writing delay requirements. Moreover, the Log-based copy model has complex logic, is not easy to understand by a state conversion machine, and has high requirements on code writers. The data processing method provided by the application embodiment is simple in logic, clear in arrangement and easy to convert the model into practical application.

Fig. 8 is a schematic diagram of an alternative structure of a master node according to an embodiment of the present application, which will be described according to various parts.

In some embodiments, master node 900 includes: a receiving unit 901, a transmitting unit 902, and an acknowledgement unit 903. The master node 900 may be a data processing device corresponding to a master node in a cluster, a data processing device corresponding to a cluster, or a data processing device outside the cluster.

The receiving unit 901 is configured to receive and write data to be written sent by a client;

the sending unit 902 is configured to send the data to be written to at least one slave node in the cluster;

The confirmation unit 903 is configured to confirm that the data to be written is successfully written into the cluster if first confirmation information sent by a first slave node in all the slave nodes is received; the first confirmation information is used for indicating the first slave node to successfully write the data to be written.

In some embodiments, the master node 900 may also include a write unit 904.

The writing unit 904 is configured to write a version identifier corresponding to the data to be written.

The confirmation unit 903 is specifically configured to confirm that the data block corresponding to the data to be written is valid at the first slave node; and/or confirming that the data block corresponding to the data to be written is invalid in a second slave node, wherein the second slave node is a slave node except the first slave node in all the slave nodes.

The confirmation unit 903 is further configured to obtain a data block corresponding to each slave node; and the master node determines the validity of each data block according to the attribute information of the data block corresponding to each slave node, and obtains a confirmation result.

In some embodiments, if the validation result indicates that at least one of the data blocks fails, the validation unit 903 is further configured to validate version identifiers corresponding to the data blocks in all nodes included in the cluster, respectively; determining a first node corresponding to a data block corresponding to a version identifier with the latest update time; if the first node is the master node, determining a third slave node, wherein the version identifier of the data block corresponding to the third slave node is not the version identifier with the latest update time; and sending the data corresponding to the version identifier with the latest update time to the third slave node.

In some embodiments, if the validation result indicates that at least one of the data blocks fails, the validation unit 903 is further configured to validate version identifiers corresponding to the data blocks in all nodes included in the cluster, respectively; determining a first node corresponding to a data block corresponding to a version identifier with the latest update time; if the first node is not the master node, the writing unit 904 is further configured to write a data block corresponding to the version identifier with the latest update time; the master node determines a third slave node, wherein the version identifier of the data block corresponding to the third slave node is not the version identifier with the latest update time; and sending the data corresponding to the version identifier with the latest update time to the third slave node.

The sending unit 902 is further configured to send second confirmation information to the client after confirming that the data to be written is successfully written into the cluster; the second confirmation information is used for indicating that the cluster successfully writes the data to be written.

Fig. 9 is a schematic diagram of a hardware composition structure of a master node according to an embodiment of the present application, where the master node includes: at least one processor 701, a memory 702, and at least one network element 704. The various components in the master node are coupled together by a bus system 705. It is appreciated that the bus system 705 is used to enable connected communications between these components. The bus system 705 includes a power bus, a control bus, and a status signal bus in addition to the data bus. But for clarity of illustration, the various buses are labeled as bus system 705 in fig. 9.

It is to be appreciated that the memory 702 can be either volatile memory or nonvolatile memory, and can include both volatile and nonvolatile memory. Wherein the nonvolatile Memory may be ROM, programmable read-Only Memory (PROM, programmable Read-Only Memory), erasable programmable read-Only Memory (EPROM, erasable Programmable Read-Only Memory), electrically erasable programmable read-Only Memory (EEPROM, ELECTRICALLY ERASABLE PROGRAMMABLE READ-Only Memory), magnetic random access Memory (FRAM, ferromagnetic random access Memory), flash Memory (Flash Memory), magnetic surface Memory, optical disk, or compact disk-Only (CD-ROM, compact Disc Read-Only Memory); the magnetic surface memory may be a disk memory or a tape memory. The volatile memory may be random access memory (RAM, random Access Memory) which acts as external cache memory. By way of example, and not limitation, many forms of RAM are available, such as static random access memory (SRAM, static Random Access Memory), synchronous static random access memory (SSRAM, synchronous Static Random Access Memory), dynamic random access memory (DRAM, dynamic Random Access Memory), synchronous dynamic random access memory (SDRAM, synchronous Dynamic Random Access Memory), double data rate synchronous dynamic random access memory (ddr SDRAM, double Data Rate Synchronous Dynamic Random Access Memory), enhanced synchronous dynamic random access memory (ESDRAM, enhanced Synchronous Dynamic Random Access Memory), synchronous link dynamic random access memory (SLDRAM, syncLink Dynamic Random Access Memory), direct memory bus random access memory (DRRAM, direct Rambus Random Access Memory). The memory 702 described in embodiments of the present application is intended to comprise, without being limited to, these and any other suitable types of memory.

The memory 702 in embodiments of the present application is used to store various types of data to support the operation of the master node. Examples of such data include: any computer program for operating on the host node, such as application 722. A program for implementing the method of the embodiment of the present application may be included in the application 722.

The method disclosed in the embodiments of the present application may be applied to the processor 701 or implemented by the processor 701. The processor 701 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the method may be performed by integrated logic circuitry in hardware or instructions in software in the processor 701. The Processor 701 may be a general purpose Processor, a digital signal Processor (DSP, digital Signal Processor), or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like. The processor 701 may implement or perform the methods, steps, and logic blocks disclosed in embodiments of the present application. The general purpose processor may be a microprocessor or any conventional processor or the like. The steps of the method disclosed in the embodiment of the application can be directly embodied in the hardware of the decoding processor or can be implemented by combining hardware and software modules in the decoding processor. The software modules may be located in a storage medium in a memory 702. The processor 701 reads information in the memory 702 and, in combination with its hardware, performs the steps of the method as described above.

In an exemplary embodiment, the master node may be implemented by one or more Application Specific Integrated Circuits (ASICs), DSPs, programmable logic devices (PLDs, programmable Logic Device), complex programmable logic devices (CPLDs, complex Programmable Logic Device), FPGAs, general purpose processors, controllers, MCUs, MPUs, or other electronic components for performing the aforementioned methods.

In addition to the methods and apparatus described above, embodiments of the application may also be a computer program product comprising computer program instructions which, when executed by a processor, cause the processor to perform steps in a method according to various embodiments of the application described in the "exemplary methods" section of this specification.

The computer program product may write program code for performing operations of embodiments of the present application in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server.

Furthermore, embodiments of the application may also be a computer-readable storage medium, having stored thereon computer program instructions, which when executed by a processor, cause the processor to perform steps in a method according to various embodiments of the application described in the "exemplary method" section of the description above.

The computer readable storage medium may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium may include, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The basic principles of the present application have been described above in connection with specific embodiments, but it should be noted that the advantages, benefits, effects, etc. mentioned in the present application are merely examples and not intended to be limiting, and these advantages, benefits, effects, etc. are not to be construed as necessarily possessed by the various embodiments of the application. Furthermore, the specific details disclosed herein are for purposes of illustration and understanding only, and are not intended to be limiting, as the application is not necessarily limited to practice with the above described specific details.

The block diagrams of the devices, apparatuses, devices, systems referred to in the present application are only illustrative examples and are not intended to require or imply that the connections, arrangements, configurations must be made in the manner shown in the block diagrams. As will be appreciated by one of skill in the art, the devices, apparatuses, devices, systems may be connected, arranged, configured in any manner. Words such as "including," "comprising," "having," and the like are words of openness and mean "including but not limited to," and are used interchangeably therewith. The terms "or" and "as used herein refer to and are used interchangeably with the term" and/or "unless the context clearly indicates otherwise. The term "such as" as used herein refers to, and is used interchangeably with, the phrase "such as, but not limited to.

It is also noted that in the apparatus, devices and methods of the present application, the components or steps may be disassembled and/or assembled. Such decomposition and/or recombination should be considered as equivalent aspects of the present application.

The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present application. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the application. Thus, the present application is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

The foregoing description has been presented for purposes of illustration and description. Furthermore, this description is not intended to limit embodiments of the application to the form disclosed herein. Although a number of example aspects and embodiments have been discussed above, a person of ordinary skill in the art will recognize certain variations, modifications, alterations, additions, and subcombinations thereof.

Claims

1. A method of data processing, the method comprising:

a master node in the cluster receives and writes data to be written, which is sent by a client;

If the master node receives first confirmation information sent by a first slave node in all the slave nodes, confirming that the data to be written is successfully written into the cluster; the first confirmation information is used for indicating the first slave node to successfully write the data to be written;

Wherein after the master node receives the first acknowledgement information sent by the first slave node in all the slave nodes, the method further includes: the master node confirms that the data block corresponding to the data to be written is valid in the first slave node; and/or the master node confirms that the data block corresponding to the data to be written is invalid in a second slave node, wherein the second slave node is a slave node except the first slave node in all the slave nodes;

The method further comprises the steps of: the master node acquires data blocks corresponding to the slave nodes; the master node determines the validity of each data block according to the attribute information of the data block corresponding to each slave node, and a confirmation result is obtained;

Wherein if the validation result characterizes at least one of the data blocks as invalid, the method further comprises: the master node confirms the version identifiers corresponding to the data blocks in all nodes included in the cluster respectively; determining a first node corresponding to a data block corresponding to a version identifier with the latest update time; if the first node is the master node, the master node determines a third slave node, and the version identifier of the data block corresponding to the third slave node is not the version identifier with the latest update time; transmitting data corresponding to the version identifier with the latest update time to the third slave node; if the first node is not the master node, the master node writes data corresponding to the version identifier with the latest update time; the master node determines a third slave node which does not comprise the data block corresponding to the version identifier with the latest update time, and the version identifier of the data block corresponding to the third slave node is not the version identifier with the latest update time; and sending the data corresponding to the version identifier with the latest update time to the third slave node.

2. The method of claim 1, wherein the step of determining the position of the substrate comprises,

And the master node and/or the first slave node writes in the version identifier corresponding to the data to be written.

3. The method of claim 1, wherein after the confirming that the data to be written was successfully written into the cluster, the method further comprises:

4. A master node, the master node comprising:

A sending unit, configured to send the data to be written to at least one slave node in a cluster;

A confirmation unit, configured to confirm that the data to be written is successfully written into the cluster if first confirmation information sent by all the first slave nodes in the slave nodes is received; the first confirmation information is used for indicating the first slave node to successfully write the data to be written;

the writing unit is used for writing the version identifier corresponding to the data to be written;

The confirmation unit is specifically configured to confirm that a data block corresponding to the data to be written is valid at the first slave node; and/or confirming that the data block corresponding to the data to be written is invalid in a second slave node, wherein the second slave node is a slave node except the first slave node in all the slave nodes;

the confirmation unit is further used for acquiring data blocks corresponding to the slave nodes; the master node determines the validity of each data block according to the attribute information of the data block corresponding to each slave node, and a confirmation result is obtained;

If the confirmation result indicates that at least one data block fails, the confirmation unit is further configured to confirm version identifiers corresponding to the data blocks in all nodes included in the cluster respectively; determining a first node corresponding to a data block corresponding to a version identifier with the latest update time; if the first node is the master node, determining a third slave node, wherein the version identifier of the data block corresponding to the third slave node is not the version identifier with the latest update time; transmitting data corresponding to the version identifier with the latest update time to the third slave node;

If the first node is not the master node, the writing unit is further configured to write data corresponding to the version identifier with the latest update time; the master node determines a third slave node, wherein the version identifier of the data block corresponding to the third slave node is not the version identifier with the latest update time; and sending the data corresponding to the version identifier with the latest update time to the third slave node.

5. An electronic device, comprising:

a memory for storing executable instructions;

a processor for carrying out the method steps of any one of claims 1 to 3 when executing executable instructions stored in said memory.

6. A computer readable and writable medium, characterized in that the computer readable and writable medium has a computer program written therein, which computer program, when being executed by a processor, implements the method steps of any of claims 1 to 3.