CN119248734A

CN119248734A - A data consistency processing method and device based on distributed file system

Info

Publication number: CN119248734A
Application number: CN202411373941.XA
Authority: CN
Inventors: 柳海源; 李婉琪
Original assignee: Du Xiaoman Technology Beijing Co Ltd
Current assignee: Du Xiaoman Technology Beijing Co Ltd
Priority date: 2024-09-29
Filing date: 2024-09-29
Publication date: 2025-01-03

Abstract

The present application provides a data consistency processing method and device based on a distributed file system. The method locks the path used to write target data by creating a lock file for the target data under the target path. Whether the temporary lock file can be successfully created determines whether there are other users competing for the same path in parallel. If so, a competition detection strategy is started to continue to determine whether there are conflicting fixed lock files. If there are conflicting fixed lock files, it indicates that other users have already established the temporary lock file, and the temporary lock file is renamed to the fixed lock file. In order to avoid concurrent operation conflicts with other users, a new temporary lock file is re-established, and a new temporary lock file that does not conflict is continuously tried. The temporary lock file is renamed to the fixed lock file, and finally the target data is updated to the path corresponding to the fixed lock file after the temporary lock file is renamed, so as to effectively ensure data consistency.

Description

Data consistency processing method and device based on distributed file system

Technical Field

The present application relates to the field of data processing, and in particular, to a data consistency processing method and apparatus based on a distributed file system.

Background

A distributed file system is a data storage system that stores data in a decentralized manner on multiple servers, where an HDFS (Hadoop Distributed FILE SYSTEM) distributed file system is a common distributed file system that is suitable for use in a Hadoop big data computing engine. The HDFS disperses and stores data on a plurality of nodes in the whole data cluster by means of a data block.

In the actual use process of the HDFS, the situation that data are inconsistent due to concurrent operation conflicts is easy to occur, so that different users access the same folder to obtain inconsistent data, and user experience is affected.

Disclosure of Invention

In view of the above, the embodiment of the application provides a data consistency processing method and device based on a distributed file system, so as to reduce the problem of inconsistent data in the distributed file system and ensure user experience.

In a first aspect, an embodiment of the present application provides a data consistency processing method based on a distributed file system, where the method includes:

receiving a data request transmitted by a user, wherein the data request carries target data to be written;

Creating a temporary lock file under a target path for the target data, and renaming the temporary lock file into a fixed lock file based on the target data by utilizing an atomic naming rule if the temporary lock file is successfully created, wherein the lock file is used for locking a path corresponding to the lock file;

If the temporary lock file is failed to be established, starting a competition detection strategy to determine whether a conflicted fixed lock file exists;

if so, creating a new temporary lock file for the target data under the target path again, judging whether the new temporary lock file is created successfully again, and if so, restarting the competition detection strategy until the new temporary lock file has no conflicting fixed lock file;

If not, renaming the temporary lock file to be a fixed lock file based on the target data by utilizing an atomic naming rule;

And updating the target data to the path corresponding to the renamed fixed lock file of the temporary lock file.

In some possible embodiments, the temporary lock file and the fixed lock file are file locks in the distributed file system, the temporary lock file is used for locking a path corresponding to the temporary lock file for a first time period, and the fixed lock file is used for locking a path corresponding to the fixed lock file for a second time period, where the second time period is longer than the first time period.

In some possible embodiments, after the step of updating the target data to the path corresponding to the renamed fixed lock file, the method further includes:

and cleaning the temporary lock file and deleting the fixed lock file.

In some possible embodiments, the method further comprises:

If the temporary lock file is renamed to be the fixed lock file based on the target data by utilizing an atomic naming rule, the target data is updated to be under a path corresponding to the renamed fixed lock file;

And if the temporary lock file is renamed to be the fixed lock file based on the target data by utilizing the atomic naming rule and fails, cleaning the temporary lock file.

In a second aspect, an embodiment of the present application provides a data consistency processing apparatus based on a distributed file system, where the apparatus includes:

the receiving module is used for receiving a data request transmitted by a user, wherein the data request carries target data to be written;

the file naming processing module is used for creating a temporary lock file for the target data under a target path, and renaming the temporary lock file into a fixed lock file based on the target data by utilizing an atomic naming rule if the temporary lock file is successfully created, wherein the lock file is used for locking a path corresponding to the lock file;

The competition detection module is used for starting a competition detection strategy if the temporary lock file is failed to be established, and determining whether a conflicted fixed lock file exists or not; if so, creating a new temporary lock file for the target data under the target path again, judging whether the new temporary lock file is created successfully again, and if so, restarting the competition detection strategy until the new temporary lock file has no conflicting fixed lock file;

the file naming processing module is further configured to rename the temporary lock file to a fixed lock file based on the target data by utilizing an atomic naming rule if there is no conflicting fixed lock file;

And the data processing module is used for updating the target data to the path corresponding to the renamed fixed lock file of the temporary lock file.

In some possible embodiments, after the step of updating the target data to the path corresponding to the renamed fixed lock file, the data processing module is further configured to:

and cleaning the temporary lock file and deleting the fixed lock file.

In some possible embodiments, the data processing module is specifically configured to:

In a third aspect, an embodiment of the present application provides an electronic device, where the electronic device includes:

processor, and

A memory in which a program is stored,

Wherein the program comprises instructions which, when executed by the processor, cause the processor to perform the distributed file system based data consistency processing method of the first aspect.

In a fourth aspect, an embodiment of the present application provides a non-transitory computer readable storage medium storing computer instructions for causing a computer to execute the data consistency processing method based on the distributed file system according to the first aspect.

The application has the beneficial effects that:

The application provides a data consistency processing method and a device based on a distributed file system, wherein the method locks a path for writing target data by creating a lock file under the target path, judges whether other users compete the same path in parallel through whether the temporary lock file can be successfully created, starts a competition detection strategy if the paths exist, continuously determines whether a conflicted fixed lock file exists, indicates that the other users are already in the temporary lock file to be established if the conflicted fixed lock file exists, and renames the temporary lock file to be the fixed lock file, and continuously tries to obtain a new temporary lock file which does not conflict completely by reestablishing the new temporary lock file in order to avoid the conflict of concurrent operation with the other users, renames the temporary lock file to be the fixed lock file, and finally updates the target data under the path corresponding to the renamed fixed lock file.

Therefore, by means of the competition detection of the temporary lock file and the fixed lock file, different writing results aiming at the same data can be avoided when different users are subjected to high concurrency processing, and when different users access paths corresponding to the same fixed lock file, only the same data can be seen, so that the consistency of the data in the view angle of the users can be effectively ensured, and the use experience of the users can be guaranteed.

Drawings

Further details, features and advantages of the application are disclosed in the following description of exemplary embodiments with reference to the following drawings, in which:

FIG. 1 is a schematic flow chart of a data consistency processing method based on a distributed file system according to an embodiment of the present application;

FIG. 2 is a schematic flow chart of another method for processing data consistency based on a distributed file system according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a logic structure of a data consistency processing apparatus based on a distributed file system according to an embodiment of the present application;

fig. 4 shows a block diagram of an exemplary electronic device that can be used to implement embodiments of the present application.

Detailed Description

Embodiments of the present application will be described in more detail below with reference to the accompanying drawings. While the application is susceptible of embodiment in the drawings, it is to be understood that the application may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but rather are provided to provide a more thorough and complete understanding of the application. It should be understood that the drawings and embodiments of the application are for illustration purposes only and are not intended to limit the scope of the present application.

It should be understood that the various steps recited in the method embodiments of the present application may be performed in a different order and/or performed in parallel. Furthermore, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the application is not limited in this respect.

The term "including" and variations thereof as used herein are intended to be open-ended, i.e., including, but not limited to. The term "based on" is based at least in part on. The term "one embodiment" means "at least one embodiment," another embodiment "means" at least one additional embodiment, "and" some embodiments "means" at least some embodiments. Related definitions of other terms will be given in the description below. It should be noted that the terms "first," "second," and the like herein are merely used for distinguishing between different devices, modules, or units and not for limiting the order or interdependence of the functions performed by such devices, modules, or units.

It should be noted that references to "one", "a plurality" and "a plurality" in this disclosure are intended to be illustrative rather than limiting, and those skilled in the art will appreciate that "one or more" is intended to be construed as "one or more" unless the context clearly indicates otherwise.

As described in the background, the HDFS distributed file system may be inconsistent in data during use due to concurrent operation conflicts. In order to solve the inconsistent situation of the data, the prior art uses a Zookeeper (an open source distributed coordination service), google Spanner (a database system), chubby (a distributed lock service) and the like to ensure the consistency of the data in the distributed file system through tools, so as to ensure that all users or clients can see the consistent data.

However, the Zookeeper is complex, an independent service cluster needs to be operated, and the configuration and management of the Zookeeper are complex. Google Spanner require complex basic settings and high costs, and are less versatile. Chubby belongs to an internal technology, is not opened to external users, and realizes and maintains Chubby with higher computing resources and higher maintenance difficulty.

In view of this, in a first aspect, the present application provides a data consistency processing method based on a distributed file system, where the method is applied to any electronic device having a data consistency processing function based on a distributed file system, including but not limited to a personal mobile terminal, a computer, a server, or the like. As shown in fig. 1, the method comprises the steps of:

S11, receiving a data request transmitted by a user, wherein the data request carries target data to be written;

s12, creating a temporary lock file for the target data under a target path;

s13, judging whether the temporary lock file is successfully created, if so, executing the step S15, and if not, executing the step S14;

S14, starting a competition detection strategy to determine whether a conflicted fixed lock file exists, if so, returning to the step S12, until the established temporary lock file does not exist the conflicted fixed lock file, and if not, executing the step S15.

S15, renaming the temporary lock file into a fixed lock file based on the target data by utilizing an atomic naming rule, and then executing a step S16, wherein the lock file is used for locking a path corresponding to the lock file.

S16, updating the target data to the path corresponding to the renamed fixed lock file of the temporary lock file.

By adopting the embodiment of the application, the path for writing the target data is locked by creating the lock file under the target path, whether other users compete for the same path in parallel is judged by whether the temporary lock file can be successfully created, if so, a competition detection strategy is started to continuously determine whether the conflicted fixed lock file exists, if so, the other users are indicated to be already established, the temporary lock file is renamed to be the fixed lock file, in order to avoid the concurrent operation conflict with other users, a new temporary lock file which does not conflict at all is continuously tried out by reestablishing the new temporary lock file, the temporary lock file is renamed to be the fixed lock file, and finally the target data is updated to be under the path corresponding to the renamed fixed lock file.

The following describes the steps S11 to S16 in detail with reference to specific examples:

In the embodiment of the application, data consistency processing is performed on data in a distributed file system, wherein the distributed file system is different from a common application program or a single server, the distributed file system is a large-sized data cluster, and takes an HDFS as an example, the HDFS distributed file system is a core component of a Hadoop large-data computing engine, and the HDFS can reliably store, process and compute large data on a plurality of hardware devices of the cluster.

As one implementation mode, the data consistency processing method based on the distributed file system provided by the application can be applied to any one storage node in the HDFS cluster, and as another implementation mode, the data consistency processing method based on the distributed file system provided by the application can be applied to any one client in the data system realized based on the HDFS cluster. Based on this, when step S11 is performed, a data request input by the user according to the own needs may be received from each node or client. Since the present application is directed to guaranteeing data consistency, the types of data requests herein are mainly data write requests, and the above-described nodes or clients may also receive requests similar to read requests, query requests, etc. in addition to write requests.

As an example, for example, a user needs to keep the same system time for each client in the HDFS distributed file system, that is, needs to determine consistency of the system time, and each client may obtain a preset reference time to achieve the effect of consistency of the system time. Based on the above, a reference time path can be preset by a certain user, and each client side can read the reference time stored under the path from the reference time path by contract, and set the reference time as the system time, so that the system consistency of each client side can be realized. At this time, the user may transmit a data writing request to the reference time path, where the data writing request carries a reference time to be set, where the reference time is the target data.

Based on this, as an embodiment, when step S12 is performed, a temporary lock file is created for the target data under the target path. In the embodiment of the application, the lock file is a mechanism for protecting data in a multithreading or multi-process environment, and can be particularly used for locking a path corresponding to the lock file. Once a lock file is successfully created, the path corresponding to the lock file is occupied, and other threads or processes can no longer apply for using the path. In the embodiment of the application, two types of lock files are used in total, namely a temporary lock file (English name temlockfile) and a fixed lock file (lockfile), wherein the temporary lock file is used for locking a path corresponding to the temporary lock file for a first time period, the fixed lock file is used for locking a path corresponding to the fixed lock file for a second time period, and the relationship between the first time period and the second time period is that the second time period is longer than the first time period.

In the embodiment of the application, the temporary lock file locks the path corresponding to the temporary lock file for a first time period, namely, if a certain thread or process creates a successful first temporary lock file at a certain time node, other threads or processes cannot continuously apply for the path corresponding to the first temporary lock file in the first time period after the time node. In the embodiment of the application, a temporary lock file is created for the target data under the target path, not randomly, but based on the file name of the target data.

For example, if the target data is a song, the song name is "sun-seed", and at this time, a temporary lock file is created based on the song name of the song, and for example, the temporary lock file is "sun-seed temp", and the subscript suffix is "temp" to indicate that the file is a temporary lock file. Similarly, taking the system time as an example, if the file name is the reference time, a temporary lock file "reference time temp" may be created for the reference time. Based on this, when step S13 is performed, it may be determined whether the temporary lock file is successfully created by detecting whether the created temporary lock file exists under the target path, and if so, a temporary lock file exists under the path.

In the embodiment of the application, the lock file creation mode taking the file name of the target data as a reference and the name naming mode of the lock file by taking the file name of the target data are named or created based on the atomicity of the target data. The atomicity of the target data refers to the target data itself, which is not changed randomly, and the atomicity operation specifically refers to a series of operations of the present application based on the target data, and the whole atomicity operation has a characteristic that the whole atomicity operation is performed successfully or not performed at all.

When step S13 is executed, if it is determined that the temporary lock file is failed to be created, step S14 is executed, and a contention detection policy is flexibly set by an operation and maintenance personnel of the HDFS distributed file system, specifically, whether the same file name exists or not is detected according to the file name, specifically, whether the temporary lock file with the same file name exists or not is checked, if the temporary lock file with the file name exists, the temporary lock file is occupied by other threads or processes, that is, the temporary lock file has a conflict, and the temporary lock file cannot be successfully created.

For example, if a user a is about to write an MP3 file in the X path, the file name of the MP3 file is sun-checking, and at this time, a temporary lock file sun is created for the user a, and also the file name of the MP3 is also sun-checking, and at this time, the file name of another temporary lock file is also sun-checking for the user B.

As another implementation manner, since the embodiment of the present application further includes a process of renaming the temporary lock file to the fixed lock file, it is easy for the temporary lock file to be converted to the fixed lock file, and then there may be a collision between the temporary lock file and the fixed lock file. The conversion from the temporary lock file to the fixed lock file is specifically obtained by executing step S15 to rename the file. In one embodiment, when step S15 is performed, the temporary lock file is renamed to a fixed lock file by using an atomic naming rule, which may be, for example, deletion of a temporary file name represented in the temporary lock file, and the temporary lock file is identified by using temp as a file suffix as described above, the temporary lock file is renamed to be a fixed lock file, and the temporary lock file can be deleted directly, and the temporary lock file is renamed to be a fixed lock file by taking the sun-seed temporary lock file as an example, and the file name can be modified from the sun-seed temporary file to the sun-seed file directly.

Because the fixed lock file is directly converted based on the temporary lock file, the two may be only the difference of the suffixes, so the temporary lock file cannot be successfully created, and a factor is that there is a conflicted fixed lock file, so the competition detection strategy can be started to determine whether there is a conflicted fixed lock file by further executing the step S14. If so, the currently created temporary lock file cannot be used continuously, and the process returns to step S12, where a new temporary lock file is created for the target data. At this time, a new temporary lock file is created for the target data, still created based on the file name of the target data itself, and by way of example, the above-described sun-temp conflict may be caused by reestablishing a new temporary lock file by using "sun v1. Temp".

In the embodiment of the application, the core processing idea for concurrency conflicts is 'who updates and calculates first for the same data path'. For example, taking the data writing of the sun as an example, if no person uses the data writing of the sun for the path X in advance, the sun is directly created at the moment. However, in this process, there is another user who is also creating, but the sun temp temporary lock file has been created successfully, then the user will not be able to create the sun temp. The temporary lock file is then renamed to a fixed lock file to alter the lock duration of the thread or process. Based on this, step S16 is executed to update the target data to the path corresponding to the renamed fixed lock file of the temporary lock file, which may specifically be uploading the target data to the path corresponding to the renamed fixed lock file.

In the embodiment of the application, the atomic renaming is the transactional guarantee of the HDFS itself, that is, two states, namely, failure with complete success or complete failure without affecting the system state. The atomic renaming aims to ensure that when the temporary lock file is renamed to be a fixed lock file, only one state can exist, either success or failure can be realized, so that when the temporary lock file is attempted to be renamed, the user A can be ensured to preempt the same temporary lock file name by the user B if the renaming is not successful, and at the moment, the temporary lock of the user A fails to be renamed and automatically rolls back to wait for the next round, and the user A can not mistakenly think that the renaming is successful.

For example, taking the above target data as the sun seed as an example, after the sun seed is renamed as the sun seed, the MP3 of the sun seed may be uploaded to the path corresponding to the fixed lock file of the sun seed. If the uploading operation does not have atomicity, the uploading operation automatically returns, and the path corresponding to the 'sun' fixed lock file is still empty. If the MP3 uploading of the sun is successful, the temporary lock file of the sun is changed into a fixed lock file of the corresponding sun, and if the uploading is unsuccessful, the temporary lock file of the sun is deleted. If other users always try to create the seed sun temp temporary lock file, if the other users find that the previous user fails to upload the seed sun MP3, the other users try to create the seed sun temp temporary lock file will be created immediately, otherwise, the other users can only wait for the current user to upload successfully, the temporary lock file of the seed sun temp can be created again after changing the fixed lock file, if the current user fails to upload, the temporary lock file of the seed sun temp is preempted.

In one embodiment, in the process of updating the target data to the path corresponding to the renamed fixed lock file, if the target data is updated successfully, the fixed lock file is automatically changed into the fixed lock file, and if the target data is not updated successfully, the fixed lock file is automatically disabled, and at this time, the locking of the temporary lock file to the thread or the process can be reduced by deleting the temporary lock file, so as to release the thread or the process resource.

In some embodiments, after step S16, the method further comprises:

and cleaning the temporary lock file and deleting the fixed lock file.

In some embodiments, the method further comprises:

Specifically, as shown in fig. 2, the data consistency processing method based on the distributed file system according to the embodiment of the present application includes the following steps:

Step S21 is performed to try to create temporary lock file temlockfile, and if the creation is successful, step S22 is performed to rename the temporary lock file to fixed lock file lockfile. If the failure indicates that another node is also updating, step S23 is executed to start the contention detection policy, and check whether there is a conflicting fixed lock file lockfile. If so, execution continues with step S24 to attempt to create another temporary lock file for contention, and if not, execution continues with step S25 to rename the temporary lock file to a fixed lock file as normal. After the success of the attempt to create another temporary lock file at step S24, step S26 is performed to rename the new temporary lock file to a fixed lock file. After step S22 and step S26 succeed in renaming the temporary lock file to the fixed lock file, step S27 is executed to write the target data to the path corresponding to the fixed lock file, and step S28 is executed to complete the data update and delete the fixed lock file. If the temporary lock file in step S22 or step S26 is renamed to be a fixed lock file failure, step S29 is executed to clean the corresponding temporary lock file.

Wherein when there is a conflicting fixed lock file, the attempt to create another temporary lock file is specifically to create the temporary lock file according to a global uniqueness principle. The global uniqueness principle refers to that no other temporary lock files are allowed to be identical with the created temporary lock file under the principle. For example, if there are three user groups, namely user A, user B and user C, the resource rights respectively owned by the user A are/root/a,/root/B and/root/C. And if the temporary lock files are the same, the temporary lock files are indicated to be preempted by other users of the same user group.

The data consistency processing method based on the distributed file system provided by the application is adopted, and by means of the primordial naming rule, when different users compete concurrently, a temporary lock file is created first to first, so that data inconsistency generated by concurrent operation of the users is avoided, and the operation sequence and consistency of the different users are ensured. In addition, when the temporary lock file is failed to be created, the state of the existing file is checked by starting a competition detection strategy, then a new temporary lock file is attempted to be created, and the atomicity and the validity of the operation are ensured through renaming operation.

In a second aspect, an embodiment of the present application provides a data consistency processing apparatus based on a distributed file system, where, as shown in fig. 3, the apparatus 30 includes:

A receiving module 301, configured to receive a data request sent by a user, where the data request carries target data to be written;

the file naming processing module 302 is configured to create a temporary lock file for the target data under a target path, and rename the temporary lock file to a fixed lock file based on the target data by using an atomic naming rule if the temporary lock file is created successfully, where the lock file is used to lock a path corresponding to the lock file;

The contention detection module 303 is configured to, if the temporary lock file is failed to be created, start a contention detection policy, and determine whether a conflicting fixed lock file exists; if so, creating a new temporary lock file for the target data under the target path again, judging whether the new temporary lock file is created successfully again, and if so, restarting the competition detection strategy until the new temporary lock file has no conflicting fixed lock file;

The file naming processing module 302 is further configured to rename the temporary lock file to a fixed lock file based on the target data by using an atomic naming rule if there is no conflicting fixed lock file;

and the data processing module 304 is configured to update the target data to a path corresponding to the renamed fixed lock file of the temporary lock file.

and cleaning the temporary lock file and deleting the fixed lock file.

The processing of collecting, storing, using, processing, transmitting, providing, disclosing and the like of the personal information of the user, which is involved in the application, accords with the rules of relevant laws and regulations and does not violate the public order colloquial.

The names of messages or information interacted between the devices in the embodiments of the present application are for illustrative purposes only and are not intended to limit the scope of such messages or information.

In a third aspect, the exemplary embodiments of this application also provide an electronic device comprising at least one processor and a memory communicatively coupled to the at least one processor. The memory stores a computer program executable by the at least one processor for causing the electronic device to perform a method according to an embodiment of the application when executed by the at least one processor.

The exemplary embodiments of the present application also provide a non-transitory computer readable storage medium storing a computer program, wherein the computer program, when executed by a processor of a computer, is for causing the computer to perform a method according to an embodiment of the present application.

The exemplary embodiments of the application also provide a computer program product comprising a computer program, wherein the computer program, when being executed by a processor of a computer, is for causing the computer to perform a method according to an embodiment of the application.

Referring to fig. 4, a block diagram of an electronic device 400 that may be a server or a client of the present application will now be described, which is an example of a hardware device that may be applied to aspects of the present application. Electronic devices are intended to represent various forms of digital electronic computer devices, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other suitable computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the applications described and/or claimed herein.

As shown in fig. 4, the electronic device 400 includes a computing unit 401 that can perform various suitable actions and processes according to a computer program stored in a Read Only Memory (ROM) 402 or a computer program loaded from a storage unit 408 into a Random Access Memory (RAM) 403. In the RAM 403, various programs and data required for the operation of the electronic device 400 may also be stored. The computing unit 401, ROM 402, and RAM 403 are connected to each other by a bus 404. An input/output (I/O) interface 405 is also connected to bus 404.

Various components in the electronic device 400 are connected to the I/O interface 405, including an input unit 406, an output unit 407, a storage unit 408, and a communication unit 409. The input unit 406 may be any type of device capable of inputting information to the electronic device 400, and the input unit 406 may receive input numeric or character information and generate key signal inputs related to user settings and/or function controls of the electronic device. The output unit 407 may be any type of device capable of presenting information and may include, but is not limited to, a display, speakers, video/audio output terminals, vibrators, and/or printers. Storage unit 408 may include, but is not limited to, magnetic disks, optical disks. The communication unit 409 allows the electronic device 400 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunications networks, and may include, but is not limited to, modems, network cards, infrared communication devices, wireless communication transceivers and/or chipsets, such as bluetooth (TM) devices, wiFi devices, wiMax devices, cellular communication devices, and/or the like.

The computing unit 401 may be a variety of general purpose and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 401 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 401 performs the respective methods and processes described above. For example, in some embodiments, the foregoing distributed file system based data consistency processing method may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as the storage unit 408. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 400 via the ROM 402 and/or the communication unit 409. In some embodiments, the computing unit 401 may be configured by any other suitable means (e.g., by means of firmware) to perform the aforementioned distributed file system based data consistency processing method.

Program code for carrying out methods of the present application may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of the present application, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to the user and a keyboard and a pointing device (e.g., a mouse or a trackball) by which the user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user, for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback), and input from the user may be received in any form, including acoustic input, speech input, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a Local Area Network (LAN), a Wide Area Network (WAN), and the Internet.

The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.

Claims

1. A data consistency processing method based on a distributed file system, characterized in that the method comprises:

Receiving a data request from a user, wherein the data request carries target data to be written;

Creating a temporary lock file for the target data under the target path, and if the temporary lock file is successfully created, renaming the temporary lock file to a fixed lock file based on the target data using an atomic naming rule, wherein the lock file is used to lock the path corresponding to the lock file;

If the creation of the temporary lock file fails, a contention detection strategy is initiated to determine whether there is a conflicting fixed lock file;

If it exists, a new temporary lock file is created for the target data under the target path again, and it is determined again whether the new temporary lock file is created successfully. If the new temporary lock file fails to be created, the competition detection strategy is started again until the new temporary lock file does not have a conflicting fixed lock file;

If it does not exist, rename the temporary lock file to a fixed lock file based on the target data using an atomic naming rule;

The target data is updated to the path corresponding to the fixed lock file after the temporary lock file is renamed.

2. The method according to claim 1 is characterized in that the temporary lock file and the fixed lock file are file locks in the distributed file system, the temporary lock file is used to lock the path corresponding to the temporary lock file for a first time period, and the fixed lock file is used to lock the path corresponding to the fixed lock file for a second time period, wherein the second time period is longer than the first time period.

3. The method according to claim 2, characterized in that after the step of updating the target data to the path corresponding to the fixed lock file after the temporary lock file is renamed, the method further comprises:

Clean up the temporary lock file and delete the fixed lock file.

4. The method according to claim 1, characterized in that the method further comprises:

If the temporary lock file is renamed to a fixed lock file based on the target data by using the atomic naming rule, the target data is updated to the path corresponding to the fixed lock file after the temporary lock file is renamed;

If renaming the temporary lock file to a fixed lock file based on the target data using the atomic naming rule fails, the temporary lock file is cleared.

5. A data consistency processing device based on a distributed file system, characterized in that the device comprises:

A receiving module, used for receiving a data request from a user, wherein the data request carries target data to be written;

A file naming processing module, used to create a temporary lock file for the target data under the target path, and if the temporary lock file is successfully created, the temporary lock file is renamed to a fixed lock file based on the target data using an atomic naming rule, wherein the lock file is used to lock the path corresponding to the lock file;

A contention detection module is used to start a contention detection strategy if the creation of the temporary lock file fails, and determine whether there is a conflicting fixed lock file; if so, create a new temporary lock file for the target data under the target path again, and determine again whether the new temporary lock file is successfully created; if the creation of the new temporary lock file fails, start the contention detection strategy again until the new temporary lock file does not have a conflicting fixed lock file;

The file naming processing module is further used to rename the temporary lock file to a fixed lock file based on the target data by using an atomic naming rule if there is no conflicting fixed lock file;

The data processing module is used to update the target data to the path corresponding to the fixed lock file after the temporary lock file is renamed.

6. The device according to claim 5 is characterized in that the temporary lock file and the fixed lock file are file locks in the distributed file system, the temporary lock file is used to lock the path corresponding to the temporary lock file for a first time period, and the fixed lock file is used to lock the path corresponding to the fixed lock file for a second time period, wherein the second time period is longer than the first time period.

7. The device according to claim 6, characterized in that after the step of updating the target data to the path corresponding to the fixed lock file after the temporary lock file is renamed, the data processing module is further used to:

Clean up the temporary lock file and delete the fixed lock file.

8. The device according to claim 5, characterized in that the data processing module is specifically used to:

9. An electronic device, characterized in that the electronic device comprises:

Processor; and

Memory for storing programs,

The program includes instructions, which, when executed by the processor, cause the processor to perform the method according to any one of claims 1 to 4.

10. A non-transitory computer-readable storage medium storing computer instructions, wherein the computer instructions are used to cause a computer to execute the method according to any one of claims 1 to 4.