CN114116672B

CN114116672B - Data synchronization method and related device

Info

Publication number: CN114116672B
Application number: CN202111439656.XA
Authority: CN
Inventors: 范东; 陶周天; 罗剑锋
Original assignee: Smartsteps Data Technology Co ltd
Current assignee: Smartsteps Data Technology Co ltd
Priority date: 2021-11-30
Filing date: 2021-11-30
Publication date: 2022-11-08
Anticipated expiration: 2041-11-30
Also published as: CN114116672A

Abstract

In the data synchronization method and the related device, the target device locally generates at least one data maintenance instruction of the target data, then sends the data maintenance instruction to the second cluster, and controls the data maintenance instruction to operate in the second cluster to generate the at least one piece of maintenance information of the target data, so that when the data is synchronized between the two clusters, the maintenance information of the synchronized data can be constructed only by operating in one cluster training.

Description

Data synchronization method and related device

Technical Field

The present application relates to the field of computers, and in particular, to a data synchronization method and a related apparatus.

Background

A large amount of data is generated in the internet service process, and therefore, it is proposed to analyze the generated data through a large data cluster. However, researches have found that data synchronization is sometimes required among a plurality of clusters in an actual scene, and the conventional data synchronization tool has the problem of being too complicated to operate when maintenance information is constructed for synchronized data.

Disclosure of Invention

In order to overcome at least one of the deficiencies in the prior art, the present application provides a data synchronization method and a related apparatus, including:

in a first aspect, the present application provides a data synchronization method applied to a target device in a first cluster, where the method includes:

generating data maintenance instructions for at least one piece of target data, wherein the at least one piece of target data represents data that has been synchronized from the first cluster to a second cluster;

sending the data maintenance instruction to the second cluster;

and controlling the second cluster to operate the data maintenance instruction, and generating the maintenance information of the at least one piece of target data in the second cluster.

In a second aspect, the present application provides a data synchronization apparatus applied to a target device in a first cluster, the data synchronization apparatus including:

an instruction module to generate a data maintenance instruction for at least one piece of target data, wherein the at least one piece of target data represents data that has been synchronized from the first cluster to a second cluster;

a transmission module, configured to send the data maintenance instruction to the second cluster;

and the maintenance module is used for controlling the second cluster to operate the data maintenance instruction and generating the maintenance information of the at least one piece of target data in the second cluster.

In a third aspect, the present application provides a target device, the target device belonging to a first cluster, the target device comprising a processor and a memory, the memory storing a computer program, the computer program, when executed by the processor, implementing the data synchronization method.

In a fourth aspect, the present application provides a computer-readable storage medium storing a computer program which, when executed by a processor, implements the data synchronization method.

Compared with the prior art, the method has the following beneficial effects:

in the data synchronization method and the related device, the target device locally generates at least one data maintenance instruction of the target data, then the data maintenance instruction is sent to the second cluster, and the data maintenance instruction is controlled to run in the second cluster to generate at least one piece of maintenance information of the target data, so that when the data is synchronized between the two clusters, the maintenance information of the synchronized data can be constructed only by operating in one cluster training.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.

Fig. 1 is a schematic structural diagram of a target device provided in an embodiment of the present application;

fig. 2 is a flowchart of a data synchronization method according to an embodiment of the present disclosure;

FIG. 3 is a second flowchart of a data synchronization method according to an embodiment of the present application;

fig. 4 is a third flowchart of a data synchronization method according to an embodiment of the present application;

fig. 5 is a schematic structural diagram of a data synchronization apparatus according to an embodiment of the present application.

Icon: 120-a memory; 130-a processor; 140-a communication unit; 201-an instruction module; 202-a transmission module; 203-maintenance module.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations.

Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.

Research finds that the existing data synchronization tool has the problem of complicated operation when building maintenance information for the synchronized data. For example, taking a Hive cluster as an example, a tool in the market for synchronizing data of the Hive cluster generally needs to deploy a synchronization tool in both Hive clusters (hereinafter, denoted as an a cluster and a B cluster), then a user needs to input an IP access a cluster in a provided operation page, execute a synchronization program in the a cluster, and after synchronization is completed, input an IP access B cluster, execute a code for maintaining a Hive library in the B cluster, and further need to store a synchronization state of data by means of a third-party database in the entire synchronization process.

In view of the problems of complex operation and dependence on a third-party database in the existing data synchronization method, the present embodiment provides a data synchronization method for at least partially solving the above technical problems. According to the method, target equipment in a first cluster locally generates at least one data maintenance instruction of target data, then the data maintenance instruction is sent to a second cluster, and the data maintenance instruction is controlled to run in the second cluster to generate at least one piece of maintenance information of the target data, so that when data are synchronized between the two clusters, only one cluster training is required to operate to construct the maintenance information of the synchronized data.

The target data in this embodiment represents data that has been synchronized from a first cluster to a second cluster, and the target data source may originate from a user terminal. Wherein the user terminal may be, but is not limited to, a mobile terminal, a tablet computer, a laptop computer, or a built-in device in a motor vehicle, etc., or any combination thereof. In some embodiments, the mobile terminal may include a smart home device, a wearable device, a smart mobile device, a virtual reality device, an augmented reality device, or the like, or any combination thereof. In some embodiments, the smart home devices may include smart lighting devices, control devices for smart electrical devices, smart monitoring devices, smart televisions, smart cameras, or walkie-talkies, or the like, or any combination thereof. In some embodiments, the wearable device may include a smart bracelet, a smart lace, smart glass, a smart helmet, a smart watch, a smart garment, a smart backpack, a smart accessory, and the like, or any combination thereof. In some embodiments, the smart mobile device may include a smartphone, a Personal Digital Assistant (PDA), a gaming device, a navigation device, or a point of sale (POS) device, or the like, or any combination thereof.

It should be noted that the first cluster in this embodiment includes a plurality of devices, and the target device may be any device in the first cluster, and is determined by the scheduling method of the first cluster. In addition, the embodiment also provides a structural schematic diagram of the target device. As shown in fig. 1, the target device includes a memory 120, a processor 130, and a communication unit 140. The memory 120, the processor 130 and the communication unit 140 are electrically connected to each other directly or indirectly, so as to realize data transmission or interaction.

The Memory 120 may be, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a Programmable Read Only Memory (PROM), an Erasable Read Only Memory (EPROM), an electrically Erasable Read Only Memory (EEPROM), and the like. The memory 120 is used for storing a program, and the processor 130 executes the program after receiving the execution instruction.

The communication unit 140 is used for establishing a communication connection between the server and the user terminal through a network, and for transceiving data through the network. The Network may include a wired Network, a Wireless Network, a fiber optic Network, a telecommunications Network, an intranet, the internet, a Local Area Network (LAN), a Wide Area Network (WAN), a Wireless Local Area Network (WLAN), a Metropolitan Area Network (MAN), a Wide Area Network (WAN), a Public Switched Telephone Network (PSTN), a bluetooth Network, a ZigBee Network, or a Near Field Communication (NFC) Network, or the like, or any combination thereof. In some embodiments, the network may include one or more network access points. For example, the network may include wired or wireless network access points, such as base stations and/or network switching nodes, through which one or more components of the service request processing system may connect to the network to exchange data and/or information.

The processor 130 may be an integrated circuit chip having signal processing capabilities, and may include one or more processing cores (e.g., a single-core processor or a multi-core processor). Merely by way of example, the Processor may include a Central Processing Unit (CPU), an Application Specific Integrated Circuit (ASIC), an Application Specific Instruction Set Processor (ASIP), a Graphics Processing Unit (GPU), a Physical Processing Unit (PPU), a Digital Signal Processor (DSP), a Field Programmable Gate Array (FPGA), a Programmable Logic Device (PLD), a controller, a microcontroller Unit, a Reduced Instruction Set computer (Reduced Instruction Set computer), a microprocessor, or the like, or any combination thereof.

Based on the above description, the data synchronization method in the present embodiment is described in detail below with reference to fig. 2. It should be understood that the operations of the flow diagrams may be performed out of order, and steps without logical context may be performed in reverse order or concurrently. One skilled in the art, under the guidance of this application, may add one or more other operations to, or remove one or more operations from, the flowchart. As shown in fig. 2, the method includes:

s104, generating at least one data maintenance instruction of the target data.

Wherein the at least one piece of target data represents data that has been synchronized from the first cluster to the second cluster. However, it should be understood that after the target data is synchronized from the first cluster to the second cluster, only the target data is stored in the second cluster, and the target data itself has a specific format and organization, so that the maintenance information of the target data is also needed to enable the second cluster to access the target data.

Therefore, in an alternative embodiment, the target device may obtain a maintenance parameter of each of the at least one piece of target data and a storage location of each of the at least one piece of target data in the second cluster; and then, aiming at each piece of target data, generating a data maintenance instruction of the target data according to the maintenance parameter of the target data and the storage position of the target data.

The maintenance parameter is related to the type of the cluster, and assuming that the first cluster and the second cluster are Hive clusters and the database and the data table are already established by the second cluster, the maintenance parameter may be a partition field in the Hive cluster, and the corresponding data maintenance instruction is an HQL statement in the Hive cluster.

And S105, sending the data maintenance instruction to the second cluster.

And S106, controlling a second cluster to operate the data maintenance instruction, and generating at least one piece of maintenance information of the target data in the second cluster.

Continuing to take the HQL statement as an example, after the HQL statement is sent to the second cluster, the target device may run the HQL statement in the second cluster through the remote access instruction, so that the second cluster establishes maintenance information of the target data, so as to facilitate access to the target data in the big data calculation process.

Therefore, the target device locally generates at least one data maintenance instruction of the target data, then sends the data maintenance instruction to the second cluster, and controls the data maintenance instruction to operate in the second cluster to generate at least one piece of maintenance information of the target data, so that when the data is synchronized between the two clusters, the maintenance information of the synchronized data can be constructed only by operating in one cluster training.

In order to facilitate the user to configure target data to be synchronized, the target device is configured with a synchronization configuration file, the synchronization configuration file records respective synchronization parameters of a plurality of candidate data, and the user screens out data to be synchronized from the candidate data by deleting the annotation identifier. That is, before step S104, the data synchronization method further includes the following embodiments:

the target equipment receives the configuration operation of a user on the synchronous configuration file, and then, responds to the configuration operation and deletes the annotation identification selected by the user; so as to screen out the data to be synchronized from the candidate data.

For example, assume that the contents of a synchronization profile record for configuring candidate cities and corresponding city data dates are as follows:

city list of # start date and end date

#20210901 20210930 Wuxi Zhengzhou

#20210901 20210901 Tianjin, shandong, hebei

In this example, "#" located at the head of the synchronization parameter is an annotation identity; if the user deletes "#" in the "# start date end date city list", it indicates that the two cities of wuxi and zheng need to be synchronized from the data during 09/01/2021-09/30/2021.

However, it has been found that, when the user uses the synchronization profile for multiple times, the current data to be synchronized and the previous data to be synchronized are easily confused, so that the wrong data to be synchronized is filtered out.

For example, when the user filters the data to be synchronized that is needed this time, the "#" deleted in the last synchronization needs to be restored, and then the current data to be synchronized is screened out from the synchronization configuration file again, however, part of the "#" is easily missed and is not restored in the actual operation. Therefore, as shown in fig. 3, the data synchronization method further includes the following embodiments:

s101, determining at least one parameter to be synchronized from a synchronization configuration file.

Each parameter to be synchronized represents a synchronization parameter of which a preset position is not configured with an annotation identifier.

S102, synchronizing the data to be synchronized to the second cluster according to the at least one parameter to be synchronized.

The data to be synchronized represents candidate data corresponding to at least one parameter to be synchronized;

and S103, adding annotation identification at the preset position of each parameter to be synchronized.

Illustratively, continuing with the synchronization profile example described above, assuming the user deletes the "#" in the "# start date end date city list," the "#" identification is re-added to the header of the synchronization parameter after the target device synchronizes the data of both cities, tinless and zheng, from the first cluster to the second cluster during 09 month 01-2021 year 09 month 30, 2021.

In addition, the target device is configured with a synchronization log for synchronizing the data to be synchronized to the second cluster, where the synchronization log includes a synchronization result of each piece of data to be synchronized, and the synchronization result represents whether each piece of data to be synchronized is successfully synchronized. Therefore, the data synchronization method further includes the following embodiments:

and the target equipment determines at least one piece of target data from the data to be synchronized according to the synchronization log.

In order to facilitate the user to analyze the synchronization log, the synchronization log in this embodiment includes a primary log and a secondary log, which are respectively used for recording synchronization results of different levels.

The relation between the primary log and the secondary log in this embodiment will be described by taking the synchronization parameter of "#20210901 20210930 without tin and zheng state" recorded in the synchronization profile as an example. Wherein, the primary log is used for recording whether the data synchronization is successful during 09/01-09/30 in 2021 in No. 1 or Zheng.

And if the synchronization result shows that the synchronization of the tin-free data fails in the synchronization process, searching 30 days of 09-month-01-09-month-30-2021 from the tin-free secondary log for which the data synchronization fails.

In order to facilitate the migration of the screened data to be synchronized from the first cluster to the second cluster in a concurrent manner, a migration script is generated for all the data to be synchronized, and then the migration scripts are executed concurrently, so that the migration of the screened data to be synchronized from the first cluster to the second cluster is realized in a concurrent manner. However, in consideration of device performance, it is necessary to manage the number of concurrent scripts. As shown in fig. 4, the present embodiment manages the number of concurrently executed scripts through the following implementation manner of step S102:

s102-1, generating at least one migration script of the data to be synchronized in the first directory according to the at least one parameter to be synchronized.

Each migration script is used for migrating corresponding data to be migrated from the first cluster to the second cluster;

s102-2, obtaining the remaining first quantity of the migration scripts in the second directory.

After the migration script in the second directory is operated, deleting the migration script from the second directory;

s102-3, judging whether the first quantity is smaller than a first parallel threshold value, if so, executing a step S102-4; if not, step S102-2 is executed.

S102-4, transferring a second number of migration scripts from the first directory to the second directory.

Wherein the sum of the first number and the second number equals a second parallelism threshold;

s102-5, obtaining the residual third quantity of the migration scripts in the first directory.

S102-6, judging whether the third quantity is larger than 0, if so, executing a step S102-2; if not, step S103 is executed.

Compared with the method for controlling the number of concurrent migration scripts under the first directory, the method needs to distinguish the migration scripts in the running state and the migration scripts not in the running state under the same directory; in the above embodiment, a first directory and a second directory are constructed, where the second directory is used to store migration scripts that need to be run in parallel, and when the number of the migration scripts is less than the first number and the second number are associated with the first number, the second number of the migration scripts are transferred from the first directory to the second directory; therefore, only the number of scripts in the second script needs to be concerned, and the maintenance cost for executing the migration script concurrently is simplified.

To make the objects, technical solutions and advantages of the present embodiments more obvious for those skilled in the art, a detailed description of the present embodiments is provided below.

In this example, it is assumed that the first cluster and the second cluster are Hive clusters respectively, and the target electronic device in the first cluster provides two data synchronization scripts for implementing the data synchronization method. In addition, the target electronic device also provides three directories of Batch, etl _ log, etl _ sub _ log, hive _ hql, wait _ task and Run _ task, wherein the roles of the directories are as follows:

batch: the device is used for storing data synchronization scripts create _ script.sh and run _ script.sh and two configuration files, i.e. hive _ table.conf and hive _ city.conf, required by the two scripts. The device comprises a create _ script.sh module, a run _ script.sh module, a migration script module and a dynamic script module, wherein the create _ script.sh is used for generating the migration script, and the run _ script.sh is used for monitoring the execution of the migration script and controlling the running quantity of the dynamic script; hive _ table.conf is used for recording a table which needs to be synchronized, and hive _ city.conf is used for recording an area which needs to be synchronized, province, city and the like.

Etl _ log: for recording the primary log.

Etl _ sub _ log: for recording secondary logs.

Hive _ hql: for recording data maintenance instructions, i.e. HQL statements in this example.

Wait _ task: and the migration script is used for storing all data to be synchronized.

Run _ task: for storing migration scripts that are running concurrently.

First, the user modifies the hive _ table. Conf file, determining the tables to synchronize. Specifically, for the recorded tables, the user only needs to delete the annotation identifier (remove #), and if the table to be synchronized is not in the configuration file, the user needs to add new records, fill in the data granularity (province/city/other), fill in the data synchronization frequency (daily table/monthly table), fill in which hive library, which table, the data source location, the data synchronization target location, and the partition field of the hive table. And after the data synchronization is finished, adding the comment identifier deleted in the hive _ table.conf file to prevent the next synchronization error. The content in the hive _ table.conf may be, for example:

“#Provience_name day cip dewell_single partition_table/xxxx/xxxx/xxxx/xxxx provience＝#PROVIENCE_CODE#|date_dt＝#YYMMDD#”

sequentially represents: city parameters, synchronization frequency, database, table name, table type, source directory, target directory, partition field.

The user then modifies the hive _ city. Conf file, determines the province/city/other to be synchronized, and the start date, end date, area needed to fill in the synchronization data. And when the create _ script.sh is also finished, the deleted annotation mark is automatically added after the data synchronization is finished, so that the error of the next synchronization is prevented. For example, the contents described in hive _ city.conf are as follows:

"#20210901 20210930 Wuxi, zhengzhou"

Sequentially represents: start date, end date, region.

Based on the above configuration, during execution of the create _ script.sh script, a city-granular script is generated according to the start date, end date, city list, data granularity (day/month) of the un-annotated table in the hive _ table.conf and other information of the un-annotated line in the hive _ city.conf.

Taking "#20210901 20210930 Wuxi Zheng", as an example, the create _ script.sh script generates two scripts, wuxi and Zheng.

Continuing with the example of the tin-free script, assuming that two tables, namely, a monthly table and a daily table, are configured in the hive _ city.conf, the create _ script.sh script generates a migration script that synchronizes the monthly tables of 09 months 01-2021, 09 months 30 and 2021, 09 months, and totally 31 synchronized directories.

And the generated migration script is placed in a wait _ task directory to wait for run _ script.sh to control execution. And the run _ script.sh determines how many migration scripts can be operated at one time according to the configured maximum execution number, the execution sequence transfers the migration scripts in the wait _ task directory to the run _ task directory for execution according to the generation sequence of the scripts, and the migration scripts are destroyed after the execution is finished.

Then, according to an execution result during execution of the migration script, generating a corresponding HQL sentence for successfully migrated target data, and writing back the statement to a HQL file corresponding to a Hive _ HQL directory; then, the HQL file is pushed to the second cluster, and the second cluster is controlled to execute the HQL statement in the HQL file, so that the maintenance information of the target data is generated in the second cluster.

Based on the same inventive concept as the data synchronization method, the present embodiment also provides an apparatus related to the method, including:

the embodiment also provides a data synchronization device, which is applied to the target device in the first cluster. The data synchronization means is a software functional module stored in a memory. As shown in fig. 5, the data synchronization apparatus includes:

an instruction module 201 is configured to generate a data maintenance instruction of at least one piece of target data, where the at least one piece of target data represents data that has been synchronized from a first cluster to a second cluster.

In this embodiment, the instruction module 201 is configured to implement step S104 in fig. 2, and for a detailed description of the instruction module 201, reference may be made to a detailed description of step S104.

A transmission module 202, configured to send the data maintenance instruction to the second cluster.

In this embodiment, the transmission module 202 is configured to implement step S105 in fig. 2, and for a detailed description of the transmission module 202, reference may be made to a detailed description of step S105.

And the maintenance module 203 is configured to control the second cluster to operate the data maintenance instruction, and generate maintenance information of at least one piece of target data in the second cluster.

In this embodiment, the maintenance module 203 is configured to implement step S106 in fig. 2, and for a detailed description of the maintenance module 203, reference may be made to a detailed description of step S106.

In an optional embodiment, the way for the instruction module 201 to generate the data maintenance instruction includes:

obtaining the maintenance parameter of each of at least one piece of target data and the storage position of each of at least one piece of target data in the second cluster;

and aiming at each piece of target data, generating a data maintenance instruction of the target data according to the maintenance parameter of the target data and the storage position of the target data.

The present embodiment further provides a target device, where the target device belongs to the first cluster, and the target device includes a processor and a memory, where the memory stores a computer program, and the computer program is executed by the processor to implement the data synchronization method.

The present embodiment also provides a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the data synchronization method is implemented.

It should be noted that the terms "first," "second," "third," and the like are used merely to distinguish one description from another, and are not intended to indicate or imply relative importance. Furthermore, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising a … …" does not exclude the presence of another identical element in a process, method, article, or apparatus that comprises the element.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

In addition, functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist alone, or two or more modules may be integrated to form an independent part.

The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

The above description is only for various embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of changes or substitutions within the technical scope of the present application, and all such changes or substitutions are included in the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A data synchronization method applied to a target device in a first cluster, wherein the target device is configured with a synchronization profile, and the synchronization profile records respective synchronization parameters of a plurality of candidate data, and the method includes:

determining at least one parameter to be synchronized from the synchronization configuration file, wherein each parameter to be synchronized represents the synchronization parameter of which a preset position is not configured with an annotation identifier;

generating at least one migration script of data to be synchronized under a first directory according to the at least one parameter to be synchronized, wherein the data to be synchronized represents the candidate data corresponding to the at least one parameter to be synchronized; each migration script is used for migrating corresponding data to be migrated from the first cluster to the second cluster;

acquiring the remaining first quantity of the migration scripts in a second directory, wherein the migration scripts in the second directory are deleted from the second directory after the migration scripts in the second directory are operated;

if the first number is less than a first parallel threshold, transferring a second number of migration scripts from the first directory to the second directory, wherein the sum of the first number and the second number is equal to a second parallel threshold;

acquiring the remaining third quantity of the migration scripts in the first directory;

if the third quantity is greater than 0, returning to the step of obtaining the remaining first quantity of the migration scripts in the second directory until the remaining third quantity of the migration scripts in the first directory is equal to 0;

adding the annotation identification at the preset position of each parameter to be synchronized;

sending the data maintenance instruction to the second cluster;

2. The data synchronization method of claim 1, wherein the generating of the data maintenance instruction of the at least one piece of target data comprises:

obtaining a maintenance parameter of each of the at least one piece of target data and a storage location of each of the at least one piece of target data in the second cluster;

and generating a data maintenance instruction of the target data according to the maintenance parameters of the target data and the storage position of the target data aiming at each piece of target data.

3. The data synchronization method according to claim 1, wherein the target device is configured with a synchronization log when the data to be synchronized is synchronized to the second cluster, the synchronization log including a synchronization result of each piece of the data to be synchronized, and the method further comprises:

and determining the at least one piece of target data from the data to be synchronized according to the synchronization log.

4. The data synchronization method of claim 1, wherein before determining at least one parameter to be synchronized from the synchronization profile, the method further comprises:

receiving configuration operation of a user on the synchronous configuration file;

and deleting the annotation identification selected by the user in response to the configuration operation.

5. A data synchronization device is applied to a target device in a first cluster, wherein the target device is configured with a synchronization configuration file, and the synchronization configuration file records respective synchronization parameters of a plurality of candidate data;

the target equipment determines at least one parameter to be synchronized from the synchronization configuration file, wherein each parameter to be synchronized represents the synchronization parameter of which a preset position is not configured with an annotation identifier;

the data synchronization apparatus includes:

6. The data synchronization apparatus of claim 5, wherein the manner in which the instruction module generates the data maintenance instruction comprises:

7. A target device belonging to a first cluster, the target device comprising a processor and a memory, the memory storing a computer program which, when executed by the processor, implements the data synchronization method of any of claims 1-4.

8. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which, when executed by a processor, implements the data synchronization method of any one of claims 1-4.