[go: up one dir, main page]

CN116521692A - Data synchronization method and device, electronic equipment and storage medium - Google Patents

Data synchronization method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN116521692A
CN116521692A CN202310450106.0A CN202310450106A CN116521692A CN 116521692 A CN116521692 A CN 116521692A CN 202310450106 A CN202310450106 A CN 202310450106A CN 116521692 A CN116521692 A CN 116521692A
Authority
CN
China
Prior art keywords
data
synchronization
trigger
target table
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310450106.0A
Other languages
Chinese (zh)
Inventor
康宏博
钱在晨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Oceanbase Technology Co Ltd
Original Assignee
Beijing Oceanbase Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Oceanbase Technology Co Ltd filed Critical Beijing Oceanbase Technology Co Ltd
Priority to CN202310450106.0A priority Critical patent/CN116521692A/en
Publication of CN116521692A publication Critical patent/CN116521692A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2379Updates performed during online database operations; commit processing
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2365Ensuring data consistency and integrity
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

One or more embodiments of the present disclosure provide a data synchronization method and apparatus, an electronic device, and a storage medium, where the method includes: creating a trigger, wherein the trigger is used for synchronizing the data change of the source table to the target table and recording the identification of the deleted data in the synchronization process in a deletion log; synchronizing data within the source table to the target table; and processing the data in the target table according to the identification in the deletion log. The trigger is used for performing incremental synchronization between the source table and the target table, and the source table is not locked in the process of performing full synchronization on the target table by the source table, namely the full synchronization process does not influence service processing; by deleting the log to record the deleted data in the incremental synchronization, the result of the incremental synchronization (for example, the deleted data in the incremental synchronization is added into the target table in the incremental synchronization) can be prevented from being covered by the full synchronization, so that the data in the target table is ensured to be effective and the latest business processing result.

Description

数据同步方法及装置、电子设备和存储介质Data synchronization method and device, electronic equipment and storage medium

技术领域technical field

本说明书一个或多个实施例涉及数据库技术领域,尤其涉及一种数据同步方法及装置、电子设备和存储介质。One or more embodiments of this specification relate to the technical field of databases, and in particular, to a data synchronization method and device, electronic equipment, and a storage medium.

背景技术Background technique

在互联网和信息化高速发展的今天,数据的产生呈爆炸式增长,因此对数据库及其管理的要求越来越高。数据库内存储有大量的数据表,在应用升级、业务变更等情况下数据库内的表结构往往会发生变更。数据库内表结构发生变更时需要对数据进行同步,而数据同步过程中往往需要进行锁表,即不允许并发DML(Data Manipulation Language,数据操纵语言),这使得业务处理被影响,产生业务请求超时等业务问题。Today, with the rapid development of the Internet and information technology, the generation of data is growing explosively, so the requirements for databases and their management are getting higher and higher. There are a large number of data tables stored in the database, and the table structure in the database often changes in the case of application upgrades and business changes. When the table structure in the database changes, the data needs to be synchronized, and the table lock is often required during the data synchronization process, that is, concurrent DML (Data Manipulation Language, Data Manipulation Language) is not allowed, which affects business processing and generates business request timeouts and other business issues.

发明内容Contents of the invention

有鉴于此,本说明书一个或多个实施例提供一种数据同步方法及装置、电子设备和存储介质。In view of this, one or more embodiments of this specification provide a data synchronization method and device, electronic equipment, and a storage medium.

为实现上述目的,本说明书一个或多个实施例提供技术方案如下:In order to achieve the above purpose, one or more embodiments of this specification provide technical solutions as follows:

根据本说明书一个或多个实施例的第一方面,提出了一种数据同步方法,所述方法包括:According to a first aspect of one or more embodiments of this specification, a data synchronization method is proposed, the method comprising:

创建触发器,其中,所述触发器用于将源表的数据变化同步至目标表,并将同步过程中已删除的数据的标识记录在删除日志中;Create a trigger, wherein the trigger is used to synchronize the data change of the source table to the target table, and record the identifier of the deleted data in the synchronization process in the deletion log;

将所述源表内的数据同步至所述目标表;synchronizing the data in the source table to the target table;

根据所述删除日志内的标识对所述目标表内的数据进行处理。The data in the target table is processed according to the identifier in the deletion log.

在本说明书的一个实施例中,所述创建触发器,包括:In one embodiment of this specification, the creation trigger includes:

创建Insert触发器、Delete触发器和Update触发器,其中,所述Insert触发器用于将所述源表发生的数据插入事件同步至所述目标表,所述Delete触发器用于将所述源表发生的数据删除事件同步至所述目标表,所述Update触发器用于将所述源表发生的数据更新事件同步至所述目标表。Create an Insert trigger, a Delete trigger, and an Update trigger, wherein the Insert trigger is used to synchronize the data insertion event that occurs in the source table to the target table, and the Delete trigger is used to synchronize the data insertion event that occurs in the source table. The data deletion event of the source table is synchronized to the target table, and the Update trigger is used to synchronize the data update event of the source table to the target table.

在本说明书的一个实施例中,所述Insert触发器还用于在所述删除日志内,删除所述数据插入事件中插入的数据的标识。In an embodiment of this specification, the Insert trigger is also used to delete the identifier of the data inserted in the data insertion event in the deletion log.

在本说明书的一个实施例中,所述Delete触发器还用于在所述删除日志内,添加所述数据删除事件中删除的数据的标识。In an embodiment of this specification, the Delete trigger is further used to add an identifier of the data deleted in the data deletion event to the deletion log.

在本说明书的一个实施例中,所述Update触发器还用于在所述删除日志内,添加所述数据更新事件中更新前的数据的标识,并删除所述数据更新事件中更新后的数据的标识。In an embodiment of this specification, the Update trigger is also used to add the identifier of the data before updating in the data update event to the delete log, and delete the data after updating in the data update event logo.

在本说明书的一个实施例中,所述根据所述删除日志内数据的标识对所述目标表内的数据进行处理,包括:In an embodiment of this specification, the processing of the data in the target table according to the identification of the data in the deletion log includes:

删除所述目标表中标识为所述删除日志内的标识的数据。Delete the data in the target table identified as the identifier in the deletion log.

在本说明书的一个实施例中,所述将所述源表内的数据同步至所述目标表,包括:In one embodiment of this specification, the synchronizing the data in the source table to the target table includes:

根据数据同步块的预设尺寸对所述源表内数据进行划分,得到多个数据同步块;Divide the data in the source table according to the preset size of the data synchronization block to obtain multiple data synchronization blocks;

针对所述多个数据同步块中每个数据同步块构建同步至所述目标表的同步任务,并将得到的多个同步任务生成任务队列。A synchronization task to be synchronized to the target table is constructed for each data synchronization block in the plurality of data synchronization blocks, and a task queue is generated from the obtained plurality of synchronization tasks.

在本说明书的一个实施例中,所述源表包括多个分区表,每个分区表存储于分布式数据库的一个分区中;In an embodiment of the present specification, the source table includes multiple partition tables, and each partition table is stored in a partition of the distributed database;

所述根据数据同步块的预设尺寸对所述源表内数据进行划分,得到多个数据同步块,包括:The data in the source table is divided according to the preset size of the data synchronization block to obtain multiple data synchronization blocks, including:

分别确定每个分区表中数据的标识上限和标识下限;Determine the identification upper limit and identification lower limit of the data in each partition table respectively;

针对每个分区表,根据所述分区表的标识上限和标识下限以及所述预设尺寸,对所述分区表内的数据进行划分,得到所述分区表对应的至少一个数据同步块。For each partition table, divide the data in the partition table according to the upper and lower identification limits of the partition table and the preset size to obtain at least one data synchronization block corresponding to the partition table.

在本说明书的一个实施例中,所述目标表包括多个分区表,每个分区表存储于分布式数据库的一个分区中;In one embodiment of this specification, the target table includes multiple partition tables, and each partition table is stored in a partition of the distributed database;

所述方法还包括:The method also includes:

针对每个数据同步块,根据所述数据同步块内各个数据在所述目标表内对应的分区表,对所述数据同步块进行划分,得到所述数据同步块对应的至少一个数据子同步块;For each data synchronization block, according to the partition table corresponding to each data in the data synchronization block in the target table, divide the data synchronization block to obtain at least one data sub-synchronization block corresponding to the data synchronization block ;

所述针对所述多个数据同步块中每个数据同步块构建同步至所述目标表的同步任务,包括:The constructing a synchronization task for synchronizing to the target table for each data synchronization block in the plurality of data synchronization blocks includes:

针对所述多个数据子同步块中每个数据子同步块构建同步至所述目标表的同步任务。A synchronization task for synchronizing to the target table is constructed for each data sub-synchronization block in the plurality of data sub-synchronization blocks.

在本说明书的一个实施例中,所述方法还包括:In one embodiment of the specification, the method further includes:

根据表结构变更指令,创建所述源表的影子表并对所述影子表进行结构修改,得到所述目标表。According to the table structure change instruction, a shadow table of the source table is created and the structure of the shadow table is modified to obtain the target table.

在本说明书的一个实施例中,所述方法还包括:In one embodiment of the specification, the method further includes:

响应于所述目标表内的数据相对于所述源表内的数据满足同步条件,切换所述源表和所述目标表的表名。In response to the data in the target table satisfying a synchronization condition with respect to the data in the source table, switch the table names of the source table and the target table.

在本说明书的一个实施例中,所述切换所述源表和所述目标表的表名,包括:In one embodiment of this specification, the switching of the table names of the source table and the target table includes:

响应于当前时刻相对于初始时刻或上一次会话统计时刻的时长达到预设时长,确定业务层与所述源表之间的会话数量,其中,所述初始时刻包括所述目标表内的数据相对于所述源表内的数据满足同步条件的时刻;Determining the number of sessions between the service layer and the source table in response to the current time relative to the initial time or the last session statistics time reaching a preset time length, wherein the initial time includes the data in the target table relative to When the data in the source table meets the synchronization condition;

响应于业务层与所述源表之间的会话数量小于数量阈值,杀死业务层与所述源表之间的所有会话并切换所述源表和所述目标表的表名。In response to the number of sessions between the business layer and the source table being less than the number threshold, killing all sessions between the business layer and the source table and switching the table names of the source table and the target table.

在本说明书的一个实施例中,所述方法还包括:In one embodiment of the specification, the method further includes:

响应于会话统计的次数达到次数阈值,且每次会话统计的结果中业务层与所述源表之间的会话数量均不小于所述数量阈值,确定所述表结构变更失败。In response to the number of sessions counted reaching the number threshold, and the number of sessions between the business layer and the source table in each session counted result is not less than the number threshold, it is determined that the table structure modification fails.

根据本说明书一个或多个实施例的第二方面,提出了一种数据同步装置,所述装置包括:According to a second aspect of one or more embodiments of the present specification, a data synchronization device is proposed, the device comprising:

增量同步模块,用于创建触发器,其中,所述触发器用于将源表的数据变化同步至目标表,并将同步过程中已删除的数据的标识记录在删除日志中;An incremental synchronization module, configured to create a trigger, wherein the trigger is used to synchronize the data changes of the source table to the target table, and record the identifier of the deleted data in the synchronization process in the deletion log;

全量同步模块,用于将所述源表内的数据同步至所述目标表;A full synchronization module, configured to synchronize data in the source table to the target table;

处理模块,用于根据所述删除日志内的标识对所述目标表内的数据进行处理。A processing module, configured to process the data in the target table according to the identifier in the deletion log.

在本说明书的一个实施例中,所述增量同步模块用于:In one embodiment of this specification, the incremental synchronization module is used for:

创建Insert触发器、Delete触发器和Update触发器,其中,所述Insert触发器用于将所述源表发生的数据插入事件同步至所述目标表,所述Delete触发器用于将所述源表发生的数据删除事件同步至所述目标表,所述Update触发器用于将所述源表发生的数据更新事件同步至所述目标表。Create an Insert trigger, a Delete trigger, and an Update trigger, wherein the Insert trigger is used to synchronize the data insertion event that occurs in the source table to the target table, and the Delete trigger is used to synchronize the data insertion event that occurs in the source table. The data deletion event of the source table is synchronized to the target table, and the Update trigger is used to synchronize the data update event of the source table to the target table.

在本说明书的一个实施例中,所述Insert触发器还用于在所述删除日志内,删除所述数据插入事件中插入的数据的标识。In an embodiment of this specification, the Insert trigger is also used to delete the identifier of the data inserted in the data insertion event in the deletion log.

在本说明书的一个实施例中,所述Delete触发器还用于在所述删除日志内,添加所述数据删除事件中删除的数据的标识。In an embodiment of this specification, the Delete trigger is further used to add an identifier of the data deleted in the data deletion event to the deletion log.

在本说明书的一个实施例中,所述Update触发器还用于在所述删除日志内,添加所述数据更新事件中更新前的数据的标识,并删除所述数据更新事件中更新后的数据的标识。In an embodiment of this specification, the Update trigger is also used to add the identifier of the data before updating in the data update event to the delete log, and delete the data after updating in the data update event logo.

在本说明书的一个实施例中,所述处理模块用于:In one embodiment of this specification, the processing module is used for:

删除所述目标表中标识为所述删除日志内的标识的数据。Delete the data in the target table identified as the identifier in the deletion log.

在本说明书的一个实施例中,所述全量同步模块用于:In one embodiment of this specification, the full amount synchronization module is used for:

根据数据同步块的预设尺寸对所述源表内数据进行划分,得到多个数据同步块;Divide the data in the source table according to the preset size of the data synchronization block to obtain multiple data synchronization blocks;

针对所述多个数据同步块中每个数据同步块构建同步至所述目标表的同步任务,并将得到的多个同步任务生成任务队列。A synchronization task to be synchronized to the target table is constructed for each data synchronization block in the plurality of data synchronization blocks, and a task queue is generated from the obtained plurality of synchronization tasks.

在本说明书的一个实施例中,所述源表包括多个分区表,每个分区表存储于分布式数据库的一个分区中;In an embodiment of the present specification, the source table includes multiple partition tables, and each partition table is stored in a partition of the distributed database;

所述全量同步模块用于根据数据同步块的预设尺寸对所述源表内数据进行划分,得到多个数据同步块时,具体用于:The full synchronization module is used to divide the data in the source table according to the preset size of the data synchronization block, and when multiple data synchronization blocks are obtained, it is specifically used for:

分别确定每个分区表中数据的标识上限和标识下限;Determine the identification upper limit and identification lower limit of the data in each partition table respectively;

针对每个分区表,根据所述分区表的标识上限和标识下限以及所述预设尺寸,对所述分区表内的数据进行划分,得到所述分区表对应的至少一个数据同步块。For each partition table, divide the data in the partition table according to the upper and lower identification limits of the partition table and the preset size to obtain at least one data synchronization block corresponding to the partition table.

在本说明书的一个实施例中,所述目标表包括多个分区表,每个分区表存储于分布式数据库的一个分区中;In one embodiment of this specification, the target table includes multiple partition tables, and each partition table is stored in a partition of the distributed database;

所述全量同步模块还用于:The full synchronization module is also used for:

针对每个数据同步块,根据所述数据同步块内各个数据在所述目标表内对应的分区表,对所述数据同步块进行划分,得到所述数据同步块对应的至少一个数据子同步块;For each data synchronization block, according to the partition table corresponding to each data in the data synchronization block in the target table, divide the data synchronization block to obtain at least one data sub-synchronization block corresponding to the data synchronization block ;

所述全量同步模块用于针对所述多个数据同步块中每个数据同步块构建同步至所述目标表的同步任务时,具体用于:When the full synchronization module is used to construct a synchronization task for each data synchronization block in the plurality of data synchronization blocks to synchronize to the target table, it is specifically used for:

针对所述多个数据子同步块中每个数据子同步块构建同步至所述目标表的同步任务。A synchronization task for synchronizing to the target table is constructed for each data sub-synchronization block in the plurality of data sub-synchronization blocks.

在本说明书的一个实施例中,所述装置还包括结构变更模块,用于:In one embodiment of the specification, the device further includes a structure modification module, configured to:

根据表结构变更指令,创建所述源表的影子表并对所述影子表进行结构修改,得到所述目标表。According to the table structure change instruction, a shadow table of the source table is created and the structure of the shadow table is modified to obtain the target table.

在本说明书的一个实施例中,所述装置还包括表名切换模块,用于:In one embodiment of this specification, the device also includes a table name switching module, which is used for:

响应于所述目标表内的数据相对于所述源表内的数据满足同步条件,切换所述源表和所述目标表的表名。In response to the data in the target table satisfying a synchronization condition with respect to the data in the source table, switch the table names of the source table and the target table.

在本说明书的一个实施例中,所述表名切换模块具体用于:In one embodiment of this specification, the table name switching module is specifically used for:

响应于当前时刻相对于初始时刻或上一次会话统计时刻的时长达到预设时长,确定业务层与所述源表之间的会话数量,其中,所述初始时刻包括所述目标表内的数据相对于所述源表内的数据满足同步条件的时刻;Determining the number of sessions between the service layer and the source table in response to the current time relative to the initial time or the last session statistics time reaching a preset time length, wherein the initial time includes the data in the target table relative to When the data in the source table meets the synchronization condition;

响应于业务层与所述源表之间的会话数量小于数量阈值,杀死业务层与所述源表之间的所有会话并切换所述源表和所述目标表的表名。In response to the number of sessions between the business layer and the source table being less than the number threshold, killing all sessions between the business layer and the source table and switching the table names of the source table and the target table.

在本说明书的一个实施例中,所述表名切换模块还用于:In one embodiment of this specification, the table name switching module is also used for:

响应于会话统计的次数达到次数阈值,且每次会话统计的结果中业务层与所述源表之间的会话数量均不小于所述数量阈值,确定所述表结构变更失败。In response to the number of sessions counted reaching the number threshold, and the number of sessions between the business layer and the source table in each session counted result is not less than the number threshold, it is determined that the table structure modification fails.

根据本说明书一个或多个实施例的第三方面,提出了一种电子设备,包括:According to a third aspect of one or more embodiments of the present specification, an electronic device is provided, including:

处理器;processor;

用于存储处理器可执行指令的存储器;memory for storing processor-executable instructions;

其中,所述处理器通过运行所述可执行指令以实现如第一方面所述的方法。Wherein, the processor implements the method according to the first aspect by running the executable instruction.

根据本说明书一个或多个实施例的第四方面,提出了一种计算机可读存储介质,其上存储有计算机指令,该指令被处理器执行时实现如第一方面所述方法的步骤。According to a fourth aspect of one or more embodiments of the present specification, a computer-readable storage medium is provided, on which computer instructions are stored, and when the instructions are executed by a processor, the steps of the method described in the first aspect are implemented.

本说明书的实施例提供的技术方案可以包括以下有益效果:The technical solutions provided by the embodiments of this specification may include the following beneficial effects:

本说明书实施例所提供的数据同步方法,首先创建触发器,触发器能够在源表的数据发生变化时将数据变化同步至目标表,且将同步过程中已删除的数据的标识记录在删除日志中,接下来将源表中额数据同步至目标表,最后根据删除日志内的标识对目标表内的数据进行处理。该方法中触发器用于在源表和目标表之间进行增量同步,且在源表向目标表进行全量同步的过程中不对源表加锁,即全量同步过程不会影响业务处理;再者通过删除日志对增量同步中已删除的数据进行记录,可以避免全量同步覆盖增量同步的结果(例如增量同步时已删除的数据又在全量同步时添加进目标表中),保证了目标表中的数据有效、且为最新的业务处理结果。The data synchronization method provided by the embodiment of this specification first creates a trigger, which can synchronize the data change to the target table when the data in the source table changes, and record the identifier of the data deleted during the synchronization process in the deletion log Next, the data in the source table is synchronized to the target table, and finally the data in the target table is processed according to the identifier in the deletion log. In this method, the trigger is used to perform incremental synchronization between the source table and the target table, and the source table is not locked during the process of full synchronization from the source table to the target table, that is, the full synchronization process will not affect business processing; moreover By deleting the log to record the deleted data in the incremental synchronization, it is possible to prevent the full synchronization from covering the results of the incremental synchronization (for example, the data deleted during the incremental synchronization is added to the target table during the full synchronization), ensuring the goal The data in the table is valid and is the latest business processing result.

附图说明Description of drawings

图1是一示例性实施例提供的一种数据同步方法的流程图。Fig. 1 is a flowchart of a data synchronization method provided by an exemplary embodiment.

图2是一示例性实施例提供的一种数据同步方法中触发器对删除日志的操作的示意图。Fig. 2 is a schematic diagram of an operation of a trigger on deleting a log in a data synchronization method provided by an exemplary embodiment.

图3是一示例性实施例提供的一种数据同步方法中全量同步过程的示意图。Fig. 3 is a schematic diagram of a full synchronization process in a data synchronization method provided by an exemplary embodiment.

图4是一示例性实施例提供的一种表结构变更方法的示意图。Fig. 4 is a schematic diagram of a table structure modification method provided by an exemplary embodiment.

图5是一示例性实施例提供的一种设备的结构示意图。Fig. 5 is a schematic structural diagram of a device provided by an exemplary embodiment.

图6是一示例性实施例提供的一种数据同步装置的框图。Fig. 6 is a block diagram of a data synchronization device provided by an exemplary embodiment.

具体实施方式Detailed ways

这里将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本说明书一个或多个实施例相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本说明书一个或多个实施例的一些方面相一致的装置和方法的例子。Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numerals in different drawings refer to the same or similar elements unless otherwise indicated. Implementations described in the following exemplary embodiments do not represent all implementations consistent with one or more embodiments of this specification. Rather, they are merely examples of apparatuses and methods consistent with aspects of one or more embodiments of the present specification as recited in the appended claims.

需要说明的是:在其他实施例中并不一定按照本说明书示出和描述的顺序来执行相应方法的步骤。在一些其他实施例中,其方法所包括的步骤可以比本说明书所描述的更多或更少。此外,本说明书中所描述的单个步骤,在其他实施例中可能被分解为多个步骤进行描述;而本说明书中所描述的多个步骤,在其他实施例中也可能被合并为单个步骤进行描述。It should be noted that in other embodiments, the steps of the corresponding methods are not necessarily performed in the order shown and described in this specification. In some other embodiments, the method may include more or less steps than those described in this specification. In addition, a single step described in this specification may be decomposed into multiple steps for description in other embodiments; multiple steps described in this specification may also be combined into a single step in other embodiments describe.

在互联网和信息化高速发展的今天,数据的产生呈爆炸式增长,因此对数据库及其管理的要求越来越高。数据库内存储有大量的数据表,在应用升级、业务变更、分区表数据比例悬殊等情况下数据库内的表结构往往会发生变更。数据库内表结构发生变更时需要对数据进行同步,而数据同步过程中往往需要进行锁表,即不允许并发DML(DataManipulation Language,数据操纵语言),这使得业务处理被影响,产生业务请求超时等业务问题。Today, with the rapid development of the Internet and information technology, the generation of data is growing explosively, so the requirements for databases and their management are getting higher and higher. There are a large number of data tables stored in the database, and the table structure in the database often changes when the application is upgraded, the business is changed, and the data ratio of the partition table is very different. When the table structure in the database changes, the data needs to be synchronized, and the table lock is often required during the data synchronization process, that is, concurrent DML (Data Manipulation Language, data manipulation language) is not allowed, which affects business processing and causes business request timeouts, etc. business problem.

基于此,第一方面,本说明书至少一个实施例提供了一种数据同步方法,该方法可以在数据库表结构变更时在源表和目标表之间进行数据同步,结构变更前的源表可以存储于集中式数据库或分布式数据库,结构变更后的目标表也可以存储于集中式数据库或分布式数据库,数据同步可以包括全量同步和增量同步,全量同步指的是源表中全部数据同步至目标表,增量同步指的是全量同步过程中源表上的数据变化同步至目标表。该方法在全量同步的过程中可以不对源表进行加锁,即业务层可以对源表进行业务数据操作;而且增量同步可以即时同步,即源表数据发生变化后立即会向目标表进行同步;再者,虽然全量同步和增量同步未限制时序,但该方法完成数据同步后目标表中的数据是最新的业务处理结果,不会出现全量同步覆盖增量同步导致数据为旧数据的问题。Based on this, in the first aspect, at least one embodiment of this specification provides a data synchronization method, which can perform data synchronization between the source table and the target table when the structure of the database table is changed, and the source table before the structure change can be stored In a centralized or distributed database, the target table after structure change can also be stored in the centralized or distributed database. Data synchronization can include full synchronization and incremental synchronization. Full synchronization refers to the synchronization of all data in the source table to For the target table, incremental synchronization refers to the synchronization of data changes on the source table to the target table during the full synchronization process. This method does not need to lock the source table in the process of full synchronization, that is, the business layer can perform business data operations on the source table; and incremental synchronization can be synchronized in real time, that is, the source table will be synchronized to the target table immediately after the data changes ;Furthermore, although full synchronization and incremental synchronization do not limit the timing, the data in the target table is the latest business processing result after data synchronization is completed by this method, and there will be no problem that full synchronization overwrites incremental synchronization and the data is old data .

请参照附图1,其示例性的示出了该方法的流程,包括步骤S101至步骤S103。Please refer to Fig. 1 , which exemplarily shows the flow of the method, including step S101 to step S103.

在步骤S101中,创建触发器,其中,所述触发器用于将源表的数据变化同步至目标表,并将同步过程中已删除的数据的标识记录在删除日志中。In step S101, a trigger is created, wherein the trigger is used to synchronize the data change of the source table to the target table, and record the identifier of the deleted data in the deletion log in the synchronization process.

其中,触发器(trigger)是保证数据完整性的一种业务规则,它是与表事件相关的特殊的存储过程,它的执行不是由程序调用,也不是手工启动,而是由事件来触发,比如当对某个表进行操作引起其数据变化时就会激活它执行。Among them, trigger (trigger) is a business rule to ensure data integrity. It is a special stored procedure related to table events. Its execution is not called by a program or manually started, but triggered by an event. For example, when an operation on a table causes its data to change, it will be activated to execute.

本步骤中创建的触发器用于保证目标表的数据完整性,由于目标表是用于同步源表内数据的表,且业务层的操作是针对源表的,因此也可以说本步骤中的触发器用于保证目标和源表之间的数据一致性。其中,触发器在源表因业务层的操作而发生数据变化时被触发,将源表的数据变化同步至目标表,以使目标表发生与源表相同的数据变化,从而保证二者间的一致性。触发器在将源表的数据变化同步至目标表的过程中会引起目标表的某些数据被删除,若触发器执行的增量同步发生在全量同步之前,则全量同步时有可能会将这些被删除的数据重新添加至目标表中,为避免这种情况出现可以使触发器预先对被删除的数据的标识进行记录,以在全量同步后对目标表进行进一步排查。The trigger created in this step is used to ensure the data integrity of the target table. Since the target table is a table used to synchronize data in the source table, and the operation of the business layer is for the source table, it can also be said that the trigger in this step The device is used to ensure data consistency between the target and source tables. Among them, the trigger is triggered when the data of the source table changes due to the operation of the business layer, and the data changes of the source table are synchronized to the target table, so that the target table has the same data changes as the source table, thereby ensuring the relationship between the two consistency. The trigger will cause some data in the target table to be deleted during the process of synchronizing the data changes of the source table to the target table. If the incremental synchronization performed by the trigger occurs before the full synchronization, these data may be deleted during the full synchronization. The deleted data is re-added to the target table. In order to avoid this situation, the trigger can be used to record the identification of the deleted data in advance, so as to further check the target table after full synchronization.

可以理解的是,源表内的数据呈多行多列的形式,其中每列被称为一个字段,每行被称为一个数据行。源表内的数据的多个字段中可以存在主键和/或唯一键,可以预先定义主键或某个唯一键作为数据行的标识。源表所发生的数据变化可以是以数据行为单位所发生的,每个数据行的标识可以为上述预先定义的该行内的主键或某个唯一键。It can be understood that the data in the source table is in the form of multiple rows and multiple columns, where each column is called a field, and each row is called a data row. A primary key and/or a unique key may exist in multiple fields of data in the source table, and a primary key or a certain unique key may be pre-defined as an identifier of a data row. The data changes in the source table can occur in units of data rows, and the identifier of each data row can be the above-mentioned pre-defined primary key or a unique key in the row.

示例性的,源表的数据变化可以包括数据插入事件(例如将某个数据行插入至源表中)、数据删除事件(例如将源表中某个数据行删除)和数据更新事件(例如将源表中某个数据行的标识更新为其他标识)。因此本步骤中创建触发器可以包括:创建Insert触发器、Delete触发器和Update触发器。Exemplarily, the data change of the source table may include data insertion events (such as inserting a certain data row into the source table), data deletion events (such as deleting a certain data row in the source table) and data update events (such as inserting The ID of a data row in the source table is updated to another ID). Therefore, creating a trigger in this step may include: creating an Insert trigger, a Delete trigger, and an Update trigger.

其中,所述Delete触发器用于将所述源表发生的数据删除事件同步至所述目标表,即将数据删除事件中删除的数据在目标表中同步删除。另外,所述Delete触发器还用于在所述删除日志内,添加所述数据删除事件中删除的数据的标识。例如,Delete触发器的代码可以为:Wherein, the Delete trigger is used to synchronize the data deletion event occurring in the source table to the target table, that is, to delete the data deleted in the data deletion event synchronously in the target table. In addition, the Delete trigger is also used to add the identifier of the data deleted in the data deletion event to the deletion log. For example, the code for a Delete trigger could be:

上述代码中,A为源表,A_GHO为目标表,DELETE_LOG为删除日志。In the above code, A is the source table, A_GHO is the target table, and DELETE_LOG is the delete log.

其中,所述Insert触发器用于将所述源表发生的数据插入事件同步至所述目标表,即将数据插入事件中插入的数据同步插入至目标表。另外,所述Insert触发器还用于在所述删除日志内,将所述数据插入事件中插入的数据的标识删除,即若删除日志内存在所述数据插入事件中插入的数据的标识,则将其删除,若删除日志内不存在所述数据插入事件中插入的数据的标识,则无需执行该删除操作。换句话说,若数据插入事件中插入的数据是之前存在于源表且已经被删除的数据,则其标识记录在删除日志内,而此次该数据被重新插入,则最新的业务处理结果中该数据存在于源表,因此其标识不再需要记录在删除日志内。例如,Insert触发器的代码可以为:Wherein, the Insert trigger is used for synchronizing the data insertion event occurring in the source table to the target table, that is, synchronously inserting the data inserted in the data insertion event into the target table. In addition, the Insert trigger is also used to delete the identifier of the data inserted in the data insertion event in the deletion log, that is, if the identifier of the data inserted in the data insertion event exists in the deletion log, then Delete it. If the identifier of the data inserted in the data insertion event does not exist in the deletion log, the deletion operation does not need to be performed. In other words, if the data inserted in the data insertion event is data that existed in the source table before and has been deleted, its identifier is recorded in the deletion log, and this time the data is re-inserted, the latest business processing result The data exists in the source table, so its identity no longer needs to be recorded in the delete log. For example, the code for an Insert trigger could be:

上述代码中,A为源表,A_GHO为目标表。In the above code, A is the source table, and A_GHO is the target table.

其中,所述Update触发器用于将所述源表发生的数据更新事件同步至所述目标表,即将数据更新事件中更新前的数据在目标表中同步删除,将数据更新事件中更新后的数据同步插入至目标表。另外,所述Update触发器还用于在所述删除日志内,添加所述数据更新事件中更新前的数据的标识,并删除所述数据更新事件中更新后的数据的标识。可以理解的是,数据更新事件可以被等效为针对更新前的数据的删除事件和针对更新后的数据的插入事件,因此可以分别按照Delete触发器和Insert触发器的原理将更新前的数据的标识添加至删除日志,并将更新后的数据的标识由删除日志内删除。例如,Update触发器的代码可以为:Wherein, the Update trigger is used to synchronize the data update event that occurs in the source table to the target table, that is, the data before updating in the data update event is synchronously deleted in the target table, and the updated data in the data update event Insert synchronously into the target table. In addition, the Update trigger is further configured to add, in the deletion log, the identifier of the data before update in the data update event, and delete the identifier of the data after update in the data update event. It can be understood that the data update event can be equivalent to a delete event for the data before the update and an insert event for the updated data, so the data before the update can be The identifier is added to the deletion log, and the identifier of the updated data is deleted from the deletion log. For example, the code for an Update trigger could be:

上述代码中,A为源表,A_GHO为目标表,DELETE_LOG为删除日志。In the above code, A is the source table, A_GHO is the target table, and DELETE_LOG is the delete log.

在上述三种触发器中,Delete触发器和Insert触发器被触发时一定是源表中某个数据行的标识(即主键或唯一键)发生了变化,例如Delete触发器被触发时某个数据行被删除,其标识也被删除,再例如Insert触发器被触发时某个数据行被插入,其标识也被插入。Among the above three triggers, when the Delete trigger and the Insert trigger are triggered, it must be that the identity of a data row in the source table (that is, the primary key or unique key) has changed, for example, when the Delete trigger is triggered, a certain data When a row is deleted, its identifier is also deleted. For example, when a certain data row is inserted when an Insert trigger is triggered, its identifier is also inserted.

而Update触发器被触发时一定是源表中某个数据行发生了变化,但是不一定是该数据行的标识(即主键或唯一键)发生了变化,而只有数据行的标识发生变化时触发器才需要在所述删除日志内,添加所述数据更新事件中更新前的数据的标识,并删除所述数据更新事件中更新后的数据的标识;至于数据行的标识未发生变化时,由于触发器对目标表执行的增量同步使用Replace into插入更新后的行数据(行数据内包括标识和其他字段的内容),而全量同步使用Insert ignore插入更新前的行数据,即若想要插入的行数据的标识若已经存在于目标表,则不会插入该数据行,因此若全量同步发生在前而增量同步发生在后则增量同步的结果会覆盖全量同步的结果,若增量同步发生在前而全量同步发生在后则全量同步不会发生,即不会覆盖增量同步的结果,增量同步的结果会被保持。When the Update trigger is triggered, it must be that a data row in the source table has changed, but it does not necessarily mean that the identity of the data row (that is, the primary key or unique key) has changed, but only when the identity of the data row changes. Only in the deletion log, the server needs to add the identification of the data before the update in the data update event, and delete the identification of the updated data in the data update event; when the identification of the data row does not change, due to The incremental synchronization performed by the trigger on the target table uses Replace into to insert the updated row data (the row data includes the content of the identifier and other fields), while the full synchronization uses Insert ignore to insert the row data before the update, that is, if you want to insert If the identifier of the row data already exists in the target table, the data row will not be inserted. Therefore, if the full synchronization occurs first and the incremental synchronization occurs later, the result of the incremental synchronization will overwrite the result of the full synchronization. If the incremental If synchronization occurs before and full synchronization occurs later, full synchronization will not occur, that is, the result of incremental synchronization will not be overwritten, and the result of incremental synchronization will be maintained.

请参照附图2,其形象的示出了上述三种触发器对删除日志(DELETE_LOG)的操作,删除日志能够避免对源表加锁,尤其是分布式数据库的表结构变更过程中的数据同步能够避免源表加锁,从而使得表结构变更时不影响业务正常处理。Please refer to Figure 2, which vividly shows the operation of the above three triggers on the delete log (DELETE_LOG). Deleting the log can avoid locking the source table, especially the data synchronization during the table structure change process of the distributed database. It can avoid locking the source table, so that the normal processing of the business will not be affected when the table structure is changed.

在步骤S102中,将所述源表内的数据同步至所述目标表。In step S102, the data in the source table is synchronized to the target table.

本步骤用于完成源表向目标表的全量同步,由于全量同步过程中并未对源表加锁(即添加共享锁),因此全量同步过程中若业务层对源表执行操作则会引起源表的数据变化,并触发步骤S101构建的触发器执行增量同步。可以理解的是,某个数据行的全量同步和增量同步的顺序是不确定的,可能全量同步发生在前增量同步发生在后,例如源表的某数据行被全量同步至目标表后业务层的操作引起该数据行的变化,进而触发器对该数据行执行增量同步;也可能增量同步发生在前全量同步发生在后,例如源表的某数据行被全量同步时读取至内存但尚未同步至目标表时,业务层的操作引起该数据行的变化,进而触发器对该数据行执行增量同步后内存中将该数据行更新前的数据才同步至目标表。This step is used to complete the full synchronization from the source table to the target table. Since the source table is not locked (that is, a shared lock is added) during the full synchronization process, if the business layer performs operations on the source table during the full synchronization process, the source table will be triggered. The data of the table changes, and the trigger constructed in step S101 is triggered to perform incremental synchronization. It is understandable that the order of full synchronization and incremental synchronization of a data row is uncertain. It is possible that the full synchronization occurs before the incremental synchronization occurs, for example, after a data row of the source table is fully synchronized to the target table The operation of the business layer causes the change of the data row, and then the trigger performs incremental synchronization on the data row; it is also possible that the incremental synchronization occurs before the full synchronization occurs, for example, when a data row of the source table is read when it is fully synchronized When it is stored in the memory but not yet synchronized to the target table, the operation of the business layer causes the change of the data row, and then the trigger performs incremental synchronization on the data row, and then the data in the memory before the update of the data row is synchronized to the target table.

接下来结合附图3,对本步骤中的全量同步过程进行示例性的详细介绍。Next, with reference to FIG. 3 , an exemplary and detailed introduction will be given to the full synchronization process in this step.

首先,根据数据同步块的预设尺寸对所述源表内数据进行划分,得到多个数据同步块。First, divide the data in the source table according to the preset size of the data synchronization block to obtain multiple data synchronization blocks.

其中,数据同步块的预设尺寸可以以数据行的数量N来表征。Wherein, the preset size of the data synchronization block may be represented by the number N of data rows.

可选的,若源表存储于集中式数据库中,则可以先确定源表的标识上限和标识下限,例如对源表中作为标识的主键或唯一键进行排序,然后基于排序结果确定标识上限和标识下限;进而可以根据标识上限和标识下限以及数据同步块中数据行的数量N,对源表内所有的数据行进行划分,得到多个数据同步块,每个数据同步块内具有N个数据行。举例来说,源表中的主键id作为数据行的标识,经过排序后源表中数据行的主键id由1到100,而数据同步块的预设尺寸为10个数据行,因此经过划分得到主键id1至10的数据同步块1、主键id11至20的数据同步块2、主键id21至30的数据同步块3、主键id31至40的数据同步块4、主键id41至50的数据同步块5、主键id51至60的数据同步块6、主键id61至70的数据同步块7、主键id71至80的数据同步块8、主键id81至90的数据同步块9、主键id91至100的数据同步块10共十个数据同步块。Optionally, if the source table is stored in a centralized database, you can first determine the upper limit and lower limit of the identity of the source table, for example, sort the primary key or unique key used as the identity in the source table, and then determine the upper limit and lower limit of the identity based on the sorting result Identify the lower limit; then, according to the upper limit and lower limit of the identification and the number N of data rows in the data synchronization block, all the data rows in the source table can be divided to obtain multiple data synchronization blocks, and each data synchronization block has N data OK. For example, the primary key id in the source table is used as the identifier of the data row. After sorting, the primary key id of the data row in the source table ranges from 1 to 100, and the default size of the data synchronization block is 10 data rows. Therefore, after division, the Data synchronization block 1 of primary key id1 to 10, data synchronization block 2 of primary key id11 to 20, data synchronization block 3 of primary key id21 to 30, data synchronization block 4 of primary key id31 to 40, data synchronization block 5 of primary key id41 to 50, Data synchronization block 6 of primary key id51 to 60, data synchronization block 7 of primary key id61 to 70, data synchronization block 8 of primary key id71 to 80, data synchronization block 9 of primary key id81 to 90, and data synchronization block 10 of primary key id91 to 100 Ten data sync blocks.

可选的,若源表存储于分布式数据库中,即源表包括多个分区表,每个分区表存储于上述分布式数据库的一个分区中,则可以先分别确定每个分区表中数据的标识上限和标识下限,例如对分区表中作为标识的主键或唯一键进行排序,然后基于排序结果确定标识上限和标识下限;进而可以针对每个分区表,根据所述分区表的标识上限和标识下限以及所述预设尺寸,对所述分区表内的数据进行划分,得到所述分区表对应的至少一个数据同步块。举例来说,源表中的主键id作为数据行的标识,源表包括分区表1和分区表2,其中分区表1经过排序后数据行的主键id由1到100,分区表2经过排序后数据行的主键id由101到200,而数据同步块的预设尺寸为10个数据行,因此经过划分得到分区表1对应的主键id1至10的数据同步块1、主键id11至20的数据同步块2、主键id21至30的数据同步块3、主键id31至40的数据同步块4、主键id41至50的数据同步块5、主键id51至60的数据同步块6、主键id61至70的数据同步块7、主键id71至80的数据同步块8、主键id81至90的数据同步块9、主键id91至100的数据同步块10共十个数据同步块,以及分区表2对应的主键id101至110的数据同步块11、主键id111至120的数据同步块12、主键id121至130的数据同步块13、主键id131至140的数据同步块14、主键id141至150的数据同步块15、主键id151至160的数据同步块16、主键id161至170的数据同步块17、主键id171至180的数据同步块18、主键id181至190的数据同步块19、主键id191至200的数据同步块20共十个数据同步块。Optionally, if the source table is stored in a distributed database, that is, the source table includes multiple partition tables, and each partition table is stored in a partition of the above-mentioned distributed database, the data in each partition table can be determined first. Identification upper limit and identification lower limit, such as sorting the primary key or unique key as the identification in the partition table, and then determining the identification upper limit and identification lower limit based on the sorting result; and then for each partition table, according to the identification upper limit and identification of the partition table The lower limit and the preset size divide the data in the partition table to obtain at least one data synchronization block corresponding to the partition table. For example, the primary key id in the source table is used as the identifier of the data row. The source table includes partition table 1 and partition table 2. The primary key id of the data row in partition table 1 is sorted from 1 to 100, and partition table 2 is sorted. The primary key id of the data row is from 101 to 200, and the preset size of the data synchronization block is 10 data rows, so after division, the data synchronization block 1 corresponding to the primary key id1 to 10 of the partition table 1 and the data synchronization of the primary key id11 to 20 are obtained Block 2, data synchronization block 3 of primary key id21 to 30, data synchronization block 4 of primary key id31 to 40, data synchronization block 5 of primary key id41 to 50, data synchronization block 6 of primary key id51 to 60, data synchronization of primary key id61 to 70 Block 7, data synchronization block 8 of primary key id71 to 80, data synchronization block 9 of primary key id81 to 90, data synchronization block 10 of primary key id91 to 100, a total of ten data synchronization blocks, and the corresponding primary key id101 to 110 of partition table 2 Data synchronization block 11, data synchronization block 12 of primary key id111 to 120, data synchronization block 13 of primary key id121 to 130, data synchronization block 14 of primary key id131 to 140, data synchronization block 15 of primary key id141 to 150, and data synchronization block of primary key id151 to 160 Data synchronization block 16, data synchronization block 17 of primary key id161 to 170, data synchronization block 18 of primary key id171 to 180, data synchronization block 19 of primary key id181 to 190, data synchronization block 20 of primary key id191 to 200, a total of ten data synchronization blocks .

另外可以理解的是,若源表存储于分布式数据库中,则目标表也可以存储于分布式数据库中,即目标表包括多个分区表,每个分区表存储于分布式数据库的一个分区中。这种情况下可以针对每个数据同步块,进一步根据所述数据同步块内各个数据在所述目标表内对应的分区表,对所述数据同步块进行划分,得到所述数据同步块对应的至少一个数据子同步块。举例来说,某数据同步块的主键id1-10中,主键id1、2、3、6对应目标表中的分区表1,主键id4、5、7对应目标表中的分区表2,主键id8、9、10对应目标表中的分区表3,因此对该数据同步块进行进一步划分,得到主键id1、2、3、6组成的数据子同步块1、主键id4、5、7组成的数据子同步块2、主键id8、9、10组成的数据子同步块3共三个数据子同步块。In addition, it can be understood that if the source table is stored in the distributed database, the target table can also be stored in the distributed database, that is, the target table includes multiple partition tables, and each partition table is stored in a partition of the distributed database . In this case, for each data synchronization block, the data synchronization block can be further divided according to the partition table corresponding to each data in the data synchronization block in the target table, and the data synchronization block corresponding to the data synchronization block can be obtained. At least one data sub-sync block. For example, in the primary key id1-10 of a data synchronization block, primary keys id1, 2, 3, and 6 correspond to partition table 1 in the target table, primary keys id4, 5, and 7 correspond to partition table 2 in the target table, primary keys id8, 9 and 10 correspond to partition table 3 in the target table, so the data synchronization block is further divided to obtain data sub-synchronization block 1 composed of primary keys id1, 2, 3, and 6, and data sub-synchronization composed of primary keys id4, 5, and 7 The data sub-synchronization block 3 composed of block 2 and primary keys id8, 9, and 10 has three data sub-synchronization blocks in total.

接下来,针对所述多个数据同步块中每个数据同步块构建同步至所述目标表的同步任务,并将得到的多个同步任务生成任务队列。Next, a synchronization task to be synchronized to the target table is constructed for each data synchronization block in the plurality of data synchronization blocks, and a task queue is generated from the obtained plurality of synchronization tasks.

示例性的,生产端将数据同步由源表块放入任务队列,以及消费端将数据同步块由队列取出并添加至目标表可以同时进行,从而提高全量同步的效率。在任务队列被执行时可以同时执行多个同步任务,也能够提高全量同步的效率。Exemplarily, the production end puts data synchronization blocks from the source table into the task queue, and the consumer end removes the data synchronization blocks from the queue and adds them to the target table at the same time, thereby improving the efficiency of full synchronization. When the task queue is executed, multiple synchronization tasks can be executed at the same time, which can also improve the efficiency of full synchronization.

可以理解的是,若目标表存储于分布式数据库中,且数据同步块被进一步划分为了数据子同步块,则本步骤中可以针对所述多个数据子同步块中每个数据子同步块构建同步至所述目标表的同步任务,并将得到的多个同步任务生成任务队列。It can be understood that if the target table is stored in a distributed database, and the data synchronization block is further divided into data sub-synchronization blocks, then in this step, the Synchronize to the synchronization task of the target table, and generate a task queue from the obtained multiple synchronization tasks.

上述全量同步的过程中,将源表进行划分、甚至进一步划分得到多个数据同步块或数据子同步块,进而针对每个数据同步块或数据子同步块构建同步任务以生成任务队列,因此可以使源表中不同数据并行同步,且任务队列的生产和消费也具有并行效果,从而提高全量同步的效率,尤其是对于分布式数据库的表结构变更过程中的全量数据同步,能够极大的提高全量同步的效率。In the above full synchronization process, the source table is divided or even further divided to obtain multiple data synchronization blocks or data sub-synchronization blocks, and then a synchronization task is constructed for each data synchronization block or data sub-synchronization block to generate a task queue, so it can be Synchronize different data in the source table in parallel, and the production and consumption of task queues also have a parallel effect, thereby improving the efficiency of full synchronization, especially for full data synchronization in the process of changing the table structure of a distributed database, which can greatly improve The efficiency of full synchronization.

在步骤S103中,根据所述删除日志内的标识对所述目标表内的数据进行处理。In step S103, the data in the target table is processed according to the identifier in the deletion log.

本步骤中用于对全量同步结果覆盖增量同步结果的数据行进行纠正,以使这些数据行恢复增量同步的结果,即保持最新的业务处理结果,使目标表与源表保持一致。This step is used to correct the data rows whose full synchronization results cover the incremental synchronization results, so that these data rows can restore the incremental synchronization results, that is, keep the latest business processing results, and make the target table consistent with the source table.

示例性的,可以删除所述目标表中标识为所述删除日志内的标识的数据。可以理解,若目标表中存在删除日志内的某个标识,则将目标表中该标识所述的数据行删除,若目标表中不存在删除日志内的某个标识,则无需执行该标识对应的删除操作。举例来说,源表中的主键id为n的数据行发生了数据变化,因此标识n被记录在删除日志中,且该主键为n的数据行先执行了增量同步后执行了全量同步,即执行增量同步时由于目标表中无标识为n的数据行,因此未执行该增量同步,而后续的全量同步则将标识为n的数据行添加至目标表中,即覆盖了增量同步的结果;而本步骤中由于删除日志内记录了标识n,因此将目标表中标识为n的数据行删除,使得增量同步结果被恢复,免于被全量同步结果所覆盖。Exemplarily, the data identified in the target table as the identifier in the deletion log may be deleted. It can be understood that if there is an identifier in the delete log in the target table, the data row described by the identifier in the target table will be deleted. If there is no identifier in the delete log in the target table, there is no need to execute the corresponding delete operation. For example, the data row whose primary key id is n in the source table has changed data, so the identifier n is recorded in the delete log, and the data row whose primary key is n performs incremental synchronization first and then performs full synchronization, that is When performing incremental synchronization, because there is no data row identified as n in the target table, the incremental synchronization is not performed, and the subsequent full synchronization will add the data row identified as n to the target table, that is, the incremental synchronization is overwritten In this step, because the identifier n is recorded in the deletion log, the data row with the identifier n in the target table is deleted, so that the incremental synchronization result is restored and is not overwritten by the full synchronization result.

请参照附图2,其形象的示出了重播线程池(replay thread pool)根据删除日志对目标表(A_GHO)的操作。Please refer to accompanying drawing 2, which vividly shows the operation of the replay thread pool (replay thread pool) on the target table (A_GHO) according to the delete log.

本说明书实施例所提供的数据同步方法,首先创建触发器,触发器能够在源表的数据发生变化时将数据变化同步至目标表,且将同步过程中已删除的数据的标识记录在删除日志中,接下来将源表中额数据同步至目标表,最后根据删除日志内的标识对目标表内的数据进行处理。该方法中触发器用于在源表和目标表之间进行增量同步,且在源表向目标表进行全量同步的过程中不对源表加锁,即全量同步过程不会影响业务处理;再者通过删除日志对增量同步中已删除的数据进行记录,可以避免全量同步覆盖增量同步的结果(例如增量同步时已删除的数据又在全量同步时添加进目标表中),保证了目标表中的数据有效、且为最新的业务处理结果。The data synchronization method provided by the embodiment of this specification first creates a trigger, which can synchronize the data change to the target table when the data in the source table changes, and record the identifier of the data deleted during the synchronization process in the deletion log Next, the data in the source table is synchronized to the target table, and finally the data in the target table is processed according to the identifier in the deletion log. In this method, the trigger is used to perform incremental synchronization between the source table and the target table, and the source table is not locked during the process of full synchronization from the source table to the target table, that is, the full synchronization process will not affect business processing; moreover By deleting the log to record the deleted data in the incremental synchronization, it is possible to prevent the full synchronization from covering the results of the incremental synchronization (for example, the data deleted during the incremental synchronization is added to the target table during the full synchronization), ensuring the goal The data in the table is valid and is the latest business processing result.

上述实施例中的数据同步可以是数据库的表结构变更时由旧表(即源表)向新表(即目标表)的数据同步,因此请参照附图4所示出的表结构变更流程,本公开的一些实施例中,在上述数据同步前可以根据表结构变更指令,创建所述源表的影子表并对所述影子表进行结构修改,得到所述目标表;在上述数据同步后可以判断目标表内的数据相对于源表内的数据是否满足同步条件,且响应于所述目标表内的数据相对于所述源表内的数据满足同步条件,切换所述源表和所述目标表的表名,从而完成数据库中表结构的变更。The data synchronization in the foregoing embodiment can be the data synchronization from the old table (i.e. the source table) to the new table (i.e. the target table) when the table structure of the database changes, so please refer to the table structure change process shown in accompanying drawing 4, In some embodiments of the present disclosure, before the above data synchronization, the shadow table of the source table can be created according to the table structure change instruction and the structure of the shadow table can be modified to obtain the target table; after the above data synchronization can be judging whether the data in the target table satisfies the synchronization condition relative to the data in the source table, and in response to the data in the target table satisfying the synchronization condition relative to the data in the source table, switch the source table and the target table The table name of the table, so as to complete the change of the table structure in the database.

示例性的,对影子表的结构修改可以包括字段增加、字段删除、分区表的重新划分等;分区表重新划分时可以基于行数据的标识来进行,且可以尽量使最少的行数据在数据同步时发生跨区同步,例如源表的分区表1的数据行主键为1-100、分区表2的数据行主键为101-200,目标表的分区表1的数据行主键为1-50,分区表2的数据行主键为151-200。Exemplarily, the modification of the structure of the shadow table may include adding fields, deleting fields, repartitioning the partition table, etc.; the repartitioning of the partition table can be performed based on the identification of the row data, and the minimum row data can be synchronized in the data as far as possible Cross-area synchronization occurs, for example, the primary key of data rows in partition table 1 of the source table is 1-100, the primary key of data rows in partition table 2 is 101-200, and the primary key of data rows in partition table 1 of the target table is 1-50. The primary key of the data row in Table 2 is 151-200.

示例性的,判断目标表内的数据相对于源表内的数据是否满足同步条件,可以判断目标表内的数据是否与源表内的数据一致,并将不一致的数据行筛选出来,若不一致的数据行是发生增量同步的数据行,且数据行内的内容与增量同步一致,则确定目标表内的数据相对于源表内的数据满足同步条件。Exemplarily, to judge whether the data in the target table satisfies the synchronization condition with respect to the data in the source table, it can be judged whether the data in the target table is consistent with the data in the source table, and the inconsistent data rows are filtered out. The data row is a data row where incremental synchronization occurs, and the content in the data row is consistent with the incremental synchronization, then it is determined that the data in the target table satisfies the synchronization condition relative to the data in the source table.

示例性的,切换所述源表和所述目标表的表名时,可以先响应于当前时刻相对于初始时刻或上一次会话统计时刻的时长达到预设时长(此时即为此次会话统计时刻),确定业务层与所述源表之间的会话(session)数量,其中,所述初始时刻包括所述目标表内的数据相对于所述源表内的数据满足同步条件的时刻;再响应于业务层与所述源表之间的会话数量小于数量阈值,杀死业务层与所述源表之间的所有会话(kill session)并切换所述源表和所述目标表的表名。由于切换表名时业务层可能会继续向源表中写入数据,尤其是分布式数据库的表结构同步并非原子的,切换表名时业务层继续向源表中写入数据的可能性更高,因此本示例中在切换表名时杀死了业务层与源表之间的所有会话,这样避免了业务层向源表中继续写入数据造成数据丢失;而且切换表名是在业务层与源表间的会话数量小于数量阈值时执行的,这样能尽量减少切换表名对业务处理的影响;再者本示例中的方式并未对源表加锁,减少了共享锁的使用,进一步减少切换表名对业务处理的影响。Exemplarily, when switching the table names of the source table and the target table, it may first respond to the fact that the duration of the current moment relative to the initial moment or the last session statistics moment reaches a preset duration (this time is the session statistics time), determine the number of sessions (session) between the business layer and the source table, wherein the initial time includes the time when the data in the target table meets the synchronization condition with respect to the data in the source table; then In response to the number of sessions between the business layer and the source table being less than the number threshold, kill all sessions between the business layer and the source table (kill session) and switch the table names of the source table and the target table . Since the business layer may continue to write data to the source table when the table name is switched, especially since the table structure synchronization of the distributed database is not atomic, the business layer is more likely to continue to write data to the source table when the table name is switched , so in this example, all sessions between the business layer and the source table are killed when the table name is switched, which prevents the business layer from continuing to write data to the source table and cause data loss; and the switch table name is between the business layer and the source table It is executed when the number of sessions between source tables is less than the number threshold, which can minimize the impact of switching table names on business processing; moreover, the method in this example does not lock the source table, which reduces the use of shared locks and further reduces The impact of switching table names on business processing.

另外,还可以预先设置次数阈值,并响应于会话统计的次数达到次数阈值,且每次会话统计的结果中业务层与所述源表之间的会话数量均不小于所述数量阈值,确定所述表结构变更失败。这是因为,若多次判断会话数量均为大于数量阈值,则此时业务层的业务较多,因此此时kill session来切换表名,会严重影响业务处理,因此可以提醒用户表结构变更失败,等待业务较少时再通过操作重新启动表结构变更。In addition, the number of times threshold can also be set in advance, and in response to the number of times of session statistics reaching the number of times threshold, and the number of sessions between the business layer and the source table in the result of each session statistics is not less than the number threshold, determine the number of times The table structure change failed. This is because, if the number of sessions is judged to be greater than the number threshold for many times, the business layer has more business at this time, so at this time, killing the session to switch the table name will seriously affect business processing, so you can remind the user that the table structure change failed , wait until the business is less and then restart the table structure change through the operation.

上述图4所示出的表结构变更方法,在表结构变更时不发生锁表,提供更高效的全量数据同步,并且增量同步和表名切换时能够保障数据不丢失,且不影响生产环境业务。尤其是该表方法适用于分布式数据库的表结构变更,结构变更时不发生锁表,全量同步高效,且增量同步和表名切换时能够保障数据不丢失,且不影响生产环境业务。The table structure change method shown in Figure 4 above does not lock the table when the table structure is changed, provides more efficient full data synchronization, and ensures that data will not be lost during incremental synchronization and table name switching, and will not affect the production environment business. In particular, this table method is suitable for table structure changes in distributed databases. When the structure changes, no table lock occurs, full synchronization is efficient, and incremental synchronization and table name switching can ensure that data will not be lost and will not affect the business of the production environment.

图5是一示例性实施例提供的一种设备的示意结构图。请参考图5,在硬件层面,该设备包括处理器502、内部总线504、网络接口506、内存508以及非易失性存储器510,当然还可能包括其他任务所需要的硬件。本说明书一个或多个实施例可以基于软件方式来实现,比如由处理器502从非易失性存储器510中读取对应的计算机程序到内存508中然后运行。当然,除了软件实现方式之外,本说明书一个或多个实施例并不排除其他实现方式,比如逻辑器件抑或软硬件结合的方式等等,也就是说以下处理流程的执行主体并不限定于各个逻辑单元,也可以是硬件或逻辑器件。Fig. 5 is a schematic structural diagram of a device provided by an exemplary embodiment. Please refer to FIG. 5 , at the hardware level, the device includes a processor 502 , an internal bus 504 , a network interface 506 , a memory 508 and a non-volatile memory 510 , and of course may include hardware required by other tasks. One or more embodiments of this specification may be implemented based on software, for example, the processor 502 reads a corresponding computer program from the non-volatile memory 510 into the memory 508 and executes it. Of course, in addition to software implementations, one or more embodiments of this specification do not exclude other implementations, such as logic devices or a combination of software and hardware, etc., that is to say, the execution subject of the following processing flow is not limited to each A logic unit, which can also be a hardware or logic device.

请参考图6,数据同步装置可以应用于如图5所示的设备中,以实现本说明书的技术方案。其中,该数据同步装置可以包括:Please refer to FIG. 6 , the data synchronization device can be applied to the device shown in FIG. 5 to realize the technical solution of this specification. Wherein, the data synchronization device may include:

增量同步模块601,用于创建触发器,其中,所述触发器用于将源表的数据变化同步至目标表,并将同步过程中已删除的数据的标识记录在删除日志中;The incremental synchronization module 601 is used to create a trigger, wherein the trigger is used to synchronize the data changes of the source table to the target table, and record the identifier of the deleted data in the synchronization process in the deletion log;

全量同步模块602,用于将所述源表内的数据同步至所述目标表;A full synchronization module 602, configured to synchronize the data in the source table to the target table;

处理模块603,用于根据所述删除日志内的标识对所述目标表内的数据进行处理。The processing module 603 is configured to process the data in the target table according to the identifier in the deletion log.

在本说明书的一个实施例中,所述增量同步模块用于:In one embodiment of this specification, the incremental synchronization module is used for:

创建Insert触发器、Delete触发器和Update触发器,其中,所述Insert触发器用于将所述源表发生的数据插入事件同步至所述目标表,所述Delete触发器用于将所述源表发生的数据删除事件同步至所述目标表,所述Update触发器用于将所述源表发生的数据更新事件同步至所述目标表。Create an Insert trigger, a Delete trigger, and an Update trigger, wherein the Insert trigger is used to synchronize the data insertion event that occurs in the source table to the target table, and the Delete trigger is used to synchronize the data insertion event that occurs in the source table. The data deletion event of the source table is synchronized to the target table, and the Update trigger is used to synchronize the data update event of the source table to the target table.

在本说明书的一个实施例中,所述Insert触发器还用于在所述删除日志内,删除所述数据插入事件中插入的数据的标识。In an embodiment of this specification, the Insert trigger is also used to delete the identifier of the data inserted in the data insertion event in the deletion log.

在本说明书的一个实施例中,所述Delete触发器还用于在所述删除日志内,添加所述数据删除事件中删除的数据的标识。In an embodiment of this specification, the Delete trigger is further used to add an identifier of the data deleted in the data deletion event to the deletion log.

在本说明书的一个实施例中,所述Update触发器还用于在所述删除日志内,添加所述数据更新事件中更新前的数据的标识,并删除所述数据更新事件中更新后的数据的标识。In an embodiment of this specification, the Update trigger is also used to add the identifier of the data before updating in the data update event to the delete log, and delete the data after updating in the data update event logo.

在本说明书的一个实施例中,所述处理模块用于:In one embodiment of this specification, the processing module is used for:

删除所述目标表中标识为所述删除日志内的标识的数据。Delete the data in the target table identified as the identifier in the deletion log.

在本说明书的一个实施例中,所述全量同步模块用于:In one embodiment of this specification, the full amount synchronization module is used for:

根据数据同步块的预设尺寸对所述源表内数据进行划分,得到多个数据同步块;Divide the data in the source table according to the preset size of the data synchronization block to obtain multiple data synchronization blocks;

针对所述多个数据同步块中每个数据同步块构建同步至所述目标表的同步任务,并将得到的多个同步任务生成任务队列。A synchronization task to be synchronized to the target table is constructed for each data synchronization block in the plurality of data synchronization blocks, and a task queue is generated from the obtained plurality of synchronization tasks.

在本说明书的一个实施例中,所述源表包括多个分区表,每个分区表存储于分布式数据库的一个分区中;In an embodiment of the present specification, the source table includes multiple partition tables, and each partition table is stored in a partition of the distributed database;

所述全量同步模块用于根据数据同步块的预设尺寸对所述源表内数据进行划分,得到多个数据同步块时,具体用于:The full synchronization module is used to divide the data in the source table according to the preset size of the data synchronization block, and when multiple data synchronization blocks are obtained, it is specifically used for:

分别确定每个分区表中数据的标识上限和标识下限;Determine the identification upper limit and identification lower limit of the data in each partition table respectively;

针对每个分区表,根据所述分区表的标识上限和标识下限以及所述预设尺寸,对所述分区表内的数据进行划分,得到所述分区表对应的至少一个数据同步块。For each partition table, divide the data in the partition table according to the upper and lower identification limits of the partition table and the preset size to obtain at least one data synchronization block corresponding to the partition table.

在本说明书的一个实施例中,所述目标表包括多个分区表,每个分区表存储于分布式数据库的一个分区中;In one embodiment of this specification, the target table includes multiple partition tables, and each partition table is stored in a partition of the distributed database;

所述全量同步模块还用于:The full synchronization module is also used for:

针对每个数据同步块,根据所述数据同步块内各个数据在所述目标表内对应的分区表,对所述数据同步块进行划分,得到所述数据同步块对应的至少一个数据子同步块;For each data synchronization block, according to the partition table corresponding to each data in the data synchronization block in the target table, divide the data synchronization block to obtain at least one data sub-synchronization block corresponding to the data synchronization block ;

所述全量同步模块用于针对所述多个数据同步块中每个数据同步块构建同步至所述目标表的同步任务时,具体用于:When the full synchronization module is used to construct a synchronization task for each data synchronization block in the plurality of data synchronization blocks to synchronize to the target table, it is specifically used for:

针对所述多个数据子同步块中每个数据子同步块构建同步至所述目标表的同步任务。A synchronization task for synchronizing to the target table is constructed for each data sub-synchronization block in the plurality of data sub-synchronization blocks.

在本说明书的一个实施例中,所述装置还包括结构变更模块,用于:In one embodiment of the specification, the device further includes a structure modification module, configured to:

根据表结构变更指令,创建所述源表的影子表并对所述影子表进行结构修改,得到所述目标表。According to the table structure change instruction, a shadow table of the source table is created and the structure of the shadow table is modified to obtain the target table.

在本说明书的一个实施例中,所述装置还包括表名切换模块,用于:In one embodiment of this specification, the device also includes a table name switching module, which is used for:

响应于所述目标表内的数据相对于所述源表内的数据满足同步条件,切换所述源表和所述目标表的表名。In response to the data in the target table satisfying a synchronization condition with respect to the data in the source table, switch the table names of the source table and the target table.

在本说明书的一个实施例中,所述表名切换模块具体用于:In one embodiment of this specification, the table name switching module is specifically used for:

响应于当前时刻相对于初始时刻或上一次会话统计时刻的时长达到预设时长,确定业务层与所述源表之间的会话数量,其中,所述初始时刻包括所述目标表内的数据相对于所述源表内的数据满足同步条件的时刻;Determining the number of sessions between the service layer and the source table in response to the current time relative to the initial time or the last session statistics time reaching a preset time length, wherein the initial time includes the data in the target table relative to When the data in the source table meets the synchronization condition;

响应于业务层与所述源表之间的会话数量小于数量阈值,杀死业务层与所述源表之间的所有会话并切换所述源表和所述目标表的表名。In response to the number of sessions between the business layer and the source table being less than the number threshold, killing all sessions between the business layer and the source table and switching the table names of the source table and the target table.

在本说明书的一个实施例中,所述表名切换模块还用于:In one embodiment of this specification, the table name switching module is also used for:

响应于会话统计的次数达到次数阈值,且每次会话统计的结果中业务层与所述源表之间的会话数量均不小于所述数量阈值,确定所述表结构变更失败。In response to the number of sessions counted reaching the number threshold, and the number of sessions between the business layer and the source table in each session counted result is not less than the number threshold, it is determined that the table structure modification fails.

上述实施例阐明的系统、装置、模块或单元,具体可以由计算机芯片或实体实现,或者由具有某种功能的产品来实现。一种典型的实现设备为计算机,计算机的具体形式可以是个人计算机、膝上型计算机、蜂窝电话、相机电话、智能电话、个人数字助理、媒体播放器、导航设备、电子邮件收发设备、游戏控制台、平板计算机、可穿戴设备或者这些设备中的任意几种设备的组合。The systems, devices, modules, or units described in the above embodiments can be specifically implemented by computer chips or entities, or by products with certain functions. A typical implementing device is a computer, which may take the form of a personal computer, laptop computer, cellular phone, camera phone, smart phone, personal digital assistant, media player, navigation device, e-mail device, game control device, etc. desktops, tablets, wearables, or any combination of these.

在一个典型的配置中,计算机包括一个或多个处理器(CPU)、输入/输出接口、网络接口和内存。In a typical configuration, a computer includes one or more processors (CPUs), input/output interfaces, network interfaces and memory.

内存可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM)。内存是计算机可读介质的示例。Memory may include non-permanent storage in computer readable media, in the form of random access memory (RAM) and/or nonvolatile memory such as read only memory (ROM) or flash RAM. Memory is an example of computer readable media.

计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带、磁盘存储、量子存储器、基于石墨烯的存储介质或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。按照本文中的界定,计算机可读介质不包括暂存电脑可读媒体(transitory media),如调制的数据信号和载波。Computer-readable media, including both permanent and non-permanent, removable and non-removable media, can be implemented by any method or technology for storage of information. Information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory (ROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Flash memory or other memory technology, Compact Disc Read-Only Memory (CD-ROM), Digital Versatile Disc (DVD) or other optical storage, Magnetic cassettes, disk storage, quantum memory, graphene-based storage media or other magnetic storage devices or any other non-transmission media that can be used to store information that can be accessed by computing devices. As defined herein, computer-readable media excludes transitory computer-readable media, such as modulated data signals and carrier waves.

还需要说明的是,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、商品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、商品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、商品或者设备中还存在另外的相同要素。It should also be noted that the term "comprises", "comprises" or any other variation thereof is intended to cover a non-exclusive inclusion such that a process, method, article, or apparatus comprising a set of elements includes not only those elements, but also includes Other elements not expressly listed, or elements inherent in the process, method, commodity, or apparatus are also included. Without further limitations, an element defined by the phrase "comprising a ..." does not exclude the presence of additional identical elements in the process, method, article or apparatus comprising said element.

上述对本说明书特定实施例进行了描述。其它实施例在所附权利要求书的范围内。在一些情况下,在权利要求书中记载的动作或步骤可以按照不同于实施例中的顺序来执行并且仍然可以实现期望的结果。另外,在附图中描绘的过程不一定要求示出的特定顺序或者连续顺序才能实现期望的结果。在某些实施方式中,多任务处理和并行处理也是可以的或者可能是有利的。The foregoing describes specific embodiments of this specification. Other implementations are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in an order different from that in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. Multitasking and parallel processing are also possible or may be advantageous in certain embodiments.

在本说明书一个或多个实施例使用的术语是仅仅出于描述特定实施例的目的,而非旨在限制本说明书一个或多个实施例。在本说明书一个或多个实施例和所附权利要求书中所使用的单数形式的“一种”、“所述”和“该”也旨在包括多数形式,除非上下文清楚地表示其他含义。还应当理解,本文中使用的术语“和/或”是指并包含一个或多个相关联的列出项目的任何或所有可能组合。Terms used in one or more embodiments of the present specification are for the purpose of describing specific embodiments only, and are not intended to limit the one or more embodiments of the present specification. As used in one or more embodiments of this specification and the appended claims, the singular forms "a", "the", and "the" are also intended to include the plural forms unless the context clearly dictates otherwise. It should also be understood that the term "and/or" as used herein refers to and includes any and all possible combinations of one or more of the associated listed items.

本申请所涉及的用户信息(包括但不限于用户设备信息、用户个人信息等)和数据(包括但不限于用于分析的数据、存储的数据、展示的数据等),均为经用户授权或者经过各方充分授权的信息和数据,并且相关数据的收集、使用和处理需要遵守相关国家和地区的相关法律法规和标准,并提供有相应的操作入口,供用户选择授权或者拒绝。The user information (including but not limited to user equipment information, user personal information, etc.) and data (including but not limited to data used for analysis, stored data, displayed data, etc.) involved in this application are authorized by the user or Information and data that have been fully authorized by all parties, and the collection, use and processing of relevant data must comply with relevant laws, regulations and standards of relevant countries and regions, and provide corresponding operation portals for users to choose to authorize or refuse.

应当理解,尽管在本说明书一个或多个实施例可能采用术语第一、第二、第三等来描述各种信息,但这些信息不应限于这些术语。这些术语仅用来将同一类型的信息彼此区分开。例如,在不脱离本说明书一个或多个实施例范围的情况下,第一信息也可以被称为第二信息,类似地,第二信息也可以被称为第一信息。取决于语境,如在此所使用的词语“如果”可以被解释成为“在……时”或“当……时”或“响应于确定”。It should be understood that although the terms first, second, third, etc. may be used in one or more embodiments of the present specification to describe various information, the information should not be limited to these terms. These terms are only used to distinguish information of the same type from one another. For example, without departing from the scope of one or more embodiments of the present specification, first information may also be called second information, and similarly, second information may also be called first information. Depending on the context, the word "if" as used herein may be interpreted as "at" or "when" or "in response to a determination."

以上所述仅为本说明书一个或多个实施例的较佳实施例而已,并不用以限制本说明书一个或多个实施例,凡在本说明书一个或多个实施例的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本说明书一个或多个实施例保护的范围之内。The above descriptions are only preferred embodiments of one or more embodiments of this specification, and are not intended to limit one or more embodiments of this specification. Within the spirit and principles of one or more embodiments of this specification, Any modification, equivalent replacement, improvement, etc. should be included in the scope of protection of one or more embodiments of this specification.

Claims (16)

1. A method of data synchronization, the method comprising:
creating a trigger, wherein the trigger is used for synchronizing the data change of a source table to a target table and recording the identification of deleted data in the synchronization process in a deletion log;
synchronizing data within the source table to the target table;
and processing the data in the target table according to the identification in the deletion log.
2. The data synchronization method of claim 1, the creating a trigger comprising:
an Insert trigger, a Delete trigger and an Update trigger are created, wherein the Insert trigger is used for synchronizing a data insertion event occurring in the source table to the target table, the Delete trigger is used for synchronizing a data deletion event occurring in the source table to the target table, and the Update trigger is used for synchronizing a data Update event occurring in the source table to the target table.
3. The data synchronization method of claim 2, the Insert trigger further for deleting an identification of data inserted in the data insertion event within the deletion log.
4. The data synchronization method of claim 2, the Delete trigger further configured to add an identification of data deleted in the data deletion event within the deletion log.
5. The data synchronization method according to claim 2, wherein the Update trigger is further configured to add, in the deletion log, an identifier of data before Update in the data Update event, and delete an identifier of data after Update in the data Update event.
6. The data synchronization method according to claim 1, wherein the processing the data in the target table according to the identification of the data in the deletion log includes:
deleting the data in the target table identified as the identification in the deletion log.
7. The data synchronization method of claim 1, the synchronizing data within the source table to the target table comprising:
dividing the data in the source table according to the preset size of the data synchronization blocks to obtain a plurality of data synchronization blocks;
And constructing synchronous tasks synchronized to the target table aiming at each data synchronous block in the plurality of data synchronous blocks, and generating a task queue for the plurality of obtained synchronous tasks.
8. The data synchronization method of claim 7, the source table comprising a plurality of partition tables, each partition table stored in a partition of a distributed database;
dividing the data in the source table according to the preset size of the data synchronization block to obtain a plurality of data synchronization blocks, including:
respectively determining an upper mark limit and a lower mark limit of data in each partition table;
and dividing the data in each partition table according to the upper limit and the lower limit of the identification of the partition table and the preset size to obtain at least one data synchronization block corresponding to the partition table.
9. The data synchronization method of claim 7, the target table comprising a plurality of partition tables, each partition table stored in a partition of a distributed database;
the method further comprises the steps of:
dividing each data synchronization block according to a partition table corresponding to each data in the data synchronization block in the target table to obtain at least one data sub-synchronization block corresponding to the data synchronization block;
The constructing a synchronization task for each of the plurality of data synchronization blocks to synchronize to the target table includes:
and constructing a synchronous task which is synchronous to the target table aiming at each data sub-synchronous block in the plurality of data sub-synchronous blocks.
10. The data synchronization method of claim 1, the method further comprising:
and creating a shadow table of the source table according to a table structure change instruction, and carrying out structural modification on the shadow table to obtain the target table.
11. The data synchronization method according to claim 1 or 10, the method further comprising:
and switching table names of the source table and the target table in response to the data in the target table meeting a synchronization condition relative to the data in the source table.
12. The data synchronization method of claim 11, the switching table names of the source table and the target table, comprising:
determining the number of sessions between a service layer and the source table in response to the fact that the duration of the current time relative to the initial time or the last session statistics time reaches a preset duration, wherein the initial time comprises the time when the data in the target table meet the synchronization condition relative to the data in the source table;
And in response to the number of sessions between the service layer and the source table being less than a number threshold, killing all sessions between the service layer and the source table and switching table names of the source table and the target table.
13. The data synchronization method of claim 12, the method further comprising:
and responding to the times of the session statistics reaching a time threshold, and determining that the change of the table structure fails when the number of the sessions between the service layer and the source table in the result of each session statistics is not smaller than the number threshold.
14. A data synchronization apparatus, the apparatus comprising:
the incremental synchronization module is used for creating a trigger, wherein the trigger is used for synchronizing the data change of the source table to the target table and recording the identification of the deleted data in the synchronization process in a deletion log;
the full synchronization module is used for synchronizing the data in the source table to the target table;
and the processing module is used for processing the data in the target table according to the identification in the deletion log.
15. An electronic device, comprising:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to implement the method of any one of claims 1-13 by executing the executable instructions.
16. A computer readable storage medium having stored thereon computer instructions which, when executed by a processor, implement the steps of the method of any of claims 1-13.
CN202310450106.0A 2023-04-24 2023-04-24 Data synchronization method and device, electronic equipment and storage medium Pending CN116521692A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310450106.0A CN116521692A (en) 2023-04-24 2023-04-24 Data synchronization method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310450106.0A CN116521692A (en) 2023-04-24 2023-04-24 Data synchronization method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116521692A true CN116521692A (en) 2023-08-01

Family

ID=87407638

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310450106.0A Pending CN116521692A (en) 2023-04-24 2023-04-24 Data synchronization method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116521692A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117171272A (en) * 2023-09-21 2023-12-05 湖北天融信网络安全技术有限公司 Data synchronization method and device
CN117235028A (en) * 2023-09-15 2023-12-15 中国建设银行股份有限公司 A data query method and device based on log files

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140258234A1 (en) * 2013-03-11 2014-09-11 AppGlu, Inc. Synchronization of cms data to mobile device storage
CN107590277A (en) * 2017-09-28 2018-01-16 泰康保险集团股份有限公司 Data synchronization method, device, electronic device and storage medium
CN108664659A (en) * 2018-05-21 2018-10-16 四川中电启明星信息技术有限公司 A kind of method of data synchronization and device of Distributed Heterogeneous Database
CN109284312A (en) * 2018-08-27 2019-01-29 山东威尔数据股份有限公司 A kind of heterogeneous database change real-time informing method
CN114297289A (en) * 2021-12-07 2022-04-08 北京天融信网络安全技术有限公司 Incremental data acquisition method and device based on optimized trigger
CN114490554A (en) * 2022-02-14 2022-05-13 中国工商银行股份有限公司 Data synchronization method and device, electronic equipment and storage medium
CN114579530A (en) * 2020-11-30 2022-06-03 亚信科技(中国)有限公司 Table space migration method, apparatus, electronic device, and computer-readable storage medium
CN115017233A (en) * 2022-06-17 2022-09-06 上海明胜品智人工智能科技有限公司 A data synchronization method, device, electronic device and medium
CN115964138A (en) * 2022-04-15 2023-04-14 中国电力科学研究院有限公司 Power static data trans-regional synchronization method and system based on data dynamic blocking

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140258234A1 (en) * 2013-03-11 2014-09-11 AppGlu, Inc. Synchronization of cms data to mobile device storage
CN107590277A (en) * 2017-09-28 2018-01-16 泰康保险集团股份有限公司 Data synchronization method, device, electronic device and storage medium
CN108664659A (en) * 2018-05-21 2018-10-16 四川中电启明星信息技术有限公司 A kind of method of data synchronization and device of Distributed Heterogeneous Database
CN109284312A (en) * 2018-08-27 2019-01-29 山东威尔数据股份有限公司 A kind of heterogeneous database change real-time informing method
CN114579530A (en) * 2020-11-30 2022-06-03 亚信科技(中国)有限公司 Table space migration method, apparatus, electronic device, and computer-readable storage medium
CN114297289A (en) * 2021-12-07 2022-04-08 北京天融信网络安全技术有限公司 Incremental data acquisition method and device based on optimized trigger
CN114490554A (en) * 2022-02-14 2022-05-13 中国工商银行股份有限公司 Data synchronization method and device, electronic equipment and storage medium
CN115964138A (en) * 2022-04-15 2023-04-14 中国电力科学研究院有限公司 Power static data trans-regional synchronization method and system based on data dynamic blocking
CN115017233A (en) * 2022-06-17 2022-09-06 上海明胜品智人工智能科技有限公司 A data synchronization method, device, electronic device and medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
黄晓微;陈玲;魏玮;徐世莲;: "基于快照日志分析的数据同步方法", 后勤工程学院学报, no. 02, 30 June 2006 (2006-06-30) *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117235028A (en) * 2023-09-15 2023-12-15 中国建设银行股份有限公司 A data query method and device based on log files
CN117171272A (en) * 2023-09-21 2023-12-05 湖北天融信网络安全技术有限公司 Data synchronization method and device

Similar Documents

Publication Publication Date Title
JP7546832B2 (en) Transaction processing method, device, computer device, and computer program
CN112997167B (en) Task Scheduling in Database Systems
US11669510B2 (en) Parallel processing of disjoint change streams into a single stream
US10255108B2 (en) Parallel execution of blockchain transactions
JP7158482B2 (en) Method, computer-readable medium, and system for resolution of violations in client synchronization
US10353893B2 (en) Data partitioning and ordering
CN111414403B (en) Data access method and device and data storage method and device
US9830372B2 (en) Scalable coordination aware static partitioning for database replication
US11907260B2 (en) Compare processing using replication log-injected compare records in a replication environment
US9218405B2 (en) Batch processing and data synchronization in cloud-based systems
US10970311B2 (en) Scalable snapshot isolation on non-transactional NoSQL
US20120330890A1 (en) Propagating tables while preserving cyclic foreign key relationships
US10338910B2 (en) Multi-tenant upgrading
WO2021057482A1 (en) Method and device for generating bloom filter in blockchain
CN116521692A (en) Data synchronization method and device, electronic equipment and storage medium
US9760623B2 (en) System for lightweight objects
CN111708626B (en) Data access method, device, computer equipment and storage medium
US9009731B2 (en) Conversion of lightweight object to a heavyweight object
US11789971B1 (en) Adding replicas to a multi-leader replica group for a data set
CN111026764A (en) Data storage method and device, electronic product and storage medium
WO2025256379A1 (en) Data management
US10726041B1 (en) Multi-revision graph data store
CN113204520A (en) Remote sensing data rapid concurrent read-write method based on distributed file system
Shen et al. OPCAM: Optimal algorithms implementing causal memories in shared memory systems
CN118916122A (en) Database transaction processing method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination