[go: up one dir, main page]

CN103763368B - A kind of method of data synchronization across data center - Google Patents

A kind of method of data synchronization across data center Download PDF

Info

Publication number
CN103763368B
CN103763368B CN201410023373.0A CN201410023373A CN103763368B CN 103763368 B CN103763368 B CN 103763368B CN 201410023373 A CN201410023373 A CN 201410023373A CN 103763368 B CN103763368 B CN 103763368B
Authority
CN
China
Prior art keywords
data
data center
log
module
center
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410023373.0A
Other languages
Chinese (zh)
Other versions
CN103763368A (en
Inventor
王恩东
文中领
张立强
袁冬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
IEIT Systems Co Ltd
Original Assignee
Inspur Electronic Information Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Electronic Information Industry Co Ltd filed Critical Inspur Electronic Information Industry Co Ltd
Priority to CN201410023373.0A priority Critical patent/CN103763368B/en
Publication of CN103763368A publication Critical patent/CN103763368A/en
Priority to PCT/CN2015/070416 priority patent/WO2015106656A1/en
Application granted granted Critical
Publication of CN103763368B publication Critical patent/CN103763368B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • G06F16/273Asynchronous replication or reconciliation

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

本发明提供一种跨数据中心的数据同步方法,其具体实现过程为:完成数据的写入与日志的记录;同步调度与推送;日志回放,完成数据同步;进行跨数据中心的数据访问,实现异步数据同步操作。该一种跨数据中心的数据同步方法和现有技术相比,能够实现跨数据中心的异步数据同步操作,提高数据的安全性;有效地利用数据中心内部的IO资源和数据中心之间的网络资源,实用性强,易于推广。

The present invention provides a method for synchronizing data across data centers. The specific implementation process is: completing data writing and log recording; synchronous scheduling and pushing; log playback to complete data synchronization; performing data access across data centers to realize Synchronous operations on asynchronous data. Compared with the prior art, this cross-data center data synchronization method can realize asynchronous data synchronization operations across data centers and improve data security; effectively utilize the IO resources inside the data center and the network between data centers Resources, strong practicability, easy to promote.

Description

一种跨数据中心的数据同步方法A data synchronization method across data centers

技术领域 technical field

本发明涉及计算机数据存储技术领域,具体的说是一种跨数据中心的数据同步方法。 The invention relates to the technical field of computer data storage, in particular to a data synchronization method across data centers.

背景技术 Background technique

随着的互联网时代已经到来:社交网络、微博、位置服务等面向普通互联网用户的交互型网站正蓬勃兴起,如Google、Facebook、Twitter以及国内的人人网、微博等,向数以亿计的用户提供基于互联网和无线网络的交互服务。遍布全世界的互联网用户每天都进行多种多样的交互,随时都在制造各种各样的数据,这些数据的数量是单机时代数据量的数倍。 With the advent of the Internet era: social networks, microblogs, location services and other interactive websites for ordinary Internet users are booming, such as Google, Facebook, Twitter and domestic Renren, Weibo, etc., to hundreds of millions It provides interactive services based on the Internet and wireless networks to users of the project. Internet users all over the world carry out a variety of interactions every day and produce all kinds of data at any time. The amount of these data is several times the amount of data in the stand-alone era.

为存储这些数据,各互联网公司在世界各地建立了庞大的数据中心,单个数据中心的主机数量在几百至数万的数量级不等。来自Google的信息表明,Google在全球有数十个数据中心和过千万台服务器,存储其全球用户每天产生的海量数据。对这些数据的管理和使用都是巨大的挑战:包括数据的读取和存储、索引和寻址、配置和管理的接口、数据中心之间的数据复制等,这其中,对多数据中心之间数据同步的支持和研究需求尤为迫切。 In order to store these data, Internet companies have built huge data centers all over the world, and the number of hosts in a single data center ranges from hundreds to tens of thousands. Information from Google shows that Google has dozens of data centers and over ten million servers around the world, storing massive amounts of data generated by its global users every day. The management and use of these data are huge challenges: including data reading and storage, indexing and addressing, configuration and management interfaces, data replication between data centers, etc., among which, between multiple data centers The support and research needs of data synchronization are particularly urgent.

目前针对海量的数据存储的研究仍处于刚刚起步的阶段,对于数据中心之间的数据同步方法仍有许多值得研究和改进的方面,以Hbase为例,Hbase的复制依赖于Master/Slave的体系结构,在0.90.0版本才加入了简单的在两个数据中心之间进行数据复制的特性,复制任务没有优先级队列的实现,没有针对数据中心的负载做统一的调度。另一方面,传统的跨数据中心的数据同步算法通常以整块数据的传输和覆盖为主要方法,这种方法会占用大量的网络资源和IO资源。 At present, the research on massive data storage is still in its infancy, and there are still many aspects worthy of research and improvement for data synchronization methods between data centers. Taking Hbase as an example, the replication of Hbase depends on the architecture of Master/Slave , the feature of simple data replication between two data centers was added in version 0.90.0. There is no implementation of priority queues for replication tasks, and there is no unified scheduling for the load of data centers. On the other hand, the traditional cross-data center data synchronization algorithm usually uses the transmission and coverage of the entire block of data as the main method, which will occupy a large amount of network resources and IO resources.

针对这一情况,本专利发明了一种基于日志回放的跨数据中心的数据同步方法。 In response to this situation, the patent invented a data synchronization method across data centers based on log playback.

发明内容 Contents of the invention

本发明的技术任务是解决现有技术的不足,提供一种跨数据中心的数据同步方法。 The technical task of the present invention is to solve the deficiencies of the prior art and provide a data synchronization method across data centers.

本发明的技术方案是按以下方式实现的,该一种跨数据中心的数据同步方法,其具体实现过程为: The technical solution of the present invention is realized in the following manner. The specific implementation process of the cross-data center data synchronization method is as follows:

一、完成数据的写入与日志的记录:在主数据中心运行日志记录模块,当主数据中心接收到客户端发来的数据请求时,该模块将请求所要求的操作以日志的方式记录在主数据中心,该模块以嵌入式或插件的方式,整合到主数据中心的业务流程中。 1. Complete data writing and log recording: run the log recording module in the main data center. When the main data center receives the data request from the client, the module will record the required operations in the main data center in the form of logs. Data center, this module is integrated into the business process of the main data center in an embedded or plug-in manner.

二、同步调度与推送:设置调度模块运行在主数据中心,该调度模块负责调度数据回放操作,根据主数据中心的负载、备份数据中心的负载、调度策略信息,激活日志的推送和回放操作;调度模块要求的推送操作通过日志推送模块完成,该日志推送模块在主数据中心运行,将数据操作日志传输到备份数据中心。 2. Synchronous scheduling and push: set the scheduling module to run in the primary data center, which is responsible for scheduling data playback operations, and activate log push and playback operations according to the load of the primary data center, the load of the backup data center, and scheduling policy information; The push operation required by the scheduling module is completed through the log push module, which runs in the primary data center and transmits the data operation log to the backup data center.

三、日志回放,完成数据同步:主数据中心日志推送模块推送过来的数据操作执行由日志回放模块接收,该日志回放模块运行在备份数据中心,并在当前数据中心回放数据操作日志,实现两个数据中心的数据同步。 3. Log playback and complete data synchronization: The data operation execution pushed by the log push module of the primary data center is received by the log playback module. The log playback module runs in the backup data center and plays back the data operation logs in the current data center, realizing two Data synchronization in data center.

四、进行跨数据中心的数据访问,实现异步数据同步操作。 4. Perform data access across data centers to realize asynchronous data synchronization operations.

所述步骤一的详细过程为:客户端根据本地的配置识别到客户数据所在的主数据中心,并将所有的数据操作全部发送到主数据中心,交由主数据中的数据节点进行处理;主数据中心收到客户端的请求后,根据请求的操作和内容执行客户操作,在这一过程中,日志记录模块通过截取请求的方式捕捉到客户请求的操作及相关数据;日志记录模块判断客户端的操作是否需要对数据中心的数据进行修改,如果需要,则该数据操作需要作为跨数据中心数据同步操作的内容,此时日志记录模块将请求的操作及相关的数据以专有的日志格式保存到主数据中心的异步日志记录区域,该区域的内容都是需要进行跨数据中心数据回放的内容。 The detailed process of step 1 is as follows: the client identifies the primary data center where the customer data is located according to the local configuration, and sends all data operations to the primary data center for processing by the data nodes in the primary data; After the data center receives the client's request, it executes the client's operation according to the requested operation and content. During this process, the logging module captures the client's requested operation and related data by intercepting the request; the logging module judges the client's operation Is it necessary to modify the data in the data center? If so, the data operation needs to be used as the content of the cross-data center data synchronization operation. At this time, the log recording module will save the requested operation and related data to the master in a proprietary log format. The asynchronous log recording area of the data center, the content of this area is the content that needs to be played back across data centers.

所述步骤二的详细过程为:首先由运行在主数据中心的调度模块监控以下条件。 The detailed process of the second step is as follows: firstly, the following conditions are monitored by the scheduling module running in the main data center.

1)异步日志记录区域中日志的数目和涉及到的数据量; 1) The number of logs in the asynchronous logging area and the amount of data involved;

2)主数据中心的负载情况,包括网络IO和磁盘IO; 2) The load of the main data center, including network IO and disk IO;

3)备份数据中心的负载情况,包括网络IO和磁盘IO; 3) Backup data center load, including network IO and disk IO;

当以上三者满足配置管理员设置的调度策略时,触发日志推送操作,日志推送操作由日志推送模块执行,该模块负责将主数据中心异步日志记录区域中的数据操作日志写入到备份数据中心的异步日志执行区域;当日志推送模块完成日志的跨数据中心传输后,会通知调度模块,然后调度模块驱动备份数据中心的日志回放模块执行日志的回放。 When the above three meet the scheduling policy set by the configuration administrator, the log push operation is triggered, and the log push operation is performed by the log push module, which is responsible for writing the data operation logs in the asynchronous log recording area of the primary data center to the backup data center The asynchronous log execution area; when the log push module completes the cross-data center transmission of logs, it will notify the scheduling module, and then the scheduling module will drive the log playback module in the backup data center to perform log playback.

所述步骤三的详细过程为:运行在备份数据中心的日志回放模块在接收到调度模块的通知后,开始执行日志的回放操作,日志回放模块读取存储在异步日志执行区域中的数据日志,然后对日志的内容进行解码,取得日志对应的操作以及相关数据,然后在备份数据中心的相关节点上再次执行该操作,使备份数据中心的数据与主数据中心中的数据一致,实现数据的跨数据中心同步。 The detailed process of step 3 is: the log playback module running in the backup data center starts to execute the playback operation of the log after receiving the notification from the scheduling module, and the log playback module reads the data log stored in the asynchronous log execution area, Then decode the content of the log to obtain the operation and related data corresponding to the log, and then execute the operation again on the relevant nodes in the backup data center, so that the data in the backup data center is consistent with the data in the primary data center, and the data crossover is realized. Data center synchronization.

所述步骤四的客户端在下述两种情况下通过访问备份数据中心获取数据,客户端无法连接主数据中心时;客户端可以连接主数据中心,但主数据中心业务繁忙时。 The client in step 4 acquires data by accessing the backup data center in the following two cases, when the client cannot connect to the primary data center; when the client can connect to the primary data center, but the primary data center is busy.

本发明与现有技术相比所产生的有益效果是: The beneficial effect that the present invention produces compared with prior art is:

本发明的一种跨数据中心的数据同步方法能够实现跨数据中心的异步数据同步操作,提高数据的安全性;用户在无法访问主数据中心时,还可以通过访问备用数据中心获取数据;由于在回放过程中,只需要传输数据的差异,而无需传输数据本身,因此本方法还可以降低传输的数据量,减少同步操作对数据中心之间带宽的占用;另外,系统中的调度模块能够根据数据中心的负载进行调度,有效地利用数据中心内部的IO资源和数据中心之间的网络资源,起到负载平衡的作用;实用性强,易于推广。 A data synchronization method across data centers of the present invention can realize asynchronous data synchronization operations across data centers and improve data security; when users cannot access the main data center, they can also obtain data by accessing the standby data center; During the playback process, only the difference of the data needs to be transmitted, but not the data itself, so this method can also reduce the amount of transmitted data and reduce the bandwidth occupied by the synchronization operation between the data centers; in addition, the scheduling module in the system can be based on the data The load of the center is scheduled, and the IO resources inside the data center and the network resources between the data centers are effectively used to play the role of load balancing; it is practical and easy to promote.

附图说明 Description of drawings

附图1是本发明的实现过程示意图。 Accompanying drawing 1 is the schematic diagram of the realization process of the present invention.

具体实施方式 detailed description

下面结合附图对本发明的一种跨数据中心的数据同步方法作以下详细说明。 A cross-data center data synchronization method of the present invention will be described in detail below in conjunction with the accompanying drawings.

如附图1所示,一种跨数据中心的数据同步方法,通过数据操作日志的回放,实现数据中心之间的异步数据同步。其具体实现过程为: As shown in Figure 1, a cross-data center data synchronization method implements asynchronous data synchronization between data centers through playback of data operation logs. Its specific implementation process is:

首先通过编程设置以下几个模块: First set up the following modules by programming:

(1)日志记录模块。运行在主数据中心,负责当主数据中心接收到客户端发来的数据请求时,将请求所要求的操作以日志的方式记录在主数据中心。该模块以嵌入式或插件的方式,整合到主数据中心的业务流程中。 (1) Logging module. Running in the main data center, it is responsible for recording the operations required by the request in the main data center in the form of logs when the main data center receives the data request from the client. This module is integrated into the business process of the main data center in an embedded or plug-in manner.

(2)调度模块。运行在主数据中心,负责调度数据回放操作。根据主数据中心的负载、备份数据中心的负载、调度策略等信息,激活日志的推送和回放操作。 (2) Scheduling module. It runs in the main data center and is responsible for scheduling data playback operations. According to information such as the load of the primary data center, the load of the backup data center, and scheduling policies, the log push and playback operations are activated.

(3)日志推送模块。运行在主数据中心,负责执行调度模块要求的推送操作,将数据操作日志传输到备份数据中心。 (3) Log push module. Running in the primary data center, it is responsible for executing the push operation required by the scheduling module and transferring the data operation log to the backup data center.

(4)日志回放模块。运行在备份数据中心,负责接收主数据中心日志推送模块推送过来的数据操作执行,并在当前数据中心回放数据操作日志,实现两个数据中心的数据同步。 (4) Log playback module. Running in the backup data center, it is responsible for receiving the data operation execution pushed by the log push module of the main data center, and playing back the data operation log in the current data center to realize the data synchronization of the two data centers.

通过以上模块完成下述操作: Complete the following operations through the above modules:

一、数据的写入与日志的记录。 1. Data writing and log recording.

在正常的情况下,客户端根据本地的配置识别到客户数据所在的主数据中心,并将所有的数据操作,包括读取、写入、删除等,全部发送到主数据中心,交由主数据中的数据节点进行处理。 Under normal circumstances, the client recognizes the main data center where the customer data is located according to the local configuration, and sends all data operations, including reading, writing, deleting, etc., to the main data center and handed over to the main data center. The data nodes in the process are processed.

主数据中心收到客户端的请求后,会根据请求的操作和内容执行客户操作。在这一过程中,日志记录模块会通过截取请求的方式捕捉到客户请求的操作及相关数据。 After receiving the request from the client, the primary data center will execute the client operation according to the requested operation and content. During this process, the logging module will capture the operation and related data requested by the client by intercepting the request.

日志记录模块会判断客户端的操作是否需要对数据中心的数据进行修改,如果需要,则说明该数据操作需要作为跨数据中心数据同步操作的内容。这个时候,日志记录模块会将请求的操作及相关的数据以专有的日志格式保存到主数据中心的异步日志记录区域,该区域的内容都是需要进行跨数据中心数据回放的内容。 The log recording module will judge whether the operation of the client needs to modify the data in the data center, and if so, it means that the data operation needs to be used as the content of the cross-data center data synchronization operation. At this time, the logging module will save the requested operation and related data in a proprietary log format to the asynchronous logging area of the primary data center. The content in this area is the content that needs to be played back across data centers.

二、同步调度与日志的推送。 2. Synchronous scheduling and log push.

运行在主数据中心的调度模块,会监控以下条件: The scheduling module running in the main data center will monitor the following conditions:

1)异步日志记录区域中日志的数目和涉及到的数据量。 1) The number of logs in the asynchronous logging area and the amount of data involved.

2)主数据中心的负载情况,包括网络IO和磁盘IO。 2) The load of the main data center, including network IO and disk IO.

3)备份数据中心的负载情况,主包括网络IO和磁盘IO。 3) Backup the load of the data center, mainly including network IO and disk IO.

当以上三者满足配置管理员设置的调度策略时,触发日志推送操作。触发的前提通常是条件1)较高,而条件2)和条件3)较低。 When the above three meet the scheduling policy set by the configuration administrator, the log push operation is triggered. The trigger is usually condition 1) high while condition 2) and condition 3) are low.

日志推送操作由日志推送模块执行,该模块负责将主数据中心异步日志记录区域中的数据操作日志写入到备份数据中心的异步日志执行区域。 The log push operation is performed by the log push module, which is responsible for writing the data operation logs in the asynchronous log recording area of the primary data center to the asynchronous log execution area of the backup data center.

当日志推送模块完成日志的跨数据中心传输后,会通知调度模块。然后调度模块驱动备份数据中心的日志回放模块执行日志的回放。 When the log push module completes the cross-data center transmission of logs, it will notify the scheduling module. Then the scheduling module drives the log playback module of the backup data center to perform log playback.

三、日志的回放。 3. Log playback.

运行在备份数据中心的日志回访模块在接收到调度模块的通知后,开始执行日志的回放操作。 The log recall module running in the backup data center starts to execute the log playback operation after receiving the notification from the scheduling module.

日志回访模块读取存储在异步日志执行区域中的数据日志,然后对日志的内容进行解码,取得日志对应的操作以及相关数据,然后在备份数据中心的相关节点上再次执行该操作,使备份数据中心的数据与主数据中心中的数据一致。从而实现了数据的跨数据中心同步。 The log return access module reads the data logs stored in the asynchronous log execution area, and then decodes the contents of the logs to obtain the corresponding operations and related data of the logs, and then executes the operations again on the relevant nodes of the backup data center to make the backup data The data in the center is consistent with the data in the main data center. This enables data synchronization across data centers.

四、跨数据中心的数据访问。 4. Data access across data centers.

客户端在两种情况下可能会通过访问备份数据中心获取数据: Clients may obtain data by accessing the backup data center in two cases:

第一,客户端无法连接主数据中心时。这种情况有可能是主数据中心发生了故障,也可能是由于主数据中心与客户端之间的网络中断。 First, when the client cannot connect to the primary data center. This situation could be due to a failure in the primary data center, or it could be due to a network outage between the primary data center and the client.

当出现这种情况时,客户端将尝试从备份数据中心读取操作,且只能执行读取操作。另外,由于此时不确定备份数据中心与主数据中心之间的数据是否已经完成了同步操作,因此客户端会对用户显示相关的提示信息,通知用户此时数据的来源是备份数据中心,且存在数据一致性问题。 When this happens, the client will attempt to read from the backup datacenter, and will only be able to. In addition, because it is uncertain whether the data between the backup data center and the primary data center has been synchronized at this time, the client will display relevant prompt information to the user, informing the user that the source of the data at this time is the backup data center, and There is a data consistency problem.

第二,客户端可以连接主数据中心,但主数据中心业务繁忙时。 Second, the client can connect to the primary data center, but the primary data center is busy.

当出现这种情况时,客户端如果判断客户的操作是只读操作,则会向主数据中心确认操作涉及的数据是否已经同步到备份数据中心,如果主数据中心告知客户端数据已经同步完成,则客户端可以通过访问备份数据中心获取客户要读取的数据。此时,备份数据中心起到一个负载平衡的作用。 When this happens, if the client judges that the client’s operation is a read-only operation, it will confirm to the primary data center whether the data involved in the operation has been synchronized to the backup data center. If the primary data center informs the client that the data has been synchronized, Then the client can obtain the data that the client wants to read by accessing the backup data center. At this point, the backup data center plays a load balancing role.

本发明的跨数据中心的数据同步方法,能够实现跨数据中心的异步数据同步操作,提高数据的安全性。用户在无法访问主数据中心时,还可以通过访问备用数据中心获取数据。能够有效减少数据中心之间同步所产生的数据量。由于在回放过程中,只需要传输数据的差异,而无需传输数据本身,因此可以降低传输的数据量,减少同步操作对数据中心之间带宽的占用。能够根据数据中心的负载进行调度,有效地利用数据中心内部的IO资源和数据中心之间的网络资源,起到负载平衡的作用。 The data synchronization method across data centers of the present invention can realize asynchronous data synchronization operations across data centers and improve data security. When users cannot access the primary data center, they can also obtain data by accessing the backup data center. It can effectively reduce the amount of data generated by synchronization between data centers. Since only the difference of the data needs to be transmitted during the playback process without transmitting the data itself, the amount of transmitted data can be reduced and the bandwidth occupied by the synchronization operation between data centers can be reduced. It can be scheduled according to the load of the data center, effectively utilize the IO resources inside the data center and the network resources between the data centers, and play the role of load balancing.

以上所述仅为本发明的实施例而已,凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。 The above description is only an embodiment of the present invention, and any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention shall be included in the protection scope of the present invention.

Claims (3)

1.一种跨数据中心的数据同步方法,其特征在于其具体实现过程为: 1. A data synchronization method across data centers, characterized in that its specific implementation process is: 一、完成数据的写入与日志的记录:在主数据中心运行日志记录模块,当主数据中心接收到客户端发来的数据请求时,该模块将请求所要求的操作以日志的方式记录在主数据中心,该模块以嵌入式或插件的方式,整合到主数据中心的业务流程中; 1. Complete data writing and log recording: run the log recording module in the main data center. When the main data center receives the data request from the client, the module will record the required operations in the main data center in the form of logs. Data center, this module is integrated into the business process of the main data center in the form of embedded or plug-in; 该步骤一的详细过程为:客户端根据本地的配置识别到客户数据所在的主数据中心,并将所有的数据操作全部发送到主数据中心,交由主数据中心的数据节点进行处理;主数据中心收到客户端的请求后,根据请求的操作和内容执行客户操作,在这一过程中,日志记录模块通过截取请求的方式捕捉到客户请求的操作及相关数据;日志记录模块判断客户端的操作是否需要对数据中心的数据进行修改,如果需要,则该数据操作需要作为跨数据中心数据同步操作的内容,此时日志记录模块将请求的操作及相关的数据以专有的日志格式保存到主数据中心的异步日志记录区域,该区域的内容都是需要进行跨数据中心数据回放的内容; The detailed process of step 1 is: the client identifies the primary data center where the customer data is located according to the local configuration, and sends all data operations to the primary data center for processing by the data nodes of the primary data center; the primary data After the center receives the client's request, it executes the client's operation according to the requested operation and content. During this process, the logging module captures the client's requested operation and related data by intercepting the request; the logging module judges whether the client's operation is The data in the data center needs to be modified. If necessary, the data operation needs to be used as the content of the cross-data center data synchronization operation. At this time, the logging module saves the requested operation and related data to the master data in a proprietary log format The asynchronous logging area of the center, the content of this area is the content that needs to be played back across data centers; 二、同步调度与推送:设置调度模块运行在主数据中心,该调度模块负责调度数据回放操作,根据主数据中心的负载、备份数据中心的负载、调度策略信息,激活日志的推送和回放操作;调度模块要求的推送操作通过日志推送模块完成,该日志推送模块在主数据中心运行,将数据操作日志传输到备份数据中心; 2. Synchronous scheduling and push: set the scheduling module to run in the primary data center, which is responsible for scheduling data playback operations, and activate log push and playback operations according to the load of the primary data center, the load of the backup data center, and scheduling policy information; The push operation required by the scheduling module is completed through the log push module, which runs in the primary data center and transmits the data operation log to the backup data center; 三、日志回放,完成数据同步:主数据中心日志推送模块推送过来的数据操作执行由日志回放模块接收,该日志回放模块运行在备份数据中心,并在当前数据中心回放数据操作日志,实现两个数据中心的数据同步; 3. Log playback and complete data synchronization: The data operation execution pushed by the log push module of the primary data center is received by the log playback module. The log playback module runs in the backup data center and plays back the data operation logs in the current data center, realizing two Data synchronization in the data center; 所述步骤三的详细过程为:运行在备份数据中心的日志回放模块在接收到调度模块的通知后,开始执行日志的回放操作,日志回放模块读取存储在异步日志执行区域中的数据日志,然后对日志的内容进行解码,取得日志对应的操作以及相关数据,然后在备份数据中心的相关节点上再次执行该操作,使备份数据中心的数据与主数据中心中的数据一致,实现数据的跨数据中心同步; The detailed process of step 3 is: the log playback module running in the backup data center starts to execute the playback operation of the log after receiving the notification from the scheduling module, and the log playback module reads the data log stored in the asynchronous log execution area, Then decode the content of the log to obtain the operation and related data corresponding to the log, and then execute the operation again on the relevant nodes in the backup data center, so that the data in the backup data center is consistent with the data in the primary data center, and the data crossover is realized. data center synchronization; 四、进行跨数据中心的数据访问,实现异步数据同步操作。 4. Perform data access across data centers to realize asynchronous data synchronization operations. 2.根据权利要求1或2所述的一种跨数据中心的数据同步方法,其特征在于:所述步骤二的详细过程为:首先由运行在主数据中心的调度模块监控以下条件, 2. A method for synchronizing data across data centers according to claim 1 or 2, characterized in that: the detailed process of step 2 is: first, the following conditions are monitored by a scheduling module running in the main data center, 1)异步日志记录区域中日志的数目和涉及到的数据量; 1) The number of logs in the asynchronous logging area and the amount of data involved; 2)主数据中心的负载情况,包括网络IO和磁盘IO; 2) The load of the main data center, including network IO and disk IO; 3)备份数据中心的负载情况,包括网络IO和磁盘IO; 3) Backup data center load, including network IO and disk IO; 当以上三者满足配置管理员设置的调度策略时,触发日志推送操作,日志推送操作由日志推送模块执行,该模块负责将主数据中心异步日志记录区域中的数据操作日志写入到备份数据中心的异步日志执行区域;当日志推送模块完成日志的跨数据中心传输后,会通知调度模块,然后调度模块驱动备份数据中心的日志回放模块执行日志的回放。 When the above three meet the scheduling policy set by the configuration administrator, the log push operation is triggered, and the log push operation is performed by the log push module, which is responsible for writing the data operation logs in the asynchronous log recording area of the primary data center to the backup data center The asynchronous log execution area; when the log push module completes the cross-data center transmission of logs, it will notify the scheduling module, and then the scheduling module will drive the log playback module in the backup data center to perform log playback. 3.根据权利要求2所述的一种跨数据中心的数据同步方法,其特征在于:所述步骤四的客户端在下述两种情况下通过访问备份数据中心获取数据,客户端无法连接主数据中心时;客户端可以连接主数据中心,但主数据中心业务繁忙时。 3. A cross-data center data synchronization method according to claim 2, characterized in that: the client in step 4 obtains data by accessing the backup data center in the following two cases, and the client cannot connect to the main data Center; clients can connect to the main data center, but the main data center is busy.
CN201410023373.0A 2014-01-20 2014-01-20 A kind of method of data synchronization across data center Active CN103763368B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201410023373.0A CN103763368B (en) 2014-01-20 2014-01-20 A kind of method of data synchronization across data center
PCT/CN2015/070416 WO2015106656A1 (en) 2014-01-20 2015-01-09 Cross-data-center data synchronization method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410023373.0A CN103763368B (en) 2014-01-20 2014-01-20 A kind of method of data synchronization across data center

Publications (2)

Publication Number Publication Date
CN103763368A CN103763368A (en) 2014-04-30
CN103763368B true CN103763368B (en) 2016-07-06

Family

ID=50530527

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410023373.0A Active CN103763368B (en) 2014-01-20 2014-01-20 A kind of method of data synchronization across data center

Country Status (2)

Country Link
CN (1) CN103763368B (en)
WO (1) WO2015106656A1 (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103763368B (en) * 2014-01-20 2016-07-06 浪潮电子信息产业股份有限公司 A kind of method of data synchronization across data center
CN104219288B (en) * 2014-08-14 2018-03-23 中国南方电网有限责任公司超高压输电公司 Distributed Data Synchronization method and its system based on multithreading
CN104519130B (en) * 2014-12-16 2018-02-27 北京中交兴路车联网科技有限公司 A kind of data sharing caching method across IDC
CN104899278B (en) * 2015-05-29 2019-05-03 北京京东尚科信息技术有限公司 A kind of generation method and device of Hbase database data operation log
CN106557530B (en) * 2015-09-30 2019-10-11 腾讯科技(深圳)有限公司 Operation system, data recovery method and device
CN105610917B (en) * 2015-12-22 2019-12-20 腾讯科技(深圳)有限公司 Method and system for realizing synchronous data repair in system
CN110290214A (en) * 2019-06-28 2019-09-27 苏州浪潮智能科技有限公司 A kind of transmitting data file method and system
CN110750594B (en) * 2019-09-30 2023-05-30 上海视云网络科技有限公司 Real-time cross-network database synchronization method based on mysql incremental log

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1677931A (en) * 2004-04-02 2005-10-05 鸿富锦精密工业(深圳)有限公司 Network daily-record data management system and method
CN101043375A (en) * 2007-03-15 2007-09-26 华为技术有限公司 Distributed system journal collecting method and system

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8214329B2 (en) * 2008-08-26 2012-07-03 Zeewise, Inc. Remote data collection systems and methods
CN102075556B (en) * 2009-11-19 2014-11-26 北京明朝万达科技有限公司 Method for designing service architecture with large-scale loading capacity
JP5452765B2 (en) * 2010-12-14 2014-03-26 株式会社日立製作所 Failure recovery method in information processing system and information processing system
CN103500229B (en) * 2013-10-24 2017-04-19 北京奇虎科技有限公司 Database synchronization method and database system
CN103763368B (en) * 2014-01-20 2016-07-06 浪潮电子信息产业股份有限公司 A kind of method of data synchronization across data center

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1677931A (en) * 2004-04-02 2005-10-05 鸿富锦精密工业(深圳)有限公司 Network daily-record data management system and method
CN101043375A (en) * 2007-03-15 2007-09-26 华为技术有限公司 Distributed system journal collecting method and system

Also Published As

Publication number Publication date
CN103763368A (en) 2014-04-30
WO2015106656A1 (en) 2015-07-23

Similar Documents

Publication Publication Date Title
CN103763368B (en) A kind of method of data synchronization across data center
US9405757B2 (en) File storage system, apparatus, and file access method
US9477517B2 (en) Service broker systems, methods, and apparatus
TWI533213B (en) Storing and resuming application runtime state
CN105393220B (en) System and method for disposing dotted virtual server in group system
US20150331775A1 (en) Estimating data storage device lifespan
CN103488546B (en) A kind of support multi-level data and the online concurrent backup of database and restoration methods
WO2015192661A1 (en) Method, device, and system for data synchronization in distributed storage system
CN103761141A (en) Method and device for realizing message queue
CN103118073B (en) Virtual machine data persistence storage system and method in cloud environment
US20120278429A1 (en) Cluster system, synchronization controlling method, server, and synchronization controlling program
US8527454B2 (en) Data replication using a shared resource
CN103916421A (en) Cloud storage data service device, data transmission system, server and method
CN104866528B (en) Multi-platform data acquisition method and system
CN109783018A (en) A kind of method and device of data storage
JP2012510094A5 (en)
CN110633046A (en) Storage method and device of distributed system, storage equipment and storage medium
JP2016535908A (en) How to queue email web client notifications
CN107426288A (en) A kind of resource-sharing schedule method and apparatus based on storage network
WO2014190622A1 (en) Off-line message storage method and server
CN108205468A (en) A kind of distributed system and implementation method towards massive video image
US20140189055A1 (en) Migration of usage sessions between devices
CN102868739B (en) Be applied to the switching equipment of IP SAN cluster storage system
EP3349416B1 (en) Relationship chain processing method and system, and storage medium
CN107357922A (en) A kind of NFS of distributed file system accesses auditing method and system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant