[go: up one dir, main page]

CN107729541A - A kind of data processing method, device and computer-readable recording medium - Google Patents

A kind of data processing method, device and computer-readable recording medium Download PDF

Info

Publication number
CN107729541A
CN107729541A CN201711049815.9A CN201711049815A CN107729541A CN 107729541 A CN107729541 A CN 107729541A CN 201711049815 A CN201711049815 A CN 201711049815A CN 107729541 A CN107729541 A CN 107729541A
Authority
CN
China
Prior art keywords
data
repaired
audited
metadata
preset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201711049815.9A
Other languages
Chinese (zh)
Inventor
龚超
唐凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
MIGU Digital Media Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
MIGU Digital Media Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, MIGU Digital Media Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN201711049815.9A priority Critical patent/CN107729541A/en
Publication of CN107729541A publication Critical patent/CN107729541A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • G06F16/275Synchronous replication
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2365Ensuring data consistency and integrity

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Security & Cryptography (AREA)
  • Computing Systems (AREA)
  • Quality & Reliability (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of data processing method, including:Obtain data to be audited online from least two data sources;Rule is checked based on default, the data to be audited are checked;It will check that unsanctioned data are defined as complex data to be repaired, and, according to default reparation order, the complex data to be repaired is repaired.The present invention further simultaneously discloses a kind of data processing equipment and computer-readable recording medium.

Description

一种数据处理方法、装置及计算机可读存储介质A data processing method, device and computer-readable storage medium

技术领域technical field

本发明涉及计算机技术领域,尤其涉及一种数据处理方法、装置及计算机可读存储介质。The present invention relates to the field of computer technology, in particular to a data processing method, device and computer-readable storage medium.

背景技术Background technique

目前,对于需要多个数据源共同配合来实现某项业务的情况,这些相互配合的数据源中某些数据需要保持一致。为了保证数据的完整性和一致性,需要对相互配合的多个数据源中的数据进行稽核修复。At present, for the situation where multiple data sources are required to cooperate to realize a certain business, some data in these mutually coordinated data sources need to be consistent. In order to ensure the integrity and consistency of data, it is necessary to audit and repair the data in multiple data sources that cooperate with each other.

目前,对多个数据源中的数据进行稽核修复的方法包括:为各个数据源建立备份数据库,按照预设周期(如一天)将数据源中的数据同步备份到与之对应的备份数据库中,然后基于各备份数据库中的数据进行数据稽核,进而根据稽核结果,对数据源中的数据进行数据修复。At present, the method for auditing and repairing data in multiple data sources includes: establishing a backup database for each data source, synchronously backing up the data in the data source to the corresponding backup database according to a preset cycle (such as one day), Then perform data audit based on the data in each backup database, and then perform data restoration on the data in the data source according to the audit results.

但是,由于数据稽核操作是以天为单位在备份环境中进行的,即数据稽核操作是离线进行的,导致数据稽核的实时性较低,影响数据稽核的准确性。因此,基于备份数据库得到的数据稽核结果是不准确的,无法真实反应各数据源中的数据一致性,进而根据备份数据库得到的稽核结果对数据源中的数据进行数据修复之后,仍然无法确保各数据源中数据的一致性与准确性。However, since the data audit operation is performed in the backup environment on a daily basis, that is, the data audit operation is performed offline, resulting in low real-time performance of the data audit and affecting the accuracy of the data audit. Therefore, the data audit results obtained based on the backup database are inaccurate and cannot truly reflect the data consistency in each data source. Furthermore, after the data in the data source is repaired according to the audit results obtained from the backup database, it is still impossible to ensure that all Consistency and accuracy of data in data sources.

发明内容Contents of the invention

有鉴于此,本发明实施例期望提供一种数据处理方法、装置及计算机可读存储介质,能够保证至少两个数据源的数据的一致性与准确性。In view of this, the embodiments of the present invention expect to provide a data processing method, device, and computer-readable storage medium, which can ensure the consistency and accuracy of data from at least two data sources.

本发明实施例的技术方案是这样实现的:The technical scheme of the embodiment of the present invention is realized like this:

本发明实施例提供一种数据处理方法,所述方法包括:An embodiment of the present invention provides a data processing method, the method comprising:

从至少两个数据源中在线获取待稽核数据;Obtain data to be audited online from at least two data sources;

基于预设稽核规则,对所述待稽核数据进行稽核;Auditing the data to be audited based on preset audit rules;

将稽核未通过的数据确定为待修复数据,以及,按照预设修复顺序,对所述待修复数据进行修复。The data that fails the audit is determined as data to be repaired, and the data to be repaired is repaired according to a preset repair sequence.

上述方案中,所述从至少两个数据源中在线获取待稽核数据,包括:In the above solution, the online acquisition of data to be audited from at least two data sources includes:

确定预先存储的与所述待稽核数据对应的元数据;determining pre-stored metadata corresponding to the data to be audited;

根据所述元数据,从所述至少两个数据源中获取与所述元数据对应的所述待稽核数据。According to the metadata, the data to be audited corresponding to the metadata is acquired from the at least two data sources.

上述方案中,所述方法还包括:In the above scheme, the method also includes:

根据预设采集规则,从所述至少两个数据源中获取所述元数据;Acquiring the metadata from the at least two data sources according to preset acquisition rules;

和/或,接收所述至少两个数据源推送的所述元数据。And/or, receiving the metadata pushed by the at least two data sources.

上述方案中,所述基于预设稽核规则,对所述待稽核数据进行稽核,包括:In the above solution, the auditing of the data to be audited based on the preset audit rules includes:

根据所述预设稽核规则,判断所述待稽核数据在所述至少两个数据源中是否一致;According to the preset audit rules, it is judged whether the data to be audited is consistent in the at least two data sources;

若一致,则确定所述待稽核数据通过稽核;否则,确定所述待稽核数据未通过稽核。If they are consistent, it is determined that the data to be audited has passed the audit; otherwise, it is determined that the data to be audited has not passed the audit.

上述方案中,所述方法还包括:In the above scheme, the method also includes:

判断所述待修复数据是否修复成功;judging whether the data to be repaired is successfully repaired;

当确定所述待修复数据未修复成功时,对所述待修复数据进行再次修复;When it is determined that the data to be repaired has not been successfully repaired, the data to be repaired is repaired again;

当对所述待修复数据的修复次数大于预设修复次数时,生成报警消息;所述报警消息表征所述待修复数据无法进行修复。When the number of repairs to the data to be repaired is greater than the preset number of repairs, an alarm message is generated; the alarm message indicates that the data to be repaired cannot be repaired.

上述方案中,所述判断所述待修复数据是否修复成功,包括:In the above solution, the judging whether the data to be repaired is successfully repaired includes:

判断修复后的所述待修复数据在所述至少两个数据源中是否一致;judging whether the repaired data to be repaired is consistent in the at least two data sources;

若一致,确定所述待修复数据修复成功;否则,确定所述待修复数据未修复成功。If they are consistent, it is determined that the data to be repaired is successfully repaired; otherwise, it is determined that the data to be repaired is not successfully repaired.

本发明实施例提供一种数据处理装置,所述装置包括:An embodiment of the present invention provides a data processing device, the device comprising:

获取模块,用于从至少两个数据源中在线获取待稽核数据;An acquisition module, configured to acquire data to be audited online from at least two data sources;

稽核模块,用于基于预设稽核规则,对所述待稽核数据进行稽核;An audit module, configured to audit the data to be audited based on preset audit rules;

修复模块,用于将稽核未通过的数据确定为待修复数据,以及,按照预设修复顺序,对所述待修复数据进行修复。The repair module is configured to determine the data that fails the audit as data to be repaired, and repair the data to be repaired according to a preset repair sequence.

上述方案中,所述获取模块,具体用于确定预先存储的与所述待稽核数据对应的元数据;根据所述元数据,从所述至少两个数据源中获取与所述元数据对应的所述待稽核数据。In the above solution, the acquiring module is specifically configured to determine pre-stored metadata corresponding to the data to be audited; according to the metadata, acquire the metadata corresponding to the metadata from the at least two data sources The data to be audited.

上述方案中,所述装置还包括:In the above scheme, the device also includes:

采集模块,用于根据预设采集规则,从所述至少两个数据源获取所述元数据;和/或,接收所述至少两个数据源推送的所述元数据。The collection module is configured to obtain the metadata from the at least two data sources according to preset collection rules; and/or receive the metadata pushed by the at least two data sources.

上述方案中,所述稽核模块,具体用于根据所述预设稽核规则,判断所述待稽核数据在所述至少两个数据源中是否一致;若一致,则确定所述待稽核数据通过稽核;否则,确定所述待稽核数据未通过稽核。In the above solution, the audit module is specifically used to judge whether the data to be audited is consistent in the at least two data sources according to the preset audit rules; if they are consistent, determine that the data to be audited has passed the audit ; Otherwise, determine that the data to be audited has not passed the audit.

上述方案中,所述装置还包括:In the above scheme, the device also includes:

报警模块,用于判断所述待修复数据是否修复成功;当确定所述待修复数据未修复成功时,对所述待修复数据进行再次修复;当对所述待修复数据的修复次数大于预设修复次数时,生成报警消息;所述报警消息表征所述待修复数据无法进行修复。An alarm module, used to judge whether the data to be repaired is successfully repaired; when it is determined that the data to be repaired has not been successfully repaired, repair the data to be repaired again; when the number of repairs to the data to be repaired is greater than the preset When the repair times are exceeded, an alarm message is generated; the alarm message indicates that the data to be repaired cannot be repaired.

上述方案中,所述报警模块,具体用于判断修复后的所述待修复数据在所述至少两个数据源中是否一致;若一致,确定所述待修复数据修复成功;否则,确定所述待修复数据未修复成功。In the above solution, the alarm module is specifically used to judge whether the repaired data to be repaired is consistent in the at least two data sources; if consistent, determine that the repair of the data to be repaired is successful; otherwise, determine the The data to be repaired was not successfully repaired.

本发明实施例提供一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现上面所述任一种数据处理方法的步骤。An embodiment of the present invention provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the steps of any data processing method described above are implemented.

本发明实施例提供一种数据处理装置,包括:存储器、处理器以及存储在存储器上并可在处理器上运行的计算机程序;An embodiment of the present invention provides a data processing device, including: a memory, a processor, and a computer program stored in the memory and operable on the processor;

其中,所述处理器用于运行所述计算机程序时,执行上面所述任一种数据处理方法的步骤。Wherein, when the processor is configured to run the computer program, it executes the steps of any data processing method described above.

本发明实施例提供的数据处理方法、装置及计算机可读存储介质,从至少两个数据源中在线获取待稽核数据;基于预设稽核规则,对所述待稽核数据进行稽核;将稽核未通过的数据确定为待修复数据,以及,按照预设修复顺序,对所述待修复数据进行修复,从而有效实现数据的在线稽核以及及时修复,提高数据稽核以及数据修复的效率和准确率。The data processing method, device, and computer-readable storage medium provided by the embodiments of the present invention obtain online data to be audited from at least two data sources; audit the data to be audited based on preset audit rules; The data to be repaired is determined as the data to be repaired, and the data to be repaired is repaired according to the preset repair sequence, so as to effectively realize the online audit and timely repair of the data, and improve the efficiency and accuracy of the data audit and data repair.

附图说明Description of drawings

图1为本发明实施例数据处理方法的实现流程示意图;Fig. 1 is a schematic diagram of the implementation flow of the data processing method of the embodiment of the present invention;

图2为本发明实施例数据处理装置的组成结构示意图一;FIG. 2 is a schematic diagram of the composition and structure of a data processing device according to an embodiment of the present invention;

图3为本发明实施例数据处理装置的组成结构示意图二;FIG. 3 is a second schematic diagram of the composition and structure of a data processing device according to an embodiment of the present invention;

图4为本发明实施例在线稽核修复的具体实现流程示意图;FIG. 4 is a schematic diagram of a specific implementation process of online audit and repair according to an embodiment of the present invention;

图5为本发明实施例数据处理装置的组成结构示意图三。FIG. 5 is a third schematic diagram of the composition and structure of the data processing device according to the embodiment of the present invention.

具体实施方式detailed description

相关技术中,对于需要多个数据源共同配合来实现某项业务的情况,这些相互配合的数据源中的某些数据需要保持一致。为了保证数据的完整性和一致性,通常需要对多个数据源中的数据进行数据稽核修复。In related technologies, in the case where multiple data sources need to cooperate to realize a certain service, some data in these mutually coordinated data sources need to be consistent. In order to ensure the integrity and consistency of data, it is usually necessary to perform data audit and repair on the data in multiple data sources.

举例来说,对于咪咕阅读应用程序(APP,Application),图书数据分别存储在小库、主库和内容库中;其中,小库、主库是面向图书作者和编审人员的,内容库是面向用户的。当编审人员对小库中的图书A审核通过之后,将小库中该图书A的状态数据的值修改为1(即图书状态为“上架”),并将主库和内容库中所述图书A的状态数据的值同步修改为1,确保用户能够阅读所述图书A;当所述图书A版权到期之后,编审人员需要将小库中该图书A的状态数据的值修改为0(即图书状态为“下架”),并将主库和内容库中该图书A的状态数据的值同步修改为0,确保用户无法再阅读该图书A。在此过程中,需要对小库、主库和内容库中该图书A的状态数据进行数据稽核,确定所述状态数据在小库、主库和内容库中是否一致,当不一致时,需要进行数据修复,确保所述状态数据在小库、主库和内容库中的一致性。For example, for the Migu reading application (APP, Application), the book data is stored in the small library, the main library and the content library respectively; among them, the small library and the main library are for book authors and editors, and the content library is user-oriented. When the editors and reviewers pass the review of book A in the small library, modify the value of the status data of the book A in the small library to 1 (that is, the status of the book is "on the shelf"), and change the value of the book A in the main library and the content library The value of the state data of A is synchronously modified to 1 to ensure that the user can read the book A; when the copyright of the book A expires, the editors and reviewers need to modify the value of the state data of the book A in the small library to 0 (ie The book status is "off the shelf"), and the value of the status data of the book A in the main library and the content library is synchronously changed to 0 to ensure that the user can no longer read the book A. In this process, it is necessary to perform data audit on the state data of the book A in the small library, the main library and the content library to determine whether the state data is consistent in the small library, the main library and the content library. Data repair, to ensure the consistency of the state data in the small library, the main library and the content library.

目前,为了避免对数据源的正常运行造成影响,为各个数据源建立备份数据库,按照预设周期(一般以天为单位)将数据源中的数据同步备份到与之对应的备份数据库中,然后基于各备份数据库中的数据进行数据稽核,进而根据稽核结果,对数据源中的数据进行数据修复。At present, in order to avoid affecting the normal operation of the data source, a backup database is established for each data source, and the data in the data source is synchronously backed up to the corresponding backup database according to the preset cycle (generally in days), and then Perform data audit based on the data in each backup database, and then perform data restoration on the data in the data source according to the audit results.

但是,由于数据稽核操作是以天为单位在备份环境中进行的,即数据稽核操作是离线进行的,导致数据稽核的实时性较低,影响数据稽核的准确性,例如,对备份数据库中的数据a进行数据稽核时,数据源中的数据a有可能已经发生变化,因此,基于备份数据库得到的数据稽核结果是不准确的,无法真实反应各数据源中的数据一致性,进而根据备份数据库得到的稽核结果对数据源中的数据进行数据修复之后,仍然无法确保各数据源中数据的一致性。However, since the data audit operation is performed in the backup environment on a daily basis, that is, the data audit operation is performed offline, resulting in low real-time performance of the data audit and affecting the accuracy of the data audit. When data a is audited, the data a in the data source may have changed. Therefore, the data audit results based on the backup database are inaccurate and cannot truly reflect the data consistency in each data source. Then, according to the backup database According to the obtained audit results, after the data in the data source is repaired, the consistency of the data in each data source still cannot be ensured.

举例来说,仍以上述咪咕阅读APP为例,分别为小库、主库和内容库建立备份数据库a、b和c,以天为单位将小库中的图书数据同步到备份数据库a、将主库中的图书数据同步到备份数据库b、将内容库中的图书数据同步到备份数据库c。假如,在将小库中的图书数据同步到备份数据库a中时,小库中图书A的状态数据的值为“1”,因此,备份数据库a中图书A的状态数据的值也为“1”。但是,当基于备份库进行数据稽核的过程中,小库中图书A的状态数据的值修改为了“0”。当基于备份库a、b和c进行数据稽核的稽核结果为:备份数据库a中图书A的状态数据的值也为“1”、备份数据库库b中图书A的状态数据的值为“0”时,数据不一致。根据稽核结果,需要将与备份数据库b对应的主库中图书A的状态数据的值同步修改为“1”。但是由于在数据稽核过程中,小库中图书A的状态数据的值已经修改为“0”,因此,根据稽核结果对主库进行数据修复之后,仍然使得小库和主库中图书A的状态数据的值不一致。同时,为各个数据源建立备份数据库也需要增加额外的资源和成本投入。For example, still taking the aforementioned Migu Reading APP as an example, establish backup databases a, b, and c for the small library, main library, and content library respectively, and synchronize the book data in the small library to the backup databases a, b, and c on a daily basis. Synchronize the book data in the main library to the backup database b, and synchronize the book data in the content library to the backup database c. Suppose, when synchronizing the book data in the small library to the backup database a, the value of the state data of book A in the small library is "1", therefore, the value of the state data of book A in the backup database a is also "1" ". However, during the process of data audit based on the backup library, the value of the status data of book A in the small library is changed to "0". When the data audit is performed based on the backup databases a, b and c, the audit result is: the value of the state data of book A in the backup database a is also "1", and the value of the state data of book A in the backup database b is "0" , the data is inconsistent. According to the audit results, the value of the status data of book A in the main library corresponding to the backup database b needs to be synchronously modified to "1". However, during the data audit process, the value of the status data of book A in the small library has been changed to "0", therefore, after data restoration is performed on the main library according to the audit results, the status of book A in the small library and the main library is still Data has inconsistent values. At the same time, establishing a backup database for each data source also requires additional resources and cost input.

基于此,本发明实施例中,从至少两个数据源中在线获取待稽核数据;基于预设稽核规则,对所述待稽核数据进行稽核;将稽核未通过的数据确定为待修复数据,以及,按照预设修复顺序,对所述待修复数据进行修复。Based on this, in the embodiment of the present invention, the data to be audited is obtained online from at least two data sources; based on the preset audit rules, the data to be audited is audited; the data that fails the audit is determined as the data to be repaired, and , repairing the data to be repaired according to a preset repair sequence.

为了能够更加详尽地了解本发明实施例的特点与技术内容,下面结合附图对本发明实施例的实现进行详细阐述,所附附图仅供参考说明之用,并非用来限定本发明。In order to understand the characteristics and technical contents of the embodiments of the present invention in more detail, the implementation of the embodiments of the present invention will be described in detail below in conjunction with the accompanying drawings. The attached drawings are only for reference and description, and are not intended to limit the present invention.

如图1所示,详细说明本发明实施例数据处理方法,包括以下步骤:As shown in Figure 1, the data processing method of the embodiment of the present invention is described in detail, including the following steps:

步骤101:从至少两个数据源中在线获取待稽核数据。Step 101: Online acquisition of data to be audited from at least two data sources.

在一实施例中,所述从至少两个数据源中在线获取待稽核数据,包括:确定预先存储的与所述待稽核数据对应的元数据;根据所述元数据,从所述至少两个数据源中获取与所述元数据对应的所述待稽核数据。In an embodiment, the online acquisition of the data to be audited from at least two data sources includes: determining pre-stored metadata corresponding to the data to be audited; according to the metadata, from the at least two The data to be audited corresponding to the metadata is obtained from a data source.

实际应用时,根据所述元数据,从所述至少两个数据源中获取与所述元数据对应的所述待稽核数据,具体可以是:根据预先存储的所述元数据,确定所述元数据的标识信息;根据确定的所述元数据的标识信息,从所述至少两个数据源中获取与所述标识信息对应的待稽核数据。In actual application, according to the metadata, the data to be audited corresponding to the metadata is obtained from the at least two data sources, specifically, it may be: according to the pre-stored metadata, determining the metadata Data identification information; according to the determined identification information of the metadata, obtain the data to be audited corresponding to the identification information from the at least two data sources.

举例来说,当待稽核数据为图书数据时,假设预先存储的元数据的标识信息为:“ID=123456789(表示唯一标识),TYPE=1(表示图书),UPLOADDATE=2017-05-07 17:06:00(表示采集时间),DOFLAG=0(表示处理状态为待稽核)”,根据所述标识信息从小库、主库和内容库中分别提取与“ID=123456789”对应的待稽核数据。所述待稽核数据可以包括状态数据、卷数据、章节数据等等。For example, when the data to be audited is book data, it is assumed that the identification information of the pre-stored metadata is: "ID=123456789 (indicates unique identification), TYPE=1 (indicates books), UPLOADDATE=2017-05-07 17 : 06:00 (indicates the collection time), DOFLAG=0 (indicates that the processing status is to be audited)", extract the data to be audited corresponding to "ID=123456789" from the small library, the main library and the content library respectively according to the identification information . The data to be audited may include status data, volume data, chapter data and so on.

在一实施例中,所述方法还包括:根据预设采集规则,从所述至少两个数据源中获取所述元数据;和/或,接收所述至少两个数据源推送的所述元数据。In an embodiment, the method further includes: acquiring the metadata from the at least two data sources according to preset collection rules; and/or receiving the metadata pushed by the at least two data sources data.

其中,预设采集规则可以是按照预设时间周期查看所述至少两个数据源的数据日志,并根据所述数据日志从所述至少两个数据源中获取所述元数据。Wherein, the preset collection rule may be to check the data logs of the at least two data sources according to a preset time period, and obtain the metadata from the at least two data sources according to the data logs.

需要说明的是,预设采集规则可以根据实际情况进行设定,这里不做具体限定。It should be noted that the preset collection rules may be set according to actual conditions, and are not specifically limited here.

实际应用时,根据预设采集规则,从所述至少两个数据源中获取所述元数据;同时,接收所述至少两个数据源推送的所述元数据,可以获取更多的待稽核数据,不仅扩大稽核范围,而且在一定程度上能够提高稽核准确率。In actual application, the metadata is obtained from the at least two data sources according to preset collection rules; at the same time, more data to be audited can be obtained by receiving the metadata pushed by the at least two data sources , not only expand the scope of audit, but also improve the accuracy of audit to a certain extent.

实际应用时,还可以对采集得到的元数据进行去重操作,去除采集得到的重复数据,能够在一定程度上提高后续的稽核效率。In practical applications, the collected metadata can also be deduplicated to remove duplicate data collected, which can improve the efficiency of subsequent audits to a certain extent.

实际应用时,可以将采集得到的元数据的数据格式转换为标准格式后再进行存储。所述标准格式可以由稽核KEY值、数据类型、采集时间、处理状态等一系列参数组成;其中,稽核KEY值可以表示待稽核数据的唯一标识。针对不同业务场景,元数据的标准格式可以不相同。In practical applications, the data format of the collected metadata can be converted into a standard format before being stored. The standard format can be composed of a series of parameters such as audit KEY value, data type, collection time, processing status, etc.; wherein, the audit KEY value can represent the unique identification of the data to be audited. For different business scenarios, the standard format of metadata can be different.

步骤102:基于预设稽核规则,对所述待稽核数据进行稽核。Step 102: Audit the data to be audited based on preset audit rules.

在一实施例中,所述基于预设稽核规则,对所述待稽核数据进行稽核,包括:根据所述预设稽核规则,判断所述待稽核数据在所述至少两个数据源中是否一致;若一致,则确定所述待稽核数据通过稽核;否则,确定所述待稽核数据未通过稽核。In an embodiment, the auditing of the data to be audited based on preset audit rules includes: judging whether the data to be audited is consistent in the at least two data sources according to the preset audit rules ; If consistent, it is determined that the data to be audited has passed the audit; otherwise, it is determined that the data to be audited has not passed the audit.

实际应用时,不同的业务场景可以对应不同的预设稽核规则。In actual application, different business scenarios can correspond to different preset audit rules.

例如,当待稽核数据为主库、内容库和小库中的待稽核图书数据,且主库和内容库中的待稽核图书数据是根据小库中的待稽核图书数据同步得到时,预设稽核规则可以为:分别将主库和内容库中的待稽核图书数据与小库中的待稽核图书数据进行稽核,即确定主库和内容库中的待稽核图书数据是否与小库中的待稽核图书数据一致。For example, when the data to be audited is the book data to be audited in the main library, the content library, and the small library, and the data of the books to be audited in the main library and the content library are obtained synchronously from the data of the books to be audited in the small library, the default The audit rules can be: respectively audit the data of the books to be audited in the main library and the content library and the data of the books to be audited in the small library, that is, to determine whether the data of the books to be audited in the main library and the content library are consistent with the data of the books to be audited in the small library The audit book data is consistent.

稽核的过程,具体可以是:假设小库中待稽核图书数据(图书A)的状态数据的值为“0”(表示下架状态,不可阅读),小库中待稽核图书数据(章节a)的状态数据的值也为“0”(表示下架状态,不可阅读)。通过对小库、内容库中的待稽核图书数据(图书A)和待稽核图书数据(章节a)进行稽核,稽核结果显示:内容库中待稽核图书数据(图书A)的状态数据的值为“1”(表示上架状态,可阅读),与小库中的待稽核图书数据(图书A)不一致;内容库中待稽核图书数据(章节a)的状态数据的值也为“1”(表示上架状态,可阅读),与小库中的待稽核图书数据(章节a)也不一致。The audit process can be specifically as follows: Assume that the value of the state data of the book data to be audited (book A) in the small library is "0" (indicating the off-shelf status and cannot be read), and the book data to be audited in the small library (chapter a) The value of the status data is also "0" (indicating off-shelf status, unreadable). By auditing the book data to be audited (book A) and the book data to be audited (chapter a) in the small library and the content library, the audit results show that the status data of the book data to be audited (book A) in the content library is "1" (indicating the status of being put on the shelf and can be read), is inconsistent with the data of the book to be audited (book A) in the small library; the value of the status data of the book data to be audited (chapter a) in the content library is also "1" (indicating Shelf status, readable), and the data of books to be audited (chapter a) in the small library are also inconsistent.

例如,在银行转账业务中,预设稽核规则可以为:对一次转账业务对应的收费数据源中的收费数据和扣费数据源中的扣费数据进行稽核,判断收费数据源中的收费数据与扣费数据源中的扣费数据是否一致。For example, in the bank transfer business, the default audit rule can be: check the charging data in the charging data source and the deduction data in the deduction data source corresponding to a transfer business, and judge the charging data in the charging data source. Whether the deduction data in the deduction data source is consistent.

步骤103:将稽核未通过的数据确定为待修复数据,以及,按照预设修复顺序,对所述待修复数据进行修复。Step 103: Determine the data that fails the audit as data to be repaired, and repair the data to be repaired according to a preset repair sequence.

实际应用时,预设修复顺序可以根据多个待稽核数据之间的关联关系得到。In actual application, the preset restoration order can be obtained according to the association relationship among multiple data to be audited.

举例来说,多个待稽核数据之间存在的关联关系可以是:待稽核数据(章节a)为待稽核数据(图书A)中的一个章节。根据关联关系,得到的预设修复顺序可以为:先修复待稽核数据(图书A),再修复待稽核数据(章节a)。For example, the relationship among multiple data to be audited may be: the data to be audited (chapter a) is a chapter in the data to be audited (book A). According to the association relationship, the obtained preset repair order can be: first repair the data to be audited (book A), and then repair the data to be audited (chapter a).

多个待稽核数据之间存在的关联关系还可以是:待稽核数据(剧集b)是待稽核数据(电视剧B)中的一个剧集。根据关联关系,得到的预设修复顺序可以为:先修复待稽核数据(电视剧B),再修复待稽核数据(剧集b)。The association relationship between multiple data to be audited may also be: the data to be audited (episode b) is an episode in the data to be audited (tv series B). According to the association relationship, the obtained preset restoration sequence may be: first restore the data to be audited (television series B), and then restore the data to be audited (episode b).

仍以上述待稽核数据为主库、内容库和小库中的待稽核图书数据为例,修复的过程,具体可以是:先修复待稽核图书数据(图书A),将内容库中待稽核图书数据(图书A)的状态数据的值修改为“0”,确保面向用户的内容库中,图书A的所有数据都无法提供给用户阅读;再修复待稽核图书数据(章节a),将内容库中待稽核图书数据(章节a)的状态数据的值修改为“0”,确定小库和内容库中的待稽核图书数据一致性。Still taking the above data to be audited as the main library, the content library and the book data to be audited in the small library as an example, the repair process can be as follows: first repair the book data to be audited (book A), and then restore the book data to be audited in the content library Change the value of the status data of the data (book A) to "0" to ensure that in the user-oriented content library, all the data of book A cannot be provided for users to read; Change the value of the state data of the book data to be audited (chapter a) to "0" to confirm the consistency of the book data to be audited in the small library and the content library.

在一实施例中,所述方法还包括:判断所述待修复数据是否修复成功;当确定所述待修复数据未修复成功时,对所述待修复数据进行再次修复;当对所述待修复数据的修复次数大于预设修复次数时,生成报警消息;所述报警消息表征所述待修复数据无法进行修复。In an embodiment, the method further includes: judging whether the data to be repaired is successfully repaired; when it is determined that the data to be repaired has not been successfully repaired, repairing the data to be repaired again; When the number of data repairs is greater than the preset number of repairs, an alarm message is generated; the alarm message indicates that the data to be repaired cannot be repaired.

在一实施例中,所述判断所述待修复数据是否修复成功,包括:判断修复后的所述待修复数据在所述至少两个数据源中是否一致;若一致,确定所述待修复数据修复成功;否则,确定所述待修复数据未修复成功。In one embodiment, the judging whether the data to be repaired is successfully repaired includes: judging whether the data to be repaired after repair is consistent in the at least two data sources; if they are consistent, determining whether the data to be repaired is consistent The repair is successful; otherwise, it is determined that the data to be repaired is not successfully repaired.

本发明实施例提供的数据处理方法,从至少两个数据源中在线获取待稽核数据;基于预设稽核规则,对所述待稽核数据进行稽核;将稽核未通过的数据确定为待修复数据,以及,按照预设修复顺序,对所述待修复数据进行修复,从而有效实现数据的在线稽核以及及时修复,提高数据稽核以及数据修复的效率和准确率。In the data processing method provided by the embodiment of the present invention, the data to be audited is acquired online from at least two data sources; based on the preset audit rules, the data to be audited is audited; the data that fails the audit is determined as the data to be repaired, And, the data to be repaired is repaired according to the preset repair sequence, so as to effectively realize online audit and timely repair of data, and improve the efficiency and accuracy of data audit and data repair.

进一步,本发明实施例针对不同的业务场景均可使用,仅需针对不同业务场景,确定预设稽核规则和预设修复顺序即可,通用性较好。Furthermore, the embodiments of the present invention can be used for different business scenarios, and only need to determine the preset audit rules and preset repair sequences for different business scenarios, which has good versatility.

基于本申请各实施例提供数据处理方法,本申请还提供了一种数据处理装置,如图2所示,所述装置包括:获取模块21、稽核模块22、修复模块23;其中,Based on the data processing method provided by each embodiment of the present application, the present application also provides a data processing device, as shown in FIG. 2 , the device includes: an acquisition module 21, an audit module 22, and a repair module 23; wherein,

获取模块21,用于从至少两个数据源中在线获取待稽核数据;An acquisition module 21, configured to acquire data to be audited online from at least two data sources;

稽核模块22,用于基于预设稽核规则,对所述待稽核数据进行稽核;An audit module 22, configured to audit the data to be audited based on preset audit rules;

修复模块23,用于将稽核未通过的数据确定为待修复数据,以及,按照预设修复顺序,对所述待修复数据进行修复。The repair module 23 is configured to determine the data that fails the audit as data to be repaired, and repair the data to be repaired according to a preset repair sequence.

在一实施例中,所述获取模块21,具体用于确定预先存储的与所述待稽核数据对应的元数据;根据所述元数据,从所述至少两个数据源中获取与所述元数据对应的所述待稽核数据。In an embodiment, the obtaining module 21 is specifically configured to determine pre-stored metadata corresponding to the data to be audited; according to the metadata, obtain the metadata related to the metadata from the at least two data sources. The data to be audited corresponding to the data.

实际应用时,根据所述元数据,从所述至少两个数据源中获取与所述元数据对应的所述待稽核数据,具体可以是:根据预先存储的所述元数据,确定所述元数据的标识信息;根据确定的所述元数据的标识信息,从所述至少两个数据源中获取与所述标识信息对应的待稽核数据。In actual application, according to the metadata, the data to be audited corresponding to the metadata is obtained from the at least two data sources, specifically, it may be: according to the pre-stored metadata, determining the metadata Data identification information; according to the determined identification information of the metadata, obtain the data to be audited corresponding to the identification information from the at least two data sources.

在一实施例中,所述装置还包括:采集模块;其中,In an embodiment, the device further includes: an acquisition module; wherein,

所述采集模块,用于根据预设采集规则,从所述至少两个数据源获取所述元数据;和/或,接收所述至少两个数据源推送的所述元数据。The collection module is configured to obtain the metadata from the at least two data sources according to preset collection rules; and/or receive the metadata pushed by the at least two data sources.

其中,预设采集规则可以是按照预设时间周期查看所述至少两个数据源的数据日志,并根据所述数据日志从所述至少两个数据源中获取所述元数据。Wherein, the preset collection rule may be to check the data logs of the at least two data sources according to a preset time period, and obtain the metadata from the at least two data sources according to the data logs.

在一实施例中,所述稽核模块22,具体用于根据所述预设稽核规则,判断所述待稽核数据在所述至少两个数据源中是否一致;若一致,则确定所述待稽核数据通过稽核;否则,确定所述待稽核数据未通过稽核。In one embodiment, the audit module 22 is specifically configured to determine whether the data to be audited is consistent in the at least two data sources according to the preset audit rules; if they are consistent, determine whether the data to be audited The data passes the audit; otherwise, it is determined that the data to be audited has not passed the audit.

实际应用时,不同的业务场景可以对应不同的预设稽核规则。In actual application, different business scenarios can correspond to different preset audit rules.

在一实施例中,所述装置还包括:报警模块;其中,In one embodiment, the device further includes: an alarm module; wherein,

所述报警模块,用于判断所述待修复数据是否修复成功;当确定所述待修复数据未修复成功时,对所述待修复数据进行再次修复;当对所述待修复数据的修复次数大于预设修复次数时,生成报警消息;所述报警消息表征所述待修复数据无法进行修复。The alarm module is used to judge whether the data to be repaired is successfully repaired; when it is determined that the data to be repaired is not successfully repaired, the data to be repaired is repaired again; when the number of repairs to the data to be repaired is greater than When the repair times are preset, an alarm message is generated; the alarm message indicates that the data to be repaired cannot be repaired.

在一实施例中,所述报警模块,具体用于判断修复后的所述待修复数据在所述至少两个数据源中是否一致;若一致,确定所述待修复数据修复成功;否则,确定所述待修复数据未修复成功。In one embodiment, the alarm module is specifically configured to determine whether the repaired data to be repaired is consistent in the at least two data sources; if consistent, determine that the repair of the data to be repaired is successful; otherwise, determine The data to be repaired is not successfully repaired.

需要说明的是:上述实施例提供的数据处理装置在进行数据处理时,仅以上述各程序模块的划分进行举例说明,实际应用中,可以根据需要而将上述处理分配由不同的程序模块完成,即将装置的内部结构划分成不同的程序模块,以完成以上描述的全部或者部分处理。另外,上述实施例提供的数据处理装置与数据处理方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。It should be noted that: when the data processing device provided in the above-mentioned embodiments performs data processing, the division of the above-mentioned program modules is used as an example for illustration. In practical applications, the above-mentioned processing allocation can be completed by different program modules according to needs. That is, the internal structure of the device is divided into different program modules to complete all or part of the processing described above. In addition, the data processing device and the data processing method embodiments provided in the above embodiments belong to the same concept, and the specific implementation process thereof is detailed in the method embodiments, and will not be repeated here.

在实际应用中,获取模块21由位于数据处理装置上的网络接口实现;稽核模块22、修复模块23、采集模块、报警模块可由位于数据处理装置上的中央处理器(CPU,CentralProcessing Unit)、微处理器(MPU,Micro Processor Unit)、数字信号处理器(DSP,Digital Signal Processor)、或现场可编程门阵列(FPGA,Field Programmable GateArray)等实现。In practical applications, the acquisition module 21 is realized by a network interface positioned on the data processing device; Processor (MPU, Micro Processor Unit), digital signal processor (DSP, Digital Signal Processor), or Field Programmable Gate Array (FPGA, Field Programmable GateArray) and other implementations.

图3是本发明数据处理装置的结构示意图,图3所示的数据处理装置300包括:至少一个处理器301、存储器302、用户接口303、至少一个网络接口304。数据处理装置300中的各个组件通过总线系统305耦合在一起。可理解,总线系统305用于实现这些组件之间的连接通信。总线系统305除包括数据总线之外,还包括电源总线、控制总线和状态信号总线。但是为了清楚说明起见,在图3中将各种总线都标为总线系统305。FIG. 3 is a schematic structural diagram of a data processing device of the present invention. The data processing device 300 shown in FIG. Various components in the data processing device 300 are coupled together through the bus system 305 . It can be understood that the bus system 305 is used to realize connection and communication between these components. In addition to the data bus, the bus system 305 also includes a power bus, a control bus and a status signal bus. However, the various buses are labeled as bus system 305 in FIG. 3 for clarity of illustration.

其中,用户接口303可以包括显示器、键盘、鼠标、轨迹球、点击轮、按键、按钮、触感板或者触摸屏等。Wherein, the user interface 303 may include a display, a keyboard, a mouse, a trackball, a click wheel, keys, buttons, a touch panel or a touch screen, and the like.

可以理解,存储器302可以是易失性存储器或非易失性存储器,也可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(ROM,Read Only Memory)、可编程只读存储器(PROM,Programmable Read-Only Memory)、可擦除可编程只读存储器(EPROM,Erasable Programmable Read-Only Memory)、电可擦除可编程只读存储器(EEPROM,Electrically Erasable Programmable Read-Only Memory)、磁性随机存取存储器(FRAM,ferromagnetic random access memory)、快闪存储器(Flash Memory)、磁表面存储器、光盘、或只读光盘(CD-ROM,Compact Disc Read-Only Memory);磁表面存储器可以是磁盘存储器或磁带存储器。易失性存储器可以是随机存取存储器(RAM,Random AccessMemory),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(SRAM,Static Random Access Memory)、同步静态随机存取存储器(SSRAM,Synchronous Static Random Access Memory)、动态随机存取存储器(DRAM,Dynamic Random Access Memory)、同步动态随机存取存储器(SDRAM,SynchronousDynamic Random Access Memory)、双倍数据速率同步动态随机存取存储器(DDRSDRAM,Double Data Rate Synchronous Dynamic Random Access Memory)、增强型同步动态随机存取存储器(ESDRAM,Enhanced Synchronous Dynamic Random Access Memory)、同步连接动态随机存取存储器(SLDRAM,SyncLink Dynamic Random Access Memory)、直接内存总线随机存取存储器(DRRAM,Direct Rambus Random Access Memory)。本发明实施例描述的存储器302旨在包括但不限于这些和任意其它适合类型的存储器。It can be understood that the memory 302 may be a volatile memory or a non-volatile memory, and may also include both volatile and non-volatile memories. Wherein, the non-volatile memory can be a read-only memory (ROM, Read Only Memory), a programmable read-only memory (PROM, Programmable Read-Only Memory), an erasable programmable read-only memory (EPROM, Erasable Programmable Read-Only Memory), Only Memory), Electrically Erasable Programmable Read-Only Memory (EEPROM, Electrically Erasable Programmable Read-Only Memory), Magnetic Random Access Memory (FRAM, ferromagnetic random access memory), Flash Memory (Flash Memory), Magnetic Surface Memory , CD, or CD-ROM (CD-ROM, Compact Disc Read-Only Memory); the magnetic surface storage can be disk storage or tape storage. The volatile memory may be random access memory (RAM, Random Access Memory), which is used as an external cache. By way of illustration and not limitation, many forms of RAM are available, such as Static Random Access Memory (SRAM, Static Random Access Memory), Synchronous Static Random Access Memory (SSRAM, Synchronous Static Random Access Memory), Dynamic Random Access Memory Memory (DRAM, Dynamic Random Access Memory), Synchronous Dynamic Random Access Memory (SDRAM, Synchronous Dynamic Random Access Memory), Double Data Rate Synchronous Dynamic Random Access Memory (DDRSDRAM, Double Data Rate Synchronous Dynamic Random Access Memory), Enhanced Synchronous Dynamic Random Access Memory (ESDRAM, Enhanced Synchronous Dynamic Random Access Memory), Synchronous Link Dynamic Random Access Memory (SLDRAM, SyncLink Dynamic Random Access Memory), Direct Memory Bus Random Access Memory (DRRAM, Direct Rambus Random Access Memory) . The memory 302 described in the embodiments of the present invention is intended to include, but not be limited to, these and any other suitable types of memory.

本发明实施例中的存储器302用于存储各种类型的数据以支持数据处理装置300的操作。这些数据的示例包括:用于在数据处理装置300上操作的任何计算机程序,如操作系统3021和应用程序3022;其中,操作系统3021包含各种系统程序,例如框架层、核心库层、驱动层等,用于实现各种基础业务以及处理基于硬件的任务。应用程序3022可以包含各种应用程序,用于实现各种应用业务。实现本发明实施例方法的程序可以包含在应用程序3022中。The memory 302 in the embodiment of the present invention is used to store various types of data to support the operation of the data processing device 300 . Examples of these data include: any computer program for operating on the data processing device 300, such as operating system 3021 and application program 3022; wherein, the operating system 3021 includes various system programs, such as framework layer, core library layer, driver layer etc., to implement various basic services and handle hardware-based tasks. The application program 3022 may include various application programs for implementing various application services. The program for realizing the method of the embodiment of the present invention may be included in the application program 3022 .

上述本发明实施例揭示的方法可以应用于处理器301中,或者由处理器301实现。处理器301可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器301中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器301可以是通用处理器、数字信号处理器,或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。处理器301可以实现或者执行本发明实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者任何常规的处理器等。结合本发明实施例所公开的方法的步骤,可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于存储介质中,该存储介质位于存储器302,处理器301读取存储器302中的信息,结合其硬件完成前述方法的步骤。The methods disclosed in the foregoing embodiments of the present invention may be applied to the processor 301 or implemented by the processor 301 . The processor 301 may be an integrated circuit chip with signal processing capabilities. In the implementation process, each step of the above method can be completed by an integrated logic circuit of hardware in the processor 301 or instructions in the form of software. The aforementioned processor 301 may be a general-purpose processor, a digital signal processor, or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and the like. The processor 301 may implement or execute various methods, steps, and logic block diagrams disclosed in the embodiments of the present invention. A general purpose processor may be a microprocessor or any conventional processor or the like. The steps of the methods disclosed in the embodiments of the present invention may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in a storage medium, and the storage medium is located in the memory 302. The processor 301 reads the information in the memory 302, and completes the steps of the foregoing method in combination with its hardware.

具体来说,本发明实施例还提供了一种数据处理装置,参照图3所示,所述数据处理装置包括:存储器302、处理器301以及存储在存储器上并可在处理器上运行的计算机程序,Specifically, the embodiment of the present invention also provides a data processing device, as shown in FIG. program,

其中,所述处理器301用于运行所述计算机程序时,执行以下操作:Wherein, when the processor 301 is used to run the computer program, it performs the following operations:

从至少两个数据源中在线获取待稽核数据;基于预设稽核规则,对所述待稽核数据进行稽核;将稽核未通过的数据确定为待修复数据,以及,按照预设修复顺序,对所述待修复数据进行修复。Obtain online data to be audited from at least two data sources; audit the data to be audited based on preset audit rules; determine data that fails the audit as data to be repaired, and, according to a preset repair sequence, perform an audit on the data to be audited; Repair the data to be repaired.

在一实施例中,所述处理器301还用于运行所述计算机程序时,执行以下操作:确定预先存储的与所述待稽核数据对应的元数据;根据所述元数据,从所述至少两个数据源中获取与所述元数据对应的所述待稽核数据。In an embodiment, the processor 301 is further configured to perform the following operations when running the computer program: determine pre-stored metadata corresponding to the data to be audited; according to the metadata, from the at least The data to be audited corresponding to the metadata is acquired from two data sources.

在一实施例中,所述处理器301还用于运行所述计算机程序时,执行以下操作:根据预设采集规则,从所述至少两个数据源中获取所述元数据;和/或,接收所述至少两个数据源推送的所述元数据。In an embodiment, the processor 301 is further configured to perform the following operations when running the computer program: acquire the metadata from the at least two data sources according to a preset acquisition rule; and/or, The metadata pushed by the at least two data sources is received.

在一实施例中,所述处理器301还用于运行所述计算机程序时,执行以下操作:根据所述预设稽核规则,判断所述待稽核数据在所述至少两个数据源中是否一致;若一致,则确定所述待稽核数据通过稽核;否则,确定所述待稽核数据未通过稽核。In an embodiment, the processor 301 is further configured to perform the following operations when running the computer program: according to the preset audit rules, determine whether the data to be audited is consistent in the at least two data sources ; If consistent, it is determined that the data to be audited has passed the audit; otherwise, it is determined that the data to be audited has not passed the audit.

在一实施例中,所述处理器301还用于运行所述计算机程序时,执行以下操作:判断所述待修复数据是否修复成功;当确定所述待修复数据未修复不成功时,对所述待修复数据进行再次修复;当确定修复次数大于预设修复次数时,生成报警消息;所述报警消息表征所述待修复数据无法进行修复。In an embodiment, the processor 301 is further configured to perform the following operations when running the computer program: determine whether the data to be repaired is successfully repaired; when it is determined that the data to be repaired is not successfully repaired, The data to be repaired is repaired again; when it is determined that the number of repairs is greater than the preset number of repairs, an alarm message is generated; the alarm message indicates that the data to be repaired cannot be repaired.

在一实施例中,所述处理器301还用于运行所述计算机程序时,执行以下操作:判断修复后的所述待修复数据在所述至少两个数据源中是否一致;若一致,确定所述待修复数据修复成功;否则,确定所述待修复数据未修复成功。In one embodiment, the processor 301 is further configured to perform the following operations when running the computer program: determine whether the repaired data to be repaired is consistent in the at least two data sources; if consistent, determine The data to be repaired is successfully repaired; otherwise, it is determined that the data to be repaired is not successfully repaired.

基于本申请各实施例提供的数据处理方法,本申请还提供一种计算机可读存储介质,参照图3所示,所述计算机可读存储介质可以包括:用于存储计算机程序的存储器302,上述计算机程序可由数据处理装置300的处理器301执行,以完成前述方法所述步骤。计算机可读存储介质可以是FRAM、ROM、PROM、EPROM、EEPROM、Flash Memory、磁表面存储器、光盘、或CD-ROM等存储器。Based on the data processing methods provided by the embodiments of the present application, the present application also provides a computer-readable storage medium, as shown in FIG. 3 , the computer-readable storage medium may include: a memory 302 for storing computer programs, the The computer program can be executed by the processor 301 of the data processing device 300 to complete the steps described in the foregoing method. The computer-readable storage medium may be memory such as FRAM, ROM, PROM, EPROM, EEPROM, Flash Memory, magnetic surface memory, optical disk, or CD-ROM.

具体地,本发明实施例提供的计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器运行时,执行以下操作:Specifically, the computer-readable storage medium provided by the embodiment of the present invention has a computer program stored thereon, and when the computer program is run by a processor, the following operations are performed:

从至少两个数据源中在线获取待稽核数据;基于预设稽核规则,对所述待稽核数据进行稽核;将稽核未通过的数据确定为待修复数据,以及,按照预设修复顺序,对所述待修复数据进行修复。Obtain online data to be audited from at least two data sources; audit the data to be audited based on preset audit rules; determine data that fails the audit as data to be repaired, and, according to a preset repair sequence, perform an audit on the data to be audited; Repair the data to be repaired.

在一实施例中,所述计算机程序被处理器运行时,还执行以下操作:确定预先存储的与所述待稽核数据对应的元数据;根据所述元数据,从所述至少两个数据源中获取与所述元数据对应的所述待稽核数据。In an embodiment, when the computer program is executed by the processor, the following operations are further performed: determining pre-stored metadata corresponding to the data to be audited; Obtain the data to be audited corresponding to the metadata.

在一实施例中,所述计算机程序被处理器运行时,还执行以下操作:根据预设采集规则,从所述至少两个数据源中获取所述元数据;和/或,接收所述至少两个数据源推送的所述元数据。In an embodiment, when the computer program is executed by the processor, the following operations are further performed: according to preset collection rules, obtain the metadata from the at least two data sources; and/or receive the at least The metadata pushed by the two data sources.

在一实施例中,所述计算机程序被处理器运行时,还执行以下操作:根据所述预设稽核规则,判断所述待稽核数据在所述至少两个数据源中是否一致;若一致,则确定所述待稽核数据通过稽核;否则,确定所述待稽核数据未通过稽核。In an embodiment, when the computer program is run by the processor, the following operations are further performed: according to the preset audit rules, it is judged whether the data to be audited is consistent in the at least two data sources; if they are consistent, Then it is determined that the data to be audited has passed the audit; otherwise, it is determined that the data to be audited has not passed the audit.

在一实施例中,所述计算机程序被处理器运行时,还执行以下操作:判断所述待修复数据是否修复成功;当确定所述待修复数据未修复成功时,对所述待修复数据进行再次修复;当对所述待修复数据的修复次数大于预设修复次数时,生成报警消息;所述报警消息表征所述待修复数据无法进行修复。In an embodiment, when the computer program is run by the processor, the following operations are further performed: judging whether the data to be repaired has been successfully repaired; when it is determined that the data to be repaired has not been successfully repaired, performing Repairing again; when the number of repairs to the data to be repaired is greater than the preset number of repairs, an alarm message is generated; the alarm message indicates that the data to be repaired cannot be repaired.

在一实施例中,所述计算机程序被处理器运行时,还执行以下操作:判断修复后的所述待修复数据在所述至少两个数据源中是否一致;若一致,确定所述待修复数据修复成功;否则,确定所述待修复数据未修复成功。In one embodiment, when the computer program is run by the processor, the following operations are further performed: judging whether the repaired data to be repaired is consistent in the at least two data sources; if they are consistent, determine the data to be repaired The data is repaired successfully; otherwise, it is determined that the data to be repaired is not successfully repaired.

下面以实现数据处理为具体实例详细说明本发明在实际应用中的实现过程及原理。The implementation process and principles of the present invention in practical applications will be described in detail below by taking the implementation of data processing as a specific example.

图4为本发明实施例实现在线稽核修复的具体实现流程示意图,结合图5所示的数据处理装置的示意图,具体实现过程,包括如下步骤:FIG. 4 is a schematic diagram of a specific implementation process for realizing online audit and repair according to an embodiment of the present invention. In combination with the schematic diagram of the data processing device shown in FIG. 5 , the specific implementation process includes the following steps:

步骤401:采集模块从所述至少两个数据源中在线采集元数据。Step 401: The collection module collects metadata online from the at least two data sources.

采集方式包括主动采集和被动采集;Collection methods include active collection and passive collection;

主动采集方式可以为:按照预设时间周期查看所述至少两个数据源的数据日志,并根据所述数据日志从所述至少两个数据源中获取所述元数据。The active collection method may be: checking the data logs of the at least two data sources according to a preset time period, and obtaining the metadata from the at least two data sources according to the data logs.

被动采集方式可以为:采集集模块可以设置有供各个数据源调用的应用程序编程接口(API,Application Programming Interface),各个数据源通过调用API,将自身与待稽核数据对应的元数据主动推送到采集模块中。The passive collection method can be as follows: the collection set module can be set with an application programming interface (API, Application Programming Interface) for each data source to call, and each data source actively pushes the metadata corresponding to itself and the data to be audited to the in the acquisition module.

采集模块还可以对采集得到的与待稽核数据对应的元数据进行去重操作,去除采集得到的重复数据,以使得在一定程度上提高后续的稽核效率。The collection module can also perform deduplication operations on the collected metadata corresponding to the data to be audited, and remove the collected duplicate data, so as to improve the follow-up audit efficiency to a certain extent.

步骤402:采集模块将元数据推送到获取模块。Step 402: the collection module pushes the metadata to the acquisition module.

采集模块可以在存储所述元数据之后,立即将所述元数据推送到获取模块中进行处理;也可以每隔预设时长,将预设时长内存储的所述元数据批量推送到获取模块中进行预处理。The acquisition module may immediately push the metadata to the acquisition module for processing after storing the metadata; it may also push the metadata stored within the preset period to the acquisition module in batches every preset period of time Do preprocessing.

步骤403:获取模块根据预先存储的所述元数据,在所述至少两个数据源中获取与所述元数据对应的所述待稽核数据,将所述待稽核数据推送到稽核模块。Step 403: The acquiring module acquires the data to be audited corresponding to the metadata from the at least two data sources according to the pre-stored metadata, and pushes the data to be audited to the auditing module.

步骤404:稽核模块基于预设稽核规则,对所述待稽核数据进行稽核,将所述待修复数据推送到修复模块。Step 404: The audit module audits the data to be audited based on preset audit rules, and pushes the data to be repaired to the repair module.

当确定所述待稽核数据在所述至少两个数据源中一致时,所述稽核模块将稽核通过的待稽核数据对应的元数据推送到采集模块,使得采集模块删除之前存储的元数据;或者,使得采集模块将所述元数据中的处理状态修改为“稽核完成”。When it is determined that the data to be audited is consistent in the at least two data sources, the audit module pushes the metadata corresponding to the audited data to the collection module, so that the collection module deletes the previously stored metadata; or , so that the acquisition module modifies the processing status in the metadata to "audit completed".

当确定待稽核数据在所述至少两个数据源中不一致时,稽核模块将稽核未通过的数据确定为待修复数据,将所述待修复数据推送到修复模块进行实时修复。When it is determined that the data to be audited is inconsistent in the at least two data sources, the audit module determines the data that fails the audit as data to be repaired, and pushes the data to be repaired to the repair module for real-time repair.

稽核的过程,具体可以是:假设小库中待稽核数据(如图书A)的状态数据的值为“0”(表示下架状态,不可阅读),小库中待稽核数据(如章节a)的状态数据的值也为“0”(表示下架状态,不可阅读)。通过对小库、内容库中的待稽核数据(如图书A)和待稽核数据(如章节a)进行稽核,稽核结果显示:内容库中待稽核数据(如图书A)的状态数据的值为“1”(表示上架状态,可阅读),与小库中的待稽核数据不一致;内容库中待稽核数据(如章节a)的状态数据的值也为“1”(表示上架状态,可阅读),与小库中的待稽核数据也不一致。The process of auditing can be specifically as follows: Assume that the value of the state data of the data to be audited in the small library (such as book A) is "0" (indicating the off-shelf status and cannot be read), and the data to be audited in the small library (such as chapter a) The value of the status data is also "0" (indicating off-shelf status, unreadable). By auditing the data to be audited (such as book A) and the data to be audited (such as chapter a) in the small library and content library, the audit results show that the status data of the data to be audited in the content library (such as book A) is "1" (indicating on-shelf status, readable), inconsistent with the data to be audited in the small library; the value of the status data of the data to be audited in the content library (such as chapter a) is also "1" (indicating on-shelf status, readable ), which is inconsistent with the data to be audited in the small library.

步骤405:修复模块按照预设修复顺序,对所述待修复数据进行修复。Step 405: The repair module repairs the data to be repaired according to a preset repair sequence.

修复的过程,具体可以是:先修复待稽核数据(如图书A),将内容库中待稽核数据(如图书A)的状态数据的值修改为“0”,确保面向用户的内容库中,图书A的所有数据都无法提供给用户阅读;再修复待稽核数据(如章节a),将内容库中待稽核数据(如章节a)的状态数据的值修改为“0”,确定小库和内容库中的待稽核数据一致性。The repair process can be specifically: first repair the data to be audited (such as book A), and change the value of the state data of the data to be audited in the content library (such as book A) to "0", so as to ensure that in the user-oriented content library, All the data of book A cannot be provided to users for reading; then repair the data to be audited (such as chapter a), change the value of the status data of the data to be audited in the content library (such as chapter a) to "0", and determine the small library and Data consistency to be audited in the content library.

假如不按照预设的修复顺序进行修复,而先对待稽核数据(如章节a)进行修复,由于修复需要一定的时间,因此,在对待稽核数据(如章节a)进行修复的时间段内,内容库中待稽核数据(如图书A)的状态数据的值仍然为“1”,即在该时间段内,面向用户的内容库仍然可以为用户提供图书A的所有数据为可阅读状态,造成数据错误。If the restoration is not performed according to the preset restoration order, but the data to be audited (such as chapter a) is repaired first, since the restoration takes a certain amount of time, during the time period when the data to be audited (such as chapter a) is repaired, the content The value of the state data of the data to be audited in the library (such as book A) is still "1", that is, within this time period, the user-oriented content library can still provide users with all the data of book A as readable, resulting in data mistake.

步骤406:判断所述待修复数据是否修复成功;当确定所述待修复数据未修复成功时,对所述待修复数据进行再次修复。当对所述待修复数据的修复次数大于预设修复次数时,生成报警消息。Step 406: Judging whether the data to be repaired is successfully repaired; when it is determined that the data to be repaired is not successfully repaired, repairing the data to be repaired again. When the times of repairing the data to be repaired are greater than the preset times of repairing, an alarm message is generated.

其中,所述报警消息表征所述待修复数据无法进行修复。使得相关人员可以及时采取其他方式进行修复,避免由于不同数据源中的数据不一致导致业务故障。Wherein, the alarm message indicates that the data to be repaired cannot be repaired. Relevant personnel can take other ways to repair in time to avoid business failures caused by inconsistencies in data in different data sources.

以上所述,仅为本发明的较佳实施例而已,并非用于限定本发明的保护范围。The above descriptions are only preferred embodiments of the present invention, and are not intended to limit the protection scope of the present invention.

Claims (14)

1. A method of data processing, the method comprising:
acquiring data to be audited from at least two data sources on line;
auditing the data to be audited based on a preset auditing rule;
determining data which is not passed through the audit as data to be repaired, and repairing the data to be repaired according to a preset repairing sequence.
2. The method of claim 1, wherein the online obtaining of data to be audited from at least two data sources comprises:
determining prestored metadata corresponding to the data to be audited;
and acquiring the data to be audited corresponding to the metadata from the at least two data sources according to the metadata.
3. The method of claim 2, further comprising:
acquiring the metadata from the at least two data sources according to a preset acquisition rule;
and/or receiving the metadata pushed by the at least two data sources.
4. The method of claim 1, wherein auditing the data to be audited based on preset auditing rules comprises:
judging whether the data to be audited are consistent in the at least two data sources or not according to the preset auditing rule;
if the data to be audited are consistent, determining that the data to be audited pass auditing; otherwise, determining that the data to be audited does not pass the audit.
5. The method of claim 1, further comprising:
judging whether the data to be repaired is successfully repaired;
when the data to be repaired is determined to be unsuccessfully repaired, repairing the data to be repaired again;
when the repairing times of the data to be repaired are larger than the preset repairing times, generating an alarm message; the alarm message represents that the data to be repaired cannot be repaired.
6. The method according to claim 5, wherein the determining whether the data to be repaired is successfully repaired comprises:
judging whether the repaired data to be repaired is consistent in the at least two data sources;
if the data to be repaired are consistent, determining that the data to be repaired is successfully repaired; otherwise, determining that the data to be repaired is not repaired successfully.
7. A data processing apparatus, characterized in that the apparatus comprises:
the system comprises an acquisition module, a checking module and a processing module, wherein the acquisition module is used for acquiring data to be audited from at least two data sources on line;
the auditing module is used for auditing the data to be audited based on a preset auditing rule;
and the repairing module is used for determining the data which is not passed through the audit as the data to be repaired and repairing the data to be repaired according to a preset repairing sequence.
8. The apparatus of claim 7,
the acquisition module is specifically used for determining metadata which is stored in advance and corresponds to the data to be audited; and acquiring the data to be audited corresponding to the metadata from the at least two data sources according to the metadata.
9. The apparatus of claim 8, further comprising:
the acquisition module is used for acquiring the metadata from the at least two data sources according to a preset acquisition rule; and/or receiving the metadata pushed by the at least two data sources.
10. The apparatus of claim 7,
the auditing module is specifically used for judging whether the data to be audited are consistent in the at least two data sources according to the preset auditing rule; if the data to be audited are consistent, determining that the data to be audited pass auditing; otherwise, determining that the data to be audited does not pass the audit.
11. The apparatus of claim 7, further comprising:
the alarm module is used for judging whether the data to be repaired is successfully repaired; when the data to be repaired is determined to be unsuccessfully repaired, repairing the data to be repaired again; when the repairing times of the data to be repaired are larger than the preset repairing times, generating an alarm message; the alarm message represents that the data to be repaired cannot be repaired.
12. The apparatus of claim 11,
the alarm module is specifically configured to determine whether the repaired data to be repaired is consistent in the at least two data sources; if the data to be repaired are consistent, determining that the data to be repaired is successfully repaired; otherwise, determining that the data to be repaired is not repaired successfully.
13. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 6.
14. A data processing apparatus, comprising: a memory, a processor, and a computer program stored on the memory and executable on the processor;
wherein the processor is adapted to perform the steps of the method of any one of claims 1 to 6 when running the computer program.
CN201711049815.9A 2017-10-31 2017-10-31 A kind of data processing method, device and computer-readable recording medium Pending CN107729541A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711049815.9A CN107729541A (en) 2017-10-31 2017-10-31 A kind of data processing method, device and computer-readable recording medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711049815.9A CN107729541A (en) 2017-10-31 2017-10-31 A kind of data processing method, device and computer-readable recording medium

Publications (1)

Publication Number Publication Date
CN107729541A true CN107729541A (en) 2018-02-23

Family

ID=61202977

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711049815.9A Pending CN107729541A (en) 2017-10-31 2017-10-31 A kind of data processing method, device and computer-readable recording medium

Country Status (1)

Country Link
CN (1) CN107729541A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108829534A (en) * 2018-05-31 2018-11-16 康键信息技术(深圳)有限公司 Data problem restorative procedure, device, computer equipment and storage medium
CN109254893A (en) * 2018-08-20 2019-01-22 彩讯科技股份有限公司 A kind of business datum auditing method, device, server and storage medium
CN112967417A (en) * 2021-02-01 2021-06-15 南京盛航海运股份有限公司 Intelligent ship data acquisition networking method, device, equipment and storage medium
CN114003579A (en) * 2020-07-28 2022-02-01 中国移动通信集团山东有限公司 Method, device, equipment and storage medium for auditing data
CN115878601A (en) * 2022-12-14 2023-03-31 中电信数智科技有限公司 Data auditing method, system and device and electronic equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101625686A (en) * 2008-07-08 2010-01-13 阿里巴巴集团控股有限公司 Method and system for monitoring data consistency between plurality of databases
US20100082542A1 (en) * 2008-09-30 2010-04-01 Yahoo! Inc. Comparison of online advertising data consistency
CN102970159A (en) * 2012-11-05 2013-03-13 华为软件技术有限公司 Method and device for data audit and repairing treatment
CN105306585A (en) * 2015-11-12 2016-02-03 焦点科技股份有限公司 Data synchronization method for plurality of data centers
US9323799B2 (en) * 2013-09-21 2016-04-26 Oracle International Corporation Mechanism to run OLTP workload on in-memory database under memory pressure
CN105678434A (en) * 2014-11-18 2016-06-15 金蝶软件(中国)有限公司 Verification information publishing method and system in ERP system
CN107247749A (en) * 2017-05-25 2017-10-13 阿里巴巴集团控股有限公司 A kind of database positioning determines method, consistency verification method and device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101625686A (en) * 2008-07-08 2010-01-13 阿里巴巴集团控股有限公司 Method and system for monitoring data consistency between plurality of databases
US20100082542A1 (en) * 2008-09-30 2010-04-01 Yahoo! Inc. Comparison of online advertising data consistency
CN102970159A (en) * 2012-11-05 2013-03-13 华为软件技术有限公司 Method and device for data audit and repairing treatment
US9323799B2 (en) * 2013-09-21 2016-04-26 Oracle International Corporation Mechanism to run OLTP workload on in-memory database under memory pressure
CN105678434A (en) * 2014-11-18 2016-06-15 金蝶软件(中国)有限公司 Verification information publishing method and system in ERP system
CN105306585A (en) * 2015-11-12 2016-02-03 焦点科技股份有限公司 Data synchronization method for plurality of data centers
CN107247749A (en) * 2017-05-25 2017-10-13 阿里巴巴集团控股有限公司 A kind of database positioning determines method, consistency verification method and device

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108829534A (en) * 2018-05-31 2018-11-16 康键信息技术(深圳)有限公司 Data problem restorative procedure, device, computer equipment and storage medium
CN108829534B (en) * 2018-05-31 2024-04-05 康键信息技术(深圳)有限公司 Data problem repairing method, device, computer equipment and storage medium
CN109254893A (en) * 2018-08-20 2019-01-22 彩讯科技股份有限公司 A kind of business datum auditing method, device, server and storage medium
CN109254893B (en) * 2018-08-20 2021-10-15 彩讯科技股份有限公司 Service data auditing method, device, server and storage medium
CN109254893B8 (en) * 2018-08-20 2021-11-19 彩讯科技股份有限公司 Service data auditing method, device, server and storage medium
CN114003579A (en) * 2020-07-28 2022-02-01 中国移动通信集团山东有限公司 Method, device, equipment and storage medium for auditing data
CN112967417A (en) * 2021-02-01 2021-06-15 南京盛航海运股份有限公司 Intelligent ship data acquisition networking method, device, equipment and storage medium
CN115878601A (en) * 2022-12-14 2023-03-31 中电信数智科技有限公司 Data auditing method, system and device and electronic equipment

Similar Documents

Publication Publication Date Title
CN107729541A (en) A kind of data processing method, device and computer-readable recording medium
CN110737594B (en) Database standard conformance testing method and device for automatically generating test cases
WO2017219678A1 (en) Data recovery method and device, and cloud storage system
US20130275369A1 (en) Data record collapse and split functionality
US20140222766A1 (en) System and method for database migration and validation
CN106951345A (en) A kind of conformance test method and device of magnetic disk of virtual machine data
CN113064859B (en) Metadata processing method, device, electronic device and storage medium
US20070220481A1 (en) Limited source code regeneration based on model modification
CN111176887A (en) MySQL misoperation rollback method, equipment and system
CN112416710A (en) User-operated recording method, device, electronic device, and storage medium
CN112256672B (en) Database change approval method and device
CN108874611A (en) A kind of construction method and device of test data
CN112699129A (en) Data processing system, method and device
CN104615948A (en) Method for automatically recognizing file completeness and restoring
CN114077428B (en) Data processing method, device, electronic equipment and storage medium
US11768855B1 (en) Replicating data across databases by utilizing validation functions for data completeness and sequencing
CN111880964A (en) Method and system for provenance-based data backup
CN111221817B (en) Service information data storage method, device, computer equipment and storage medium
WO2019085354A1 (en) Excel system interface-based database linkage method, electronic device, and storage medium
CN110895531A (en) Data writing method of data storage table, partition server and electronic device
CN116108035A (en) Data recovery method and device, processor and electronic equipment
CN115658391A (en) Backup recovery method of WAL mechanism based on QianBase MPP database
CN113326268A (en) Data writing and reading method and device
CN110825809A (en) Storage method and device for drug response information
CN119862194B (en) Oracle database full-text retrieval statement restoration synchronization method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20180223

RJ01 Rejection of invention patent application after publication