[go: up one dir, main page]

CN117874002A - A method and system for heterogeneous data migration - Google Patents

A method and system for heterogeneous data migration Download PDF

Info

Publication number
CN117874002A
CN117874002A CN202311828822.4A CN202311828822A CN117874002A CN 117874002 A CN117874002 A CN 117874002A CN 202311828822 A CN202311828822 A CN 202311828822A CN 117874002 A CN117874002 A CN 117874002A
Authority
CN
China
Prior art keywords
migration
data
target
migrated
heterogeneous
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311828822.4A
Other languages
Chinese (zh)
Inventor
胡欣
韩琪
陶勇
孙雪松
陈林林
黄燚
于学沛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Aisino Corp
Original Assignee
Aisino Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Aisino Corp filed Critical Aisino Corp
Priority to CN202311828822.4A priority Critical patent/CN117874002A/en
Publication of CN117874002A publication Critical patent/CN117874002A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/214Database migration support
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method and a system for heterogeneous data migration, and belongs to the technical field of data migration. The method of the invention comprises the following steps: establishing a target database and a target data record structure aiming at heterogeneous data to be migrated, taking the target database as a migration end point of the heterogeneous data to be migrated, and taking the target data record structure as a migration start point of the heterogeneous data to be migrated; preprocessing the heterogeneous data to be migrated; taking the migration strategy and the priority as information labels, and writing target migration data; and acquiring a migration strategy and a priority of the target migration data, and migrating the target migration data to a migration end point through a preset network channel according to the migration strategy and the priority. The method and the device can effectively migrate the target data, and are simple in migration process and high in efficiency.

Description

一种用于异构数据迁移的方法及系统A method and system for heterogeneous data migration

技术领域Technical Field

本发明涉及数据迁移技术领域,并且更具体地,涉及一种用于异构数据迁移的方法及系统。The present invention relates to the technical field of data migration, and more specifically, to a method and system for heterogeneous data migration.

背景技术Background technique

随着信息技术的发展,治安旅馆行业的数据量日益增大,数据的来源也越来越多样化。然而,由于各种原因,这些数据往往分散在不同的数据源中,无法有效地进行整合和利用。因此,如何将这些异构的数据进行有效的迁移,以满足治安旅馆行业的业务需求,成为了一个亟待解决的问题。With the development of information technology, the amount of data in the security hotel industry is increasing, and the sources of data are becoming more and more diverse. However, due to various reasons, these data are often scattered in different data sources and cannot be effectively integrated and utilized. Therefore, how to effectively migrate these heterogeneous data to meet the business needs of the security hotel industry has become an urgent problem to be solved.

现有的数据迁移工具往往只能处理特定的数据源和数据结构,对于治安旅馆行业的异构数据迁移,其效果并不理想。此外,这些工具在处理大量数据时,往往需要消耗大量的计算资源和存储资源,效率低下。Existing data migration tools can only process specific data sources and data structures, and are not ideal for heterogeneous data migration in the security hotel industry. In addition, these tools often consume a lot of computing and storage resources when processing large amounts of data, which is inefficient.

发明内容Summary of the invention

针对上述问题,本发明提出了一种用于异构数据迁移的方法,包括:In view of the above problems, the present invention proposes a method for heterogeneous data migration, comprising:

针对待迁移的异构数据建立目标数据库及目标数据记录结构,将所述目标数据库,作为所述待迁移的异构数据的迁移终点,并将所述目标数据记录结构,作为所述待迁移的异构数据的迁移起点;Establishing a target database and a target data record structure for the heterogeneous data to be migrated, taking the target database as the migration end point of the heterogeneous data to be migrated, and taking the target data record structure as the migration start point of the heterogeneous data to be migrated;

对所述待迁移的异构数据,进行预处理,并将预处理后的待迁移的异构数据映射至所述迁移起点,基于所述迁移起点根据映射的待迁移的异构数据,生成统一数据格式的目标迁移数据,并将所述目标迁移数据存储至中间数据库;Preprocessing the heterogeneous data to be migrated, mapping the preprocessed heterogeneous data to be migrated to the migration starting point, generating target migration data in a unified data format based on the migration starting point and the mapped heterogeneous data to be migrated, and storing the target migration data in an intermediate database;

针对所述目标迁移数据,根据预设的数据迁移要求,生成所述目标迁移数据的迁移策略及优先级,并将所述迁移策略及优先级作为信息标签,写入目标迁移数据;For the target migration data, generating a migration strategy and a priority of the target migration data according to preset data migration requirements, and writing the migration strategy and the priority into the target migration data as information tags;

基于预设的迁移框架,调取中间数据库存储的目标迁移数据,检验所述目标迁移数据是否带写入有信息标签,若是,读取信息标签,以获取所述目标迁移数据的迁移策略及优先级,根据所述迁移策略及优先级,将所述目标迁移数据,以预设的网络通道,迁移至迁移终点。Based on the preset migration framework, the target migration data stored in the intermediate database is retrieved to check whether the target migration data has an information tag written therein. If so, the information tag is read to obtain the migration strategy and priority of the target migration data. According to the migration strategy and priority, the target migration data is migrated to the migration destination through the preset network channel.

可选的,对所述待迁移的异构数据,进行预处理,包括:Optionally, the heterogeneous data to be migrated is preprocessed, including:

对所述待迁移的异构数据进行去重及去噪处理;Deduplication and denoising are performed on the heterogeneous data to be migrated;

对进行去重和去噪处理后的待迁移的异构数据进行缺失数据的查验,若存在缺失数据,对所述待迁移的异构数据进行补全处理;Checking for missing data on the heterogeneous data to be migrated after deduplication and denoising, and if there is missing data, completing the heterogeneous data to be migrated;

对查验缺失数据后的待迁移的异构数据,进行真伪校验,以已剔除虚假数据;After checking for missing data, perform authenticity verification on the heterogeneous data to be migrated to eliminate false data;

对剔除虚假数据后的待迁移的异构数据,进行归一化处理。After removing false data, the heterogeneous data to be migrated is normalized.

可选的,目标数据记录结构为多维度的数据记录结构;Optionally, the target data record structure is a multi-dimensional data record structure;

所述多维度,包括:第一维度、第二维度至第N维度。The multiple dimensions include: a first dimension, and a second dimension to an Nth dimension.

可选的,将预处理后的待迁移的异构数据映射至所述迁移起点,包括:Optionally, mapping the pre-processed heterogeneous data to be migrated to the migration starting point includes:

将预处理后的待迁移的异构数据,按照第一维度映射至迁移起点,再按照第二维度映射至迁移起点,直到第N维度完成映射。The preprocessed heterogeneous data to be migrated is mapped to the migration starting point according to the first dimension, and then mapped to the migration starting point according to the second dimension until the mapping is completed in the Nth dimension.

可选的,中间数据库为关系型数据库、NoSQL数据库或大数据存储型数据库。Optionally, the intermediate database is a relational database, a NoSQL database, or a big data storage database.

可选的,若检验目标迁移数据是未写入有信息标签,则确定中间数据库是否存在优先级更高的目标迁移数据,若是不存在,则对目标迁移数据以通用迁移策略进行迁移,若是存在,则发出写入信息标签的提示消息,当目标迁移数据写入信息标签后,将目标迁移数据作为下一次的数据迁移对象进行迁移。Optionally, if it is verified that the target migration data is not written with an information tag, determine whether there is target migration data with a higher priority in the intermediate database. If not, migrate the target migration data using a general migration strategy. If so, issue a prompt message to write the information tag. After the target migration data is written with the information tag, migrate the target migration data as the next data migration object.

可选的,迁移策略,包括如下中的至少一种:增量迁移、全量迁移和按时间迁移。Optionally, the migration strategy includes at least one of the following: incremental migration, full migration, and time-based migration.

可选的,基于Spring Boot快速开发框架搭建迁移框架。Optionally, build a migration framework based on the Spring Boot rapid development framework.

可选的,方法,还包括:对迁移终点的目标迁移数据进行校验,以确定所述目标迁移数据的完整性,若是不完整,获取不完整目标迁移数据的数据特征,基于所述数据特征对所述迁移框架的参数进行调整,并对调整后的迁移框架的准确性进行校验,直到校验结果满足数据迁移要求。Optionally, the method also includes: verifying the target migration data at the migration endpoint to determine the integrity of the target migration data; if it is incomplete, obtaining data features of the incomplete target migration data; adjusting the parameters of the migration framework based on the data features; and verifying the accuracy of the adjusted migration framework until the verification results meet the data migration requirements.

可选的,方法,还包括:对预处理、映射及迁移的全过程进行记录,以生成数据迁移日志,并对所述数据迁移日志进行存储;Optionally, the method further includes: recording the entire process of preprocessing, mapping and migration to generate a data migration log, and storing the data migration log;

所述数据迁移日志用于对迁移数据进行溯源。The data migration log is used to trace the migration data.

再一方面,本发明还提出了一种用于异构数据迁移的系统,包括:In another aspect, the present invention further provides a system for heterogeneous data migration, comprising:

初始单元,用于针对待迁移的异构数据建立目标数据库及目标数据记录结构,将所述目标数据库,作为所述待迁移的异构数据的迁移终点,并将所述目标数据记录结构,作为所述待迁移的异构数据的迁移起点;An initialization unit, used to establish a target database and a target data record structure for the heterogeneous data to be migrated, use the target database as a migration end point of the heterogeneous data to be migrated, and use the target data record structure as a migration start point of the heterogeneous data to be migrated;

预处理单元,用于对所述待迁移的异构数据,进行预处理,并将预处理后的待迁移的异构数据映射至所述迁移起点,基于所述迁移起点根据映射的待迁移的异构数据,生成统一数据格式的目标迁移数据,并将所述目标迁移数据存储至中间数据库;a preprocessing unit, configured to preprocess the heterogeneous data to be migrated, map the preprocessed heterogeneous data to be migrated to the migration starting point, generate target migration data in a unified data format based on the migration starting point and the mapped heterogeneous data to be migrated, and store the target migration data in an intermediate database;

标签单元,用于针对所述目标迁移数据,根据预设的数据迁移要求,生成所述目标迁移数据的迁移策略及优先级,并将所述迁移策略及优先级作为信息标签,写入目标迁移数据;a label unit, configured to generate a migration strategy and a priority of the target migration data according to a preset data migration requirement for the target migration data, and write the migration strategy and the priority into the target migration data as an information label;

迁移单元,用于基于预设的迁移框架,调取中间数据库存储的目标迁移数据,检验所述目标迁移数据是否带写入有信息标签,若是,读取信息标签,以获取所述目标迁移数据的迁移策略及优先级,根据所述迁移策略及优先级,将所述目标迁移数据,以预设的网络通道,迁移至迁移终点。The migration unit is used to retrieve the target migration data stored in the intermediate database based on a preset migration framework, and check whether the target migration data has an information tag written therein. If so, the information tag is read to obtain the migration strategy and priority of the target migration data, and according to the migration strategy and priority, the target migration data is migrated to the migration destination through a preset network channel.

可选的,对所述待迁移的异构数据,进行预处理,包括:Optionally, the heterogeneous data to be migrated is preprocessed, including:

对所述待迁移的异构数据进行去重及去噪处理;Deduplication and denoising are performed on the heterogeneous data to be migrated;

对进行去重和去噪处理后的待迁移的异构数据进行缺失数据的查验,若存在缺失数据,对所述待迁移的异构数据进行补全处理;Checking for missing data on the heterogeneous data to be migrated after deduplication and denoising, and if there is missing data, completing the heterogeneous data to be migrated;

对查验缺失数据后的待迁移的异构数据,进行真伪校验,以已剔除虚假数据;After checking for missing data, perform authenticity verification on the heterogeneous data to be migrated to eliminate false data;

对剔除虚假数据后的待迁移的异构数据,进行归一化处理。After removing false data, the heterogeneous data to be migrated is normalized.

可选的,目标数据记录结构为多维度的数据记录结构;Optionally, the target data record structure is a multi-dimensional data record structure;

所述多维度,包括:第一维度、第二维度至第N维度。The multiple dimensions include: a first dimension, and a second dimension to an Nth dimension.

可选的,将预处理后的待迁移的异构数据映射至所述迁移起点,包括:Optionally, mapping the pre-processed heterogeneous data to be migrated to the migration starting point includes:

将预处理后的待迁移的异构数据,按照第一维度映射至迁移起点,再按照第二维度映射至迁移起点,直到第N维度完成映射。The preprocessed heterogeneous data to be migrated is mapped to the migration starting point according to the first dimension, and then mapped to the migration starting point according to the second dimension until the mapping is completed in the Nth dimension.

可选的,中间数据库为关系型数据库、NoSQL数据库或大数据存储型数据库。Optionally, the intermediate database is a relational database, a NoSQL database, or a big data storage database.

可选的,若检验目标迁移数据是未写入有信息标签,则确定中间数据库是否存在优先级更高的目标迁移数据,若是不存在,则对目标迁移数据以通用迁移策略进行迁移,若是存在,则发出写入信息标签的提示消息,当目标迁移数据写入信息标签后,将目标迁移数据作为下一次的数据迁移对象进行迁移。Optionally, if it is verified that the target migration data is not written with an information tag, determine whether there is target migration data with a higher priority in the intermediate database. If not, migrate the target migration data using a general migration strategy. If so, issue a prompt message to write the information tag. After the target migration data is written with the information tag, migrate the target migration data as the next data migration object.

可选的,迁移策略,包括如下中的至少一种:增量迁移、全量迁移和按时间迁移。Optionally, the migration strategy includes at least one of the following: incremental migration, full migration, and time-based migration.

可选的,基于Spring Boot快速开发框架搭建迁移框架。Optionally, build a migration framework based on the Spring Boot rapid development framework.

可选的,迁移单元还用于:对迁移终点的目标迁移数据进行校验,以确定所述目标迁移数据的完整性,若是不完整,获取不完整目标迁移数据的数据特征,基于所述数据特征对所述迁移框架的参数进行调整,并对调整后的迁移框架的准确性进行校验,直到校验结果满足数据迁移要求。Optionally, the migration unit is also used to: verify the target migration data at the migration endpoint to determine the integrity of the target migration data; if it is incomplete, obtain data features of the incomplete target migration data; adjust the parameters of the migration framework based on the data features; and verify the accuracy of the adjusted migration framework until the verification results meet the data migration requirements.

可选的,迁移单元还用于:对预处理、映射及迁移的全过程进行记录,以生成数据迁移日志,并对所述数据迁移日志进行存储;Optionally, the migration unit is further used to: record the entire process of preprocessing, mapping and migration to generate a data migration log, and store the data migration log;

所述数据迁移日志用于对迁移数据进行溯源。The data migration log is used to trace the migration data.

再一方面,本发明还提供了一种计算设备,包括:一个或多个处理器;In yet another aspect, the present invention further provides a computing device, comprising: one or more processors;

处理器,用于执行一个或多个程序;a processor for executing one or more programs;

当所述一个或多个程序被所述一个或多个处理器执行时,实现如上述所述的方法。When the one or more programs are executed by the one or more processors, the above-described method is implemented.

再一方面,本发明还提供了一种计算机可读存储介质,其上存有计算机程序,所述计算机程序被执行时,实现如上述所述的方法。In yet another aspect, the present invention further provides a computer-readable storage medium having a computer program stored thereon, wherein when the computer program is executed, the method described above is implemented.

与现有技术相比,本发明的有益效果为:Compared with the prior art, the present invention has the following beneficial effects:

本发明提出了一种用于异构数据迁移的方法,包括:针对待迁移的异构数据建立目标数据库及目标数据记录结构,将所述目标数据库,作为所述待迁移的异构数据的迁移终点,并将所述目标数据记录结构,作为所述待迁移的异构数据的迁移起点;对所述待迁移的异构数据,进行预处理,并将预处理后的待迁移的异构数据映射至所述迁移起点,基于所述迁移起点根据映射的待迁移的异构数据,生成统一数据格式的目标迁移数据,并将所述目标迁移数据存储至中间数据库;针对所述目标迁移数据,根据预设的数据迁移要求,生成所述目标迁移数据的迁移策略及优先级,并将所述迁移策略及优先级作为信息标签,写入目标迁移数据;基于预设的迁移框架,调取中间数据库存储的目标迁移数据,检验所述目标迁移数据是否带写入有信息标签,若是,读取信息标签,以获取所述目标迁移数据的迁移策略及优先级,根据所述迁移策略及优先级,将所述目标迁移数据,以预设的网络通道,迁移至迁移终点。本发明能够有效的将目标数据进行迁移,且迁移过程简单,效率高。The present invention proposes a method for heterogeneous data migration, including: establishing a target database and a target data record structure for heterogeneous data to be migrated, taking the target database as the migration end point of the heterogeneous data to be migrated, and taking the target data record structure as the migration start point of the heterogeneous data to be migrated; preprocessing the heterogeneous data to be migrated, and mapping the preprocessed heterogeneous data to be migrated to the migration start point, generating target migration data in a unified data format based on the migration start point according to the mapped heterogeneous data to be migrated, and storing the target migration data in an intermediate database; generating a migration strategy and priority of the target migration data according to preset data migration requirements for the target migration data, and writing the migration strategy and priority as information tags into the target migration data; based on a preset migration framework, calling the target migration data stored in the intermediate database, checking whether the target migration data has an information tag written in it, and if so, reading the information tag to obtain the migration strategy and priority of the target migration data, and migrating the target migration data to the migration end point through a preset network channel according to the migration strategy and priority. The present invention can effectively migrate the target data, and the migration process is simple and efficient.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1为本发明方法的流程图;Fig. 1 is a flow chart of the method of the present invention;

图2为本发明系统的结构图。FIG. 2 is a structural diagram of the system of the present invention.

具体实施方式Detailed ways

现在参考附图介绍本发明的示例性实施方式,然而,本发明可以用许多不同的形式来实施,并且不局限于此处描述的实施例,提供这些实施例是为了详尽地且完全地公开本发明,并且向所属技术领域的技术人员充分传达本发明的范围。对于表示在附图中的示例性实施方式中的术语并不是对本发明的限定。在附图中,相同的单元/元件使用相同的附图标记。Now, exemplary embodiments of the present invention are described with reference to the accompanying drawings. However, the present invention can be implemented in many different forms and is not limited to the embodiments described herein. These embodiments are provided to disclose the present invention in detail and completely and to fully convey the scope of the present invention to those skilled in the art. The terms used in the exemplary embodiments shown in the accompanying drawings are not intended to limit the present invention. In the accompanying drawings, the same units/elements are marked with the same reference numerals.

除非另有说明,此处使用的术语(包括科技术语)对所属技术领域的技术人员具有通常的理解含义。另外,可以理解的是,以通常使用的词典限定的术语,应当被理解为与其相关领域的语境具有一致的含义,而不应该被理解为理想化的或过于正式的意义。Unless otherwise specified, the terms (including technical terms) used herein have the commonly understood meanings to those skilled in the art. In addition, it is understood that the terms defined in commonly used dictionaries should be understood to have the same meanings as those in the context of the relevant fields, and should not be understood as idealized or overly formal meanings.

实施例1:Embodiment 1:

本发明提出了一种用于异构数据迁移的方法,如图1所示,包括:The present invention proposes a method for heterogeneous data migration, as shown in FIG1 , comprising:

步骤1、针对待迁移的异构数据建立目标数据库及目标数据记录结构,将所述目标数据库,作为所述待迁移的异构数据的迁移终点,并将所述目标数据记录结构,作为所述待迁移的异构数据的迁移起点;Step 1: establishing a target database and a target data record structure for the heterogeneous data to be migrated, taking the target database as the migration end point of the heterogeneous data to be migrated, and taking the target data record structure as the migration start point of the heterogeneous data to be migrated;

步骤2、对所述待迁移的异构数据,进行预处理,并将预处理后的待迁移的异构数据映射至所述迁移起点,基于所述迁移起点根据映射的待迁移的异构数据,生成统一数据格式的目标迁移数据,并将所述目标迁移数据存储至中间数据库;Step 2: preprocess the heterogeneous data to be migrated, and map the preprocessed heterogeneous data to be migrated to the migration starting point, generate target migration data in a unified data format based on the migration starting point and the mapped heterogeneous data to be migrated, and store the target migration data in an intermediate database;

步骤3、针对所述目标迁移数据,根据预设的数据迁移要求,生成所述目标迁移数据的迁移策略及优先级,并将所述迁移策略及优先级作为信息标签,写入目标迁移数据;Step 3: Generate a migration strategy and priority of the target migration data according to preset data migration requirements for the target migration data, and write the migration strategy and priority into the target migration data as information tags;

步骤4、基于预设的迁移框架,调取中间数据库存储的目标迁移数据,检验所述目标迁移数据是否带写入有信息标签,若是,读取信息标签,以获取所述目标迁移数据的迁移策略及优先级,根据所述迁移策略及优先级,将所述目标迁移数据,以预设的网络通道,迁移至迁移终点。Step 4: Based on the preset migration framework, retrieve the target migration data stored in the intermediate database, check whether the target migration data has an information tag written in it, and if so, read the information tag to obtain the migration strategy and priority of the target migration data. According to the migration strategy and priority, migrate the target migration data to the migration destination through the preset network channel.

其中,对所述待迁移的异构数据,进行预处理,包括:The heterogeneous data to be migrated is preprocessed, including:

对所述待迁移的异构数据进行去重及去噪处理;Deduplication and denoising are performed on the heterogeneous data to be migrated;

对进行去重和去噪处理后的待迁移的异构数据进行缺失数据的查验,若存在缺失数据,对所述待迁移的异构数据进行补全处理;Checking for missing data on the heterogeneous data to be migrated after deduplication and denoising, and if there is missing data, completing the heterogeneous data to be migrated;

对查验缺失数据后的待迁移的异构数据,进行真伪校验,以已剔除虚假数据;After checking for missing data, perform authenticity verification on the heterogeneous data to be migrated to eliminate false data;

对剔除虚假数据后的待迁移的异构数据,进行归一化处理。After removing false data, the heterogeneous data to be migrated is normalized.

其中,目标数据记录结构为多维度的数据记录结构;Wherein, the target data record structure is a multi-dimensional data record structure;

所述多维度,包括:第一维度、第二维度至第N维度。The multiple dimensions include: a first dimension, and a second dimension to an Nth dimension.

其中,将预处理后的待迁移的异构数据映射至所述迁移起点,包括:The pre-processed heterogeneous data to be migrated is mapped to the migration starting point, including:

将预处理后的待迁移的异构数据,按照第一维度映射至迁移起点,再按照第二维度映射至迁移起点,直到第N维度完成映射。The preprocessed heterogeneous data to be migrated is mapped to the migration starting point according to the first dimension, and then mapped to the migration starting point according to the second dimension until the mapping is completed in the Nth dimension.

其中,中间数据库为关系型数据库、NoSQL数据库或大数据存储型数据库。Among them, the intermediate database is a relational database, a NoSQL database or a big data storage database.

其中,若检验目标迁移数据是未写入有信息标签,则确定中间数据库是否存在优先级更高的目标迁移数据,若是不存在,则对目标迁移数据以通用迁移策略进行迁移,若是存在,则发出写入信息标签的提示消息,当目标迁移数据写入信息标签后,将目标迁移数据作为下一次的数据迁移对象进行迁移。Among them, if it is checked that the target migration data is not written with an information tag, it is determined whether there is target migration data with a higher priority in the intermediate database. If not, the target migration data is migrated with a general migration strategy. If it exists, a prompt message for writing the information tag is issued. After the target migration data is written with the information tag, the target migration data is migrated as the next data migration object.

其中,迁移策略,包括如下中的至少一种:增量迁移、全量迁移和按时间迁移。The migration strategy includes at least one of the following: incremental migration, full migration, and time-based migration.

其中,基于Spring Boot快速开发框架搭建迁移框架。Among them, a migration framework is built based on the Spring Boot rapid development framework.

其中,方法,还包括:对迁移终点的目标迁移数据进行校验,以确定所述目标迁移数据的完整性,若是不完整,获取不完整目标迁移数据的数据特征,基于所述数据特征对所述迁移框架的参数进行调整,并对调整后的迁移框架的准确性进行校验,直到校验结果满足数据迁移要求。The method further includes: verifying the target migration data at the migration endpoint to determine the integrity of the target migration data; if the target migration data is incomplete, obtaining data features of the incomplete target migration data; adjusting the parameters of the migration framework based on the data features; and verifying the accuracy of the adjusted migration framework until the verification result meets the data migration requirements.

其中,方法,还包括:对预处理、映射及迁移的全过程进行记录,以生成数据迁移日志,并对所述数据迁移日志进行存储;The method further includes: recording the entire process of preprocessing, mapping and migration to generate a data migration log, and storing the data migration log;

所述数据迁移日志用于对迁移数据进行溯源。The data migration log is used to trace the migration data.

实施例2:Embodiment 2:

本发明还提出了一种用于异构数据迁移的系统200,如图2所示,包括:The present invention also proposes a system 200 for heterogeneous data migration, as shown in FIG2 , comprising:

初始单元201,用于针对待迁移的异构数据建立目标数据库及目标数据记录结构,将所述目标数据库,作为所述待迁移的异构数据的迁移终点,并将所述目标数据记录结构,作为所述待迁移的异构数据的迁移起点;The initialization unit 201 is used to establish a target database and a target data record structure for the heterogeneous data to be migrated, and use the target database as the migration end point of the heterogeneous data to be migrated, and use the target data record structure as the migration start point of the heterogeneous data to be migrated;

预处理单元202,用于对所述待迁移的异构数据,进行预处理,并将预处理后的待迁移的异构数据映射至所述迁移起点,基于所述迁移起点根据映射的待迁移的异构数据,生成统一数据格式的目标迁移数据,并将所述目标迁移数据存储至中间数据库;A preprocessing unit 202 is used to preprocess the heterogeneous data to be migrated, map the preprocessed heterogeneous data to be migrated to the migration starting point, generate target migration data in a unified data format based on the migration starting point according to the mapped heterogeneous data to be migrated, and store the target migration data in an intermediate database;

标签单元203,用于针对所述目标迁移数据,根据预设的数据迁移要求,生成所述目标迁移数据的迁移策略及优先级,并将所述迁移策略及优先级作为信息标签,写入目标迁移数据;The label unit 203 is used to generate a migration strategy and a priority of the target migration data according to a preset data migration requirement for the target migration data, and write the migration strategy and the priority into the target migration data as an information label;

迁移单元204,用于基于预设的迁移框架,调取中间数据库存储的目标迁移数据,检验所述目标迁移数据是否带写入有信息标签,若是,读取信息标签,以获取所述目标迁移数据的迁移策略及优先级,根据所述迁移策略及优先级,将所述目标迁移数据,以预设的网络通道,迁移至迁移终点。The migration unit 204 is used to retrieve the target migration data stored in the intermediate database based on a preset migration framework, and check whether the target migration data has an information tag written therein. If so, the information tag is read to obtain the migration strategy and priority of the target migration data, and according to the migration strategy and priority, the target migration data is migrated to the migration destination through a preset network channel.

其中,对所述待迁移的异构数据,进行预处理,包括:The heterogeneous data to be migrated is preprocessed, including:

对所述待迁移的异构数据进行去重及去噪处理;Deduplication and denoising are performed on the heterogeneous data to be migrated;

对进行去重和去噪处理后的待迁移的异构数据进行缺失数据的查验,若存在缺失数据,对所述待迁移的异构数据进行补全处理;Checking for missing data on the heterogeneous data to be migrated after deduplication and denoising, and if there is missing data, completing the heterogeneous data to be migrated;

对查验缺失数据后的待迁移的异构数据,进行真伪校验,以已剔除虚假数据;After checking for missing data, perform authenticity verification on the heterogeneous data to be migrated to eliminate false data;

对剔除虚假数据后的待迁移的异构数据,进行归一化处理。After removing false data, the heterogeneous data to be migrated is normalized.

其中,目标数据记录结构为多维度的数据记录结构;Wherein, the target data record structure is a multi-dimensional data record structure;

所述多维度,包括:第一维度、第二维度至第N维度。The multiple dimensions include: a first dimension, and a second dimension to an Nth dimension.

其中,将预处理后的待迁移的异构数据映射至所述迁移起点,包括:The pre-processed heterogeneous data to be migrated is mapped to the migration starting point, including:

将预处理后的待迁移的异构数据,按照第一维度映射至迁移起点,再按照第二维度映射至迁移起点,直到第N维度完成映射。The preprocessed heterogeneous data to be migrated is mapped to the migration starting point according to the first dimension, and then mapped to the migration starting point according to the second dimension until the mapping is completed in the Nth dimension.

其中,中间数据库为关系型数据库、NoSQL数据库或大数据存储型数据库。Among them, the intermediate database is a relational database, a NoSQL database or a big data storage database.

其中,若检验目标迁移数据是未写入有信息标签,则确定中间数据库是否存在优先级更高的目标迁移数据,若是不存在,则对目标迁移数据以通用迁移策略进行迁移,若是存在,则发出写入信息标签的提示消息,当目标迁移数据写入信息标签后,将目标迁移数据作为下一次的数据迁移对象进行迁移。Among them, if it is checked that the target migration data is not written with an information tag, it is determined whether there is target migration data with a higher priority in the intermediate database. If not, the target migration data is migrated with a general migration strategy. If it exists, a prompt message for writing the information tag is issued. After the target migration data is written with the information tag, the target migration data is migrated as the next data migration object.

其中,迁移策略,包括如下中的至少一种:增量迁移、全量迁移和按时间迁移。The migration strategy includes at least one of the following: incremental migration, full migration, and time-based migration.

其中,基于Spring Boot快速开发框架搭建迁移框架。Among them, a migration framework is built based on the Spring Boot rapid development framework.

其中,迁移单元204还用于:对迁移终点的目标迁移数据进行校验,以确定所述目标迁移数据的完整性,若是不完整,获取不完整目标迁移数据的数据特征,基于所述数据特征对所述迁移框架的参数进行调整,并对调整后的迁移框架的准确性进行校验,直到校验结果满足数据迁移要求。Among them, the migration unit 204 is also used to: verify the target migration data at the migration endpoint to determine the integrity of the target migration data. If it is incomplete, obtain the data characteristics of the incomplete target migration data, adjust the parameters of the migration framework based on the data characteristics, and verify the accuracy of the adjusted migration framework until the verification result meets the data migration requirements.

其中,迁移单元204还用于:对预处理、映射及迁移的全过程进行记录,以生成数据迁移日志,并对所述数据迁移日志进行存储;The migration unit 204 is further used to: record the entire process of preprocessing, mapping and migration to generate a data migration log, and store the data migration log;

所述数据迁移日志用于对迁移数据进行溯源。The data migration log is used to trace the migration data.

本发明能够有效的将目标数据进行迁移,且迁移过程简单,效率高。The present invention can effectively migrate target data, and the migration process is simple and efficient.

实施例3:Embodiment 3:

基于同一种发明构思,本发明还提供了一种计算机设备,该计算机设备包括处理器以及存储器,所述存储器用于存储计算机程序,所述计算机程序包括程序指令,所述处理器用于执行所述计算机存储介质存储的程序指令。处理器可能是中央处理单元(CentralProcessing Unit,CPU),还可以是其他通用处理器、数字信号处理器(Digital SignalProcessor、DSP)、专用集成电路(Application SpecificIntegrated Circuit,ASIC)、现成可编程门阵列(Field-Programmable GateArray,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等,其是终端的计算核心以及控制核心,其适于实现一条或一条以上指令,具体适于加载并执行计算机存储介质内一条或一条以上指令从而实现相应方法流程或相应功能,以实现上述实施例中方法的步骤。Based on the same inventive concept, the present invention also provides a computer device, which includes a processor and a memory, wherein the memory is used to store a computer program, the computer program includes program instructions, and the processor is used to execute the program instructions stored in the computer storage medium. The processor may be a central processing unit (CPU), or other general-purpose processors, digital signal processors (DSP), application-specific integrated circuits (ASIC), field-programmable gate arrays (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc. It is the computing core and control core of the terminal, which is suitable for implementing one or more instructions, and is specifically suitable for loading and executing one or more instructions in the computer storage medium to implement the corresponding method flow or corresponding functions, so as to implement the steps of the method in the above embodiment.

实施例4:Embodiment 4:

基于同一种发明构思,本发明还提供了一种存储介质,具体为计算机可读存储介质(Memory),所述计算机可读存储介质是计算机设备中的记忆设备,用于存放程序和数据。可以理解的是,此处的计算机可读存储介质既可以包括计算机设备中的内置存储介质,当然也可以包括计算机设备所支持的扩展存储介质。计算机可读存储介质提供存储空间,该存储空间存储了终端的操作系统。并且,在该存储空间中还存放了适于被处理器加载并执行的一条或一条以上的指令,这些指令可以是一个或一个以上的计算机程序(包括程序代码)。需要说明的是,此处的计算机可读存储介质可以是高速RAM存储器,也可以是非不稳定的存储器(non-volatile memory),例如至少一个磁盘存储器。可由处理器加载并执行计算机可读存储介质中存放的一条或一条以上指令,以实现上述实施例中方法的步骤。Based on the same inventive concept, the present invention also provides a storage medium, specifically a computer-readable storage medium (Memory), which is a memory device in a computer device for storing programs and data. It is understandable that the computer-readable storage medium here can include both built-in storage media in a computer device and, of course, an extended storage medium supported by the computer device. The computer-readable storage medium provides a storage space, which stores the operating system of the terminal. In addition, one or more instructions suitable for being loaded and executed by a processor are also stored in the storage space, and these instructions can be one or more computer programs (including program codes). It should be noted that the computer-readable storage medium here can be a high-speed RAM memory or a non-volatile memory, such as at least one disk memory. The processor can load and execute one or more instructions stored in the computer-readable storage medium to implement the steps of the method in the above embodiment.

本领域内的技术人员应明白,本发明的实施例可提供为方法、系统、或计算机程序产品。因此,本发明可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本发明可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。本发明实施例中的方案可以采用各种计算机语言实现,例如,面向对象的程序设计语言Java和直译式脚本语言JavaScript等。It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as methods, systems, or computer program products. Therefore, the present invention may take the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware. Moreover, the present invention may take the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes. The scheme in the embodiments of the present invention may be implemented in various computer languages, for example, object-oriented programming language Java and literal scripting language JavaScript, etc.

本发明是参照根据本发明实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present invention is described with reference to the flowchart and/or block diagram of the method, device (system), and computer program product according to the embodiment of the present invention. It should be understood that each process and/or box in the flowchart and/or block diagram, as well as the combination of the process and/or box in the flowchart and/or block diagram can be implemented by computer program instructions. These computer program instructions can be provided to a processor of a general-purpose computer, a special-purpose computer, an embedded processor or other programmable data processing device to produce a machine, so that the instructions executed by the processor of the computer or other programmable data processing device produce a device for implementing the functions specified in one or more processes in the flowchart and/or one or more boxes in the block diagram.

这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing device to work in a specific manner, so that the instructions stored in the computer-readable memory produce a manufactured product including an instruction device that implements the functions specified in one or more processes in the flowchart and/or one or more boxes in the block diagram.

这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions may also be loaded onto a computer or other programmable data processing device so that a series of operational steps are executed on the computer or other programmable device to produce a computer-implemented process, whereby the instructions executed on the computer or other programmable device provide steps for implementing the functions specified in one or more processes in the flowchart and/or one or more boxes in the block diagram.

尽管已描述了本发明的优选实施例,但本领域内的技术人员一旦得知了基本创造性概念,则可对这些实施例作出另外的变更和修改。所以,所附权利要求意欲解释为包括优选实施例以及落入本发明范围的所有变更和修改。Although the preferred embodiments of the present invention have been described, those skilled in the art may make other changes and modifications to these embodiments once they have learned the basic creative concept. Therefore, the appended claims are intended to be interpreted as including the preferred embodiments and all changes and modifications that fall within the scope of the present invention.

显然,本领域的技术人员可以对本发明进行各种改动和变型而不脱离本发明的精神和范围。这样,倘若本发明的这些修改和变型属于本发明权利要求及其等同技术的范围之内,则本发明也意图包含这些改动和变型在内。Obviously, those skilled in the art can make various changes and modifications to the present invention without departing from the spirit and scope of the present invention. Thus, if these modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include these modifications and variations.

Claims (10)

1. A method for heterogeneous data migration, the method comprising:
establishing a target database and a target data record structure aiming at heterogeneous data to be migrated, taking the target database as a migration end point of the heterogeneous data to be migrated, and taking the target data record structure as a migration start point of the heterogeneous data to be migrated;
preprocessing the heterogeneous data to be migrated, mapping the preprocessed heterogeneous data to be migrated to the migration starting point, generating target migration data in a unified data format based on the migration starting point according to the mapped heterogeneous data to be migrated, and storing the target migration data in an intermediate database;
aiming at the target migration data, generating a migration strategy and a priority of the target migration data according to a preset data migration requirement, and writing the migration strategy and the priority into the target migration data by taking the migration strategy and the priority as information labels;
and based on a preset migration frame, retrieving target migration data stored in an intermediate database, checking whether the target migration data is provided with an information tag, if yes, reading the information tag to acquire a migration strategy and a priority of the target migration data, and migrating the target migration data to a migration terminal point through a preset network channel according to the migration strategy and the priority.
2. The method of claim 1, wherein the preprocessing the heterogeneous data to be migrated comprises:
performing de-duplication and de-noising treatment on the heterogeneous data to be migrated;
checking missing data of the heterogeneous data to be migrated after the de-duplication and de-noising treatment, and if the missing data exists, carrying out complement treatment on the heterogeneous data to be migrated;
performing true and false verification on heterogeneous data to be migrated after checking the missing data to remove false data;
and carrying out normalization processing on heterogeneous data to be migrated after the false data is removed.
3. The method of claim 1, wherein the target data record structure is a multi-dimensional data record structure;
the multi-dimension, comprising: the first dimension, the second dimension and the nth dimension.
4. The method of claim 1, wherein mapping the preprocessed heterogeneous data to be migrated to the migration origin comprises:
and mapping the preprocessed heterogeneous data to be migrated to a migration starting point according to a first dimension, and mapping to the migration starting point according to a second dimension until the N dimension is mapped.
5. The method of claim 1, wherein the intermediate database is a relational database, a NoSQL database, or a big data store database.
6. The method of claim 1, wherein if the target migration data is checked to be the target migration data not written with the information tag, determining whether the target migration data with higher priority exists in the intermediate database, if the target migration data does not exist, migrating the target migration data with a general migration policy, if the target migration data exists, sending a prompt message for writing the information tag, and after the target migration data is written in the information tag, migrating the target migration data as a next data migration object.
7. The method of claim 1, wherein the migration policy comprises at least one of: incremental migration, full migration, and time migration.
8. A system for heterogeneous data migration, the system comprising:
the initial unit is used for establishing a target database and a target data record structure aiming at the heterogeneous data to be migrated, taking the target database as a migration end point of the heterogeneous data to be migrated, and taking the target data record structure as a migration start point of the heterogeneous data to be migrated;
the preprocessing unit is used for preprocessing the heterogeneous data to be migrated, mapping the preprocessed heterogeneous data to be migrated to the migration starting point, generating target migration data in a unified data format based on the migration starting point according to the mapped heterogeneous data to be migrated, and storing the target migration data in an intermediate database;
the label unit is used for generating a migration strategy and a priority of the target migration data according to a preset data migration requirement aiming at the target migration data, and writing the migration strategy and the priority into the target migration data by taking the migration strategy and the priority as information labels;
and the migration unit is used for retrieving target migration data stored in the intermediate database based on a preset migration framework, checking whether the target migration data is provided with an information tag, if so, reading the information tag to acquire a migration strategy and a priority of the target migration data, and migrating the target migration data to a migration end point through a preset network channel according to the migration strategy and the priority.
9. A computer device, comprising:
one or more processors;
a processor for executing one or more programs;
the method of any of claims 1-7 is implemented when the one or more programs are executed by the one or more processors.
10. A computer readable storage medium, characterized in that a computer program is stored thereon, which computer program, when executed, implements the method according to any of claims 1-7.
CN202311828822.4A 2023-12-27 2023-12-27 A method and system for heterogeneous data migration Pending CN117874002A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311828822.4A CN117874002A (en) 2023-12-27 2023-12-27 A method and system for heterogeneous data migration

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311828822.4A CN117874002A (en) 2023-12-27 2023-12-27 A method and system for heterogeneous data migration

Publications (1)

Publication Number Publication Date
CN117874002A true CN117874002A (en) 2024-04-12

Family

ID=90584078

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311828822.4A Pending CN117874002A (en) 2023-12-27 2023-12-27 A method and system for heterogeneous data migration

Country Status (1)

Country Link
CN (1) CN117874002A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118708545A (en) * 2024-08-30 2024-09-27 四川福摩数字科技有限公司 File data migration method and system based on multi-source heterogeneous data

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118708545A (en) * 2024-08-30 2024-09-27 四川福摩数字科技有限公司 File data migration method and system based on multi-source heterogeneous data
CN118708545B (en) * 2024-08-30 2024-11-15 四川福摩数字科技有限公司 File data migration method and system based on multi-source heterogeneous data

Similar Documents

Publication Publication Date Title
CN111104392B (en) Database migration method and device, electronic equipment and storage medium
WO2022095520A1 (en) Document editing method and device, server, terminal, and storage medium
US20150261511A1 (en) Handling Pointers in Program Code in a System that Supports Multiple Address Spaces
CN111367890A (en) A method, apparatus, computer equipment and readable storage medium for data migration
CN113434734A (en) Method, device, equipment and storage medium for generating file and reading file
CN107992492A (en) A kind of storage method of data block, read method, its device and block chain
CN107247767B (en) Method and device for importing formatted data file into database
CN117874002A (en) A method and system for heterogeneous data migration
CN116166629A (en) A file format conversion method, device, equipment and readable storage medium
US9600562B2 (en) Method and apparatus for asynchronized de-serialization of E-R model in a huge data trunk
CN109542860B (en) Service data management method based on HDFS and terminal equipment
CN110928941A (en) Data fragment extraction method and device
CN110019347B (en) Data processing method and device of block chain and terminal equipment
CN112559444B (en) SQL file migration method, device, storage medium and equipment
CN113176877B (en) Entity class generation method, device and storage medium
CN114968725B (en) Task dependency correction method, device, computer equipment and storage medium
CN117009464A (en) Project management method and device
CN109558549B (en) Method for eliminating CSS style redundancy and related product
CN105701158A (en) File system read-write optimization method and framework
US10509659B1 (en) Input processing logic to produce outputs for downstream systems using configurations
CN105243020A (en) Automatic test method applicable for global distributed real-time database
CN108228735A (en) A kind of data processing method, apparatus and system
KR20240090928A (en) Artificial intelligence-based integration framework
WO2015154683A1 (en) File publication system, file release method, and network server
CN112035161B (en) A small program release verification method and parallel release method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination