CN116894023A - Data migration method and device, storage medium and electronic device - Google Patents
Data migration method and device, storage medium and electronic device Download PDFInfo
- Publication number
- CN116894023A CN116894023A CN202310869292.1A CN202310869292A CN116894023A CN 116894023 A CN116894023 A CN 116894023A CN 202310869292 A CN202310869292 A CN 202310869292A CN 116894023 A CN116894023 A CN 116894023A
- Authority
- CN
- China
- Prior art keywords
- data set
- target
- instruction
- source
- target data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/214—Database migration support
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/11—File system administration, e.g. details of archiving or snapshots
- G06F16/119—Details of migration of file systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/182—Distributed file systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
- G06F16/252—Integrating or interfacing systems involving database management systems between a Database Management System and a front-end application
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/283—Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
本申请实施例提供了一种数据的迁移方法及装置、存储介质及电子装置,其中,该方法包括:接收到通过触控目标显示界面上的第一按钮所触发的第一指令;在所述第一指令的触发下调用用于管理源数据集以及目标数据集的目标接口;接收到通过触控目标显示界面上的第二按钮所触发的第二指令;在所述第二指令的触发下,将所述源数据集中的目标数据迁移到所述目标数据集中。通过本申请,解决了相关技术中存在的数据迁移效率低以及通用性差的问题。
Embodiments of the present application provide a data migration method and device, a storage medium and an electronic device, wherein the method includes: receiving a first instruction triggered by touching a first button on a target display interface; Triggered by the first instruction, the target interface for managing the source data set and the target data set is called; receiving the second instruction triggered by touching the second button on the target display interface; triggered by the second instruction , migrate the target data in the source data set to the target data set. Through this application, the problems of low data migration efficiency and poor versatility existing in related technologies are solved.
Description
技术领域Technical field
本申请实施例涉及计算机领域,具体而言,涉及一种数据的迁移方法及装置、存储介质及电子装置。Embodiments of the present application relate to the field of computers, specifically, to a data migration method and device, a storage medium, and an electronic device.
背景技术Background technique
随着数据技术的发展,例如,大数据技术的发展,在互联网时代的数据信息迅猛爆炸背景下,大数据时代已经来临,大数据也成为未来技术发展的趋势,研究机构Gartner给“大数据”作出了如下定义:大数据是需要新处理模式才能具有更强的决策力、洞察发现力和流程优化能力的海量、高增加率和多样化的信息资产。With the development of data technology, for example, the development of big data technology, in the context of the rapid explosion of data information in the Internet era, the era of big data has arrived, and big data has also become the trend of future technology development. The research organization Gartner gave "big data" The following definition is made: Big data is a massive, high-growth and diverse information asset that requires new processing models to have stronger decision-making power, insight discovery and process optimization capabilities.
在大数据环境下,因用户业务的发展、商务关系等原因,会涉及到大数据的迁移,即,需要将大数据集群中的数据迁移到其他大数据集群中,但是,目前部分厂商通过开源的迁移工具distcp等进行底层命令迁移,步骤繁琐,工作效率低,并且,该迁移操作需要由专业人士来执行,适用人群有限,通用性较差。In the big data environment, due to the development of user business, business relationships and other reasons, big data migration will be involved, that is, the data in the big data cluster needs to be migrated to other big data clusters. However, some manufacturers currently use open source Migration tools such as distcp are used to migrate underlying commands. The steps are cumbersome and the work efficiency is low. Moreover, the migration operation needs to be performed by professionals. It has limited applicability and poor versatility.
针对相关技术中存在的数据迁移效率低以及通用性差的问题,目前尚未提出有效的解决方案。To address the problems of low data migration efficiency and poor versatility in related technologies, no effective solution has yet been proposed.
发明内容Contents of the invention
本申请实施例提供了一种数据的迁移方法及装置、存储介质及电子装置,以至少解决相关技术中存在的数据迁移效率低以及通用性差的问题。Embodiments of the present application provide a data migration method and device, a storage medium, and an electronic device to at least solve the problems of low data migration efficiency and poor versatility existing in related technologies.
根据本申请的一个实施例,提供了一种数据的迁移方法,包括:接收到通过触控目标显示界面上的第一按钮所触发的第一指令;在所述第一指令的触发下调用用于管理源数据集以及目标数据集的目标接口;接收到通过触控目标显示界面上的第二按钮所触发的第二指令;在所述第二指令的触发下,将所述源数据集中的目标数据迁移到所述目标数据集中。According to an embodiment of the present application, a data migration method is provided, including: receiving a first instruction triggered by touching a first button on a target display interface; and calling a user when triggered by the first instruction. A target interface for managing the source data set and the target data set; receiving a second instruction triggered by touching a second button on the target display interface; triggered by the second instruction, converting the source data set into The target data is migrated into the target data set.
在一个可选的实施例中,在将所述源数据集中的数据迁移到所述目标数据集中之后,所述方法还包括:接收到通过触控目标显示界面上的第三按钮所触发的第三指令;在所述第三指令的触发下,在所述目标显示界面上显示迁移到所述目标数据集中的数据的信息。In an optional embodiment, after migrating the data in the source data set to the target data set, the method further includes: receiving a third button triggered by touching a third button on the target display interface. Three instructions: when triggered by the third instruction, display information on the data migrated to the target data set on the target display interface.
在一个可选的实施例中,在所述源数据集中存储有HADOOP数据的情况下:在所述第一指令的触发下调用用于管理源数据集以及目标数据集的目标接口包括:在所述第一指令的触发下调用用于管理所述源数据集以及所述目标数据集的HADOOP接口;在所述第二指令的触发下,将所述源数据集中的数据迁移到所述目标数据集中包括:在所述第二指令的触发下,创建第一迁移任务,其中,所述第一迁移任务用于选取所述源数据集中的所述目标数据,并将所述目标数据转移到所述目标数据集中。In an optional embodiment, when HADOOP data is stored in the source data set: calling the target interface for managing the source data set and the target data set triggered by the first instruction includes: in the Triggered by the first instruction, the HADOOP interface for managing the source data set and the target data set is called; triggered by the second instruction, the data in the source data set is migrated to the target data Concentrating includes: triggering the second instruction, creating a first migration task, wherein the first migration task is used to select the target data in the source data set and transfer the target data to the target data set.
在一个可选的实施例中,在所述源数据集中存储有HIVE数据的情况下:在所述第一指令的触发下调用用于管理源数据集以及目标数据集的目标接口包括:在所述第一指令的触发下调用用于管理所述源数据集以及所述目标数据集的HDFS和HIVE接口;在所述第二指令的触发下,将所述源数据集中的数据迁移到所述目标数据集中包括:在所述第二指令的触发下,通过HDFS文件系统将所述源数据集中的所述目标数据导出到所述目标数据集中。In an optional embodiment, when HIVE data is stored in the source data set: calling the target interface for managing the source data set and the target data set triggered by the first instruction includes: in the Triggered by the first instruction, the HDFS and HIVE interfaces used to manage the source data set and the target data set are called; triggered by the second instruction, the data in the source data set is migrated to the The target data set includes: under the trigger of the second instruction, exporting the target data in the source data set to the target data set through the HDFS file system.
在一个可选的实施例中,在所述第二指令的触发下,通过HDFS文件系统将所述源数据集中的所述目标数据导出到所述目标数据集中包括:在所述第二指令的触发下,执行以下操作:创建导出任务,其中,所述导出任务用于将所述源数据集中的所述目标数据导出到源HDFS文件系统上;创建第二迁移任务,其中,所述第二迁移任务用于将所述源HDFS文件系统上的所述目标数据迁移到目标HDFS文件系统上;创建导入任务,其中,所述导入任务用于将所述目标HDFS文件系统上的所述目标数据导入到所述目标数据集中。In an optional embodiment, when triggered by the second instruction, exporting the target data in the source data set to the target data set through the HDFS file system includes: in the second instruction When triggered, perform the following operations: create an export task, wherein the export task is used to export the target data in the source data set to the source HDFS file system; create a second migration task, wherein the second The migration task is used to migrate the target data on the source HDFS file system to the target HDFS file system; create an import task, wherein the import task is used to migrate the target data on the target HDFS file system. Import into the target dataset.
在一个可选的实施例中,在所述源数据集中存储有HBASE数据的情况下:在所述第一指令的触发下调用用于管理源数据集以及目标数据集的目标接口包括:在所述第一指令的触发下调用用于管理所述源数据集以及所述目标数据集的HDFS和HBASE接口;在所述第二指令的触发下,将所述源数据集中的数据迁移到所述目标数据集中包括:在所述第二指令的触发下,通过HDFS文件系统将所述源数据集中的所述目标数据导出到所述目标数据集中。In an optional embodiment, when HBASE data is stored in the source data set: calling the target interface for managing the source data set and the target data set triggered by the first instruction includes: in the Triggered by the first instruction, the HDFS and HBASE interfaces used to manage the source data set and the target data set are called; triggered by the second instruction, the data in the source data set is migrated to the The target data set includes: under the trigger of the second instruction, exporting the target data in the source data set to the target data set through the HDFS file system.
在一个可选的实施例中,在所述第二指令的触发下,通过HDFS文件系统将所述源数据集中的所述目标数据导出到所述目标数据集中包括:在所述第二指令的触发下,执行以下操作:创建HBASE快照任务,其中,所述HBASE快照任务用于为所述源数据集中的数据创建HBASE快照;创建HBASE快照导出和迁移任务,其中,所述HBASE快照导出和迁移任务用于将所述HBASE快照导出到源HDFS文件系统上,并将所述源HDFS文件系统上的所述HBASE快照迁移至目标HDFS文件系统上;创建表任务,其中,所述表任务用于根据所述源数据集中的表属性在所述目标数据集上创建HBASE表;创建恢复快照任务,其中,所述恢复快照任务用于将所述目标HDFS文件系统上的所述HBASE快照恢复到所述在所述目标数据集上创建的HBASE表中。In an optional embodiment, when triggered by the second instruction, exporting the target data in the source data set to the target data set through the HDFS file system includes: in the second instruction When triggered, perform the following operations: create an HBASE snapshot task, where the HBASE snapshot task is used to create an HBASE snapshot for the data in the source data set; create an HBASE snapshot export and migration task, where the HBASE snapshot export and migration The task is used to export the HBASE snapshot to the source HDFS file system, and migrate the HBASE snapshot on the source HDFS file system to the target HDFS file system; create a table task, wherein the table task is used to Create an HBASE table on the target data set according to the table attributes in the source data set; create a restore snapshot task, wherein the restore snapshot task is used to restore the HBASE snapshot on the target HDFS file system to the target HDFS file system. described in the HBASE table created on the target data set.
根据本申请的另一个实施例,提供了一种数据的迁移装置,包括:第一接收模块,用于接收到通过触控目标显示界面上的第一按钮所触发的第一指令;调用模块,用于在所述第一指令的触发下调用用于管理源数据集以及目标数据集的目标接口;第二接收模块,用于接收到通过触控目标显示界面上的第二按钮所触发的第二指令;迁移模块,用于在所述第二指令的触发下,将所述源数据集中的目标数据迁移到所述目标数据集中。According to another embodiment of the present application, a data migration device is provided, including: a first receiving module, configured to receive a first instruction triggered by touching a first button on a target display interface; a calling module, configured to call a target interface for managing the source data set and the target data set when triggered by the first instruction; a second receiving module configured to receive a third command triggered by touching a second button on the target display interface; Two instructions; a migration module, configured to migrate the target data in the source data set to the target data set when triggered by the second instruction.
在一个可选的实施例中,所述装置还包括第三接收模块,用于在将所述源数据集中的数据迁移到所述目标数据集中之后,接收到通过触控目标显示界面上的第三按钮所触发的第三指令;在所述第三指令的触发下,在所述目标显示界面上显示迁移到所述目标数据集中的数据的信息。In an optional embodiment, the device further includes a third receiving module, configured to receive a third message on the target display interface through touch after migrating the data in the source data set to the target data set. A third instruction triggered by the three buttons; when triggered by the third instruction, information about data migrated to the target data set is displayed on the target display interface.
在一个可选的实施例中,在所述源数据集中存储有HADOOP数据的情况下:所述调用模块包括:第一调用单元,用于在所述第一指令的触发下调用用于管理所述源数据集以及所述目标数据集的HADOOP接口;所述迁移模块包括:创建单元,用于在所述第二指令的触发下,创建第一迁移任务,其中,所述第一迁移任务用于选取所述源数据集中的所述目标数据,并将所述目标数据转移到所述目标数据集中。In an optional embodiment, when HADOOP data is stored in the source data set: the calling module includes: a first calling unit, configured to call a method for managing all data triggered by the first instruction. The HADOOP interface of the source data set and the target data set; the migration module includes: a creation unit, configured to create a first migration task triggered by the second instruction, wherein the first migration task uses Select the target data in the source data set and transfer the target data to the target data set.
在一个可选的实施例中,在所述源数据集中存储有HIVE数据的情况下:所述调用模块包括:第二调用单元,用于在所述第一指令的触发下调用用于管理所述源数据集以及所述目标数据集的HDFS和HIVE接口;所述迁移模块包括:第一导出单元,用于在所述第二指令的触发下,通过HDFS文件系统将所述源数据集中的所述目标数据导出到所述目标数据集中。In an optional embodiment, when HIVE data is stored in the source data set: the calling module includes: a second calling unit, configured to call, under the trigger of the first instruction, a method for managing all the HIVE data. The HDFS and HIVE interfaces of the source data set and the target data set; the migration module includes: a first export unit, configured to transfer the source data set through the HDFS file system under the trigger of the second instruction. The target data is exported into the target data set.
在一个可选的实施例中,所述第一导出单元包括:第一执行子单元,用于在所述第二指令的触发下,执行以下操作:创建导出任务,其中,所述导出任务用于将所述源数据集中的所述目标数据导出到源HDFS文件系统上;创建第二迁移任务,其中,所述第二迁移任务用于将所述源HDFS文件系统上的所述目标数据迁移到目标HDFS文件系统上;创建导入任务,其中,所述导入任务用于将所述目标HDFS文件系统上的所述目标数据导入到所述目标数据集中。In an optional embodiment, the first export unit includes: a first execution sub-unit, configured to perform the following operations when triggered by the second instruction: create an export task, wherein the export task is Exporting the target data in the source data set to the source HDFS file system; creating a second migration task, wherein the second migration task is used to migrate the target data on the source HDFS file system to the target HDFS file system; create an import task, where the import task is used to import the target data on the target HDFS file system into the target data set.
在一个可选的实施例中,在所述源数据集中存储有HBASE数据的情况下:所述调用模块包括:第三调用单元,用于在所述第一指令的触发下调用用于管理所述源数据集以及所述目标数据集的HDFS和HBASE接口;所述迁移模块包括:第二导出单元,用于在所述第二指令的触发下,通过HDFS文件系统将所述源数据集中的所述目标数据导出到所述目标数据集中。In an optional embodiment, when HBASE data is stored in the source data set: the calling module includes: a third calling unit, configured to call, under the trigger of the first instruction, a method for managing all The HDFS and HBASE interfaces of the source data set and the target data set; the migration module includes: a second export unit, configured to export the source data set through the HDFS file system under the trigger of the second instruction. The target data is exported into the target data set.
在一个可选的实施例中,所述第二导出单元包括:第二执行子单元,用于在所述第二指令的触发下,执行以下操作:创建HBASE快照任务,其中,所述HBASE快照任务用于为所述源数据集中的数据创建HBASE快照;创建HBASE快照导出和迁移任务,其中,所述HBASE快照导出和迁移任务用于将所述HBASE快照导出到源HDFS文件系统上,并将所述源HDFS文件系统上的所述HBASE快照迁移至目标HDFS文件系统上;创建表任务,其中,所述表任务用于根据所述源数据集中的表属性在所述目标数据集上创建HBASE表;创建恢复快照任务,其中,所述恢复快照任务用于将所述目标HDFS文件系统上的所述HBASE快照恢复到所述在所述目标数据集上创建的HBASE表中。In an optional embodiment, the second export unit includes: a second execution subunit, configured to perform the following operations when triggered by the second instruction: create an HBASE snapshot task, wherein the HBASE snapshot The task is used to create an HBASE snapshot for the data in the source data set; create an HBASE snapshot export and migration task, wherein the HBASE snapshot export and migration task is used to export the HBASE snapshot to the source HDFS file system, and Migrate the HBASE snapshot on the source HDFS file system to the target HDFS file system; create a table task, wherein the table task is used to create HBASE on the target data set according to table attributes in the source data set Table; create a restore snapshot task, where the restore snapshot task is used to restore the HBASE snapshot on the target HDFS file system to the HBASE table created on the target data set.
根据本申请的又一个实施例,还提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机程序,其中,所述计算机程序被设置为运行时执行上述任一项方法实施例中的步骤。According to yet another embodiment of the present application, a computer-readable storage medium is also provided. A computer program is stored in the computer-readable storage medium, wherein the computer program is configured to execute any of the above methods when running. Steps in Examples.
根据本申请的又一个实施例,还提供了一种电子设备,包括存储器和处理器,所述存储器中存储有计算机程序,所述处理器被设置为运行所述计算机程序以执行上述任一项方法实施例中的步骤。According to yet another embodiment of the present application, an electronic device is also provided, including a memory and a processor. A computer program is stored in the memory, and the processor is configured to run the computer program to perform any of the above. Steps in method embodiments.
通过本申请,采用的是接收到通过触控目标显示界面上的第一按钮所触发的第一指令;在所述第一指令的触发下调用用于管理源数据集以及目标数据集的目标接口;接收到通过触控目标显示界面上的第二按钮所触发的第二指令;在所述第二指令的触发下,将所述源数据集中的目标数据迁移到所述目标数据集中的操作,即,在目标显示界面上会显示有执行相关数据迁移操作的按钮,并且会预先为各相关按钮链接对应的功能,因此,在需要进行数据迁移时,仅通过触控相关的按钮即可实现数据迁移的目的,对于使用者来说,操作更加简洁方便,无需通过编写专业的底层命令也能实现数据的迁移,进而扩大了数据迁移的使用面,提高了数据迁移的通用性,一定程度上提高了数据迁移的工作效率,有效解决了相关技术中存在的数据迁移效率低以及通用性差的问题。Through this application, what is used is to receive the first instruction triggered by touching the first button on the target display interface; and call the target interface for managing the source data set and the target data set when triggered by the first instruction. ; Receive a second instruction triggered by touching the second button on the target display interface; and, triggered by the second instruction, migrate the target data in the source data set to the target data set, That is, buttons for performing relevant data migration operations will be displayed on the target display interface, and corresponding functions will be linked to each relevant button in advance. Therefore, when data migration is required, data migration can be achieved by simply touching the relevant buttons. The purpose of migration is to make the operation simpler and more convenient for users. Data migration can be achieved without writing professional underlying commands, thereby expanding the use of data migration, improving the versatility of data migration, and improving data migration to a certain extent. It improves the work efficiency of data migration and effectively solves the problems of low data migration efficiency and poor versatility in related technologies.
附图说明Description of the drawings
图1是本申请实施例的数据的迁移方法的移动终端的硬件结构框图;Figure 1 is a hardware structure block diagram of a mobile terminal of a data migration method according to an embodiment of the present application;
图2是根据本申请实施例的数据的迁移方法的流程图;Figure 2 is a flow chart of a data migration method according to an embodiment of the present application;
图3是根据本申请实施例的HBASE数据的迁移过程示意图;Figure 3 is a schematic diagram of the migration process of HBASE data according to an embodiment of the present application;
图4是根据本申请实施例的数据的迁移装置的结构框图。Figure 4 is a structural block diagram of a data migration device according to an embodiment of the present application.
具体实施方式Detailed ways
下文中将参考附图并结合实施例来详细说明本申请的实施例。The embodiments of the present application will be described in detail below with reference to the accompanying drawings and in combination with the embodiments.
需要说明的是,本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。It should be noted that the terms "first", "second", etc. in the description and claims of this application and the above-mentioned drawings are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence.
首先对本发明实施例中所涉及到的术语进行说明:First, the terminology involved in the embodiment of the present invention is explained:
HDFS:全称Hadoop Distributed File System,即,分布式文件系统。该系统是指被设计成适合运行在通用硬件(commodity hardware)上的分布式文件系统(DistributedFile System)。它和现有的分布式文件系统有很多共同点。但同时,它和其他的分布式文件系统的区别也是很明显的。HDFS是一个高度容错性的系统,适合部署在廉价的机器上。HDFS能提供高吞吐量的数据访问,非常适合大规模数据集上的应用。HDFS放宽了一部分POSIX约束,来实现流式读取文件系统数据的目的。HDFS在最开始是作为Apache Nutch搜索引擎项目的基础架构而开发的。HDFS是Apache Hadoop Core项目的一部分。另外,HDFS有着高容错性(fault-tolerant)的特点,并且设计用来部署在低廉的(low-cost)硬件上。而且它提供高吞吐量(high throughput)来访问应用程序的数据,适合那些有着超大数据集(largedata set)的应用程序。HDFS放宽了(relax)POSIX的要求(requirements)这样可以实现流的形式访问(streaming access)文件系统中的数据。HDFS: The full name is Hadoop Distributed File System, that is, distributed file system. This system refers to a distributed file system (DistributedFile System) designed to run on common hardware (commodity hardware). It has a lot in common with existing distributed file systems. But at the same time, the difference between it and other distributed file systems is also obvious. HDFS is a highly fault-tolerant system suitable for deployment on cheap machines. HDFS can provide high-throughput data access and is very suitable for applications on large-scale data sets. HDFS relaxes some POSIX constraints to achieve the purpose of streaming file system data. HDFS was originally developed as the infrastructure for the Apache Nutch search engine project. HDFS is part of the Apache Hadoop Core project. In addition, HDFS has high fault-tolerant characteristics and is designed to be deployed on low-cost hardware. And it provides high throughput to access application data, suitable for applications with large data sets. HDFS relaxes POSIX requirements so that streaming access to data in the file system can be achieved.
HIVE:HIVE(也可以称为hive)是基于hadoop的一个数据仓库工具,用来进行数据提取、转化、加载,这是一种可以存储、查询和分析存储在Hadoop中的大规模数据的机制。HIVE数据仓库工具能将结构化的数据文件映射为一张数据库表,并提供SQL查询功能,能将SQL语句转变成MapReduce任务来执行。Hive的优点是学习成本低,可以通过类似SQL语句实现快速MapReduce统计,使MapReduce变得更加简单,而不必开发专门的MapReduce应用程序。hive十分适合对数据仓库进行统计分析。HIVE: HIVE (also called hive) is a data warehouse tool based on Hadoop, used for data extraction, transformation, and loading. This is a mechanism that can store, query, and analyze large-scale data stored in Hadoop. The HIVE data warehouse tool can map structured data files into a database table, and provides SQL query functions, which can convert SQL statements into MapReduce tasks for execution. The advantage of Hive is that it has low learning cost and can implement fast MapReduce statistics through SQL-like statements, making MapReduce simpler without having to develop a specialized MapReduce application. hive is very suitable for statistical analysis of data warehouses.
HBASE:HBASE(也可以称为HBase)是一个分布式的、面向列的开源数据库。该技术来源于Fay Chang所撰写的论文“Bigtable:一个结构化数据的分布式存储系统”。就像Bigtable利用了文件系统(File System)所提供的分布式数据存储一样,HBase在Hadoop之上提供了类似于Bigtable的能力。HBase是Apache的Hadoop项目的子项目。HBase不同于一般的关系数据库,它是一个适合于非结构化数据存储的数据库。另一个不同的是HBase基于列的而不是基于行的模式。HBASE: HBASE (also known as HBase) is a distributed, column-oriented open source database. This technology comes from the paper "Bigtable: A distributed storage system for structured data" written by Fay Chang. Just like Bigtable utilizes the distributed data storage provided by the File System, HBase provides Bigtable-like capabilities on top of Hadoop. HBase is a subproject of Apache's Hadoop project. HBase is different from general relational databases in that it is a database suitable for unstructured data storage. Another difference is that HBase is column-based rather than row-based.
Distcp:(分布式复制Distributed Copy,也可以称为DistCp)是一种用于大型集群之间/集群内复制的工具。和在linux上执行cp,scp实现效果是一致的,不同的是,cp是将本机的文件和目录拷贝到本机的其它地方,scp则可以将A机器的文件或者目录拷贝到B机器,而Distcp则可以实现的是A(HDFS)集群的数据拷贝到B(HDFS)集群,而分布式使得数据拷贝时,可以实现A级群的DN节点同时向B集群的DN节点发送数据,突破了单机拷贝的网卡速率限制,拷贝效率更高。同时Distcp它使用Map/Reduce任务实现文件分发,错误处理和恢复,以及报告生成。它把文件和目录的列表作为map任务的输入,每个任务会完成源列表中部分文件的拷贝(实际上Distcp只用到了map,没有用到reduce,在map任务中从老集群读取数据,然后写入到新集群,以此来完成数据迁移)。Distcp: (Distributed Copy, also known as DistCp) is a tool for large inter-cluster/intra-cluster replication. The effect of executing cp and scp on Linux is the same. The difference is that cp copies the files and directories of the local machine to other places on the machine, while scp can copy the files or directories of machine A to machine B. What Distcp can achieve is copying the data of cluster A (HDFS) to cluster B (HDFS). When the data is copied in a distributed manner, the DN node of level A group can send data to the DN node of cluster B at the same time, which is a breakthrough. The network card rate limit for stand-alone copying makes copying more efficient. At the same time, Distcp uses Map/Reduce tasks to implement file distribution, error handling and recovery, and report generation. It uses the list of files and directories as the input of the map task, and each task will complete the copy of some files in the source list (actually Distcp only uses map, not reduce, and reads data from the old cluster in the map task. Then write it to the new cluster to complete the data migration).
Snapshot:也就是快照技术,是HBase数据库最核心的备份与恢复工具。在backup时被广泛采用,很早就被应用到阵列和主机中,主要采用Copy on Write的算法,通常都是基于卷,在block级别进行处理。Snapshot: Snapshot technology is the core backup and recovery tool for the HBase database. It is widely used in backup and has been applied to arrays and hosts very early. It mainly uses the Copy on Write algorithm, which is usually volume-based and processed at the block level.
下面结合实施例对本发明如何解决的相关技术中所存在的问题进行说明:How the present invention solves the problems existing in related technologies will be described below in conjunction with the embodiments:
本申请实施例中所提供的方法实施例可以在移动终端、计算机终端或者类似的运算装置中执行。以运行在移动终端上为例,图1是本申请实施例的数据的迁移方法的移动终端的硬件结构框图。如图1所示,移动终端可以包括一个或多个(图1中仅示出一个)处理器102(处理器102可以包括但不限于微处理器MCU或可编程逻辑器件FPGA等的处理装置)和用于存储数据的存储器104,其中,上述移动终端还可以包括用于通信功能的传输设备106以及输入输出设备108。本领域普通技术人员可以理解,图1所示的结构仅为示意,其并不对上述移动终端的结构造成限定。例如,移动终端还可包括比图1中所示更多或者更少的组件,或者具有与图1所示不同的配置。The method embodiments provided in the embodiments of this application can be executed in a mobile terminal, a computer terminal, or a similar computing device. Taking running on a mobile terminal as an example, FIG. 1 is a hardware structure block diagram of a mobile terminal according to the data migration method according to the embodiment of the present application. As shown in Figure 1, the mobile terminal may include one or more (only one is shown in Figure 1) processors 102 (the processor 102 may include but is not limited to a processing device such as a microprocessor MCU or a programmable logic device FPGA) and a memory 104 for storing data, wherein the above-mentioned mobile terminal may also include a transmission device 106 and an input and output device 108 for communication functions. Persons of ordinary skill in the art can understand that the structure shown in Figure 1 is only illustrative, and it does not limit the structure of the above-mentioned mobile terminal. For example, the mobile terminal may also include more or fewer components than shown in FIG. 1 , or have a different configuration than shown in FIG. 1 .
存储器104可用于存储计算机程序,例如,应用软件的软件程序以及模块,如本申请实施例中的据的迁移方法对应的计算机程序,处理器102通过运行存储在存储器104内的计算机程序,从而执行各种功能应用以及数据处理,即实现上述的方法。存储器104可包括高速随机存储器,还可包括非易失性存储器,如一个或者多个磁性存储装置、闪存、或者其他非易失性固态存储器。在一些实例中,存储器104可进一步包括相对于处理器102远程设置的存储器,这些远程存储器可以通过网络连接至移动终端。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。The memory 104 can be used to store computer programs, for example, software programs and modules of application software, such as the computer programs corresponding to the data migration method in the embodiment of the present application. The processor 102 executes the computer program by running the computer program stored in the memory 104. Various functional applications and data processing implement the above methods. Memory 104 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory located remotely relative to the processor 102, and these remote memories may be connected to the mobile terminal through a network. Examples of the above-mentioned networks include but are not limited to the Internet, intranets, local area networks, mobile communication networks and combinations thereof.
传输设备106用于经由一个网络接收或者发送数据。上述的网络具体实例可包括移动终端的通信供应商提供的无线网络。在一个实例中,传输设备106包括一个网络适配器(Network Interface Controller,简称为NIC),其可通过基站与其他网络设备相连从而可与互联网进行通讯。在一个实例中,传输设备106可以为射频(Radio Frequency,简称为RF)模块,其用于通过无线方式与互联网进行通讯。Transmission device 106 is used to receive or send data via a network. Specific examples of the above-mentioned network may include a wireless network provided by a communication provider of the mobile terminal. In one example, the transmission device 106 includes a network adapter (Network Interface Controller, NIC for short), which can be connected to other network devices through a base station to communicate with the Internet. In one example, the transmission device 106 may be a radio frequency (Radio Frequency, RF for short) module, which is used to communicate with the Internet wirelessly.
在本实施例中提供了一种运行于上述终端的数据的迁移方法,图2是根据本申请实施例的数据的迁移方法的流程图,如图2所示,该流程包括如下步骤:This embodiment provides a data migration method running on the above-mentioned terminal. Figure 2 is a flow chart of the data migration method according to the embodiment of the present application. As shown in Figure 2, the process includes the following steps:
步骤S202,接收到通过触控目标显示界面上的第一按钮所触发的第一指令;Step S202: Receive the first instruction triggered by touching the first button on the target display interface;
步骤S204,在上述第一指令的触发下调用用于管理源数据集以及目标数据集的目标接口;Step S204: Call the target interface for managing the source data set and the target data set when triggered by the above-mentioned first instruction;
步骤S206,接收到通过触控目标显示界面上的第二按钮所触发的第二指令;Step S206: Receive a second instruction triggered by touching the second button on the target display interface;
步骤S208,在上述第二指令的触发下,将上述源数据集中的目标数据迁移到上述目标数据集中。Step S208, triggered by the second instruction, migrate the target data in the source data set to the target data set.
上述的目标显示界面可以是特定终端上的显示界面,或者安装于特定终端上的应用的显示界面,当然,也可以是其他类型的显示界面,只要是可视化的界面即可。上述步骤的执行主题可以是显示有上述目标显示界面的终端,或者,具备控制其他设备显示上述目标显示界面的终端或者控制设备或者服务器等等,或者,还可以是其他的具备类似处理能力的设备或系统。The above-mentioned target display interface can be a display interface on a specific terminal, or a display interface of an application installed on a specific terminal. Of course, it can also be other types of display interfaces, as long as it is a visual interface. The execution subject of the above steps may be a terminal displaying the above target display interface, or a terminal or control device or server that controls other devices to display the above target display interface, or other devices with similar processing capabilities. or system.
在上述实施例中,目标显示界面上配置有多个按钮(或称为按键,例如,触控按钮,当然也可以是物理按钮,具体是何种类型的按钮以实际情况为准),且每个按钮所配置的功能可以是不同的,通过触控(物理按键的话是按压操作)某个按钮可以触发相应的功能,其中,所触发的功能是与数据迁移相关的功能,具体某个按钮对应的是哪个功能可以基于实际情况进行配置,此外,触控按钮在显示界面上的位置也是可以调整的,使用者可以基于自身的使用习惯来灵活调整。In the above embodiment, the target display interface is configured with multiple buttons (or keys, for example, touch buttons, which of course can also be physical buttons, and the specific type of buttons shall be subject to the actual situation), and each The functions configured for each button can be different. A certain button can trigger the corresponding function by touching it (for physical buttons, it is a pressing operation). Among them, the triggered function is a function related to data migration. A specific button corresponds to Which function can be configured based on the actual situation. In addition, the position of the touch button on the display interface can also be adjusted, and users can flexibly adjust it based on their own usage habits.
在上述实施例中,通过目标显示界面可以实现不同类型的数据的迁移,例如,可以实现HDFS数据,HIVE数据,HBASE数据跨大数据集群的迁移,当然,也可以实现其他类型的数据的迁移。需要说明的是,在上述实施例中,在进行数据迁移时,可以将源数据集中的全部数据都迁移至目标数据集中,也可以将源数据集中的部分数据迁移至目标数据集中,具体迁移哪些数据可以由使用者通过目标显示界面进行选择。此外,目标数据集的数量可以是一个,也可以是多个,具体是几个以实际迁移的数据量以及目标数据集的空闲存储空间来确定。在该目标显示界面中也可以配置切换按钮,通过触控该切换按钮来选择具体待迁移的数据类型所对应的迁移模式,其中,切换按钮的数量可以是一个也可以是多个,在切换按钮的数量为一个的情况下,按压次数的不同所切换到的模式是不同的,或者,按压力度的不同所切换到的模式是不同的,或者,按压时长的不同所切换到的模式也是不同的,此外,具体的切换方式也可以基于使用者的使用习惯灵活设置和调整;在切换按钮为多个的情况下,通过触控不同的切换按钮能够实现将当前模式切换为其他模式。In the above embodiment, different types of data can be migrated through the target display interface. For example, HDFS data, HIVE data, and HBASE data can be migrated across big data clusters. Of course, other types of data can also be migrated. It should be noted that in the above embodiments, when performing data migration, all data in the source data set can be migrated to the target data set, or part of the data in the source data set can be migrated to the target data set. Specifically, which data should be migrated? Data can be selected by the user through the target display interface. In addition, the number of target data sets can be one or multiple. The specific number is determined by the actual amount of data to be migrated and the free storage space of the target data set. A switch button can also be configured in the target display interface, and the migration mode corresponding to the specific data type to be migrated can be selected by touching the switch button. The number of switch buttons can be one or more. When the number of is one, the mode switched to is different depending on the number of times of pressing, or the mode switched to is different depending on the pressing force, or the mode switched to is different depending on the pressing duration. , In addition, the specific switching method can also be flexibly set and adjusted based on the user's usage habits; when there are multiple switching buttons, the current mode can be switched to other modes by touching different switching buttons.
在上述实施例中,采用的是接收到通过触控目标显示界面上的第一按钮所触发的第一指令;在上述第一指令的触发下调用用于管理源数据集以及目标数据集的目标接口;接收到通过触控目标显示界面上的第二按钮所触发的第二指令;在上述第二指令的触发下,将上述源数据集中的目标数据迁移到上述目标数据集中的操作,即,在目标显示界面上会显示有执行相关数据迁移操作的按钮,并且会预先为各相关按钮链接对应的功能,因此,在需要进行数据迁移时,仅通过触控相关的按钮即可实现数据迁移的目的,对于使用者来说,操作更加简洁方便,无需通过编写专业的底层命令也能实现数据的迁移,进而扩大了数据迁移的使用面,提高了数据迁移的通用性,一定程度上提高了数据迁移的工作效率,有效解决了相关技术中存在的数据迁移效率低以及通用性差的问题。In the above embodiment, the method is to receive the first instruction triggered by touching the first button on the target display interface; and to call the target for managing the source data set and the target data set when triggered by the above first instruction. Interface; receiving a second instruction triggered by touching a second button on the target display interface; triggering the above second instruction, the operation of migrating the target data in the above source data set to the above target data set, that is, Buttons for performing relevant data migration operations will be displayed on the target display interface, and corresponding functions will be linked to each relevant button in advance. Therefore, when data migration is required, data migration can be achieved by simply touching the relevant buttons. The purpose is to make the operation simpler and more convenient for users. Data migration can be achieved without writing professional underlying commands, thereby expanding the use of data migration, improving the versatility of data migration, and improving data efficiency to a certain extent. The work efficiency of migration effectively solves the problems of low data migration efficiency and poor versatility in related technologies.
在一个可选的实施例中,在将上述源数据集中的数据迁移到上述目标数据集中之后,上述方法还包括:接收到通过触控目标显示界面上的第三按钮所触发的第三指令;在上述第三指令的触发下,在上述目标显示界面上显示迁移到上述目标数据集中的数据的信息。在本实施例中,还可以通过该目标显示界面来对数据迁移进行校验,即,校验数据迁移是否已经完成,其中,在触控第三按钮之后,在目标显示界面的特定区域中可以显示当前已经完成迁移的数据的信息,例如,显示当前已经完成迁移的数据的列表,或者,数据所在的文件夹等等。此外,还需要说明的是,本实施例中的显示操作也可以是实时进行的,即,可以不需要在第三指令的触发下执行,而是在检测到有数据迁移的操作后即在目标显示界面的特定区域上显示对应的迁移进度或者显示对应的已完成迁移的数据的信息。通过本实施例可以实现数据迁移完整性的校验,方便使用者直观的看到数据的迁移过程以及是否完成迁移,从而减少不必要的等待时间。In an optional embodiment, after migrating the data in the source data set to the target data set, the method further includes: receiving a third instruction triggered by touching a third button on the target display interface; Triggered by the above third instruction, information about the data migrated to the above target data set is displayed on the above target display interface. In this embodiment, the data migration can also be verified through the target display interface, that is, to verify whether the data migration has been completed. After touching the third button, the data migration can be verified in a specific area of the target display interface. Display information about the data that has been migrated. For example, display the list of data that has been migrated, or the folder where the data is located, etc. In addition, it should be noted that the display operation in this embodiment can also be performed in real time, that is, it does not need to be executed under the trigger of the third instruction, but can be performed on the target after detecting a data migration operation. A specific area of the display interface displays the corresponding migration progress or displays the corresponding information on the data that has been migrated. Through this embodiment, data migration integrity can be verified, allowing users to intuitively see the data migration process and whether the migration is completed, thereby reducing unnecessary waiting time.
由前述实施例的描述可知,本发明实施例可以实现多种类型的数据的迁移,下面分情况对多类数据的迁移过程进行说明:As can be seen from the description of the foregoing embodiments, embodiments of the present invention can realize the migration of multiple types of data. The following describes the migration process of multiple types of data according to situations:
情况1:上述源数据集中存储有HADOOP数据Case 1: HADOOP data is stored in the above source data set
在上述第一指令的触发下调用用于管理源数据集以及目标数据集的目标接口包括:在上述第一指令的触发下调用用于管理上述源数据集(该情况下,源数据集包括源HDFS文件系统)以及上述目标数据集(该情况下,源数据集包括目标HDFS文件系统,或称为目的HDFS文件系统)的HADOOP接口;在上述第二指令的触发下,将上述源数据集中的数据迁移到上述目标数据集中包括:在上述第二指令的触发下,创建第一迁移任务,其中,上述第一迁移任务用于选取上述源数据集中的上述目标数据,并将上述目标数据转移到上述目标数据集中。此外,在完成上述迁移操作后,还可以通过可视化UI(即,上述的目标显示界面)进行校验数据的完整性(可以通过查看文件数据的总数来校验)。Calling the target interface for managing the source data set and the target data set when triggered by the above-mentioned first instruction includes: calling the target interface for managing the above-mentioned source data set when triggered by the above-mentioned first instruction (in this case, the source data set includes the source data set). HDFS file system) and the HADOOP interface of the above-mentioned target data set (in this case, the source data set includes the target HDFS file system, or the destination HDFS file system); triggered by the above-mentioned second instruction, the above-mentioned source data set Migrating data to the above-mentioned target data set includes: triggering the above-mentioned second instruction, creating a first migration task, wherein the above-mentioned first migration task is used to select the above-mentioned target data in the above-mentioned source data set and transfer the above-mentioned target data to The above target data set. In addition, after completing the above migration operation, the integrity of the data can also be verified through the visual UI (ie, the above-mentioned target display interface) (the integrity can be verified by viewing the total number of file data).
在上述情况1中,可以通过开源迁移工具Distcp来完成,Distcp是适用于大规模集群内部和集群之间数据拷贝的工具,它使用Map/Reduce实现文件的分发,错误处理和恢复以及报告生成。它在文件和目录的列表作为map任务的输入,每个任务会完成源列表部分文件的拷贝,由于使用了MapReduce方法,这个工具在语义和执行上都有特殊的地方。In the above scenario 1, this can be accomplished through the open source migration tool Distcp. Distcp is a tool suitable for data copying within and between large-scale clusters. It uses Map/Reduce to implement file distribution, error handling and recovery, and report generation. It takes a list of files and directories as input to a map task, and each task completes a copy of some files in the source list. Due to the use of the MapReduce method, this tool has special features in semantics and execution.
情况2:上述源数据集中存储有HIVE数据Case 2: HIVE data is stored in the above source data set
在上述第一指令的触发下调用用于管理源数据集以及目标数据集的目标接口包括:在上述第一指令的触发下调用用于管理上述源数据集(该情况下,源数据集包括源HIVE数据仓库)以及上述目标数据集(该情况下,源数据集包括目标HIVE数据仓库,或称为目标HIVE数据仓库)的HDFS和HIVE接口;在上述第二指令的触发下,将上述源数据集中的数据迁移到上述目标数据集中包括:在上述第二指令的触发下,通过HDFS文件系统将上述源数据集中的上述目标数据导出到上述目标数据集中。Calling the target interface for managing the source data set and the target data set when triggered by the above-mentioned first instruction includes: calling the target interface for managing the above-mentioned source data set when triggered by the above-mentioned first instruction (in this case, the source data set includes the source data set). HIVE data warehouse) and the HDFS and HIVE interfaces of the above-mentioned target data set (in this case, the source data set includes the target HIVE data warehouse, or the target HIVE data warehouse); triggered by the above-mentioned second instruction, the above-mentioned source data Migrating the centralized data to the above-mentioned target data set includes: under the trigger of the above-mentioned second instruction, exporting the above-mentioned target data in the above-mentioned source data set to the above-mentioned target data set through the HDFS file system.
在一个可选的实施例中,在上述情况2中,在上述第二指令的触发下,通过HDFS文件系统将上述源数据集中的上述目标数据导出到上述目标数据集中包括:在上述第二指令的触发下,执行以下操作:创建导出任务,其中,上述导出任务用于将上述源数据集中的上述目标数据导出到源HDFS文件系统上;创建第二迁移任务,其中,上述第二迁移任务用于将上述源HDFS文件系统上的上述目标数据迁移到目标HDFS文件系统上;创建导入任务,其中,上述导入任务用于将上述目标HDFS文件系统上的上述目标数据导入到上述目标数据集中。In an optional embodiment, in the above situation 2, under the trigger of the above second instruction, exporting the above target data in the above source data set to the above target data set through the HDFS file system includes: in the above second instruction Triggered by Migrating the target data on the source HDFS file system to the target HDFS file system; creating an import task, wherein the import task is used to import the target data on the target HDFS file system into the target data set.
在上述情况2中,HIVE数据的迁移可以使用HIVE本身的import(导入)和export(导出)功能,可以将HIVE表中的表结构和数据导入和导出。此外,在完成上述迁移操作后,还可以通过可视化UI(即,上述的目标显示界面)进行校验数据的完整性(可以通过查看表数据的总数来校验)。In case 2 above, HIVE data migration can use HIVE's own import and export functions to import and export the table structure and data in the HIVE table. In addition, after completing the above migration operation, the integrity of the data can also be verified through the visual UI (ie, the above-mentioned target display interface) (the integrity can be verified by viewing the total number of table data).
情况3:上述源数据集中存储有HBASE(或称为Hbase)数据Scenario 3: The above source data set stores HBASE (or Hbase) data
在上述第一指令的触发下调用用于管理源数据集以及目标数据集的目标接口包括:在上述第一指令的触发下调用用于管理上述源数据集以及上述目标数据集的HDFS和HBASE接口;在上述第二指令的触发下,将上述源数据集中的数据迁移到上述目标数据集中包括:在上述第二指令的触发下,通过HDFS文件系统将上述源数据集中的上述目标数据导出到上述目标数据集中。Calling the target interface for managing the source data set and the target data set when triggered by the above-mentioned first instruction includes: calling the HDFS and HBASE interfaces for managing the above-mentioned source data set and the above-mentioned target data set when triggered by the above-mentioned first instruction. ; Triggered by the above-mentioned second instruction, migrating the data in the above-mentioned source data set to the above-mentioned target data set includes: Triggered by the above-mentioned second instruction, exporting the above-mentioned target data in the above-mentioned source data set to the above-mentioned target data set through the HDFS file system target data set.
在一个可选的实施例中,在上述情况3中,在上述第二指令的触发下,通过HDFS文件系统将上述源数据集中的上述目标数据导出到上述目标数据集中包括:在上述第二指令的触发下,执行以下操作:创建HBASE快照任务,其中,上述HBASE快照任务用于为上述源数据集中的数据创建HBASE快照;创建HBASE快照导出和迁移任务,其中,上述HBASE快照导出和迁移任务用于将上述HBASE快照导出到源HDFS文件系统上,并将上述源HDFS文件系统上的上述HBASE快照迁移至目标HDFS文件系统上;创建表任务,其中,上述表任务用于根据上述源数据集中的表属性在上述目标数据集上创建HBASE表;创建恢复快照任务,其中,上述恢复快照任务用于将上述目标HDFS文件系统上的上述HBASE快照恢复到上述在上述目标数据集上创建的HBASE表中。In an optional embodiment, in the above situation 3, under the trigger of the above second instruction, exporting the above target data in the above source data set to the above target data set through the HDFS file system includes: in the above second instruction When triggered by Export the above-mentioned HBASE snapshot to the source HDFS file system, and migrate the above-mentioned HBASE snapshot on the above-mentioned source HDFS file system to the target HDFS file system; create a table task, wherein the above-mentioned table task is used to create a table task based on the above-mentioned source data set The table attribute creates an HBASE table on the above-mentioned target data set; creates a restore snapshot task, wherein the above-mentioned restore snapshot task is used to restore the above-mentioned HBASE snapshot on the above-mentioned target HDFS file system to the above-mentioned HBASE table created on the above-mentioned target data set. .
在上述情况3中,HBASE数据迁移可以使用HBASE的snapshot和exprotsnapshot和restore_snapshot核心功能来实现。例如,在将上述HBASE快照导出到源HDFS文件系统上时,可以通过exportsnapshot技术将上述HBASE快照导出到源HDFS文件系统上,在将上述目标HDFS文件系统上的上述HBASE快照恢复到上述在上述目标数据集上创建的HBASE表中时,可以通过restore_snapshot技术恢复HBASE快照到HBASE表中。此外,在完成上述迁移操作后,还可以通过可视化UI(即,上述的目标显示界面)进行校验数据的完整性(可以通过查看表数据的总数来校验)。其中,HBASE数据的迁移过程可以参见附图3。In case 3 above, HBASE data migration can be achieved using HBASE's snapshot and exprotsnapshot and restore_snapshot core functions. For example, when exporting the above HBASE snapshot to the source HDFS file system, you can export the above HBASE snapshot to the source HDFS file system through exportsnapshot technology, and then restore the above HBASE snapshot on the above target HDFS file system to the above mentioned target. When the HBASE table is created on the data set, the HBASE snapshot can be restored to the HBASE table through the restore_snapshot technology. In addition, after completing the above migration operation, the integrity of the data can also be verified through the visual UI (ie, the above-mentioned target display interface) (the integrity can be verified by viewing the total number of table data). The migration process of HBASE data can be seen in Figure 3.
通过上述实施例中的几种情况,可以有效实现各类型数据的可视化迁移,满足客户的使用需求,提升迁移效率,简化迁移步骤。Through several situations in the above embodiments, visual migration of various types of data can be effectively realized to meet customer usage needs, improve migration efficiency, and simplify migration steps.
此外,需要说明的是,前述的几种数据迁移情况实际上是可以在括安全模式(kerberos)下及非安全模式(非kerberos)下进行的,其中,在全模式下,需要对迁移操作进行安全验证,例如,对管理接口的调用要进行安全验证,只有验证通过后才会执行后续的数据迁移操作。In addition, it should be noted that the aforementioned data migration situations can actually be performed in secure mode (kerberos) and non-secure mode (non-kerberos). Among them, in full mode, the migration operation needs to be performed Security verification, for example, calls to the management interface must be security verified, and subsequent data migration operations will only be performed after the verification is passed.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到根据上述实施例的方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备等)执行本申请各个实施例上述的方法。Through the description of the above embodiments, those skilled in the art can clearly understand that the method according to the above embodiments can be implemented by means of software plus the necessary general hardware platform. Of course, it can also be implemented by hardware, but in many cases the former is Better implementation. Based on this understanding, the technical solution of the present application can be embodied in the form of a software product in essence or that contributes to the existing technology. The computer software product is stored in a storage medium (such as ROM/RAM, disk, CD), including several instructions to cause a terminal device (which can be a mobile phone, a computer, a server, or a network device, etc.) to execute the methods described above in various embodiments of the present application.
在本实施例中还提供了一种数据的迁移装置,该装置用于实现上述实施例及优选实施方式,已经进行过说明的不再赘述。如以下所使用的,术语“模块”可以实现预定功能的软件和/或硬件的组合。尽管以下实施例所描述的装置较佳地以软件来实现,但是硬件,或者软件和硬件的组合的实现也是可能并被构想的。This embodiment also provides a data migration device, which is used to implement the above embodiments and preferred implementations. What has already been described will not be described again. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. Although the apparatus described in the following embodiments is preferably implemented in software, implementation in hardware, or a combination of software and hardware, is also possible and contemplated.
图4是根据本申请实施例的数据的迁移装置的结构框图,如图4所示,该装置包括:Figure 4 is a structural block diagram of a data migration device according to an embodiment of the present application. As shown in Figure 4, the device includes:
第一接收模块42,用于接收到通过触控目标显示界面上的第一按钮所触发的第一指令;调用模块44,用于在上述第一指令的触发下调用用于管理源数据集以及目标数据集的目标接口;第二接收模块46,用于接收到通过触控目标显示界面上的第二按钮所触发的第二指令;迁移模块48,用于在上述第二指令的触发下,将上述源数据集中的目标数据迁移到上述目标数据集中。The first receiving module 42 is used to receive the first instruction triggered by touching the first button on the target display interface; the calling module 44 is used to call the source data set management module 44 when triggered by the first instruction. The target interface of the target data set; the second receiving module 46 is used to receive the second instruction triggered by touching the second button on the target display interface; the migration module 48 is used to trigger the second instruction. Migrate the target data in the above source data set to the above target data set.
在一个可选的实施例中,上述装置还包括第三接收模块,用于在将上述源数据集中的数据迁移到上述目标数据集中之后,接收到通过触控目标显示界面上的第三按钮所触发的第三指令;在上述第三指令的触发下,在上述目标显示界面上显示迁移到上述目标数据集中的数据的信息。In an optional embodiment, the above device further includes a third receiving module, configured to receive the data received by touching the third button on the target display interface after migrating the data in the source data set to the target data set. The third instruction triggered; when triggered by the above third instruction, the information of the data migrated to the above target data set is displayed on the above target display interface.
在一个可选的实施例中,在上述源数据集中存储有HADOOP数据的情况下:上述调用模块44包括:第一调用单元,用于在上述第一指令的触发下调用用于管理上述源数据集以及上述目标数据集的HADOOP接口;上述迁移模块48包括:创建单元,用于在上述第二指令的触发下,创建第一迁移任务,其中,上述第一迁移任务用于选取上述源数据集中的上述目标数据,并将上述目标数据转移到上述目标数据集中。In an optional embodiment, when HADOOP data is stored in the above source data set: the above calling module 44 includes: a first calling unit, configured to call for managing the above source data when triggered by the above first instruction. set and the HADOOP interface of the target data set; the migration module 48 includes: a creation unit, configured to create a first migration task triggered by the second instruction, wherein the first migration task is used to select the source data set of the above target data, and transfer the above target data to the above target data set.
在一个可选的实施例中,在上述源数据集中存储有HIVE数据的情况下:上述调用模块44包括:第二调用单元,用于在上述第一指令的触发下调用用于管理上述源数据集以及上述目标数据集的HDFS和HIVE接口;上述迁移模块48包括:第一导出单元,用于在上述第二指令的触发下,通过HDFS文件系统将上述源数据集中的上述目标数据导出到上述目标数据集中。In an optional embodiment, in the case where HIVE data is stored in the above source data set: the above calling module 44 includes: a second calling unit, configured to call for managing the above source data when triggered by the above first instruction. set and the HDFS and HIVE interfaces of the above-mentioned target data set; the above-mentioned migration module 48 includes: a first export unit, configured to export the above-mentioned target data in the above-mentioned source data set to the above-mentioned source data set through the HDFS file system when triggered by the above-mentioned second instruction. target data set.
在一个可选的实施例中,上述第一导出单元包括:第一执行子单元,用于在上述第二指令的触发下,执行以下操作:创建导出任务,其中,上述导出任务用于将上述源数据集中的上述目标数据导出到源HDFS文件系统上;创建第二迁移任务,其中,上述第二迁移任务用于将上述源HDFS文件系统上的上述目标数据迁移到目标HDFS文件系统上;创建导入任务,其中,上述导入任务用于将上述目标HDFS文件系统上的上述目标数据导入到上述目标数据集中。In an optional embodiment, the above-mentioned first export unit includes: a first execution sub-unit, configured to perform the following operations when triggered by the above-mentioned second instruction: create an export task, wherein the above-mentioned export task is used to convert the above-mentioned Export the above target data in the source data set to the source HDFS file system; create a second migration task, wherein the above second migration task is used to migrate the above target data on the above source HDFS file system to the target HDFS file system; create An import task, wherein the above-mentioned import task is used to import the above-mentioned target data on the above-mentioned target HDFS file system into the above-mentioned target data set.
在一个可选的实施例中,在上述源数据集中存储有HBASE数据的情况下:上述调用模块44包括:第三调用单元,用于在上述第一指令的触发下调用用于管理上述源数据集以及上述目标数据集的HDFS和HBASE接口;上述迁移模块48包括:第二导出单元,用于在上述第二指令的触发下,通过HDFS文件系统将上述源数据集中的上述目标数据导出到上述目标数据集中。In an optional embodiment, when HBASE data is stored in the source data set: the calling module 44 includes: a third calling unit, configured to call for managing the source data when triggered by the first instruction. set and the HDFS and HBASE interfaces of the above-mentioned target data set; the above-mentioned migration module 48 includes: a second export unit, configured to export the above-mentioned target data in the above-mentioned source data set to the above-mentioned source data set through the HDFS file system when triggered by the above-mentioned second instruction. target data set.
在一个可选的实施例中,上述第二导出单元包括:第二执行子单元,用于在上述第二指令的触发下,执行以下操作:创建HBASE快照任务,其中,上述HBASE快照任务用于为上述源数据集中的数据创建HBASE快照;创建HBASE快照导出和迁移任务,其中,上述HBASE快照导出和迁移任务用于将上述HBASE快照导出到源HDFS文件系统上,并将上述源HDFS文件系统上的上述HBASE快照迁移至目标HDFS文件系统上;创建表任务,其中,上述表任务用于根据上述源数据集中的表属性在上述目标数据集上创建HBASE表;创建恢复快照任务,其中,上述恢复快照任务用于将上述目标HDFS文件系统上的上述HBASE快照恢复到上述在上述目标数据集上创建的HBASE表中。In an optional embodiment, the above-mentioned second export unit includes: a second execution sub-unit, configured to perform the following operations when triggered by the above-mentioned second instruction: create an HBASE snapshot task, wherein the above-mentioned HBASE snapshot task is used to Create an HBASE snapshot for the data in the above source data set; create an HBASE snapshot export and migration task, in which the above HBASE snapshot export and migration task is used to export the above HBASE snapshot to the source HDFS file system, and transfer it to the above source HDFS file system. Migrate the above HBASE snapshot to the target HDFS file system; create a table task, wherein the above table task is used to create an HBASE table on the above target data set according to the table attributes in the above source data set; create a restore snapshot task, wherein the above restore The snapshot task is used to restore the above HBASE snapshot on the above target HDFS file system to the above HBASE table created on the above target data set.
需要说明的是,上述各个模块是可以通过软件或硬件来实现的,对于后者,可以通过以下方式实现,但不限于此:上述模块均位于同一处理器中;或者,上述各个模块以任意组合的形式分别位于不同的处理器中。It should be noted that each of the above modules can be implemented through software or hardware. For the latter, it can be implemented in the following ways, but is not limited to this: the above modules are all located in the same processor; or the above modules can be implemented in any combination. The forms are located in different processors.
本申请的实施例还提供了一种计算机可读存储介质,该计算机可读存储介质中存储有计算机程序,其中,该计算机程序被设置为运行时执行上述任一项方法实施例中的步骤。Embodiments of the present application also provide a computer-readable storage medium that stores a computer program, wherein the computer program is configured to execute the steps in any of the above method embodiments when running.
在本实施例中,上述计算机可读存储介质可以被设置为存储用于执行以下步骤的计算机程序:In this embodiment, the above-mentioned computer-readable storage medium may be configured to store a computer program for performing the following steps:
S1,在上述第一指令的触发下调用用于管理源数据集以及目标数据集的目标接口;S1, triggering the above first instruction, calls the target interface for managing the source data set and the target data set;
S2,接收到通过触控目标显示界面上的第二按钮所触发的第二指令;S2, receiving the second instruction triggered by touching the second button on the target display interface;
S3,在上述第二指令的触发下,将上述源数据集中的目标数据迁移到上述目标数据集中。S3, triggered by the above second instruction, migrate the target data in the above source data set to the above target data set.
在一个示例性实施例中,上述计算机可读存储介质可以包括但不限于:U盘、只读存储器(Read-Only Memory,简称为ROM)、随机存取存储器(Random Access Memory,简称为RAM)、移动硬盘、磁碟或者光盘等各种可以存储计算机程序的介质。In an exemplary embodiment, the computer-readable storage medium may include but is not limited to: USB flash drive, read-only memory (ROM), random access memory (Random Access Memory, RAM) , mobile hard disk, magnetic disk or optical disk and other media that can store computer programs.
本申请的实施例还提供了一种电子设备,包括存储器和处理器,该存储器中存储有计算机程序,该处理器被设置为运行计算机程序以执行上述任一项方法实施例中的步骤。An embodiment of the present application also provides an electronic device, including a memory and a processor. A computer program is stored in the memory, and the processor is configured to run the computer program to perform the steps in any of the above method embodiments.
在一个示例性实施例中,上述处理器可以被设置为通过计算机程序执行以下步骤:In an exemplary embodiment, the above-mentioned processor may be configured to perform the following steps through a computer program:
S1,在上述第一指令的触发下调用用于管理源数据集以及目标数据集的目标接口;S1, triggering the above first instruction, calls the target interface for managing the source data set and the target data set;
S2,接收到通过触控目标显示界面上的第二按钮所触发的第二指令;S2, receiving the second instruction triggered by touching the second button on the target display interface;
S3,在上述第二指令的触发下,将上述源数据集中的目标数据迁移到上述目标数据集中。S3, triggered by the above second instruction, migrate the target data in the above source data set to the above target data set.
在一个示例性实施例中,上述电子设备还可以包括传输设备以及输入输出设备,其中,该传输设备和上述处理器连接,该输入输出设备和上述处理器连接。In an exemplary embodiment, the above-mentioned electronic device may further include a transmission device and an input-output device, wherein the transmission device is connected to the above-mentioned processor, and the input-output device is connected to the above-mentioned processor.
本实施例中的具体示例可以参考上述实施例及示例性实施方式中所描述的示例,本实施例在此不再赘述。For specific examples in this embodiment, reference may be made to the examples described in the above-mentioned embodiments and exemplary implementations, and details will not be described again in this embodiment.
显然,本领域的技术人员应该明白,上述的本申请的各模块或各步骤可以用通用的计算装置来实现,它们可以集中在单个的计算装置上,或者分布在多个计算装置所组成的网络上,它们可以用计算装置可执行的程序代码来实现,从而,可以将它们存储在存储装置中由计算装置来执行,并且在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤,或者将它们分别制作成各个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。这样,本申请不限制于任何特定的硬件和软件结合。Obviously, those skilled in the art should understand that the above-mentioned modules or steps of the present application can be implemented using general-purpose computing devices, and they can be concentrated on a single computing device, or distributed across a network composed of multiple computing devices. They may be implemented in program code executable by a computing device, such that they may be stored in a storage device for execution by the computing device, and in some cases may be executed in a sequence different from that shown herein. or the described steps, or they are respectively made into individual integrated circuit modules, or multiple modules or steps among them are made into a single integrated circuit module. As such, the application is not limited to any specific combination of hardware and software.
以上所述仅为本申请的优选实施例而已,并不用于限制本申请,对于本领域的技术人员来说,本申请可以有各种更改和变化。凡在本申请的原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。The above descriptions are only preferred embodiments of the present application and are not intended to limit the present application. For those skilled in the art, the present application may have various modifications and changes. Any modifications, equivalent replacements, improvements, etc. made within the principles of this application shall be included in the protection scope of this application.
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310869292.1A CN116894023A (en) | 2023-07-14 | 2023-07-14 | Data migration method and device, storage medium and electronic device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310869292.1A CN116894023A (en) | 2023-07-14 | 2023-07-14 | Data migration method and device, storage medium and electronic device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116894023A true CN116894023A (en) | 2023-10-17 |
Family
ID=88314580
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310869292.1A Pending CN116894023A (en) | 2023-07-14 | 2023-07-14 | Data migration method and device, storage medium and electronic device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116894023A (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108628874A (en) * | 2017-03-17 | 2018-10-09 | 北京京东尚科信息技术有限公司 | Method, apparatus, electronic equipment and the readable storage medium storing program for executing of migrating data |
CN110209653A (en) * | 2019-06-04 | 2019-09-06 | 中国农业银行股份有限公司 | HBase data migration method and moving apparatus |
CN111367889A (en) * | 2020-03-09 | 2020-07-03 | 中国工商银行股份有限公司 | Cross-cluster data migration method and device based on webpage interface |
CN115658816A (en) * | 2022-11-29 | 2023-01-31 | 贵州易鲸捷信息技术有限公司 | Method for synchronizing HBase data to QianBase MPP in real time |
-
2023
- 2023-07-14 CN CN202310869292.1A patent/CN116894023A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108628874A (en) * | 2017-03-17 | 2018-10-09 | 北京京东尚科信息技术有限公司 | Method, apparatus, electronic equipment and the readable storage medium storing program for executing of migrating data |
CN110209653A (en) * | 2019-06-04 | 2019-09-06 | 中国农业银行股份有限公司 | HBase data migration method and moving apparatus |
CN111367889A (en) * | 2020-03-09 | 2020-07-03 | 中国工商银行股份有限公司 | Cross-cluster data migration method and device based on webpage interface |
CN115658816A (en) * | 2022-11-29 | 2023-01-31 | 贵州易鲸捷信息技术有限公司 | Method for synchronizing HBase data to QianBase MPP in real time |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
TWI740901B (en) | Method and device for performing data recovery operation | |
US9031910B2 (en) | System and method for maintaining a cluster setup | |
US11321291B2 (en) | Persistent version control for data transfer between heterogeneous data stores | |
US20190102087A1 (en) | Remote one-sided persistent writes | |
CN114925084B (en) | Distributed transaction processing method, system, equipment and readable storage medium | |
WO2023197670A1 (en) | Distributed storage system control method and apparatus, and readable storage medium | |
US10628380B2 (en) | Enabling data replication processes between heterogeneous storage systems | |
CN111324610A (en) | Data synchronization method and device | |
CN107656705B (en) | A computer storage medium and a data migration method, device and system | |
US11507277B2 (en) | Key value store using progress verification | |
CN100514331C (en) | Method of converting a filesystem while the filesystem remains in an active state | |
CN106874343B (en) | Data deletion method and system for time sequence database | |
US11044312B2 (en) | Storage segment server covered cache | |
WO2024041433A1 (en) | Data processing method and apparatus | |
WO2025123848A1 (en) | Data import method and apparatus, electronic device, storage medium, and program product | |
WO2025086860A1 (en) | Data table processing method and apparatus, computer device, and readable storage medium | |
CN113094367A (en) | Data processing method and device and server | |
WO2014187216A1 (en) | Method and device for database structure object processing | |
CN107844592A (en) | A kind of method and apparatus of query metadata | |
CN104572638A (en) | Data reading and writing method and device | |
CN116894023A (en) | Data migration method and device, storage medium and electronic device | |
WO2024078211A1 (en) | Backup method for service cluster instance, recovery method for service cluster instance, and related device | |
CN114138182B (en) | Cross-storage online cloning method, system and device for distributed cloud hard disk | |
EP3082050A1 (en) | Mass data fusion storage method and system | |
CN107632785A (en) | A kind of collocation method of storage device, device and readable storage medium storing program for executing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |