[go: up one dir, main page]

CN107679096B - Method and device for sharing indexes among data marts - Google Patents

Method and device for sharing indexes among data marts Download PDF

Info

Publication number
CN107679096B
CN107679096B CN201710806219.4A CN201710806219A CN107679096B CN 107679096 B CN107679096 B CN 107679096B CN 201710806219 A CN201710806219 A CN 201710806219A CN 107679096 B CN107679096 B CN 107679096B
Authority
CN
China
Prior art keywords
data
shared
data table
mart
indicator
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710806219.4A
Other languages
Chinese (zh)
Other versions
CN107679096A (en
Inventor
孙冬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Jingdong Shangke Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN201710806219.4A priority Critical patent/CN107679096B/en
Publication of CN107679096A publication Critical patent/CN107679096A/en
Application granted granted Critical
Publication of CN107679096B publication Critical patent/CN107679096B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/283Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Prevention of errors by analysis, debugging or testing of software
    • G06F11/3668Testing of software
    • G06F11/3672Test management
    • G06F11/3684Test management for test design, e.g. generating new test cases
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Prevention of errors by analysis, debugging or testing of software
    • G06F11/3668Testing of software
    • G06F11/3672Test management
    • G06F11/3688Test management for test execution, e.g. scheduling of test suites
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/80Database-specific techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明公开了一种数据集市间指标共享的方法和装置,涉及计算机领域。该方法的一具体实施方式包括:在多个数据集市的指标数据表中确定至少一个共享指标数据表,根据所述至少一个共享指标数据表中的任一个创建与该共享指标数据表相应的测试用例;其中,所述共享指标数据表中包括将在所述多个数据集市间共享的指标数据;将所述共享指标数据表和所述测试用例复制到共享集市,利用所述测试用例对共享集市中相应的共享指标数据表进行测试;以及将通过测试的任一共享指标数据表复制到未存储该共享指标数据表的每一个数据集市。该实施方式能够在各数据集市之间共享指标数据,从而使各指标的定义、各指标采用的数学模型在各数据集市之间达成一致。

Figure 201710806219

The invention discloses a method and a device for sharing indicators among data marts, and relates to the field of computers. A specific implementation of the method includes: determining at least one shared indicator data table in indicator data tables of a plurality of data marts, and creating a corresponding shared indicator data table according to any one of the at least one shared indicator data table A test case; wherein the shared indicator data table includes indicator data to be shared among the multiple data marts; copy the shared indicator data table and the test case to the shared mart, and use the test The use case tests the corresponding shared indicator data table in the shared mart; and copies any shared indicator data table that passes the test to each data mart that does not store the shared indicator data table. In this embodiment, indicator data can be shared among various data marts, so that the definition of each indicator and the mathematical model adopted by each indicator can be agreed among various data marts.

Figure 201710806219

Description

数据集市间指标共享的方法和装置Method and device for sharing indicators among data marts

技术领域technical field

本发明涉及计算机领域,尤其涉及一种数据集市间指标共享的方法和装置。The invention relates to the field of computers, and in particular, to a method and device for sharing indicators between data marts.

背景技术Background technique

随着企业业务的发展,会产生大量的业务数据,通常企业会建立数据仓库来收集数据,并按照主题分类存储这些数据。而对于企业的各部门来说,往往会对数据仓库中关心的数据进行逻辑加工后,推送到本部门的数据集市中。With the development of enterprise business, a large amount of business data will be generated. Usually, enterprises will build a data warehouse to collect data and store the data according to the subject classification. For each department of the enterprise, the data concerned in the data warehouse are often processed logically and then pushed to the data mart of the department.

实际应用中,企业的数据集市能否建设成功在很大程度上取决于各数据集市间是否具有一个稳定、一致、全面的指标模型,指标模型包括对订单量、订单金额、库存量等指标的确切定义及各指标采用的相关数据模型。在各数据集市间建立稳定、一致、全面的指标模型能够使各数据集市的指标数据保持一致,进而为后续的应用分析、数据产品提供支持。In practical applications, the success of an enterprise's data mart construction depends to a large extent on whether each data mart has a stable, consistent and comprehensive indicator model. Exact definitions of indicators and related data models used for each indicator. Establishing a stable, consistent and comprehensive indicator model between data marts can keep the indicator data of each data mart consistent, and then provide support for subsequent application analysis and data products.

现有技术中,数据仓库从业务数据库抽取原始数据后,建立相应的数据表,并将数据表同步到相关的数据集市,供分析人员使用。在任一数据集市,分析人员自行定义指标模型,并在该数据集市内部使用所述指标模型。In the prior art, after a data warehouse extracts original data from a business database, a corresponding data table is established, and the data table is synchronized to a relevant data mart for use by analysts. In any data mart, analysts define their own metric models and use said metric models within that data mart.

在实现本发明的过程中,发明人发现现有技术至少存在以下问题:In the process of realizing the present invention, the inventor found that the prior art has at least the following problems:

1.各数据集市同一指标的定义不同,使得相应的指标数据在数据集市之间使用时容易造成误解。例如:营销集市的订单量指标指的是订单总数减去无效订单数量,运营集市的订单量指标指的是产生运单的订单数量;1. The definitions of the same indicator in each data mart are different, which makes it easy to cause misunderstanding when the corresponding indicator data is used between data marts. For example: the order volume index of the marketing market refers to the total number of orders minus the number of invalid orders, and the order volume index of the operation market refers to the number of orders that generate waybills;

2.各数据集市同一指标采用的数据模型不同,使得相应的指标数据在数据集市之间使用时无法达成一致。例如:营销集市的订单金额指标采用优惠数据表LA,销售集市中的订单金额指标采用优惠数据表LB。2. The data models used for the same indicator in each data mart are different, so that the corresponding indicator data cannot be used in the data marts. For example, the order amount indicator in the marketing market uses the preferential data table LA, and the order amount indicator in the sales market uses the preferential data table LB.

发明内容SUMMARY OF THE INVENTION

有鉴于此,本发明实施例提供一种数据集市间指标共享的方法和装置,能够在各数据集市之间共享指标数据,从而使各指标的定义、各指标采用的数学模型在各数据集市之间达成一致。In view of this, embodiments of the present invention provide a method and device for sharing indicators between data marts, which can share indicator data among data marts, so that the definition of each indicator and the mathematical model used by each indicator can be used in each data mart. Agreed among markets.

为实现上述目的,根据本发明的一个方面,提供了一种数据集市间指标共享的方法。To achieve the above object, according to an aspect of the present invention, a method for sharing indicators between data marts is provided.

本发明实施例的数据集市间指标共享的方法包括:在多个数据集市的指标数据表中确定至少一个共享指标数据表,根据所述至少一个共享指标数据表中的任一个创建与该共享指标数据表相应的测试用例;其中,所述共享指标数据表中包括将在所述多个数据集市间共享的指标数据;将所述共享指标数据表和所述测试用例复制到共享集市,利用所述测试用例对共享集市中相应的共享指标数据表进行测试;以及将通过测试的任一共享指标数据表复制到未存储该共享指标数据表的每一个数据集市。The method for sharing indicators among data marts according to the embodiment of the present invention includes: determining at least one shared indicator data table in indicator data tables of a plurality of data marts, and creating a shared indicator data table according to any one of the at least one shared indicator data table. A test case corresponding to the shared indicator data table; wherein, the shared indicator data table includes indicator data to be shared among the multiple data marts; the shared indicator data table and the test case are copied to the shared set test the corresponding shared indicator data table in the shared mart by using the test case; and copy any shared indicator data table that passes the test to each data mart that does not store the shared indicator data table.

可选地,所述测试用例包括:测试步骤、以及与所述测试步骤对应的期望结果。Optionally, the test case includes: a test step and an expected result corresponding to the test step.

可选地,所述利用所述测试用例对共享集市中相应的共享指标数据表进行测试包括:针对共享集市中的任一共享指标数据表,执行相应的测试用例中的测试步骤;当测试结果与所述测试用例中的期望结果完全一致时,将所述共享指标数据表确定为通过测试。Optionally, the using the test case to test the corresponding shared indicator data table in the shared mart includes: for any shared indicator data table in the shared mart, executing the test steps in the corresponding test case; when When the test result is completely consistent with the expected result in the test case, the shared indicator data table is determined to pass the test.

可选地,所述方法进一步包括:当测试结果与所述测试用例中的期望结果存在不一致时:将所述共享指标数据表去除,并将在所述多个数据集市的指标数据表中确定的、与所述共享指标数据表对应的共享指标数据表再次复制到共享集市,利用所述测试用例对该共享指标数据表进行测试。Optionally, the method further includes: when the test result is inconsistent with the expected result in the test case: removing the shared indicator data table, and putting it in the indicator data table of the multiple data marts The determined shared indicator data table corresponding to the shared indicator data table is copied to the shared market again, and the shared indicator data table is tested by using the test case.

可选地,所述多个数据集市为基于同一数据仓库的从属型数据集市,所述多个数据集市、所述共享集市、以及所述数据仓库都采用分布式文件系统架构存储数据。Optionally, the multiple data marts are subordinate data marts based on the same data warehouse, and the multiple data marts, the shared mart, and the data warehouse are all stored in a distributed file system architecture. data.

为实现上述目的,根据本发明的又一方面,提供了一种数据集市间指标共享的装置。To achieve the above object, according to another aspect of the present invention, an apparatus for sharing indicators between data marts is provided.

本发明实施例的数据集市间指标共享的装置包括:指标确定单元,可用于在多个数据集市的指标数据表中确定至少一个共享指标数据表,根据所述至少一个共享指标数据表中的任一个创建与该共享指标数据表相应的测试用例;其中,所述共享指标数据表中包括将在所述多个数据集市间共享的指标数据;指标测试单元,可用于将所述共享指标数据表和所述测试用例复制到共享集市,利用所述测试用例对共享集市中相应的共享指标数据表进行测试;指标共享单元,可用于将通过测试的任一共享指标数据表复制到未存储该共享指标数据表的每一个数据集市。The apparatus for sharing indicators between data marts according to the embodiment of the present invention includes: an indicator determining unit, which can be configured to determine at least one shared indicator data table among indicator data tables of a plurality of data marts, and according to the at least one shared indicator data table Create a test case corresponding to the shared indicator data table; wherein, the shared indicator data table includes indicator data to be shared among the plurality of data marts; the indicator testing unit can be used to share the shared indicator data. The indicator data table and the test case are copied to the shared mart, and the test case is used to test the corresponding shared indicator data table in the shared mart; the indicator sharing unit can be used to copy any shared indicator data table that has passed the test. to every data mart that does not store this shared metric data table.

可选地,所述测试用例包括:测试步骤、以及与所述测试步骤对应的期望结果;以及所述指标测试单元可用于:针对共享集市中的任一共享指标数据表,执行相应的测试用例中的测试步骤;当测试结果与所述测试用例中的期望结果完全一致时,将所述共享指标数据表确定为通过测试。Optionally, the test case includes: a test step and an expected result corresponding to the test step; and the indicator test unit can be used to: execute a corresponding test for any shared indicator data table in the shared marketplace Test steps in the use case; when the test result is completely consistent with the expected result in the test case, the shared indicator data table is determined to pass the test.

可选地,所述多个数据集市为基于同一数据仓库的从属型数据集市,所述多个数据集市、所述共享集市、以及所述数据仓库都采用分布式文件系统架构存储数据。Optionally, the multiple data marts are subordinate data marts based on the same data warehouse, and the multiple data marts, the shared mart, and the data warehouse are all stored in a distributed file system architecture. data.

为实现上述目的,根据本发明的又一方面,提供了一种电子设备。To achieve the above object, according to yet another aspect of the present invention, an electronic device is provided.

本发明的一种电子设备包括:一个或多个处理器;存储装置,用于存储一个或多个程序,当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现本发明所提供的数据集市间指标共享的方法。An electronic device of the present invention comprises: one or more processors; a storage device for storing one or more programs, when the one or more programs are executed by the one or more processors, the one or more programs make the One or more processors implement the method for sharing indicators among data marts provided by the present invention.

为实现上述目的,根据本发明的再一方面,提供了一种计算机可读存储介质。To achieve the above object, according to yet another aspect of the present invention, a computer-readable storage medium is provided.

本发明的一种计算机可读存储介质,其上存储有计算机程序,所述程序被处理器执行时实现本发明所提供的数据集市间指标共享的方法。A computer-readable storage medium of the present invention stores a computer program thereon, and when the program is executed by a processor, the method for sharing indicators between data marts provided by the present invention is implemented.

根据本发明的技术方案,上述发明中的一个实施例具有如下优点或有益效果:通过从各数据集市中确定共享指标数据表,从而可根据共享指标数据表确定各指标的确切定义以及采用的数据模型;通过将共享指标数据表复制到共享集市,并在共享集市对其测试后分享到各数据集市,实现了在各数据集市共享指标数据的技术效果,解决了现有技术中各数据集市指标定义、指标采用的数据模型不一致的问题,使得各数据集市能够基于稳定、一致、全面的指标模型为业务提供支持;通过在共享集市对共享指标数据表进行测试,避免了因节点故障、网络故障等造成的数据丢失、数据损坏带来的影响,保证了在各数据集市中共享的指标数据的质量。According to the technical solution of the present invention, an embodiment of the above invention has the following advantages or beneficial effects: by determining the shared indicator data table from each data mart, the exact definition of each indicator and the adopted index can be determined according to the shared indicator data table. Data model; by copying the shared indicator data table to the shared mart, and sharing it with each data mart after testing it in the shared mart, the technical effect of sharing indicator data in each data mart is realized, and the existing technology is solved. The problem of inconsistency in the definition of indicators and the data models adopted by the data marts in the data mart enables each data mart to provide support for business based on a stable, consistent and comprehensive indicator model; by testing the shared indicator data table in the shared mart, It avoids the impact of data loss and data damage caused by node failure, network failure, etc., and ensures the quality of the indicator data shared in each data mart.

上述的非惯用的可选方式所具有的进一步效果将在下文中结合具体实施方式加以说明。Further effects of the above non-conventional alternatives will be described below in conjunction with specific embodiments.

附图说明Description of drawings

附图用于更好地理解本发明,不构成对本发明的不当限定。其中:The accompanying drawings are used for better understanding of the present invention and do not constitute an improper limitation of the present invention. in:

图1是根据本发明实施例的数据集市间指标共享的方法的主要步骤示意图;1 is a schematic diagram of main steps of a method for sharing indicators between data marts according to an embodiment of the present invention;

图2是根据本发明实施例的数据集市间指标共享的方法的数据集市与数据仓库结构示意图;2 is a schematic structural diagram of a data mart and a data warehouse of a method for sharing indicators between data marts according to an embodiment of the present invention;

图3是根据本发明实施例的数据集市间指标共享的装置的主要部分示意图;3 is a schematic diagram of a main part of an apparatus for sharing indicators between data marts according to an embodiment of the present invention;

图4是根据本发明实施例的数据集市间指标共享的装置的组成示意图;4 is a schematic diagram of the composition of an apparatus for sharing indicators between data marts according to an embodiment of the present invention;

图5是根据本发明实施例可以应用于其中的示例性系统架构图;5 is an exemplary system architecture diagram to which an embodiment of the present invention may be applied;

图6是用来实现本发明实施例的数据集市间指标共享的方法的电子设备的结构示意图。FIG. 6 is a schematic structural diagram of an electronic device for implementing the method for sharing indicators between data marts according to an embodiment of the present invention.

具体实施方式Detailed ways

以下结合附图对本发明的示范性实施例做出说明,其中包括本发明实施例的各种细节以助于理解,应当将它们认为仅仅是示范性的。因此,本领域普通技术人员应当认识到,可以对这里描述的实施例做出各种改变和修改,而不会背离本发明的范围和精神。同样,为了清楚和简明,以下的描述中省略了对公知功能和结构的描述。Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, which include various details of the embodiments of the present invention to facilitate understanding and should be considered as exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted from the following description for clarity and conciseness.

本发明实施例的技术方案通过从各数据集市中确定共享指标数据表,从而可根据共享指标数据表确定各指标的确切定义以及采用的数据模型;通过将共享指标数据表复制到共享集市,并在共享集市对其测试后分享到各数据集市,实现了在各数据集市共享指标数据的技术效果,解决了现有技术中各数据集市指标定义、指标采用的数据模型不一致的问题,使得各数据集市能够基于稳定、一致、全面的指标模型为业务提供支持;通过在共享集市对共享指标数据表进行测试,避免了因节点故障、网络故障等造成的数据丢失、数据损坏带来的影响,保证了在各数据集市中共享的指标数据的质量。The technical solution of the embodiment of the present invention determines the shared indicator data table from each data mart, so that the exact definition of each indicator and the adopted data model can be determined according to the shared indicator data table; by copying the shared indicator data table to the shared mart , and share it with each data mart after testing it in the sharing market, realizing the technical effect of sharing indicator data in each data mart, and solving the inconsistency in the definition of indicators and the data model adopted by indicators in each data mart in the prior art problems, so that each data mart can provide support for business based on a stable, consistent and comprehensive indicator model; by testing the shared indicator data table in the shared mart, data loss due to node failure, network failure, etc. is avoided. The impact of data corruption ensures the quality of indicator data shared across data marts.

图1是根据本实施例的数据集市间指标共享的方法的主要步骤示意图。FIG. 1 is a schematic diagram of main steps of a method for sharing indicators among data marts according to this embodiment.

如图1所示,本发明实施例的数据集市间指标共享的方法按照以下步骤执行:As shown in FIG. 1 , the method for sharing indicators between data marts according to an embodiment of the present invention is performed according to the following steps:

步骤S101:在多个数据集市的指标数据表中确定至少一个共享指标数据表,根据任一共享指标数据表创建与该共享指标数据表相应的测试用例。Step S101 : Determine at least one shared indicator data table in indicator data tables of a plurality of data marts, and create a test case corresponding to the shared indicator data table according to any shared indicator data table.

在本发明实施例中,数据集市为基于同一数据仓库的从属型数据集市。其中,在计算机技术领域,数据仓库指的是面向主题的、集成的、相对稳定的、反映历史变化的、用于支持管理决策的数据集合,企业通常在企业范围建立数据仓库以从业务数据库获取数据,并对数据进行联机分析处理(OLAP,Online Analytical Processing)以获得支持企业决策的信息。数据集市指的是满足企业的特定部门或特定需求、面向部门决策分析的数据集合,企业通常建立部门范围的数据集市来存储本部门的相关数据,并对数据进行分析以获得支持本部门决策的信息。例如:某企业的数据集市可以有营销集市、销售集市、运营集市等。In the embodiment of the present invention, the data mart is a subordinate data mart based on the same data warehouse. Among them, in the field of computer technology, data warehouse refers to a subject-oriented, integrated, relatively stable data collection that reflects historical changes and is used to support management decision-making. Enterprises usually build data warehouses within the enterprise to obtain data from business databases. data, and perform online analytical processing (OLAP, Online Analytical Processing) on the data to obtain information that supports enterprise decision-making. Data marts refer to data collections that meet specific departments or specific needs of enterprises and are oriented to departmental decision-making analysis. Enterprises usually establish department-wide data marts to store relevant data of the department, and analyze the data to support the department. decision-making information. For example, the data mart of an enterprise may include marketing mart, sales mart, operation mart, etc.

一般地,数据集市可以分为两种类型:直接从业务数据库获取数据的独立型数据集市、以及从企业级数据仓库获取数据的从属型数据集市。在本发明实施例中,数据集市为体系结构更为稳定、可伸缩性更强的从属型数据集市。Generally, data marts can be divided into two types: independent data marts that obtain data directly from business databases, and dependent data marts that obtain data from enterprise-level data warehouses. In the embodiment of the present invention, the data mart is a subordinate data mart with a more stable architecture and stronger scalability.

图2是根据本发明实施例的数据集市与数据仓库结构示意图。FIG. 2 is a schematic structural diagram of a data mart and a data warehouse according to an embodiment of the present invention.

如图2所示,业务数据库210中存储的业务数据经过数据清洗、数据压缩后,被抽取到数据仓库220。其中,数据清洗用于过滤业务数据中的干扰数据。数据压缩可以将数据压缩为LZO(LZO是一种无损数据压缩算法)、ORC(Optimized Row Columnar file,优化行列式文件)等格式进行存储。As shown in FIG. 2 , the business data stored in the business database 210 is extracted into the data warehouse 220 after data cleaning and data compression. Among them, data cleaning is used to filter interference data in business data. Data compression can compress data into LZO (LZO is a lossless data compression algorithm), ORC (Optimized Row Columnar file, optimized row columnar file) and other formats for storage.

数据仓库220建立对应于业务数据的数据表,并将数据按照存储时间设置分区。实际应用中,可以对数据按天设置分区。此后,数据仓库220可以将数据表复制到各个数据集市250。需要说明的是,本发明中的“复制”指的是对源数据表进行复制,并将复制的数据表发送到目标位置。较佳地,数据仓库220可以通过消息队列等方式将数据表复制到数据集市250。The data warehouse 220 establishes a data table corresponding to the business data, and partitions the data according to the storage time. In practical applications, data can be partitioned by day. Thereafter, the data warehouse 220 may replicate the data tables to the various data marts 250 . It should be noted that "copying" in the present invention refers to copying the source data table and sending the copied data table to the target location. Preferably, the data warehouse 220 can copy the data table to the data mart 250 by means of a message queue or the like.

在本发明实施例中,在各个数据集市250之间预先设置共享集市260,用于与任一数据集市250通信。具体应用中,共享集市260可以是独立的特殊数据集市,也可以由现有的多个数据集市250划分出的存储空间组成。较佳地,本发明实施例的数据仓库220、数据集市250以及共享集市260都采用分布式文件系统架构如HDFS(Hadoop Distributed FileSystem,Hadoop分布式文件系统,Hadoop是一种分布式系统基础架构)存储数据,数据仓库220可以基于Hive(Hive是基于Hadoop的数据仓库工具)建立。In this embodiment of the present invention, a shared marketplace 260 is preset among each data mart 250 for communicating with any data mart 250 . In a specific application, the shared mart 260 may be an independent special data mart, or may be composed of storage spaces divided by multiple existing data marts 250 . Preferably, the data warehouse 220, the data mart 250, and the shared mart 260 in the embodiment of the present invention all adopt a distributed file system architecture such as HDFS (Hadoop Distributed File System, Hadoop distributed file system, Hadoop is a distributed system foundation. Schema) to store data, the data warehouse 220 can be built based on Hive (Hive is a Hadoop-based data warehouse tool).

在步骤S101中,指标数据表指的是含有指标数据的数据表。其中,数据表是数据集市中“表”(table)格式的数据结构,其含有的指标数据存储在HDFS相应的表目录下。指标数据指的是指标数据表中与订单量、订单金额等指标相关的数据。例如:订单详情表中的总订单量数据、无效订单量数据、订单生成时间数据均可以作为订单量指标的指标数据,订单原始金额数据、优惠金额数据、优惠表数据均可作为订单金额指标的指标数据。In step S101, the index data table refers to a data table containing index data. The data table is a data structure in the "table" format in the data mart, and the index data contained in the data table is stored in the corresponding table directory of HDFS. Indicator data refers to the data related to indicators such as order volume and order amount in the indicator data table. For example: the total order volume data, invalid order volume data, and order generation time data in the order details table can be used as the indicator data of the order volume indicator, and the original order amount data, discount amount data, and discount table data can be used as the order amount indicator. indicator data.

需要说明的是,数据集市自行建立的指标数据表能够反映该数据集市对于相应指标的定义及指标的相关数据模型,即指标数据表能够表征该数据集市对于相应指标的指标模型。例如:某数据集市A的订单详情表JA中含有总订单量数据、无效订单量数据、订单生成时间数据、订单原始金额数据、优惠金额数据、优惠表数据等数据,从表中的上述指标数据即可得知该数据集市A对于订单量指标、订单金额指标的确切定义与指标的相关数据模型。其中,指标的相关数据模型指的是使用该指标时采用的数据模型,例如:对于订单金额指标,指标数据表中的优惠表数据显示生成订单金额时采用优惠表k,则k为订单金额指标的相关数据模型。It should be noted that the indicator data table established by the data mart can reflect the data mart's definition of the corresponding indicator and the relevant data model of the indicator, that is, the indicator data table can represent the data mart's indicator model for the corresponding indicator. For example: the order details table JA of a data mart A contains data such as total order volume data, invalid order volume data, order generation time data, order original amount data, discount amount data, discount table data and other data, from the above indicators in the table From the data, you can know the exact definition of the order volume indicator and the order amount indicator in the data mart A and the related data model of the indicators. Among them, the relevant data model of the indicator refers to the data model used when using the indicator, for example: for the order amount indicator, the discount table data in the indicator data table shows that the discount table k is used when generating the order amount, then k is the order amount indicator related data model.

实际应用中,各个数据集市均会根据本部门的应用环境自行建立指标数据表,所述指标数据表在数据集市内部使用时一般不会产生不便,但在不同的数据集市间使用时往往造成误解,使得基于多个数据集市的分析应用难以实现。In practical applications, each data mart will establish its own indicator data table according to the application environment of its own department. The indicator data table is generally not inconvenient when used within the data mart, but when used between different data marts. Misunderstandings are often created, making it difficult to implement analytics applications based on multiple data marts.

针对上述问题,在本发明的实施例中,在各个数据集市的关于相同指标的多个指标数据表中确定共享指标数据表,共享指标数据表中包括将在各个数据集市间共享的指标数据,即共享指标数据表中含有的指标数据在确定该共享指标数据表、以及该共享指标数据表通过测试之后在各个数据集市共享,从而确定数据集市间工作的通用的指标模型。也就是说,在各个数据集市的多个指标数据表中选择一个或多个共享指标数据表——对于同一指标,只能选择唯一的共享指标数据表与之对应——共享指标数据表通过其含有的指标数据确定相应指标的定义及数据模型,共享指标数据表及其确定的指标的定义、指标的数据模型将在各个数据集市共享并达成一致,之后在多个数据集市间工作时一致使用。In view of the above problems, in the embodiment of the present invention, a shared indicator data table is determined among multiple indicator data tables related to the same indicator in each data mart, and the shared indicator data table includes indicators to be shared among various data marts The data, that is, the indicator data contained in the shared indicator data table, are shared in each data mart after the shared indicator data table is determined and the shared indicator data table passes the test, so as to determine a common indicator model for work between data marts. That is to say, select one or more shared indicator data tables from multiple indicator data tables in each data mart—for the same indicator, only a unique shared indicator data table can be selected to correspond to it—the shared indicator data table is passed through The index data contained in it determines the definition and data model of the corresponding index, and the shared index data table and the definition of the identified index and the data model of the index will be shared and agreed in each data mart, and then work among multiple data marts be used consistently.

例如:数据集市A、B、C分别存储有订单详情表JA、JB、JC,JA、JB、JC分别确定了各自的关于订单量指标、订单金额指标的定义以及数据模型。此时,可以根据具体需求,将数据集市A的订单详情表JA作为共享指标数据表,分享到包括数据集市B、C在内的各个数据集市。各个数据集市与其它数据集市通信时、或企业基于多个数据集市工作时,各数据集市的相关数据需要符合共享指标数据表确定的指标定义及指标数据模型。For example, data marts A, B, and C store order detail tables JA, JB, and JC, respectively. JA, JB, and JC determine their own definitions and data models of order volume indicators and order amount indicators. At this time, according to specific needs, the order details table JA of data mart A can be used as a shared indicator data table to be shared with each data mart including data marts B and C. When each data mart communicates with other data marts, or when an enterprise works based on multiple data marts, the relevant data of each data mart needs to conform to the indicator definitions and indicator data models determined by the shared indicator data table.

在本步骤中,确定共享指标数据表之后,可以根据任一共享指标数据表创建测试用例。测试用例指的是为了完成特定目标而创建的测试数据与测试规则。一般地,测试用例可以根据应用环境灵活设置,可以包括测试步骤以及与测试步骤对应的期望结果。具体地,根据共享指标数据表创建测试用例可以通过以下步骤实现:In this step, after the shared indicator data table is determined, a test case can be created according to any shared indicator data table. Test cases refer to test data and test rules created to accomplish specific goals. Generally, test cases can be flexibly set according to the application environment, and can include test steps and expected results corresponding to the test steps. Specifically, creating a test case based on the shared indicator data table can be achieved through the following steps:

1.基于该共享指标数据表确定多个测试步骤;1. Determine multiple test steps based on the shared indicator data table;

2.在该共享指标数据表依次执行测试步骤,获得多个测试结果;2. Execute the test steps in sequence on the shared indicator data table to obtain multiple test results;

3.将多个测试结果分别作为对应于测试步骤的期望结果。3. Take multiple test results as the expected results corresponding to the test steps, respectively.

例如:根据作为共享指标数据表的订单详情表JA创建以下测试用例:For example: create the following test case based on the order details table JA as a shared indicator data table:

测试步骤:Test steps:

1.查询商品编号为5512的商品的订单量;1. Query the order quantity of the product with the product number of 5512;

2.查询订单地址为北京的商品的订单量;2. Check the order quantity of products whose order address is Beijing;

3.查询订单地址为东北地区的服装类商品的订单金额。3. Check the order amount of clothing products whose order address is in Northeast China.

将上述测试步骤在JA依次执行,得到下列测试结果,将其作为期望结果:The above test steps are executed in sequence in JA, and the following test results are obtained, which are regarded as the expected results:

1.1000;1.1000;

2.7566986;2.7566986;

3.659988。3.659988.

可以理解的是,实际应用中的测试用例通常由数量较大的测试步骤与期望结果组成,上例仅为示例,并不对本发明中的测试用例进行限制。It can be understood that, test cases in practical applications usually consist of a large number of test steps and expected results, and the above examples are only examples, and do not limit the test cases in the present invention.

步骤S102:将共享指标数据表以及测试用例复制到共享集市,利用测试用例对共享集市中相应的共享指标数据表进行测试。Step S102: Copy the shared indicator data table and the test case to the shared mart, and use the test case to test the corresponding shared indicator data table in the shared mart.

在本步骤中,将在各个数据集市确定的共享指标数据表、以及根据共享指标数据表创建的测试用例从数据集市复制到共享集市的预上线区。其中,共享集市的预上线区为共享集市处于预发布环境的层级。此外,实际应用中可以利用消息队列等方式实现上述复制过程。之后,在共享集市的预上线区利用测试用例对相应的共享指标数据表进行测试,以防止数据在复制过程中由于数据存储节点故障、网络故障等原因出错或丢失。其中,相应的共享指标数据表指的是与测试用例相应的共享指标数据表,即该测试用例根据该共享指标数据表创建。In this step, the shared indicator data table determined in each data mart and the test case created according to the shared indicator data table are copied from the data mart to the pre-launch area of the shared mart. Among them, the pre-launch area of the sharing market is the level where the sharing market is in the pre-release environment. In addition, in practical applications, the above replication process may be implemented by means of message queues and the like. After that, use test cases to test the corresponding shared indicator data tables in the pre-launch area of the shared market to prevent errors or loss of data due to data storage node failures, network failures, and other reasons during the replication process. The corresponding shared indicator data table refers to a shared indicator data table corresponding to the test case, that is, the test case is created according to the shared indicator data table.

测试过程具体如下:The test process is as follows:

1.针对共享集市中的任一共享指标数据表,执行相应的测试用例中的测试步骤。其中,相应的测试用例指的是在数据集市中,根据该共享指标数据表的源数据表创建的测试用例;1. For any shared indicator data table in the shared marketplace, execute the test steps in the corresponding test case. Among them, the corresponding test case refers to the test case created according to the source data table of the shared indicator data table in the data mart;

2.判断测试结果是否与该测试用例中的期望结果完全一致:若是,将共享指标数据表确定为通过测试;否则,将该共享指标数据表去除,将在数据集市的指标数据表中确定的、与该共享指标数据表对应的共享指标数据表再次复制到共享集市,即再次将该共享指标数据表的源数据表复制到共享集市;2. Determine whether the test results are completely consistent with the expected results in the test case: if so, determine the shared indicator data table as passing the test; otherwise, remove the shared indicator data table and determine it in the indicator data table of the data mart , the shared indicator data table corresponding to the shared indicator data table is copied to the shared market again, that is, the source data table of the shared indicator data table is copied to the shared market again;

3.对于再次复制的共享指标数据表,再次执行该测试用例中的测试步骤,判断测试结果是否与该测试用例中的期望结果完全一致:若是,将共享指标数据表确定为通过测试;否则,去除再次复制的共享指标数据表,测试结束。3. For the shared indicator data table copied again, execute the test steps in the test case again to determine whether the test results are completely consistent with the expected results in the test case: if so, determine the shared indicator data table as passing the test; otherwise, Remove the shared indicator data table copied again, and the test is over.

实际应用中,经过上述测试过程,一般可在共享集市得到通过测试的共享指标数据表。In practical applications, after the above-mentioned testing process, a shared indicator data table that has passed the test can generally be obtained in the shared market.

步骤S103:将通过测试的任一共享指标数据表复制到未存储该共享指标数据表的每一个数据集市。Step S103: Copy any shared indicator data table that passes the test to each data mart that does not store the shared indicator data table.

在本步骤中,将通过测试的共享指标数据表复制到没有存储该共享指标数据表的各数据集市的线上区,以供分析人员使用。其中,数据集市的线上区指的是数据集市中处于正式环境的层级。实际应用中,数据集市在开展内部工作时,可以采用自行设置的指标定义与指标数据模型;当多个数据集市进行交互或基于多个数据集市开展工作时,需要采用共享指标数据表建立的指标定义与指标数据模型,以使各数据集市中的指标一致,避免出现歧义。In this step, the shared indicator data table that has passed the test is copied to the online area of each data mart that does not store the shared indicator data table for analysts to use. Among them, the online area of the data mart refers to the level in the formal environment of the data mart. In practical applications, the data mart can use its own indicator definition and indicator data model when carrying out internal work; when multiple data marts interact or work based on multiple data marts, a shared indicator data table needs to be used. The established indicator definitions and indicator data models make the indicators in each data mart consistent and avoid ambiguity.

具体应用中,可以定期——例如每天——将在数据集市确定的共享指标数据表以及测试用例复制到共享集市,并将测试通过的共享指标数据表共享到各数据集市,实现共享的指标模型的定期更新。In specific applications, the shared indicator data tables and test cases determined in the data mart can be copied to the shared mart on a regular basis—for example, every day, and the shared indicator data tables that have passed the test can be shared with each data mart to achieve sharing. Regular updates of the indicator model.

根据本发明实施例的方法可以看出,通过从各数据集市中确定共享指标数据表,从而可根据共享指标数据表确定各指标的确切定义以及采用的数据模型;因为采用了将共享指标数据表复制到共享集市,并在共享集市对其测试后分享到各数据集市的技术手段,从而实现了在各数据集市共享指标数据的技术效果,解决了现有技术中各数据集市指标定义、指标采用的数据模型不一致的问题,使得各数据集市能够基于稳定、一致、全面的指标模型为业务提供支持;因为采用了在共享集市对共享指标数据表进行测试的技术手段,从而避免了因节点故障、网络故障等造成的数据丢失、数据损坏带来的影响,保证了在各数据集市中共享的指标数据的质量。According to the method of the embodiment of the present invention, it can be seen that by determining the shared indicator data table from each data mart, the exact definition of each indicator and the adopted data model can be determined according to the shared indicator data table; The technical means of copying the table to the shared mart, and sharing it with each data mart after testing it in the shared mart, thus realizing the technical effect of sharing indicator data in each data mart, and solving the problem of each data set in the prior art. The problem of inconsistency in the definition of market indicators and the data model adopted by the indicators enables each data mart to provide support for business based on a stable, consistent and comprehensive indicator model; because the technical means of testing the shared indicator data table in the shared mart is adopted , so as to avoid the impact of data loss and data damage caused by node failure, network failure, etc., and ensure the quality of the indicator data shared in each data mart.

图3是本发明实施例的数据集市间指标共享的装置的主要部分示意图。FIG. 3 is a schematic diagram of a main part of an apparatus for sharing indicators between data marts according to an embodiment of the present invention.

如图3所示,本发明实施例的数据集市间指标共享的装置300可以包括:指标确定单元301、指标测试单元302以及指标共享单元303。其中:As shown in FIG. 3 , the apparatus 300 for sharing indicators among data marts according to the embodiment of the present invention may include: an indicator determining unit 301 , an indicator testing unit 302 , and an indicator sharing unit 303 . in:

指标确定单元301可用于在多个数据集市的指标数据表中确定至少一个共享指标数据表,根据任一共享指标数据表创建与该共享指标数据表相应的测试用例;其中,所述共享指标数据表中包括将在所述多个数据集市间共享的指标数据。The indicator determining unit 301 can be configured to determine at least one shared indicator data table in indicator data tables of multiple data marts, and create a test case corresponding to the shared indicator data table according to any shared indicator data table; wherein, the shared indicator The data table includes indicator data to be shared among the plurality of data marts.

指标测试单元302可用于将所述共享指标数据表以及所述测试用例复制到共享集市,利用所述测试用例对共享集市中相应的共享指标数据表进行测试。The indicator testing unit 302 may be configured to copy the shared indicator data table and the test case to the shared mart, and use the test case to test the corresponding shared indicator data table in the shared mart.

指标共享单元303可用于将通过测试的任一共享指标数据表复制到未存储该共享指标数据表的每一个数据集市。The indicator sharing unit 303 can be configured to copy any shared indicator data table that passes the test to each data mart that does not store the shared indicator data table.

作为一个优选方案,所述测试用例可以包括:测试步骤以及与所述测试步骤对应的期望结果。所述指标测试单元302可以用于:针对共享集市中的任一共享指标数据表,执行相应的测试用例中的测试步骤;当测试结果与所述测试用例中的期望结果完全一致时,将所述共享指标数据表确定为通过测试。As a preferred solution, the test case may include: test steps and expected results corresponding to the test steps. The indicator testing unit 302 can be used to: for any shared indicator data table in the shared market, execute the test steps in the corresponding test case; when the test result is completely consistent with the expected result in the test case, The shared indicator data table is determined to pass the test.

此外,在本发明实施例中,所述多个数据集市为基于同一数据仓库的从属型数据集市,所述多个数据集市、所述共享集市、以及所述数据仓库都采用分布式文件系统架构存储数据。In addition, in the embodiment of the present invention, the plurality of data marts are subordinate data marts based on the same data warehouse, and the plurality of data marts, the shared mart, and the data warehouse all use distribution file system architecture to store data.

图4是根据本发明实施例的数据集市间指标共享的装置的组成部分示意图。FIG. 4 is a schematic diagram of components of an apparatus for sharing indicators among data marts according to an embodiment of the present invention.

如图4所示,本发明实施例的数据集市间指标共享的装置在指标确定单元301、指标测试单元302、指标共享单元303之外,还包括数据准备单元401与数据导入单元402。其中,数据准备单元401包括采集模块与存储模块,数据导入单元402包括抽取模块与调度模块。具体地:As shown in FIG. 4 , the apparatus for sharing indicators among data marts according to the embodiment of the present invention includes a data preparation unit 401 and a data import unit 402 in addition to the indicator determination unit 301 , the indicator testing unit 302 , and the indicator sharing unit 303 . The data preparation unit 401 includes a collection module and a storage module, and the data import unit 402 includes an extraction module and a scheduling module. specifically:

采集模块可用于采集业务数据库的业务数据,并对采集的数据执行数据清洗以过滤干扰数据;The collection module can be used to collect the business data of the business database, and perform data cleaning on the collected data to filter the interference data;

存储模块可用于以LZO、ORC等格式压缩经过数据清洗的数据;The storage module can be used to compress the cleaned data in LZO, ORC and other formats;

抽取模块可用于将压缩数据通过调度模块抽取到数据仓库;The extraction module can be used to extract the compressed data to the data warehouse through the scheduling module;

数据仓库建立对应于业务数据的数据表后,将数据表复制到各个数据集市,之后,指标确定单元301即可从数据集市确定共享指标数据表。After the data warehouse establishes the data table corresponding to the business data, the data table is copied to each data mart, and then the indicator determining unit 301 can determine the shared indicator data table from the data mart.

根据本发明实施例的技术方案,通过从各数据集市中确定共享指标数据表,从而可根据共享指标数据表确定各指标的确切定义以及采用的数据模型;通过将共享指标数据表复制到共享集市,并在共享集市对其测试后分享到各数据集市,实现了在各数据集市共享指标数据的技术效果,解决了现有技术中各数据集市指标定义、指标采用的数据模型不一致的问题,使得各数据集市能够基于稳定、一致、全面的指标模型为业务提供支持;通过在共享集市对共享指标数据表进行测试,避免了因节点故障、网络故障等造成的数据丢失、数据损坏带来的影响,保证了在各数据集市中共享的指标数据的质量。According to the technical solution of the embodiment of the present invention, by determining the shared indicator data table from each data mart, the exact definition of each indicator and the adopted data model can be determined according to the shared indicator data table; by copying the shared indicator data table to the shared indicator data table Market, and share it with each data mart after testing it in the sharing mart, realizing the technical effect of sharing indicator data in each data mart, and solving the definition of indicators in each data mart and the data used in indicators in the prior art The problem of model inconsistency enables each data mart to provide support for the business based on a stable, consistent and comprehensive indicator model; by testing the shared indicator data table in the shared mart, data caused by node failures, network failures, etc. are avoided. The impact of loss and data corruption ensures the quality of indicator data shared in various data marts.

图5示出了可以应用本发明实施例的数据集市间指标共享的方法或数据集市间指标共享的装置的示例性系统架构500。FIG. 5 shows an exemplary system architecture 500 of a method for sharing indicators among data marts or an apparatus for sharing indicators between data marts to which embodiments of the present invention can be applied.

如图5所示,系统架构500可以包括终端设备501、502、503,网络504和服务器505(此架构仅仅是示例,具体架构中包含的组件可以根据申请具体情况调整)。网络504用以在终端设备501、502、503和服务器505之间提供通信链路的介质。网络504可以包括各种连接类型,例如有线、无线通信链路或者光纤电缆等等。As shown in FIG. 5, the system architecture 500 may include terminal devices 501, 502, 503, a network 504 and a server 505 (this architecture is only an example, and the components included in the specific architecture can be adjusted according to the specific application). The network 504 is a medium used to provide a communication link between the terminal devices 501 , 502 , 503 and the server 505 . Network 504 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.

用户可以使用终端设备501、502、503通过网络504与服务器505交互,以接收或发送消息等。终端设备501、502、503上可以安装有各种通讯客户端应用,例如购物类应用、网页浏览器应用、搜索类应用、即时通信工具、邮箱客户端、社交平台软件等(仅为示例)。The user can use the terminal devices 501, 502, 503 to interact with the server 505 through the network 504 to receive or send messages and the like. Various communication client applications may be installed on the terminal devices 501 , 502 and 503 , such as shopping applications, web browser applications, search applications, instant messaging tools, email clients, social platform software, etc. (only examples).

终端设备501、502、503可以是具有显示屏并且支持网页浏览的各种电子设备,包括但不限于智能手机、平板电脑、膝上型便携计算机和台式计算机等等。The terminal devices 501, 502, 503 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop computers, desktop computers, and the like.

服务器505可以是提供各种服务的服务器,例如对用户利用终端设备501、502、503所浏览的购物类网站提供支持的后台管理服务器(仅为示例)。后台管理服务器可以对接收到的产品信息查询请求等数据进行分析等处理,并将处理结果(例如目标推送信息、产品信息--仅为示例)反馈给终端设备。The server 505 may be a server that provides various services, for example, a background management server that provides support for shopping websites browsed by the terminal devices 501 , 502 , and 503 (just an example). The background management server can analyze and process the received product information query request and other data, and feed back the processing results (such as target push information, product information—just an example) to the terminal device.

需要说明的是,本发明实施例所提供的数据集市间指标共享的方法一般由服务器505执行,相应地,数据集市间指标共享的装置一般设置于服务器505中。It should be noted that the method for sharing indicators among data marts provided by the embodiments of the present invention is generally performed by the server 505 .

应该理解,图5中的终端设备、网络和服务器的数目仅仅是示意性的。根据实现需要,可以具有任意数目的终端设备、网络和服务器。It should be understood that the numbers of terminal devices, networks and servers in FIG. 5 are only illustrative. There can be any number of terminal devices, networks and servers according to implementation needs.

本发明还提供了一种电子设备。The present invention also provides an electronic device.

本发明实施例的电子设备包括:一个或多个处理器;存储装置,用于存储一个或多个程序,当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现本发明所提供的数据集市间指标共享的方法。An electronic device according to an embodiment of the present invention includes: one or more processors; and a storage device for storing one or more programs, when the one or more programs are executed by the one or more processors, the one or more programs cause the One or more processors implement the method for sharing indicators among data marts provided by the present invention.

下面参考图6,其示出了适于用来实现本发明实施例的电子设备的计算机系统600的结构示意图。图6示出的电子设备仅仅是一个示例,不应对本发明实施例的功能和使用范围带来任何限制。Referring next to FIG. 6 , it shows a schematic structural diagram of a computer system 600 suitable for implementing an electronic device according to an embodiment of the present invention. The electronic device shown in FIG. 6 is only an example, and should not impose any limitation on the function and scope of use of the embodiments of the present invention.

如图6所示,计算机系统600包括中央处理单元(CPU)601,其可以根据存储在只读存储器(ROM)602中的程序或者从存储部分608加载到随机访问存储器(RAM)603中的程序而执行各种适当的动作和处理。在RAM603中,还存储有计算机系统600操作所需的各种程序和数据。CPU601、ROM 602以及RAM 603通过总线604彼此相连。输入/输出(I/O)接口605也连接至总线604。As shown in FIG. 6, a computer system 600 includes a central processing unit (CPU) 601, which can be loaded into a random access memory (RAM) 603 according to a program stored in a read only memory (ROM) 602 or a program from a storage section 608 Instead, various appropriate actions and processes are performed. In the RAM 603, various programs and data necessary for the operation of the computer system 600 are also stored. The CPU 601 , the ROM 602 , and the RAM 603 are connected to each other through a bus 604 . An input/output (I/O) interface 605 is also connected to bus 604 .

以下部件连接至I/O接口605:包括键盘、鼠标等的输入部分606;包括诸如阴极射线管(CRT)、液晶显示器(LCD)等以及扬声器等的输出部分607;包括硬盘等的存储部分608;以及包括诸如LAN卡、调制解调器等的网络接口卡的通信部分609。通信部分609经由诸如因特网的网络执行通信处理。驱动器610也根据需要连接至I/O接口605。可拆卸介质611,诸如磁盘、光盘、磁光盘、半导体存储器等等,根据需要安装在驱动器610上,以便从其上读出的计算机程序根据需要被安装入存储部分608。The following components are connected to the I/O interface 605: an input section 606 including a keyboard, a mouse, etc.; an output section 607 including a cathode ray tube (CRT), a liquid crystal display (LCD), etc., and a speaker, etc.; a storage section 608 including a hard disk, etc. ; and a communication section 609 including a network interface card such as a LAN card, a modem, and the like. The communication section 609 performs communication processing via a network such as the Internet. A drive 610 is also connected to the I/O interface 605 as needed. A removable medium 611, such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, etc., is mounted on the drive 610 as needed, so that a computer program read therefrom is installed into the storage section 608 as needed.

特别地,根据本发明公开的实施例,上文的主要步骤图描述的过程可以被实现为计算机软件程序。例如,本发明实施例包括一种计算机程序产品,其包括承载在计算机可读介质上的计算机程序,该计算机程序包含用于执行主要步骤图所示的方法的程序代码。在上述实施例中,该计算机程序可以通过通信部分609从网络上被下载和安装,和/或从可拆卸介质611被安装。在该计算机程序被中央处理单元601执行时,执行本发明的系统中限定的上述功能。In particular, according to the disclosed embodiments of the present invention, the processes described in the main step diagrams above may be implemented as computer software programs. For example, embodiments of the present invention include a computer program product comprising a computer program carried on a computer-readable medium, the computer program containing program code for executing the method shown in the main step diagram. In the above-described embodiment, the computer program can be downloaded and installed from the network through the communication section 609 and/or installed from the removable medium 611 . When the computer program is executed by the central processing unit 601, the above-described functions defined in the system of the present invention are performed.

需要说明的是,本发明所示的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本发明中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。在本发明中,计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述任意合适的组合。计算机可读信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:无线、电线、光缆、RF等等,或者上述的任意合适的组合。It should be noted that the computer-readable medium shown in the present invention may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two. The computer-readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. More specific examples of computer readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), fiber optics, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing. In the present invention, a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device. In the present invention, a computer readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, carrying computer readable program code therein. Such propagated data signals may take a variety of forms including, but not limited to, electromagnetic signals, optical signals, or any suitable combination of the foregoing. A computer-readable signal medium can also be any computer-readable medium, other than a computer-readable storage medium, that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any suitable medium including, but not limited to, wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

附图中的流程图和框图,图示了按照本发明各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,上述模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这根据所涉及的功能而定。也要注意的是,框图或流程图中的每个方框、以及框图或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logical functions for implementing the specified functions executable instructions. It should also be noted that, in some alternative implementations, the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It is also noted that each block of the block diagrams or flowchart illustrations, and combinations of blocks in the block diagrams or flowchart illustrations, can be implemented in special purpose hardware-based systems that perform the specified functions or operations, or can be implemented using A combination of dedicated hardware and computer instructions is implemented.

描述于本发明实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。所描述的单元也可以设置在处理器中,例如,可以描述为:一种处理器包括指标确定单元、指标测试单元和指标共享单元。其中,这些单元的名称在某种情况下并不构成对该单元本身的限定,例如,指标测试单元还可以被描述为“向指标共享单元发送通过测试的共享指标数据表的单元”。The units involved in the embodiments of the present invention may be implemented in a software manner, and may also be implemented in a hardware manner. The described unit can also be set in the processor, for example, it can be described as: a processor includes an indicator determination unit, an indicator testing unit and an indicator sharing unit. Among them, the names of these units do not constitute a limitation of the unit itself under certain circumstances. For example, the indicator testing unit can also be described as "a unit that sends a shared indicator data table that has passed the test to the indicator sharing unit".

作为另一方面,本发明还提供了一种计算机可读介质,该计算机可读介质可以是上述实施例中描述的设备中所包含的;也可以是单独存在,而未装配入该设备中的。上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被该设备执行时,使得该设备执行的步骤包括:在多个数据集市的指标数据表中确定至少一个共享指标数据表,根据任一共享指标数据表创建测试用例;其中,所述共享指标数据表中包括将在所述多个数据集市间共享的指标数据;将所述共享指标数据表和所述测试用例复制到共享集市,利用所述测试用例对共享集市中相应的共享指标数据表进行测试;以及将通过测试的任一共享指标数据表复制到未存储该共享指标数据表的每一个数据集市。As another aspect, the present invention also provides a computer-readable medium. The computer-readable medium may be included in the device described in the above embodiments; it may also exist independently without being assembled into the device. . The above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the device, the steps of causing the device to execute include: determining at least one shared indicator in indicator data tables of multiple data marts A data table, creating a test case according to any shared indicator data table; wherein the shared indicator data table includes indicator data to be shared among the multiple data marts; combining the shared indicator data table and the test Copy the use case to the shared mart, and use the test case to test the corresponding shared indicator data table in the shared mart; and copy any shared indicator data table that passes the test to each data that does not store the shared indicator data table market.

根据本发明实施例的技术方案,通过从各数据集市中确定共享指标数据表,从而可根据共享指标数据表确定各指标的确切定义以及采用的数据模型;通过将共享指标数据表复制到共享集市,并在共享集市对其测试后分享到各数据集市,实现了在各数据集市共享指标数据的技术效果,解决了现有技术中各数据集市指标定义、指标采用的数据模型不一致的问题,使得各数据集市能够基于稳定、一致、全面的指标模型为业务提供支持;通过在共享集市对共享指标数据表进行测试,避免了因节点故障、网络故障等造成的数据丢失、数据损坏带来的影响,保证了在各数据集市中共享的指标数据的质量。According to the technical solution of the embodiment of the present invention, by determining the shared indicator data table from each data mart, the exact definition of each indicator and the adopted data model can be determined according to the shared indicator data table; by copying the shared indicator data table to the shared indicator data table Market, and share it with each data mart after testing it in the sharing mart, realizing the technical effect of sharing indicator data in each data mart, and solving the definition of indicators in each data mart and the data used in indicators in the prior art The problem of model inconsistency enables each data mart to provide support for the business based on a stable, consistent and comprehensive indicator model; by testing the shared indicator data table in the shared mart, data caused by node failures, network failures, etc. are avoided. The impact of loss and data corruption ensures the quality of indicator data shared in various data marts.

上述具体实施方式,并不构成对本发明保护范围的限制。本领域技术人员应该明白的是,取决于设计要求和其他因素,可以发生各种各样的修改、组合、子组合和替代。任何在本发明的精神和原则之内所作的修改、等同替换和改进等,均应包含在本发明保护范围之内。The above-mentioned specific embodiments do not constitute a limitation on the protection scope of the present invention. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may occur depending on design requirements and other factors. Any modifications, equivalent replacements and improvements made within the spirit and principle of the present invention shall be included within the protection scope of the present invention.

Claims (10)

1. A method for index sharing among data marts is characterized by comprising the following steps:
determining at least one shared index data table in index data tables of a plurality of data marts, and creating a test case corresponding to the shared index data table according to any one of the at least one shared index data table; wherein the shared index data table includes index data to be shared among the plurality of data marts;
copying the shared index data sheet and the test case to a shared mart, and testing the corresponding shared index data sheet in the shared mart by using the test case; and
any shared index data table that passes the test is copied to each data mart that does not store the shared index data table.
2. The method of claim 1, wherein the test case comprises: a testing step, and an expected result corresponding to the testing step.
3. The method of claim 2, wherein the testing the corresponding shared index data table in the shared mart using the test case comprises:
executing a test step in a corresponding test case aiming at least one shared index data table in the shared mart;
and when the test result is completely consistent with the expected result in the test case, determining the shared index data table as passing the test.
4. The method of claim 3, further comprising:
when the test result is inconsistent with the expected result in the test case: and removing the shared index data table, copying the shared index data table which is determined in the index data tables of the plurality of data marts and corresponds to the shared index data table to the shared marts again, and testing the shared index data table by using the test case so as to prevent errors or loss of data in the copying process.
5. The method of any of claims 1-4, wherein the plurality of data marts are subordinate data marts based on a same data warehouse, and wherein the plurality of data marts, the shared mart, and the data warehouse all store data using a distributed file system architecture.
6. An apparatus for index sharing between data marts, comprising:
the index determining unit is used for determining at least one shared index data table in the index data tables of the data marts and creating a test case corresponding to the shared index data table according to any one of the at least one shared index data table; wherein the shared index data table includes index data to be shared among the plurality of data marts;
the index testing unit is used for copying the shared index data sheet and the test case to a shared mart and testing the corresponding shared index data sheet in the shared mart by using the test case;
and the index sharing unit is used for copying any one sharing index data table passing the test to each data mart which does not store the sharing index data table.
7. The apparatus of claim 6, wherein the test case comprises: a testing step and an expected result corresponding to the testing step; and
the index testing unit is used for: executing a test step in a corresponding test case aiming at least one shared index data table in the shared mart; and when the test result is completely consistent with the expected result in the test case, determining the shared index data table as passing the test.
8. The apparatus of claim 6 or 7, wherein the plurality of data marts are subordinate data marts based on a same data warehouse, and wherein the plurality of data marts, the shared mart, and the data warehouse all store data using a distributed file system architecture.
9. An electronic device, comprising:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-5.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-5.
CN201710806219.4A 2017-09-08 2017-09-08 Method and device for sharing indexes among data marts Active CN107679096B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710806219.4A CN107679096B (en) 2017-09-08 2017-09-08 Method and device for sharing indexes among data marts

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710806219.4A CN107679096B (en) 2017-09-08 2017-09-08 Method and device for sharing indexes among data marts

Publications (2)

Publication Number Publication Date
CN107679096A CN107679096A (en) 2018-02-09
CN107679096B true CN107679096B (en) 2020-06-05

Family

ID=61135253

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710806219.4A Active CN107679096B (en) 2017-09-08 2017-09-08 Method and device for sharing indexes among data marts

Country Status (1)

Country Link
CN (1) CN107679096B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112162928B (en) * 2020-10-15 2024-03-15 网易(杭州)网络有限公司 Game testing method, game testing device, electronic equipment and computer readable medium
CN113656372B (en) * 2021-08-13 2022-06-21 南方电网数字电网研究院有限公司 Standard index database data mart architecture device and method

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1774710A (en) * 2003-04-18 2006-05-17 国际商业机器公司 Systems and methods in datasheets for exporting and importing recursively scalable templates
CN101197876A (en) * 2006-12-06 2008-06-11 中兴通讯股份有限公司 A method and system for multidimensional analysis of message business data
CN102576363A (en) * 2009-09-29 2012-07-11 渣普控股有限公司 A content based approach to extending the form and function of a business intelligence system
CN103345484A (en) * 2013-06-21 2013-10-09 中国工商银行股份有限公司 Report form processing system based on dynamic domain and method
CN103412853A (en) * 2013-08-05 2013-11-27 北京信息科技大学 Method for automatically generating test cases aiming at document converters
CN103678665A (en) * 2013-12-24 2014-03-26 焦点科技股份有限公司 Heterogeneous large data integration method and system based on data warehouses
CN104731791A (en) * 2013-12-18 2015-06-24 东阳艾维德广告传媒有限公司 Marketing analysis data market system
CN105095392A (en) * 2015-07-02 2015-11-25 北京京东尚科信息技术有限公司 A method and device for sharing data between data marts
CN105335401A (en) * 2014-07-22 2016-02-17 阿里巴巴集团控股有限公司 Data warehouse index management method, apparatus and system
CN106030573A (en) * 2014-02-19 2016-10-12 斯诺弗雷克计算公司 Implementation of semi-structured data as a first-level database element
CN106201886A (en) * 2016-07-18 2016-12-07 合网络技术(北京)有限公司 The Proxy Method of the checking of a kind of real time data task and device

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1774710A (en) * 2003-04-18 2006-05-17 国际商业机器公司 Systems and methods in datasheets for exporting and importing recursively scalable templates
CN101197876A (en) * 2006-12-06 2008-06-11 中兴通讯股份有限公司 A method and system for multidimensional analysis of message business data
CN102576363A (en) * 2009-09-29 2012-07-11 渣普控股有限公司 A content based approach to extending the form and function of a business intelligence system
CN103345484A (en) * 2013-06-21 2013-10-09 中国工商银行股份有限公司 Report form processing system based on dynamic domain and method
CN103412853A (en) * 2013-08-05 2013-11-27 北京信息科技大学 Method for automatically generating test cases aiming at document converters
CN104731791A (en) * 2013-12-18 2015-06-24 东阳艾维德广告传媒有限公司 Marketing analysis data market system
CN103678665A (en) * 2013-12-24 2014-03-26 焦点科技股份有限公司 Heterogeneous large data integration method and system based on data warehouses
CN106030573A (en) * 2014-02-19 2016-10-12 斯诺弗雷克计算公司 Implementation of semi-structured data as a first-level database element
CN105335401A (en) * 2014-07-22 2016-02-17 阿里巴巴集团控股有限公司 Data warehouse index management method, apparatus and system
CN105095392A (en) * 2015-07-02 2015-11-25 北京京东尚科信息技术有限公司 A method and device for sharing data between data marts
CN106201886A (en) * 2016-07-18 2016-12-07 合网络技术(北京)有限公司 The Proxy Method of the checking of a kind of real time data task and device

Also Published As

Publication number Publication date
CN107679096A (en) 2018-02-09

Similar Documents

Publication Publication Date Title
US10121169B2 (en) Table level distributed database system for big data storage and query
US11238045B2 (en) Data arrangement management in a distributed data cluster environment of a shared pool of configurable computing resources
CN108647357B (en) Method and device for data query
CN109933514B (en) Data testing method and device
CN111339073A (en) Real-time data processing method and device, electronic equipment and readable storage medium
US20120054146A1 (en) Systems and methods for tracking and reporting provenance of data used in a massively distributed analytics cloud
US11487708B1 (en) Interactive visual data preparation service
EP3076359A1 (en) Implementing retail customer analytics data model in a distributed computing environment
CN111046237A (en) User behavior data processing method and device, electronic equipment and readable medium
WO2019118867A1 (en) Method, apparatus and computer program product for improving data indexing in a group-based communication platform
CN112579673A (en) Multi-source data processing method and device
CN110019214A (en) The method and apparatus that data split result is verified
CN107679096B (en) Method and device for sharing indexes among data marts
CN112148705A (en) Data migration method and device
CN113392076A (en) Method, device, electronic equipment and medium for acquiring metadata quality information
US11308115B2 (en) Method and system for persisting data
CN113934729A (en) A data management method, related equipment and medium based on knowledge graph
CN110688295A (en) Data testing method and device
CN114860821A (en) Data importing method and device of graph database, storage medium and electronic equipment
CN119513205A (en) A method and device for data synchronization
CN112148762A (en) Statistical method and device for real-time data flow
CN115033635A (en) Data extraction method and device, processor and electronic equipment
CN114997838A (en) Method and device for processing approval data, electronic equipment and computer readable medium
CN109669668B (en) Method and device for realizing simulated transaction execution in system performance test
TWI578173B (en) Statistical e-commerce transaction data, e-commerce transaction data statistics system and application server

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant