CN115114319A

CN115114319A - Method, device and device for data query based on data wide table

Info

Publication number: CN115114319A
Application number: CN202210679262.XA
Authority: CN
Inventors: 刘鑫
Original assignee: Beijing Shareit Information Technology Co Ltd
Current assignee: Shanghai Fupei Technology Co ltd
Priority date: 2022-06-15
Filing date: 2022-06-15
Publication date: 2022-09-27

Abstract

The disclosure relates to a method, a device, equipment and a storage medium for querying data based on a data wide table, which can be applied to the technical field of data processing. The method for querying data based on the data wide table comprises the following steps: receiving a query instruction of a user; and inquiring in the data wide table according to the field in the inquiry instruction to obtain inquiry data and output the inquiry data. The data wide table is obtained by summarizing data in a plurality of business data single tables, so that the required business data information can be inquired in real time by inquiring based on the data wide table, and the inquiry efficiency of the data information is improved.

Description

Method, device and device for data query based on data wide table

技术领域technical field

本公开涉及数据处理技术领域，尤其涉及一种基于数据宽表进行数据查询的方法、装置、设备及存储介质。The present disclosure relates to the technical field of data processing, and in particular, to a method, apparatus, device and storage medium for performing data query based on a data wide table.

背景技术Background technique

随着互联网的飞速发展，以互联网技术为支撑的电子商务也进入了业务高速发展期。这些业务在实现时，会生成多张记载业务数据信息的表，实现将业务信息进行记录和存储。With the rapid development of the Internet, e-commerce supported by Internet technology has also entered a period of rapid business development. When these services are implemented, multiple tables that record the business data information will be generated, so as to record and store the business information.

同时，随着业务量的日益增加，需要存储越来越多的业务表，导致针对业务表的存储设计也越来越复杂。然而，复杂的数据存储方式，会导致用户在后台查询数据时，需要逐个查找多个数据库和多个数据表，致使数据查询的整个过程响应时间过长。At the same time, with the increasing business volume, more and more business tables need to be stored, resulting in more and more complicated storage design for business tables. However, the complex data storage method will cause users to search for multiple databases and multiple data tables one by one when querying data in the background, resulting in a long response time for the entire process of data query.

所以，现有技术中并不存在一种合适的数据查询方法。Therefore, there is no suitable data query method in the prior art.

发明内容SUMMARY OF THE INVENTION

本公开提供了一种基于数据宽表进行数据查询的方法、装置、设备及存储介质，以实现实时查询到所需的业务数据信息，提高数据信息的查询效率。The present disclosure provides a method, device, device and storage medium for data query based on a data wide table, so as to realize real-time query of required business data information and improve the query efficiency of data information.

第一方面，本公开提供一种基于数据宽表进行数据查询的方法，包括：接收用户的查询指令；根据查询指令中的字段在数据宽表进行查询，得到查询数据，数据宽表是通过汇总多个业务数据单表中的数据而获得的；输出查询数据。In a first aspect, the present disclosure provides a method for querying data based on a data wide table, including: receiving a query instruction from a user; querying the data wide table according to fields in the query instruction to obtain query data, and the data wide table is obtained by summarizing Obtained from data in multiple business data sheet tables; output query data.

在一些可能的实施方式中，根据查询指令中的字段在数据宽表进行查询，得到查询数据，包括：根据查询指令中的字段，查询字段在数据宽表中对应的索引；根据索引链接到查询数据。In some possible implementations, querying the data wide table according to the fields in the query instruction to obtain the query data includes: according to the fields in the query instruction, querying the indexes corresponding to the fields in the data wide table; linking to the query according to the indexes data.

在一些可能的实施方式中，在接收用户的查询指令之前，方法还包括：获取多个业务数据单表；通过汇总多个业务数据单表中的数据，获得数据宽表。In some possible implementations, before receiving the query instruction from the user, the method further includes: acquiring multiple service data sheet tables; and obtaining a data wide table by aggregating data in the multiple service data sheet tables.

在一些可能的实施方式中，通过汇总多个业务数据单表中的数据，获得数据宽表，包括：根据多个业务数据单表中的业务信息确定第一字段，第一字段为数据宽表中用于展示的字段；按照第一字段，汇总多个业务数据单表中的数据，以获得数据宽表。In some possible implementations, the data wide table is obtained by aggregating data in multiple business data sheet tables, including: determining a first field according to business information in the multiple business data sheet tables, where the first field is a data wide table The field used for display in the field; according to the first field, summarize the data in multiple business data sheet tables to obtain a data wide table.

在一些可能的实施方式中，按照第一字段，汇总多个业务数据单表中的数据，以获得数据宽表，包括：解析多个业务数据单表中各个业务数据表单中的第二字段，第二字段用于指示多个业务数据单表中数据之间的关联关系；按照第一字段所对应的数据范围，使用第二字段，对多个业务数据单表中的关联数据进行汇总，以得到数据宽表。In some possible implementations, according to the first field, aggregating data in multiple business data sheet tables to obtain a data wide table includes: parsing the second field in each business data sheet in the multiple business data sheet tables, The second field is used to indicate the association relationship between the data in the multiple business data sheet tables; according to the data range corresponding to the first field, the second field is used to summarize the associated data in the multiple business data sheet tables to Get the data wide table.

在一些可能的实施方式中，按照第一字段所对应的数据范围，使用第二字段，对多个业务数据单表中的关联数据进行汇总，以得到数据宽表，包括：按照第一字段所对应的数据范围，通过多线程技术对多个业务数据单表中的数据进行处理，得到来自多个业务数据单表的中间数据；使用第二字段，获取中间数据中的关联数据；汇总关联数据，以得到数据宽表。In some possible implementations, according to the data range corresponding to the first field, the second field is used to summarize the associated data in multiple business data sheet tables to obtain a data wide table, including: according to the data range of the first field For the corresponding data range, the data in multiple business data sheets is processed through multi-threading technology to obtain intermediate data from multiple business data sheets; the second field is used to obtain the associated data in the intermediate data; the associated data is aggregated , to get the data wide table.

在一些可能的实施方式中，检测多个业务数据单表中的数据变化信息；当检测到任一单表中的数据发生变化时，更新数据宽表中与任一单表相关联的数据。In some possible implementations, the data change information in the multiple business data sheets is detected; when it is detected that the data in any single sheet changes, the data associated with any single sheet in the data wide table is updated.

第二方面，本公开提供一种基于数据宽表进行数据查询的装置，该装置可以为终端设备中的芯片或者片上系统，还可以为终端设备中用于实现第一方面及其任一种可能的实施方式的功能模块。该数据查询的装置可以实现第一方面及其任一种可能的实施方式中终端终端所执行的功能，功能可以通过硬件执行相应的软件实现。硬件或软件包括一个或多个上述功能相应的模块。该数据查询的装置包括：获取模块，用于接收用户的查询指令；处理模块，用于根据查询指令中的字段在数据宽表进行查询，得到查询数据，数据宽表是通过汇总多个业务数据单表中的数据而获得的；输出模块，用于输出查询数据。In a second aspect, the present disclosure provides an apparatus for querying data based on a data wide table. The apparatus may be a chip or a system-on-chip in a terminal device, and may also be used in a terminal device to implement the first aspect and any possibility thereof. The functional modules of the implementation. The data query apparatus can implement the functions performed by the terminal in the first aspect and any possible implementation manners thereof, and the functions can be implemented by executing corresponding software through hardware. The hardware or software includes one or more modules corresponding to the above functions. The data query device includes: an acquisition module for receiving a user's query instruction; a processing module for performing a query in a data wide table according to fields in the query instruction to obtain query data, and the data wide table collects multiple business data by summarizing multiple business data. It is obtained from the data in a single table; the output module is used to output query data.

在一些可能的实施方式中，处理模块，还用于：根据查询指令中的字段，查询字段在数据宽表中对应的索引；根据索引链接到查询数据。In some possible implementations, the processing module is further configured to: query the corresponding index of the field in the data wide table according to the field in the query instruction; and link to the query data according to the index.

在一些可能的实施方式中，获取模块，还用于：在接收用户的查询指令之前，获取多个业务数据单表；汇总多个业务数据单表中的数据，获得数据宽表。In some possible implementations, the obtaining module is further configured to: before receiving a query instruction from a user, obtain multiple service data sheets; and summarize data in the multiple service data sheets to obtain a data wide sheet.

在一些可能的实施方式中，获取模块，还用于：根据多个业务数据单表中的业务信息确定第一字段，第一字段为数据宽表中用于查询的字段；按照第一字段，汇总多个业务数据单表中的数据，以获得数据宽表。In some possible implementations, the acquiring module is further configured to: determine a first field according to the business information in the multiple business data sheet tables, where the first field is a field used for query in the data wide table; according to the first field, Summarize data from multiple business data sheet tables to obtain data wide tables.

在一些可能的实施方式中，获取模块，还用于：解析多个业务数据单表中各个业务数据表单中的第二字段，第二字段用于指示多个业务数据单表中数据之间的关联关系；按照第一字段所对应的数据范围，使用第二字段，对多个业务数据单表中的关联数据进行汇总，以得到数据宽表。In some possible implementations, the acquiring module is further configured to: parse the second field in each business data form in the multiple business data sheet tables, where the second field is used to indicate the difference between the data in the multiple business data sheet tables Association relationship; according to the data range corresponding to the first field, use the second field to summarize the associated data in multiple business data single tables to obtain a data wide table.

在一些可能的实施方式中，获取模块，还用于：按照第一字段所对应的数据范围，通过多线程技术对多个业务数据单表中的数据进行处理，得到来自多个业务数据单表的中间数据，中间数据是与数据宽表的字段相对应的数据；使用第二字段，获取中间数据中的关联数据；汇总关联数据，以得到数据宽表。In some possible implementations, the acquiring module is further configured to: process the data in the multiple business data sheet tables through a multi-threading technology according to the data range corresponding to the first field, to obtain data from multiple business data sheet tables The intermediate data is the data corresponding to the fields of the data wide table; the second field is used to obtain the associated data in the intermediate data; the associated data is aggregated to obtain the data wide table.

在一些可能的实施方式中，获取模块，还用于：在使用第二字段，对多个业务数据单表中的关联数据进行汇总，以得到数据宽表之后，根据关联数据的数据量大小，确定关联数据在数据宽表中的索引。In some possible implementations, the obtaining module is further configured to: after using the second field to summarize the associated data in the multiple business data sheet tables to obtain the data wide table, according to the data volume of the associated data, Determines the index of the associated data in the data-wide table.

在一些可能的实施方式中，获取模块，还用于：检测多个业务数据单表中的数据变化信息；当检测到任一单表中的数据发生变化时，更新数据宽表中与任一单表相关联的数据信息。In some possible implementations, the acquiring module is further configured to: detect data change information in multiple business data single tables; when detecting that the data in any single table changes, update the data in the wide data table with any one Data information associated with a single table.

第三方面，本公开提供一种终端，包括：用于存储处理器可执行指令的存储器；处理器；其中，处理器被配置为：用于执行可执行指令时，以实现如第一方面及其任一可能的实施方式所述的方法。In a third aspect, the present disclosure provides a terminal, comprising: a memory for storing processor-executable instructions; a processor; wherein the processor is configured to: when executing the executable instructions, to achieve the first aspect and the The method described in any possible embodiment thereof.

第四方面，本公开提供一种计算机可读存储介质，该计算机可读存储介质存储有计算机可执行指令，计算机可执行指令被处理器执行后能够实现如第一方面及其任一种可能的实施方式所述的方法。In a fourth aspect, the present disclosure provides a computer-readable storage medium, where computer-executable instructions are stored in the computer-readable storage medium, and after the computer-executable instructions are executed by a processor, the first aspect and any of its possible implementations can be implemented. The method described in the embodiment.

本公开提供的技术方案与现有技术相比存在的有益效果是：Compared with the prior art, the technical solution provided by the present disclosure has the following beneficial effects:

在本公开中，通过接收用户的查询指令，根据查询指令中的字段在数据宽表进行查询，得到查询数据并输出。本公开的数据宽表是通过汇总多个业务数据单表中的数据而获得的，因此，基于本公开的数据宽表进行查询，能够实时查询到所需的业务数据信息，提高数据信息的查询效率。In the present disclosure, the query data is obtained and output by receiving the query instruction from the user, and performing a query in the data wide table according to the fields in the query instruction. The data wide table of the present disclosure is obtained by aggregating data in multiple business data single tables. Therefore, based on the data wide table of the present disclosure, the required business data information can be queried in real time, and the query of data information can be improved. efficiency.

应当理解的是，以上的一般描述和后文的细节描述仅是示例性和解释性的，并不能限制本公开的保护范围。It is to be understood that the foregoing general description and the following detailed description are exemplary and explanatory only and are not intended to limit the scope of protection of the present disclosure.

附图说明Description of drawings

此处的附图被并入说明书中并构成本说明书的一部分，示出了符合本公开的实施例，并与说明书一起用于解释本公开的原理。The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description serve to explain the principles of the disclosure.

图1是本公开实施例中的数据查询的方法的一种实施流程示意图；1 is a schematic diagram of an implementation flow of a method for data query in an embodiment of the present disclosure;

图2是本公开实施例中的数据查询的方法的另一种实施流程示意图；2 is a schematic flowchart of another implementation of the method for data query in an embodiment of the present disclosure;

图3是本公开实施例中的一种数据查询装置的结构示意图；3 is a schematic structural diagram of a data query apparatus in an embodiment of the present disclosure;

图4是本公开实施例中的一种电子设备的结构示意图。FIG. 4 is a schematic structural diagram of an electronic device in an embodiment of the present disclosure.

具体实施方式Detailed ways

这里将详细地对示例性实施例进行说明，其示例表示在附图中。下面的描述涉及附图时，除非另有表示，不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本公开相一致的所有实施方式。相反，它们仅是与如所附权利要求书中所详述的、本公开的一些方面相一致的装置的例子。Exemplary embodiments will be described in detail herein, examples of which are illustrated in the accompanying drawings. Where the following description refers to the drawings, the same numerals in different drawings refer to the same or similar elements unless otherwise indicated. The implementations described in the illustrative examples below are not intended to represent all implementations consistent with this disclosure. Rather, they are merely examples of means consistent with some aspects of the present disclosure, as recited in the appended claims.

为了说明本公开所述的技术方案，下面通过具体实施例来进行说明。In order to illustrate the technical solutions described in the present disclosure, the following specific embodiments are used for description.

随着互联网的飞速发展，以互联网技术为支撑的电子商务也进入了业务高速发展期。常见的电子商务包括消费者的网上购物、商户之间的网上交易和在线电子支付等业务。这些业务在实现时，会生成多张记载业务数据信息的表，实现将业务信息进行记录和存储。以消费者的网上购物业务为例，在该业务实现时，会生成表示交易信息的交易表、表示支付信息的支付表、表示业务渠道的渠道表等。将上述多张记载业务数据信息的表(也可记为订单表)中记录的数据汇总起来，能够反映出网上购物业务的全部信息。在订单表的数据量较少时，通常将全部订单表存储在关系型数据库中，直接使用结构化查询语言(structuredquery language，SQL)通过查询数据库的方式进行关联查询，以获得业务的全部信息。With the rapid development of the Internet, e-commerce supported by Internet technology has also entered a period of rapid business development. Common e-commerce includes online shopping of consumers, online transactions between merchants and online electronic payment. When these services are implemented, multiple tables that record the business data information will be generated, so as to record and store the business information. Taking the online shopping business of consumers as an example, when the business is implemented, a transaction table representing transaction information, a payment table representing payment information, and a channel table representing business channels are generated. Collecting the data recorded in the above-mentioned multiple tables recording business data information (which may also be recorded as an order table) can reflect all the information of the online shopping business. When the amount of data in the order table is small, all the order tables are usually stored in a relational database, and a structured query language (SQL) is directly used to query the database for associated queries to obtain all business information.

随着业务量的日益增加，需要存储越来越多的业务表，导致针对业务表的存储设计也越来越复杂。例如，为避免海量数据存储在单一数据库中致使查询时间过长，往往采用分库分表的方式对业务表进行存储。然而，复杂的存储设计(如分库分表)会导致存在无法直接用SQL进行关联查询的问题。同时，现有的业务查询并发程度较高，关系型数据库管理系统MySQL无法承受较高并发度的查询请求，进而导致查询结果响应时间太长以及导致请求超时。With the increasing business volume, more and more business tables need to be stored, resulting in more and more complicated storage design for business tables. For example, in order to avoid long query time caused by storing massive data in a single database, the business table is often stored in the way of sub-database and sub-table. However, complex storage design (such as sub-database and sub-table) will lead to the problem that it is impossible to directly use SQL to perform related queries. At the same time, the existing business query has a high degree of concurrency, and the relational database management system MySQL cannot withstand query requests with a high degree of concurrency, resulting in a long response time for query results and request timeouts.

因此，并不存在一种合适的数据查询方法。Therefore, there is no suitable data query method.

为了解决上述问题，本公开实施例提供一种基于数据宽表进行数据查询的方法，以实现通过数据宽表实时查询到所需的业务数据信息，提高数据信息的查询效率。In order to solve the above problems, the embodiments of the present disclosure provide a method for data query based on a data wide table, so as to realize real-time query of required business data information through the data wide table, and improve the query efficiency of data information.

图1为本公开实施例中的数据查询的方法的一种实施流程示意图，参见图1所示，该数据查询的方法可以包括S101至S103。FIG. 1 is a schematic flowchart of an implementation of a data query method in an embodiment of the present disclosure. Referring to FIG. 1 , the data query method may include S101 to S103 .

S101，接收用户的查询指令。S101, receiving a query instruction from a user.

应理解的，终端设备执行S101，可以接收到用户用于指示查询数据宽表中实时存储信息的指令。It should be understood that the terminal device executes S101, and may receive an instruction from the user for instructing to query the real-time stored information in the data width table.

其中，查询指令可以是用户对终端设备进行操作时，由终端设备获取到的指令。查询指令通常包括文字、符号等多种字段，该字段用于指示查询的信息且能够被终端设备识别。示例性的，查询指令中可以包括字段“2022.05”，终端设备能够识别出该字段用于指示查询与2022.05相关的订单信息。The query instruction may be an instruction obtained by the terminal device when the user operates the terminal device. The query instruction usually includes various fields such as text and symbols, which are used to indicate the query information and can be recognized by the terminal device. Exemplarily, the query instruction may include a field "2022.05", and the terminal device can identify this field to instruct to query order information related to 2022.05.

S102，根据查询指令中的字段在数据宽表进行查询，得到查询数据。S102, query the data wide table according to the fields in the query instruction to obtain query data.

应理解的，终端设备在执行S101后，可以获得用户的查询指令。然后，终端设备执行S102，可以识别出查询指令中的字段，根据该字段的指示在数据宽表中进行数据查询，获得查询数据。It should be understood that, after executing S101, the terminal device can obtain the query instruction of the user. Then, the terminal device executes S102, and can identify the field in the query instruction, and perform data query in the data wide table according to the indication of the field to obtain query data.

其中，数据宽表是通过汇总多个业务数据单表中的数据而获得的。数据宽表可以是包括多个字段的数据库表，每个字段对应至少一个数据值。数据值可以来自多个业务数据单表，多个业务数据单表存储在同一数据库，也可以存储在不同数据库。数据宽表通过汇总数据值可以将业务主体相关的指标、维度、属性关联在一起。The data wide table is obtained by aggregating data in multiple business data sheet tables. The data wide table may be a database table including a plurality of fields, each field corresponding to at least one data value. Data values can come from multiple business data sheet tables, and multiple business data sheet tables can be stored in the same database or stored in different databases. Data wide tables can associate indicators, dimensions, and attributes related to business entities by summarizing data values.

在一些可能的实施方式中，S102可以包括：根据查询指令中的字段，查询字段在数据宽表中对应的索引；根据索引链接到查询数据。In some possible implementations, S102 may include: querying an index corresponding to the field in the data wide table according to the field in the query instruction; and linking to the query data according to the index.

应理解的，终端设备执行S102，可以识别出查询指令中的字段，根据该字段的指示查找对应的索引，再根据索引查找存储在数据宽表中指定位置的数据值，最后获得查询数据。It should be understood that when the terminal device executes S102, it can identify the field in the query instruction, search for the corresponding index according to the field's indication, and then search for the data value stored in the specified position in the data width table according to the index, and finally obtain the query data.

其中，索引可以提供指向存储在数据宽表中指定位置的数据值的指针。使用索引可以找到与指令中的字段对应的特定值，然后顺指针找到包含该值的存储位置，进而可快速访问数据宽表中的数据值。Among other things, an index can provide a pointer to a data value stored at a specified location in the data-wide table. Use the index to find the specific value that corresponds to a field in the instruction, then clockwise to find the storage location that contains the value, providing quick access to the data value in the data-wide table.

也就是说，终端设备通过在数据宽表中先查询对应的索引，再根据索引链接到查询数据，可以实现将原有的默认全表扫描的查询方式更改为去索引列表中一次定位到特定值的存储位置，大大减少扫描数据的工作量，所以能明显增加查询的速度。That is to say, by first querying the corresponding index in the data wide table, and then linking to the query data according to the index, the terminal device can change the original default full table scan query method to locate a specific value in the de-index list at one time. The storage location greatly reduces the workload of scanning data, so it can significantly increase the query speed.

S103，输出查询数据。S103, output query data.

应理解的，终端设备在执行S102后，可以获得查询数据。然后，终端设备执行S103，可以输出S102获得的查询数据。It should be understood that the terminal device can obtain query data after executing S102. Then, the terminal device executes S103, and can output the query data obtained in S102.

其中，输出查询数据的方式可以根据实际需求进行设定，本公开对此不做具体限定。例如，在用户需要在输入查询指令的页面看到查询数据时，可以设置输出查询数据的形式为在页面中显示查询数据；或者，在用户需要将查询数据进行存储保存时，可以设置输出查询数据的形式为输出文档存储查询数据。The manner of outputting the query data may be set according to actual requirements, which is not specifically limited in the present disclosure. For example, when the user needs to see the query data on the page where the query instruction is input, the format of the output query data can be set to display the query data on the page; or, when the user needs to store and save the query data, the output query data can be set The form stores query data for the output document.

在上述实施例中，终端设备执行S101至S103，可以通过接收用户的查询指令，根据查询指令中的字段在数据宽表进行查询，得到查询数据并输出。本公开的数据宽表是通过汇总多个业务数据单表中的数据而获得的，因此，基于本公开的数据宽表进行查询，能够实时查询到所需的业务数据信息，提高数据信息的查询效率。In the above embodiment, the terminal device executes S101 to S103, and can obtain and output the query data by receiving the query instruction from the user, and querying the data wide table according to the fields in the query instruction. The data wide table of the present disclosure is obtained by aggregating data in multiple business data single tables. Therefore, based on the data wide table of the present disclosure, the required business data information can be queried in real time, and the query of data information can be improved. efficiency.

在一些可能的实施方式中，图2为本公开实施例中的数据查询的方法的另一种实施流程示意图，参见图2所示，S101之前还可以包括S201。In some possible implementations, FIG. 2 is a schematic flowchart of another implementation of the data query method in the embodiment of the disclosure. Referring to FIG. 2 , S201 may also be included before S101 .

S201，获取多个业务数据单表，通过汇总多个业务数据单表中的数据，获得数据宽表。S201: Acquire multiple business data sheet tables, and obtain a data wide table by aggregating data in the multiple business data sheet tables.

应理解的，终端设备在执行S101之前，可以执行S201。终端设备可以通过获取多个业务生成的数据单表，随后汇总多个业务数据表中的数据，根据汇总后的数据获得数据宽表。It should be understood that the terminal device may execute S201 before executing S101. The terminal device may acquire data sheet tables generated by multiple services, then aggregate data in the multiple service data tables, and obtain a data width table according to the aggregated data.

其中，获取多个业务生成的数据单表的方式可以根据实际需求进行设定，本公开对此不做具体限定。例如，终端设备通过访问相应数据库的方式获取数据单表；或者，终端设备通过监听业务处流流程的方式获取数据单表。同时，汇总多个业务数据单表中的数据时，需要充分考虑多个业务数据单表可能存在分库分表的情况。采用能够跨数据库的方式，使数据信息都能被获取。The manner of acquiring the data sheets generated by multiple services may be set according to actual requirements, which is not specifically limited in the present disclosure. For example, the terminal device obtains the data sheet table by accessing a corresponding database; or, the terminal device obtains the data sheet table by monitoring the flow process of the service department. At the same time, when aggregating data in multiple business data sheets, it is necessary to fully consider the situation that multiple business data sheets may have sub-databases and sub-tables. In a way that can cross databases, data information can be obtained.

在一些可能的实施方式中，S201还可以包括：根据多个业务数据单表中的业务信息确定第一字段，第一字段为数据宽表中用于查询的字段；按照第一字段，汇总多个业务数据单表中的数据，以获得数据宽表。In some possible implementations, S201 may further include: determining a first field according to business information in multiple business data sheet tables, where the first field is a field used for query in the data wide table; data in a single business data sheet table to obtain a data wide table.

应理解的，终端设备在获取多个业务生成的数据单表后，可以获取多个业务数据单表中的业务信息。终端设备对业务信息的内容进行取舍，确定出第一字段。按照第一字段，将多个业务数据单表中的数据汇总起来，得到数据宽表。It should be understood that, after acquiring the data sheet tables generated by the multiple services, the terminal device can acquire the service information in the multiple service data sheet tables. The terminal device selects the content of the service information to determine the first field. According to the first field, the data in the multiple business data sheet tables are aggregated to obtain a data wide table.

其中，第一字段为数据宽表中用于展示的字段。具体的，当终端设备根据查询指令获取存储于数据宽表中数据信息后，终端设备输出该数据信息。为了输出数据的整齐性，可以根据第一字段分类展示该数据信息。因此，终端设备在汇总多个业务生成的数据单表时，可以预先确定好第一字段，将与第一字段相关的数据信息汇总起来，得到数据宽表。也就是说，终端设备只汇总与第一字段相关的数据信息，避免了无用信息的汇总，大大节省了存储空间。The first field is a field used for display in the data wide table. Specifically, after the terminal device acquires the data information stored in the data width table according to the query instruction, the terminal device outputs the data information. In order to keep the output data neat, the data information can be classified and displayed according to the first field. Therefore, when the terminal device aggregates the data sheet tables generated by multiple services, the first field can be determined in advance, and the data information related to the first field can be aggregated to obtain the data wide table. That is to say, the terminal device only summarizes the data information related to the first field, which avoids the aggregation of useless information and greatly saves storage space.

示例性的，A公司核心业务为支付业务，支付链路中各个业务域包含多张订单表。例如，交易表、支付表、渠道表等。由于订单表数据量较大，A公司在对这些表进行存储时，设计将各个业务域的订单表分别存储于四个数据库中，每个数据库的数据库实例并不相同。进一步地，为减少单表的记录条数，以便减少单表的查询需要的时间，提高数据库的吞吐量，设计每个数据库的分表数量为64张，使得数据均匀的分布到多张表中，并且不影响查询。A公司可以提供给商户后台订单查询的页面，商户在该页面查询数据宽表中的数据信息，该页面将数据信息按第一字段对应的分类进行展示。其中，展示给商户的第一字段(交易金额、支付状态、第三方渠道订单号等)来源于多张订单表。也就是说，第一字段包括来自订单表中的交易金额、来自支付表中的支付状态和来自渠道表中的第三方渠道订单号。Exemplarily, company A's core business is payment business, and each business domain in the payment link includes multiple order tables. For example, transaction table, payment table, channel table, etc. Due to the large amount of data in the order table, when company A stores these tables, it designs and stores the order tables of each business domain in four databases, and the database instances of each database are different. Further, in order to reduce the number of records in a single table, in order to reduce the time required for querying a single table and improve the throughput of the database, the number of sub-tables in each database is designed to be 64, so that the data is evenly distributed among multiple tables, and Does not affect queries. Company A can provide a page for the merchant's background order query, where the merchant queries the data information in the data wide table, and the page displays the data information according to the classification corresponding to the first field. Among them, the first field (transaction amount, payment status, third-party channel order number, etc.) displayed to the merchant comes from multiple order tables. That is, the first field includes the transaction amount from the order table, the payment status from the payment table, and the third-party channel order number from the channel table.

参见表1所示，为本公开实施例中的多个业务数据单表的数据信息。Referring to Table 1, it is the data information of the multiple service data sheet tables in the embodiment of the present disclosure.

表1Table 1

由上表可知，数据宽表的第一字段(展示字段)可以根据表1中的三个数据单表中的信息汇总而来。应理解的，在实际应用中，可以根据业务需求对该数据宽表进行设计。例如，现在有A、B、C三张表，每张表都包含有10个字段。那么，数据宽表可以设计为具有30个第一字段。但是，当终端设备需要使用该数据宽表解决某一特定的业务需求，该业务需求需要用到了A表3个第一字段、B表4个第一字段、C表3个第一字段，那么，为了节省空间，数据宽表就只需要设计为具有10个第一字段即可。As can be seen from the above table, the first field (display field) of the data wide table can be summarized according to the information in the three data single tables in Table 1. It should be understood that, in practical applications, the data wide table can be designed according to business requirements. For example, there are three tables A, B, and C, and each table contains 10 fields. Well, the data wide table can be designed to have 30 first fields. However, when the terminal device needs to use the data wide table to solve a specific business requirement, the business requirement needs to use 3 first fields of A table, 4 first fields of B table, and 3 first fields of C table, then , in order to save space, the data wide table only needs to be designed to have 10 first fields.

进一步地，由表1中还可以得出交易表、支付表、渠道表的其他数据信息，如，各表所在的数据库名称、表名称、所在数据库的数据库实例、表唯一的索引和串联单号。交易表、支付表、渠道表分别来自不同的数据库，拥有不同的数据库实例(实例A、实例B和实例C)。其中，数据库实例是程序，是访问数据库的通道。终端设备对数据库中的数据做任何的操作，包括数据定义、数据查询、数据维护、数据库运行控制等等都是在数据库实例下进行的。因此，本公开实施例中汇总多个业务数据单表中的数据时，会涉及到数据库实例不同的数据。Further, other data information of transaction table, payment table, and channel table can also be obtained from Table 1, such as the database name where each table is located, the table name, the database instance of the database where the table is located, the unique index of the table and the serial order number. . The transaction table, payment table, and channel table come from different databases and have different database instances (instance A, instance B, and instance C). Among them, the database instance is the program, which is the channel for accessing the database. The terminal device performs any operations on the data in the database, including data definition, data query, data maintenance, database operation control, etc., all performed under the database instance. Therefore, when aggregating data in multiple business data sheet tables in the embodiments of the present disclosure, data of different database instances will be involved.

在一些可能的实施方式中，S201还可以包括：解析多个业务数据单表中各个业务数据表单中的第二字段；按照第一字段所对应的数据范围，使用第二字段，对多个业务数据单表中的关联数据进行汇总，以得到数据宽表。终端设备能够通过第二字段汇总相关联的数据，进而使相关联的数据存储在一起，获得数据宽表。进一步地，终端设备只汇总第一字段相关的数据，节省了存储空间。In some possible implementations, S201 may further include: parsing the second field in each business data form in the multiple business data sheet tables; according to the data range corresponding to the first field, using the second field to The associated data in the data sheet table is aggregated to obtain a data wide table. The terminal device can summarize the associated data through the second field, and then store the associated data together to obtain a data wide table. Further, the terminal device only summarizes the data related to the first field, which saves storage space.

应理解的，终端设备执行S201，可以解析出多个业务单表中的第二字段。随后，使用第二字段，查找与第二字段相关的业务数据单表。最后，终端设备在查找到的所有业务数据单表中，获取与第一字段所对应的数据信息，并将所有的数据信息汇总起来。It should be understood that the terminal device executes S201, and can parse out the second fields in the multiple service order tables. Then, using the second field, look up the business data sheet table related to the second field. Finally, the terminal device obtains the data information corresponding to the first field in all the found service data sheets, and summarizes all the data information.

其中，第二字段用于指示多个业务数据单表中数据之间的关联关系。具体来说，第二字段可以是多个业务数据单表共有的字段，通过解析第二字段能够反应多个业务之间相关联，进而反应多各业务数据单表中数据相关联。Wherein, the second field is used to indicate the association relationship between the data in the multiple business data sheet tables. Specifically, the second field may be a field common to multiple business data sheets. By analyzing the second field, the correlation between multiple services can be reflected, thereby reflecting the data correlation in each business data sheet.

示例性的，如表1所示，假设按照业务逻辑，现有一业务流程中生成三张业务数据单表。三张业务数据单表的落单顺序分别为交易表、支付表、渠道表，且三张业务数据单表的关联关系为交易表:支付表:渠道表＝1:1:1，即一张交易表对应一张支付表对应一张渠道表。又因为业务数据单表在设计时需要将源头表(即交易表)生成的唯一单号传递给下游业务并存储在下游的业务表中，可以得知三张业务数据单表中均包含了【串联单号】(相当于第二字段)。那么，终端设备在使用数据宽表时，就可以通过获取每张数据单表中的【串联单号】，获取到同一项业务的所有业务数据单表以及所有的数据信息。必然的，同一项业务的所有数据信息为关联数据。另外，如果存在业务数据单表与业务数据单表之间的关联关系为1:n，可以将比例为n的业务数据单表设计为Elasticsearch中的Nested类型，通过编写Painless Script脚本对Nested字段进行bulk upsert操作，最终可以使该业务数据单表中的所有数据能够被关联查询到。Exemplarily, as shown in Table 1, it is assumed that according to the business logic, three business data sheet tables are generated in an existing business process. The order order of the three business data sheets is transaction table, payment table, and channel table, and the relationship between the three business data sheets is transaction table: payment table: channel table = 1:1:1, that is, one sheet A transaction table corresponds to a payment table and a channel table. And because the business data sheet needs to be designed to pass the unique order number generated by the source table (that is, the transaction table) to the downstream business and store it in the downstream business table, it can be known that the three business data sheets all contain [ Serial number] (equivalent to the second field). Then, when the terminal device uses the data wide table, it can obtain all business data sheets and all data information of the same business by obtaining the [serial order number] in each data sheet. Inevitably, all data information of the same business is associated data. In addition, if the relationship between the business data sheet table and the business data sheet table is 1:n, the business data sheet table with a ratio of n can be designed as the Nested type in Elasticsearch, and the Nested field can be processed by writing a Painless Script script. The bulk upsert operation can finally enable all the data in the single table of the business data to be queried by association.

在一些可能的实施方式中，S201还可以包括：按照第一字段所对应的数据范围，通过多线程技术对多个业务数据单表中的数据进行处理，得到来自多个业务数据单表的中间数据；使用第二字段，获取中间数据中的关联数据；汇总关联数据，以得到数据宽表。In some possible implementations, S201 may further include: according to the data range corresponding to the first field, processing the data in the multiple business data sheet tables through a multi-threading technology to obtain the intermediate data from the multiple business data sheet tables. data; use the second field to obtain the associated data in the intermediate data; aggregate the associated data to obtain a data wide table.

应理解的，为了节省存储空间，终端设备可以按照第一字段所对应的数据范围对多个业务数据单表中的数据进行处理。终端设备可以采用多线程技术得到来自多个业务数据单表的中间数据(也就是从业务数据单表中直接获取的数据)，随后使用第二字段获取中间数据中的关联数据(也就是来自同一业务的数据)，最后汇总关联数据，得到数据宽表。因为第一字段为数据宽表的展示字段，所以，在该范围下汇总关联数据能够避免多余数据的存储。同时，终端设备采用多线程技术可以使查询速度得到很大的提升。It should be understood that, in order to save storage space, the terminal device may process the data in the multiple service data sheet tables according to the data range corresponding to the first field. The terminal device can use the multi-threading technology to obtain the intermediate data from multiple business data sheets (that is, the data directly obtained from the business data sheet), and then use the second field to obtain the associated data in the intermediate data (that is, from the same data sheet). business data), and finally summarize the associated data to obtain a wide data table. Because the first field is the display field of the data wide table, summarizing the associated data in this range can avoid the storage of redundant data. At the same time, the use of multi-threading technology in the terminal device can greatly improve the query speed.

其中，多线程技术指一个中央处理器(central processing unit，CPU)同时执行多个程序。具体的，终端设备每天需要处理几千亿甚至几万亿的数据，同时执行多个程序将该数据汇总为数据宽表。另外，考虑到不同业务域的数据库实例不相同以及分库分表的设计，需要选择适配不同数据库的中间件完成整个处理流程。例如，首先使用解析中间件完成数据解析，随后使用消息中间件完成数据的传输，最后进行数据的同步存储。The multi-threading technology refers to a central processing unit (central processing unit, CPU) executing multiple programs at the same time. Specifically, the terminal device needs to process hundreds of billions or even trillions of data every day, and simultaneously execute multiple programs to summarize the data into a data wide table. In addition, considering that the database instances of different business domains are different and the design of sub-database and sub-table, it is necessary to select middleware adapted to different databases to complete the entire processing process. For example, first use parsing middleware to complete data parsing, then use message middleware to complete data transmission, and finally perform data synchronous storage.

示例性的，为了确保数据同步低延迟，终端设备可以采用中间件canal监听binlog的方式，通过配置canal instance监听的目标数据单表以及数据路由规则，将binlog解析成json格式数据写入配置的kafka topic分区中。其中，数据路由规则可为当多个数据单表所配置的canal.mq.partitionHash参数都指定了trade_order_no时，该多个数据单表的数据在进行路由时，均根据trade_order_no进行hash。也就是说，trade_order_no相同的数据会被路由到topic的同一个分区中，也会被同一个消费线程进行处理。Exemplarily, in order to ensure low latency of data synchronization, the terminal device can use the middleware canal to monitor binlog. By configuring the target data list and data routing rules monitored by canal instance, the binlog is parsed into json format data and written to the configured kafka. in the topic partition. The data routing rule may be that when trade_order_no is specified in the canal.mq.partitionHash parameter configured for multiple data sheet tables, the data of the multiple data sheet tables will be hashed according to trade_order_no when routing. That is to say, the same data as trade_order_no will be routed to the same partition of the topic and processed by the same consumer thread.

进一步地，Canal是一个binlog解析中间件，该组件可以将binlog的二进制文件解析生成json格式的数据。该组件可以通过编写正则表达式配置业务关心的数据库、数据表。该组件还支持根据配置文件中设定的路由规则，将解析后的binlog数据发送至对应的消息队列中。可见，通过Canal解析中间件能够实现对分库分表中的数据进行处理。在一实例中，还可以根据业务需求选择Maxwell、FlinkCDC作为解析中间件。Further, Canal is a binlog parsing middleware, which can parse binlog binary files to generate data in json format. This component can configure the database and data table that the business cares about by writing regular expressions. The component also supports sending the parsed binlog data to the corresponding message queue according to the routing rules set in the configuration file. It can be seen that the data in the sub-database and sub-table can be processed through the Canal parsing middleware. In an example, Maxwell and FlinkCDC can also be selected as parsing middleware according to business requirements.

进一步的，kafka作为消息中间件，具有高可用、高吞吐、低延迟、高并发等特性。kafka的topic分区可以根据当前业务单位时间内数据量的大小以及业务对数据丢失容忍度进行合理设置。Kafka的消费者数量一般与kafka的topic分区数保持一致，保持高并发、高性能消费。在一实例中，还可以根据业务需求选择rocketmq、rabbitmq作为消息中间件。Further, as a message middleware, Kafka has the characteristics of high availability, high throughput, low latency, and high concurrency. The topic partition of Kafka can be set reasonably according to the size of the data volume in the current business unit time and the business tolerance for data loss. The number of consumers in Kafka is generally consistent with the number of topic partitions in Kafka to maintain high concurrency and high performance consumption. In an instance, rocketmq and rabbitmq can also be selected as message middleware according to business requirements.

在一些可能的实施方式中，终端设备执行S201，汇总关联数据得到数据宽表的过程可以是将数据进行统一处理后，存储于Elasticsearch的过程。这一过程中，需要将数据分门别类的映射到Elasticsearch中，所以需要设计统一的数据处理框架。In some possible implementations, the terminal device executes S201, and the process of aggregating the associated data to obtain the data wide table may be a process of uniformly processing the data and storing the data in Elasticsearch. In this process, the data needs to be mapped to Elasticsearch in different categories, so it is necessary to design a unified data processing framework.

示例性的，终端设备执行S201可以采用框架统一的逻辑处理。Exemplarily, the terminal device may perform S201 by adopting logical processing unified by the framework.

1、过滤isDdl＝true和dmlType＝DELETE逻辑删除的数据。其中，isDdl＝true的数据为对业务表的一些DDL操作，比如修改字段名称、类型等，这些数据不属于业务数据类，不适用构建数据宽表；dmlType＝DELETE表示数据库的删除操作，这些数据也可以放弃存储。1. Filter the data whose isDdl=true and dmlType=DELETE logical deletion. Among them, the data of isDdl=true is some DDL operations on the business table, such as modifying the field name, type, etc. These data do not belong to the business data class and are not suitable for building data wide tables; dmlType=DELETE indicates that the database delete operation, these data Storage can also be waived.

2、kafka中获取到的数据，来自不用的数据库和不同的表。这种分库分表的表名称设计规范为库/表名称+下划线+数字构成，例如tb_trade_order_0。理论上同库的表的数据结构都是一致的，那么在进行数据映射时，不需要关心这些表的数字后缀，而把它们当作同一张表的数据进行处理。即通过正则表达式匹配并抹去后缀(_0)，确保同名称的表数据能够被同一个策略逻辑处理，解决分库分表导致的数据分散问题。2. The data obtained in kafka comes from different databases and different tables. The table name design specification for this sub-library and sub-table is composed of library/table name + underscore + number, such as tb_trade_order_0. In theory, the data structures of the tables in the same library are the same, so when performing data mapping, you do not need to care about the numerical suffixes of these tables, but treat them as the data of the same table. That is, through regular expression matching and erasing the suffix (_0), it ensures that table data with the same name can be processed by the same policy logic, and solves the problem of data dispersion caused by sub-database and sub-table.

3、字段过滤操作。例如，业务数据单表中经常会带有一些不具备业务含义的字段，或者是不需要用于外部显示的字段，那么，在S201中可以直接过滤掉这类字段，节省存储空间。3. Field filtering operation. For example, the business data sheet table often contains some fields that do not have business meaning, or fields that do not need to be displayed externally. Then, such fields can be directly filtered out in S201 to save storage space.

4、解析交易单号中的年月(yyyyMM)部分，确定Elasticsearch写入的索引位置。4. Analyze the year and month (yyyyMM) part of the transaction order number to determine the index position written by Elasticsearch.

5、JSON字段下划线转驼峰，并且给字段添加前缀。5. Convert the underscore of the JSON field to camel case, and add a prefix to the field.

6、将统一数据处理接口返回的结果集批量写入Elasticsearch。批量写入减少了与存储端的IO交互次数，可以提升写入性能。6. Write the result set returned by the unified data processing interface to Elasticsearch in batches. Batch writing reduces the number of IO interactions with the storage side and can improve write performance.

上述为本公开实施例中一种可能的框架，在具体实施时，可以根据业务的实际需求进行设计，本申请实施例对此不作具体限定。另外，在设计一套框架的时候，可以把公共的处理逻辑梳理出来，把特殊的业务逻辑处理流程抽象成接口。那么，针对不同的业务，就可以根据实际的业务场景来切换接口即可。The foregoing is a possible framework in the embodiments of the present disclosure. During specific implementation, it may be designed according to actual needs of services, which is not specifically limited in the embodiments of the present application. In addition, when designing a framework, you can sort out the common processing logic and abstract the special business logic processing flow into an interface. Then, for different services, the interface can be switched according to the actual business scenario.

在一些可能的实施方式中，S201还可以包括：在使用第二字段，对多个业务数据单表中的关联数据进行汇总，以得到数据宽表之后，根据关联数据的数据量大小，确定关联数据在数据宽表中的索引。In some possible implementations, S201 may further include: after using the second field to summarize the associated data in the multiple business data sheet tables to obtain a data wide table, determine the association according to the data volume of the associated data The index of the data in the data wide table.

应理解的，终端设备使用第二字段，对多个业务数据单表中的关联数据进行汇总，将汇总数据存储到Elasticsearch得到数据宽表。此外，终端设备还可以根据存储数据的大小规划Elasticsearch索引。终端设备通过构建索引，使数据宽表使用时先查询对应的索引，再根据索引链接到查询数据，可以实现将原有的默认全表扫描的查询方式更改为先去索引列表中一次定位到特定值的存储位置，大大减少查询的工作量，所以能明显增加查询的速度。It should be understood that the terminal device uses the second field to summarize the associated data in the multiple business data sheet tables, and stores the aggregated data in Elasticsearch to obtain a data wide table. In addition, the terminal device can also plan the Elasticsearch index according to the size of the stored data. By building an index, the terminal device can first query the corresponding index when using the data wide table, and then link to the query data according to the index, so that the original default full table scan query method can be changed to go to the index list once to locate a specific query. The storage location of the value greatly reduces the workload of the query, so it can significantly increase the speed of the query.

其中，索引是相互关联的文档集合，Elasticsearch会以JSON文档的形式存储数据。每个文档都会在一组键(字段或属性的名称)和它们对应的值(字符串、数字、布尔值、日期、数值组、地理位置或其他类型的数据)之间建立联系。Elasticsearch使用的是倒排索引的数据结构，这一结构的设计可以允许十分快速地进行全文本搜索。倒排索引会列出在所有文档中出现的每个特有词汇，并且可以找到包含每个词汇的全部文档。在索引过程中，Elasticsearch会存储文档并构建倒排索引，这样便可以近实时地对文档数据进行搜索。Among them, an index is a collection of interrelated documents, and Elasticsearch stores data in the form of JSON documents. Each document creates a relationship between a set of keys (names of fields or properties) and their corresponding values (strings, numbers, booleans, dates, groups of values, geographic locations, or other types of data). Elasticsearch uses an inverted index data structure, which is designed to allow very fast full-text searches. An inverted index lists every unique word that occurs in all documents, and can find all documents that contain every word. During the indexing process, Elasticsearch stores documents and builds an inverted index so that document data can be searched in near real-time.

示例性的，终端设备可以根据业务量的数据量的大小，规划数据宽表索引的拆分粒度和分片数量。相应的，终端设备还可以根据用户的访问特点，将索引划分为hot，warm，cold三个区。hot区存放访问比较频繁且创将后还需要修改的索引文档；warm区存放访问不太频繁的只读索引文档；cold区存放偶尔需要访问的只读索引文档。Exemplarily, the terminal device may plan the split granularity and the number of shards of the data wide table index according to the size of the data volume of the traffic. Correspondingly, the terminal device can also divide the index into three areas: hot, warm, and cold according to the user's access characteristics. The hot area stores index documents that are frequently accessed and need to be modified after creation; the warm area stores read-only index documents that are accessed less frequently; the cold area stores read-only index documents that occasionally need to be accessed.

进一步地，终端设备还可以通过定时任务调度应用程序接口(applicationprogramming interface，API)的方式，管理索引创建、索引别名设置、索引在hot/warm/cold节点迁移等操作，实现节约机器成本，提升查询性能。Further, the terminal device can also manage operations such as index creation, index alias setting, and index migration on hot/warm/cold nodes through the application programming interface (API) of timed task scheduling, so as to save machine costs and improve query performance. performance.

示例性的，现有的交易单号的设计规则包含年月日(yyyyMMdd)部分，例如，交易单号为20220505Txxxx482953。那么，在构建按时间粒度拆分索引时，就可以将索引按月进行划分。在终端设备解析交易单号为20220505Txxxx482953时，可以将这笔数据写入索引wt_trade_order-202205中(相当于一月一个索引)。另外，假设是2022.05.01，通过定时任务调度，会在hot节点生成wt_trade_order-202206，并将wt_trade_order-202105从hot节点转移到warm节点。Exemplarily, an existing design rule for a transaction ticket number includes a year, month, and day (yyyyMMdd) part, for example, the transaction ticket number is 20220505Txxxx482953. Then, when building a split index by time granularity, it is possible to divide the index by month. When the terminal device parses the transaction order number as 20220505Txxxx482953, the data can be written into the index wt_trade_order-202205 (equivalent to one index per month). In addition, assuming that it is 2022.05.01, through scheduled task scheduling, wt_trade_order-202206 will be generated on the hot node, and wt_trade_order-202105 will be transferred from the hot node to the warm node.

在一些可能的实施方式中，上述方法还包括S202。S202可以在上述S101、S102、S103和S201之前执行，也可以和上述S101、S102、S103和S201中的任意一步同时执行，也可以在上述S101、S102、S103和S201之后执行。In some possible implementations, the above method further includes S202. S202 can be executed before the above S101, S102, S103 and S201, can also be executed simultaneously with any of the above steps S101, S102, S103 and S201, or can be executed after the above S101, S102, S103 and S201.

S202，检测多个业务数据单表中的数据变化信息；当检测到任一单表中的数据发生变化时，更新数据宽表中与任一单表相关联的数据。S202: Detect data change information in a plurality of business data single tables; when it is detected that data in any single table changes, update the data associated with any single table in the data wide table.

应理解的，终端设备可以实时获得多个业务数据单表中的数据变化信息，并通过组件将变化的数据重新存储于数据宽表中，也就是说，当检测到任一单表中的数据发生变化时，终端设备可以更新数据宽表中与任一单表相关联的数据。It should be understood that the terminal device can obtain the data change information in multiple service data sheets in real time, and re-store the changed data in the data wide table through the component, that is, when the data in any single sheet is detected. When changes occur, the terminal device can update the data associated with any single table in the data wide table.

示例性的，上述描述中提到终端设备可以采用中间件canal监听binlog的方式，通过配置canal instance监听的目标数据单表以及数据路由规则，将binlog解析成json格式数据写入配置的kafka topic分区中。其中，binlog指二进制日志，它记录了数据库上的所有改变，并以二进制的形式保存在磁盘中。那么，中间件canal通过监听binlog可以获得数据库中数据单表的数据变化信息，进而将变化的数据信息写入kafka topic分区中，最终通过数据同步处理框架实现更新数据宽表中的数据。Exemplarily, as mentioned in the above description, the terminal device can use the middleware canal to monitor the binlog. By configuring the target data list and data routing rules monitored by the canal instance, the binlog is parsed into json format data and written to the configured kafka topic partition. middle. Among them, binlog refers to the binary log, which records all the changes on the database and saves it on the disk in binary form. Then, the middleware canal can obtain the data change information of the data single table in the database by monitoring the binlog, and then write the changed data information into the kafka topic partition, and finally update the data in the data wide table through the data synchronization processing framework.

在上述实施例中，终端设备执行201至S202，可以通过多个数据单表生成数据宽表，并实时更新数据宽表中的数据信息，实现了通过数据宽表实时查询到所需的数据信息，提高数据信息的查询效率。In the above embodiment, the terminal device executes 201 to S202, and can generate a data wide table through multiple data single tables, and update the data information in the data wide table in real time, so that the required data information can be queried in real time through the data wide table. , to improve the query efficiency of data information.

基于相同的发明构思，本公开实施例还提供一种基于数据宽表进行数据查询的装置，该数据查询的装置可以为终端设备中的芯片或者片上系统，还可以为终端设备中用于实现上述各个实施例所述的方法的功能模块。该数据查询的装置可以实现上述各实施例中终端设备所执行的功能，这些功能可以通过硬件执行相应的软件实现。这些硬件或软件包括一个或多个上述功能相应的模块。Based on the same inventive concept, an embodiment of the present disclosure further provides an apparatus for data query based on a data wide table. The data query apparatus may be a chip or a system-on-chip in a terminal device, and may also be a device used in the terminal device to implement the above Functional modules of the methods described in various embodiments. The data query apparatus can implement the functions performed by the terminal device in the above embodiments, and these functions can be implemented by executing corresponding software through hardware. These hardware or software include one or more modules corresponding to the above-mentioned functions.

图3为本公开实施例中的一种数据查询的装置的结构示意图，参见图3所示，该数据查询的装置300包括：获取模块301，用于接收用户的查询指令；处理模块302，用于根据查询指令中的字段在数据宽表进行查询，得到查询数据，数据宽表是通过汇总多个业务数据单表中的数据而获得的；输出模块303，用于输出查询数据。FIG. 3 is a schematic structural diagram of a data query device according to an embodiment of the disclosure. Referring to FIG. 3 , the data query device 300 includes: an acquisition module 301 for receiving a query instruction from a user; a processing module 302 for using The data wide table is obtained by querying the data wide table according to the fields in the query instruction, and the data wide table is obtained by summarizing the data in multiple business data single tables; the output module 303 is used for outputting the query data.

在一些可能的实施方式中，处理模块302，还用于：根据查询指令中的字段，查询字段在数据宽表中对应的索引；根据索引链接到查询数据。In some possible implementations, the processing module 302 is further configured to: query the corresponding index of the field in the data wide table according to the field in the query instruction; and link to the query data according to the index.

在一些可能的实施方式中，获取模块301，还用于：在接收用户的查询指令之前，获取多个业务数据单表；汇总多个业务数据单表中的数据，获得数据宽表。In some possible implementations, the obtaining module 301 is further configured to: before receiving a query instruction from a user, obtain multiple service data sheets; and summarize data in the multiple service data sheets to obtain a data wide sheet.

在一些可能的实施方式中，获取模块301，还用于：根据多个业务数据单表中的业务信息确定第一字段，第一字段为数据宽表中用于查询的字段；按照第一字段，汇总多个业务数据单表中的数据，以获得数据宽表。In some possible implementations, the obtaining module 301 is further configured to: determine the first field according to the business information in the multiple business data sheet tables, where the first field is a field used for query in the data wide table; according to the first field , summarize the data in multiple business data sheet tables to obtain a data wide table.

在一些可能的实施方式中，获取模块301，还用于：解析多个业务数据单表中各个业务数据表单中的第二字段，第二字段用于指示多个业务数据单表中数据之间的关联关系；按照第一字段所对应的数据范围，使用第二字段，对多个业务数据单表中的关联数据进行汇总，以得到数据宽表。In some possible implementations, the obtaining module 301 is further configured to: parse the second field in each business data form in the multiple business data sheet tables, where the second field is used to indicate the difference between data in the multiple business data sheet tables According to the data range corresponding to the first field, the second field is used to summarize the associated data in the multiple business data single tables to obtain the data wide table.

在一些可能的实施方式中，获取模块301，还用于：按照第一字段所对应的数据范围，通过多线程技术对多个业务数据单表中的数据进行处理，得到来自多个业务数据单表的中间数据，中间数据是与数据宽表的字段相对应的数据；使用第二字段，获取中间数据中的关联数据；汇总关联数据，以得到数据宽表。In some possible implementations, the obtaining module 301 is further configured to: process the data in the multiple business data sheet tables through a multi-threading technology according to the data range corresponding to the first field, and obtain data from multiple business data sheets The intermediate data of the table, the intermediate data is the data corresponding to the fields of the data wide table; the second field is used to obtain the associated data in the intermediate data; the associated data is aggregated to obtain the data wide table.

在一些可能的实施方式中，获取模块301，还用于：在使用第二字段，对多个业务数据单表中的关联数据进行汇总，以得到数据宽表之后，根据关联数据的数据量大小，确定关联数据在数据宽表中的索引。In some possible implementation manners, the obtaining module 301 is further configured to: after using the second field to summarize the associated data in the multiple business data sheet tables to obtain the data wide table, according to the data volume of the associated data , which determines the index of the associated data in the data-wide table.

在一些可能的实施方式中，获取模块301，还用于：检测多个业务数据单表中的数据变化信息；当检测到任一单表中的数据发生变化时，更新数据宽表中与任一单表相关联的数据信息。In some possible implementations, the obtaining module 301 is further configured to: detect data change information in multiple business data single tables; when detecting that the data in any single table changes, update the data in the data wide table with any one Data information associated with a single table.

需要说明的是，获取模块301、处理模块302和输出模块303的具体实现过程可参考图1和图2实施例的详细描述，为了说明书的简洁，这里不再赘述。It should be noted that the specific implementation process of the acquiring module 301 , the processing module 302 and the outputting module 303 may refer to the detailed description of the embodiments in FIG. 1 and FIG. 2 , which are not repeated here for brevity of the description.

基于相同的发明构思，本公开实施例提供一种终端，该终端可以为上述一个或者多个实施例中所述的终端。图4为本公开实施例中的一种终端的结构示意图，参见图4所示，终端400，可以采用通用的计算机硬件，包括处理器401、存储器402。Based on the same inventive concept, an embodiment of the present disclosure provides a terminal, and the terminal may be the terminal described in one or more of the foregoing embodiments. FIG. 4 is a schematic structural diagram of a terminal in an embodiment of the disclosure. Referring to FIG. 4 , the terminal 400 may adopt general computer hardware, including a processor 401 and a memory 402 .

在一些可能的实施方式中，至少一个处理器可以构成具有对一个或多个输入执行逻辑运算的电路的任何物理设备。例如，至少一个处理器可以包括一个或多个集成电路(IC)，包括专用集成电路(ASIC)、微芯片、微控制器、微处理器、中央处理单元(CPU)的全部或部分、图形处理单元(GPU)、数字信号处理器(DSP)、现场可编程门阵列(FPGA)或者适于执行指令或执行逻辑运算的其它电路。由至少一个处理器执行的指令可以例如被预加载到与控制器集成的或嵌入在控制器中的存储器中，或者可以存储在分离的存储器中。存储器可以包括随机存取存储器(RAM)、只读存储器(ROM)、硬盘、光盘、磁介质、闪存，其它永久、固定或易失性存储器，或者能够存储指令的任何其它机制。在一些实施例中，至少一个处理器可以包括多于一个处理器。每个处理器可以具有相似的结构，或者处理器可以具有彼此电连接或断开的不同构造。例如，处理器可以是分离的电路或集成在单个电路中。当使用多于一个处理器时，处理器可以被配置为独立地或协作地操作。处理器可以以电、磁、光学、声学、机械或通过允许它们交互的其它手段来耦合。In some possible implementations, at least one processor may constitute any physical device having circuitry that performs logical operations on one or more inputs. For example, at least one processor may include one or more integrated circuits (ICs), including application specific integrated circuits (ASICs), microchips, microcontrollers, microprocessors, all or part of a central processing unit (CPU), graphics processing Unit (GPU), Digital Signal Processor (DSP), Field Programmable Gate Array (FPGA), or other circuit suitable for executing instructions or performing logical operations. Instructions executed by the at least one processor may be preloaded, for example, into memory integrated with or embedded in the controller, or may be stored in a separate memory. Memory may include random access memory (RAM), read only memory (ROM), hard disks, optical disks, magnetic media, flash memory, other permanent, fixed or volatile memory, or any other mechanism capable of storing instructions. In some embodiments, at least one processor may include more than one processor. Each processor may have a similar structure, or the processors may have different configurations that are electrically connected or disconnected from each other. For example, the processor may be a separate circuit or integrated in a single circuit. When more than one processor is used, the processors may be configured to operate independently or cooperatively. The processors may be coupled electrically, magnetically, optically, acoustically, mechanically, or by other means that allow them to interact.

根据本发明的一个实施例，本发明还提供了一种计算机可读存储介质，其上存储有计算机指令，该指令被处理器执行上述基于数据宽表进行数据查询方法的步骤。存储器702可以包括以易失性和/或非易失性存储器形式的计算机存储媒体，如只读存储器和/或随机存取存储器。存储器402可以存储操作系统、应用程序、其他程序模块、可执行代码、程序数据、用户数据等。According to an embodiment of the present invention, the present invention further provides a computer-readable storage medium, on which computer instructions are stored, and the instructions are executed by a processor to perform the steps of the above-mentioned data query method based on a data wide table. Memory 702 may include computer storage media in the form of volatile and/or nonvolatile memory, such as read-only memory and/or random access memory. Memory 402 may store operating systems, application programs, other program modules, executable code, program data, user data, and the like.

此外，上述存储器402中存储有用于实现图3中各模块的功能的计算机执行指令。图3中各模块的功能/实现过程均可以通过图4中的处理器401调用存储器402中存储的计算机执行指令来实现，具体实现过程和功能参考上述相关实施例。In addition, the above-mentioned memory 402 stores computer-executed instructions for realizing the functions of each module in FIG. 3 . The function/implementation process of each module in FIG. 3 can be implemented by the processor 401 in FIG. 4 calling the computer execution instructions stored in the memory 402 . For the specific implementation process and functions, refer to the above-mentioned related embodiments.

本领域技术人员在考虑说明书及实践这里公开的发明后，将容易想到本公开的其它实施方案。本公开旨在涵盖本公开的任何变型、用途或者适应性变化，这些变型、用途或者适应性变化遵循本公开的一般性原理并包括本公开未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的，本公开的真正范围和精神由下面的权利要求指出。Other embodiments of the present disclosure will readily occur to those skilled in the art upon consideration of the specification and practice of the invention disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of this disclosure that follow the general principles of this disclosure and include common general knowledge or techniques in the technical field not disclosed by this disclosure . The specification and examples are to be regarded as exemplary only, with the true scope and spirit of the disclosure being indicated by the following claims.

应当理解的是，本公开并不局限于上面已经描述并在附图中示出的精确结构，并且可以在不脱离其范围进行各种修改和改变。本公开的范围仅由所附的权利要求来限制。It is to be understood that the present disclosure is not limited to the precise structures described above and illustrated in the accompanying drawings, and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. A method for querying data based on a data wide table is characterized by comprising the following steps:

receiving a query instruction of a user;

inquiring in a data wide table according to the fields in the inquiry instruction to obtain inquiry data, wherein the data wide table is obtained by summarizing data in a plurality of business data single tables;

and outputting the query data.

2. The method of claim 1, wherein the querying a data wide table according to a field in the query instruction to obtain query data comprises:

according to the fields in the query instruction, querying indexes corresponding to the fields in the data wide table;

and linking to the query data according to the index.

3. The method of claim 1, wherein prior to receiving the user's query instruction, the method further comprises:

acquiring a plurality of service data list tables;

and acquiring the data wide table by summarizing the data in the plurality of business data single tables.

4. The method of claim 3, wherein the obtaining the data wide table by aggregating data in the plurality of business data sheet tables comprises:

determining a first field according to the service information in the plurality of service data list tables, wherein the first field is a field used for displaying in the data wide table;

and summarizing the data in the plurality of business data list tables according to the first field to obtain the data wide table.

5. The method of claim 4, wherein aggregating data in the plurality of business datasheet tables to obtain the data-wide table according to the first field comprises:

analyzing a second field in each business data form in the plurality of business data form tables, wherein the second field is used for indicating the incidence relation among data in the plurality of business data form tables;

and summarizing the associated data in the plurality of business data list tables by using the second field according to the data range corresponding to the first field to obtain the data wide table.

6. The method according to claim 5, wherein the aggregating the associated data in the plurality of service data list tables using the second field according to the data range corresponding to the first field to obtain the data wide table comprises:

processing the data in the plurality of business data list tables by a multithreading technology according to the data range corresponding to the first field to obtain intermediate data from the plurality of business data list tables;

acquiring associated data in the intermediate data by using the second field;

and summarizing the associated data to obtain the data wide table.

7. The method of claim 5, wherein after the using the second field, aggregating associated data in the plurality of business datasheet tables to obtain the data wide table, the method further comprises:

and determining the index of the associated data in the data width table according to the data size of the associated data.

8. The method of claim 1, further comprising:

detecting data change information in the plurality of business data sheet tables;

when the change of the data in any single table is detected, the data associated with the single table in the data wide table is updated.

9. An apparatus for querying data based on a data wide table, comprising:

the acquisition module is used for receiving a query instruction of a user;

the processing module is used for inquiring in a data wide table according to the fields in the inquiry instruction to obtain inquiry data, and the data wide table is obtained by summarizing data in a plurality of business data single tables;

and the output module is used for outputting the query data.

10. The apparatus of claim 9, wherein the processing module is further configured to: according to the fields in the query instruction, querying indexes corresponding to the fields in the data wide table; linking to the query data according to the index.

11. The apparatus of claim 9, wherein the obtaining module is further configured to: before the query instruction of the user is received, acquiring a plurality of service data list tables; and summarizing the data in the plurality of business data list tables to obtain the data wide table.

12. The apparatus of claim 11, wherein the obtaining module is further configured to: determining a first field according to service information in the plurality of service data list tables, wherein the first field is a field used for query in the data wide table; and summarizing the data in the plurality of business data list tables according to the first field to obtain the data wide table.

13. The apparatus of claim 12, wherein the obtaining module is further configured to: analyzing a second field in each business data form in the plurality of business data form tables, wherein the second field is used for indicating the incidence relation among data in the plurality of business data form tables; and summarizing the associated data in the plurality of business data list tables by using the second field according to the data range corresponding to the first field to obtain the data wide table.

14. The apparatus of claim 13, wherein the obtaining module is further configured to: processing the data in the plurality of business data list tables by a multithreading technology according to the data range corresponding to the first field to obtain intermediate data from the plurality of business data list tables, wherein the intermediate data is data corresponding to the field of the data wide table; acquiring associated data in the intermediate data by using the second field; and summarizing the associated data to obtain the data wide table.

15. The apparatus of claim 13, wherein the obtaining module is further configured to: after the associated data in the service data list tables are summarized by using the second field to obtain the data wide table, determining the index of the associated data in the data wide table according to the data volume of the associated data.

16. The apparatus of claim 9, wherein the obtaining module is further configured to: detecting data change information in the plurality of business data sheet tables; and when detecting that the data in any single table changes, updating the data information associated with any single table in the data wide table.

17. A terminal, comprising:

a memory for storing processor-executable instructions;

a processor; wherein the processor is configured to: for implementing the method of any one of claims 1 to 8 when executing said executable instructions.

18. A computer-readable storage medium, characterized in that the readable storage medium stores an executable program, wherein the executable program, when executed by a processor, implements the method of any one of claims 1 to 8.