CN103995879A - Data query method, device and system based on OLAP system - Google Patents
Data query method, device and system based on OLAP system Download PDFInfo
- Publication number
- CN103995879A CN103995879A CN201410228109.0A CN201410228109A CN103995879A CN 103995879 A CN103995879 A CN 103995879A CN 201410228109 A CN201410228109 A CN 201410228109A CN 103995879 A CN103995879 A CN 103995879A
- Authority
- CN
- China
- Prior art keywords
- data table
- data
- partition
- partitions
- hash
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2453—Query optimisation
- G06F16/24532—Query optimisation of parallel queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
- G06F16/2255—Hash tables
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
- G06F16/278—Data partitioning, e.g. horizontal or vertical partitioning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/283—Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
本发明实施例提供一种基于OLAP系统的数据查询方法、装置及系统,包括:接收用户终端发送的数据查询请求,对第一数据表标识对应的第一数据表进行逻辑分区处理,获取至少两个第一分区,并对于每个第一分区,建立对应的连接映射关系;对第二数据表标识对应的第二数据表进行分区处理,获取至少两个第二分区,并对每个第二分区,从第二分区中获取第一外键值,第一外键值在一个第一分区的边界值范围内;查询连接映射关系,获取与第一外键值对应的第一哈希子表;扫描第一哈希子表,获取对应的数据,将数据返回用户终端,本发明实施例通过建立连接映射关系,有效缩短数据查询时间,降低服务器开销。
Embodiments of the present invention provide a data query method, device, and system based on an OLAP system, including: receiving a data query request sent by a user terminal, performing logical partition processing on the first data table corresponding to the first data table identifier, and obtaining at least two A first partition, and for each first partition, establish a corresponding connection mapping relationship; perform partition processing on the second data table corresponding to the second data table identifier, obtain at least two second partitions, and for each second data table Partitioning, obtaining the first foreign key value from the second partition, the first foreign key value is within a boundary value range of the first partition; querying the connection mapping relationship, obtaining the first hash subtable corresponding to the first foreign key value; The first hash subtable is scanned to obtain corresponding data, and the data is returned to the user terminal. The embodiment of the present invention effectively shortens data query time and reduces server overhead by establishing a connection mapping relationship.
Description
技术领域technical field
本发明实施例涉及计算机技术,尤其涉及一种基于联机分析处理(On-Line Analytical Processing,简称OLAP)系统的数据查询方法、装置及系统。Embodiments of the present invention relate to computer technology, and in particular to a data query method, device and system based on an On-Line Analytical Processing (OLAP) system.
背景技术Background technique
OLAP作为数据库系统的一种典型应用场景,主要用于对数据进行查询操作,在对数据的查询过程中经常会涉及到多张数据表的联合查询,表与表之间需要通过连接操作进行关联,常见的连接操作为哈希连接。As a typical application scenario of the database system, OLAP is mainly used to query data. In the process of querying data, it often involves joint query of multiple data tables. Tables need to be associated through connection operations. , a common join operation is a hash join.
现有技术中,采用哈希连接的数据查询的方式主要为:数据库系统的后台服务器接收用户发送的数据查询消息,根据该数据查询消息中携带的第一数据表标识和第二数据表的标识,将其中行数较少的第一数据表的主键进行哈希运算,建立共享哈希表,再对第二数据表进行并行扫描,将其中每个查询线程的外键进行哈希运算获取每个线程对应的哈希值,根据每个线程的哈希值对该共享哈希表进行并行扫描,以获取用户需要的完整的数据。In the prior art, the method of data query using hash join is mainly as follows: the background server of the database system receives the data query message sent by the user, and according to the first data table identifier and the second data table identifier carried in the data query message, , perform a hash operation on the primary key of the first data table with a small number of rows, establish a shared hash table, and then perform a parallel scan on the second data table, and perform a hash operation on the foreign key of each query thread to obtain each According to the hash value corresponding to each thread, the shared hash table is scanned in parallel to obtain the complete data required by the user.
但是,因为多线程并行查询存在写冲突,在并行的线程较多的情况下,造成数据查询时间长,服务器的开销大的问题。However, because there are write conflicts in multi-threaded parallel query, when there are many parallel threads, it will cause long data query time and high server overhead.
发明内容Contents of the invention
本发明实施例提供一种基于OLAP系统的数据查询方法、装置及系统,以克服现有技术中多线程查询时间长,服务器开销大的问题,通过对第一数据表和第二数据表进行分区,并建立连接映射关系,在进行多线程查询的过程中,每个线程通过该连接映射关系,获取并扫描第一哈希子表,从而获取数据,有效缩短数据查询时间,降低服务器开销。Embodiments of the present invention provide a data query method, device, and system based on an OLAP system to overcome the problems of long multi-threaded query time and high server overhead in the prior art, by partitioning the first data table and the second data table , and establish a connection mapping relationship. In the process of multi-thread query, each thread obtains and scans the first hash subtable through the connection mapping relationship, thereby obtaining data, effectively shortening data query time and reducing server overhead.
本发明实施例第一方面提供一种基于OLAP系统的数据查询方法,包括:The first aspect of the embodiments of the present invention provides a data query method based on an OLAP system, including:
接收用户终端发送的数据查询请求,所述数据查询请求包括查询信息、第一数据表标识和第二数据表标识;receiving a data query request sent by a user terminal, where the data query request includes query information, a first data table identifier, and a second data table identifier;
根据所述数据查询请求,对所述第一数据表标识对应的第一数据表进行逻辑分区处理,获取至少两个第一分区,并对于每个第一分区,建立对应的连接映射关系,所述连接映射关系包括:所述第一分区的边界值和对应的哈希子表;According to the data query request, perform logical partition processing on the first data table corresponding to the first data table identifier, obtain at least two first partitions, and establish a corresponding connection mapping relationship for each first partition, so The connection mapping relationship includes: the boundary value of the first partition and the corresponding hash subtable;
对所述第二数据表标识对应的第二数据表进行分区处理,获取至少两个第二分区,并对每个第二分区,从所述第二分区中获取第一外键值,所述第一外键值在一个所述第一分区的边界值范围内;Partitioning the second data table corresponding to the second data table identifier, obtaining at least two second partitions, and obtaining the first foreign key value from the second partition for each second partition, the the first foreign key value is within a boundary value range of said first partition;
查询所述连接映射关系,获取与第一外键值对应的第一哈希子表;Querying the connection mapping relationship to obtain a first hash subtable corresponding to the first foreign key value;
扫描所述第一哈希子表,获取与所述查询信息对应的数据,并将所述数据返回用户终端。Scanning the first hash sub-table, acquiring data corresponding to the query information, and returning the data to the user terminal.
结合第一方面,在第一方面的第一种的可能的实施方式中,所述对所述第一数据表标识对应的第一数据表进行逻辑分区处理,获取至少两个第一分区,并建立连接映射关系,包括:With reference to the first aspect, in a first possible implementation manner of the first aspect, performing logical partition processing on the first data table corresponding to the first data table identifier, obtaining at least two first partitions, and Establish a connection mapping relationship, including:
根据所述第一数据表的固有顺序,对所述第一数据表进行分区处理,获取至少两个第一分区;performing partition processing on the first data table according to the inherent order of the first data table, and acquiring at least two first partitions;
对于每个所述第一分区,根据所述第一分区的主键,进行哈希运算,建立对应的哈希子表;For each of the first partitions, perform a hash operation according to the primary key of the first partition, and establish a corresponding hash subtable;
对于每个所述第一分区,根据所述第一分区对应的哈希子表,建立对应的连接映射关系。For each of the first partitions, a corresponding connection mapping relationship is established according to the hash subtable corresponding to the first partition.
结合第一方面或第一方面的第一种可能的实施方式,在第一方面的第二种可能的实施方式中,所述第一数据表标识对应的第一数据表进行逻辑分区处理,获取至少两个第一分区,包括:With reference to the first aspect or the first possible implementation manner of the first aspect, in the second possible implementation manner of the first aspect, the logical partition processing is performed on the first data table corresponding to the first data table identifier, and the obtained At least two first divisions, including:
若所述第一数据表有序,则将所述第一数据表按照所述第一数据表的固有顺序进行逻辑分区处理,获取至少两个所述第一分区;If the first data table is in order, perform logical partition processing on the first data table according to the inherent order of the first data table, and obtain at least two of the first partitions;
或者,or,
若所述第一数据表无序,则在所述第一数据表中增加有序代理列作为主键列,并将所述第一数据表按照所述有序代理列的顺序进行逻辑分区处理,获取至少两个所述第一分区。If the first data table is out of order, an ordered proxy column is added to the first data table as a primary key column, and the first data table is logically partitioned according to the order of the ordered proxy column, Obtain at least two of said first partitions.
结合第一方面的第二种可能的实施方式,在第一方面的第三种可能的实施方式中,所述对所述第二数据表标识对应的第二数据表进行分区处理,获取至少两个第二分区,包括:With reference to the second possible implementation manner of the first aspect, in a third possible implementation manner of the first aspect, performing partition processing on the second data table corresponding to the second data table identifier to obtain at least two a second partition, including:
若所述第一数据表有序,且第二数据表有序,则根据所述第二数据表的固有顺序和并行处理能力对所述第二数据表进行分区处理,获取至少两个第二分区。If the first data table is in order and the second data table is in order, the second data table is partitioned according to the inherent order and parallel processing capability of the second data table, and at least two second data tables are obtained. partition.
或者,or,
若所述第一数据表有序,且所述第二数据表无序,则根据并行处理能力对所述第二数据表进行分区处理,获取至少两个第二分区;If the first data table is ordered and the second data table is out of order, partitioning the second data table according to parallel processing capability to obtain at least two second partitions;
或者,or,
若所述第一数据表无序,且所述第二数据表无序,则将所述第二数据表中原外键值替换为与所述第一数据表中所述有序代理列对应的新的主键值,并根据并行处理能力对替换后的所述第二数据表进行分区处理,获取至少两个第二分区。If the first data table is unordered and the second data table is unordered, then replace the original foreign key value in the second data table with the one corresponding to the ordered proxy column in the first data table new primary key value, and perform partition processing on the replaced second data table according to the parallel processing capability to obtain at least two second partitions.
结合第一方面和第一方面的第一至第三种中的任一种可能的实施方式,在第一方面的第四种可能的实施方式中,所述扫描所述第一哈希子表,获取与所述查询信息对应的数据,包括:With reference to the first aspect and any one of the first to third possible implementation manners of the first aspect, in a fourth possible implementation manner of the first aspect, the scanning of the first hash subtable, Obtain data corresponding to the query information, including:
扫描所述第一哈希子表,获取与所述查询信息对应的所述第一数据表和所述第二数据表中的关联行中的所有数据信息作为所述数据。Scanning the first hash sub-table, and obtaining all data information in associated rows in the first data table and the second data table corresponding to the query information as the data.
本发明实施例第二方面提供一种基于OLAP系统的数据查询装置,包括:The second aspect of the embodiment of the present invention provides a data query device based on an OLAP system, including:
收发模块,用于接收用户终端发送的数据查询请求,所述数据查询请求包括查询信息、第一数据表标识和第二数据表标识;A transceiver module, configured to receive a data query request sent by a user terminal, where the data query request includes query information, a first data table identifier, and a second data table identifier;
处理模块,用于根据所述数据查询请求,对所述第一数据表标识对应的第一数据表进行逻辑分区处理,获取至少两个第一分区,并对于每个第一分区,建立对应的连接映射关系,所述连接映射关系包括:所述第一分区的边界值和对应的哈希子表;A processing module, configured to perform logical partition processing on the first data table corresponding to the first data table identifier according to the data query request, acquire at least two first partitions, and establish a corresponding A connection mapping relationship, the connection mapping relationship including: the boundary value of the first partition and the corresponding hash subtable;
所述处理模块还用于对所述第二数据表标识对应的第二数据表进行分区处理,获取至少两个第二分区,并对每个第二分区,从所述第二分区中获取第一外键值,所述第一外键值在一个所述第一分区的边界值范围内;The processing module is further configured to partition the second data table corresponding to the second data table identifier, acquire at least two second partitions, and obtain the second partition from the second partition for each second partition. a foreign key value, said first foreign key value being within a boundary value range of said first partition;
获取模块,用于查询所述连接映射关系,获取与第一外键值对应的第一哈希子表;An acquisition module, configured to query the connection mapping relationship, and acquire the first hash subtable corresponding to the first foreign key value;
所述获取模块还用于扫描所述第一哈希子表,获取与所述查询信息对应的数据,并通过所述收发模块将所述数据返回用户终端。The acquisition module is further configured to scan the first hash subtable to acquire data corresponding to the query information, and return the data to the user terminal through the transceiver module.
结合第二方面,在第二方面的第一种的可能的实施方式中,所述处理模块具体用于:With reference to the second aspect, in the first possible implementation manner of the second aspect, the processing module is specifically configured to:
根据所述第一数据表的固有顺序,对所述第一数据表进行分区处理,获取至少两个第一分区;performing partition processing on the first data table according to the inherent order of the first data table, and acquiring at least two first partitions;
对于每个所述第一分区,根据所述第一分区的主键,进行哈希运算,建立对应的哈希子表;For each of the first partitions, perform a hash operation according to the primary key of the first partition, and establish a corresponding hash subtable;
对于每个所述第一分区,根据所述第一分区对应的哈希子表,建立对应的连接映射关系。For each of the first partitions, a corresponding connection mapping relationship is established according to the hash subtable corresponding to the first partition.
结合第二方面或第二方面的第一种可能的实施方式,在第二方面的第二种可能的实施方式中,所述处理模块用于:With reference to the second aspect or the first possible implementation manner of the second aspect, in the second possible implementation manner of the second aspect, the processing module is used for:
若所述第一数据表有序,则将所述第一数据表按照所述第一数据表的固有顺序进行逻辑分区处理,获取至少两个所述第一分区;If the first data table is in order, perform logical partition processing on the first data table according to the inherent order of the first data table, and obtain at least two of the first partitions;
或者,or,
若所述第一数据表无序,则在所述第一数据表中增加有序代理列作为主键列,并将所述第一数据表按照所述有序代理列的顺序进行逻辑分区处理,获取至少两个所述第一分区。If the first data table is out of order, an ordered proxy column is added to the first data table as a primary key column, and the first data table is logically partitioned according to the order of the ordered proxy column, Obtain at least two of said first partitions.
结合第二方面的第二种可能的实施方式,在第二方面的第三种可能的实施方式中,所述处理模块还用于:With reference to the second possible implementation manner of the second aspect, in a third possible implementation manner of the second aspect, the processing module is further configured to:
若所述第一数据表有序,且第二数据表有序,则根据所述第二数据表的固有顺序和并行处理能力对所述第二数据表进行分区处理,获取至少两个第二分区。If the first data table is in order and the second data table is in order, the second data table is partitioned according to the inherent order and parallel processing capability of the second data table, and at least two second data tables are obtained. partition.
或者,or,
若所述第一数据表有序,且所述第二数据表无序,则根据并行处理能力对所述第二数据表进行分区处理,获取至少两个第二分区;If the first data table is ordered and the second data table is out of order, partitioning the second data table according to parallel processing capability to obtain at least two second partitions;
或者,or,
若所述第一数据表无序,且所述第二数据表无序,则将所述第二数据表中原外键值替换为与所述第一数据表中所述有序代理列对应的新的主键值,并根据并行处理能力对替换后的所述第二数据表进行分区处理,获取至少两个第二分区。If the first data table is unordered and the second data table is unordered, then replace the original foreign key value in the second data table with the one corresponding to the ordered proxy column in the first data table new primary key value, and perform partition processing on the replaced second data table according to the parallel processing capability to obtain at least two second partitions.
结合第二方面、第二方面的第一至第三种中的任一种可能的实施方式,在第二方面的第四种可能的实施方式中,所述获取模块具体用于:In combination with the second aspect and any one of the first to third possible implementation manners of the second aspect, in the fourth possible implementation manner of the second aspect, the acquiring module is specifically configured to:
扫描所述第一哈希子表,获取与所述查询信息对应的所述第一数据表和所述第二数据表中的关联行中的所有数据信息作为所述数据。Scanning the first hash sub-table, and obtaining all data information in associated rows in the first data table and the second data table corresponding to the query information as the data.
本发明实施例第三方面提供一种基于OLAP系统的数据查询系统,包括:用户终端和第二方面提供的基于OLAP系统的数据查询装置。The third aspect of the embodiments of the present invention provides an OLAP system-based data query system, including: a user terminal and the OLAP system-based data query device provided in the second aspect.
本发明实施例基于OLAP系统的数据查询方法、装置及系统,通过接收用户终端的查询请求,对第一数据表进行逻辑分区获取第一分区,并建立连接映射关系,该连接映射关系表示第一分区的边界值和哈希子表的对应关系,再对第二数据表进行分区获取第二分区,在进行多线程查询的过程中,每个线程通过该连接映射关系,获取并扫描第一哈希子表,从而获取数据,再将该数据返回客户终端,解决了现有技术中多线程查询时间长,服务器开销大的问题,有效缩短数据查询时间,降低服务器开销。The embodiment of the present invention is based on the data query method, device and system of the OLAP system. By receiving the query request from the user terminal, the first data table is logically partitioned to obtain the first partition, and a connection mapping relationship is established. The connection mapping relationship represents the first The corresponding relationship between the boundary value of the partition and the hash subtable, and then partition the second data table to obtain the second partition. In the process of multi-thread query, each thread obtains and scans the first hash subtable through the connection mapping relationship. Table, so as to obtain data, and then return the data to the client terminal, which solves the problems of long multi-threaded query time and high server overhead in the prior art, effectively shortens the data query time, and reduces server overhead.
附图说明Description of drawings
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description These are some embodiments of the present invention. For those skilled in the art, other drawings can also be obtained according to these drawings without any creative effort.
图1为本发明基于OLAP系统的数据查询方法实施例一的流程图;Fig. 1 is the flow chart of embodiment one of the data query method based on OLAP system of the present invention;
图2为本发明基于OLAP系统的数据查询方法实施例二的流程图;Fig. 2 is the flowchart of the second embodiment of the data query method based on the OLAP system of the present invention;
图3为本发明基于OLAP系统的数据查询装置实施例的结构示意图;Fig. 3 is the structural representation of the embodiment of the data inquiry device based on OLAP system of the present invention;
图4为本发明基于OLAP系统的数据查询系统实施例的结构示意图。FIG. 4 is a schematic structural diagram of an embodiment of the data query system based on the OLAP system of the present invention.
具体实施方式Detailed ways
为使本发明实施例的目的、技术方案和优点更加清楚,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments It is a part of embodiments of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.
图1为本发明基于OLAP系统的数据查询方法实施例一的流程图,本方案主要应用于OLAP型的数据库系统中,对多张数据表进行联合查询获取数据的过程中,如图1所示,本实施例的方法可以包括:Fig. 1 is the flow chart of embodiment one of the data query method based on OLAP system of the present invention, and this scheme is mainly applied in the database system of OLAP type, in the process of carrying out joint query to multiple data tables to obtain data, as shown in Fig. 1 , the method of this embodiment may include:
S101:接收用户终端发送的数据查询请求,所述数据查询请求包括查询信息、第一数据表标识和第二数据表标识。S101: Receive a data query request sent by a user terminal, where the data query request includes query information, a first data table identifier, and a second data table identifier.
在本实施例中,用户终端需要获取数据时候,向数据库系统发送数据查询请求,其中,该第一数据表和第二数据表为关联的两个数据表,存储了关联的数据信息。In this embodiment, when the user terminal needs to obtain data, it sends a data query request to the database system, wherein the first data table and the second data table are two associated data tables that store associated data information.
S102:根据所述数据查询请求,对所述第一数据表标识对应的第一数据表进行逻辑分区处理,获取至少两个第一分区,并对于每个第一分区,建立对应的连接映射关系,所述连接映射关系包括:所述第一分区的边界值和对应的哈希子表。S102: According to the data query request, perform logical partition processing on the first data table corresponding to the first data table identifier, obtain at least two first partitions, and establish a corresponding connection mapping relationship for each first partition , the connection mapping relationship includes: the boundary value of the first partition and the corresponding hash subtable.
在本实施例中,将第一数据表分成至少一个第一分区,针对每个第一分区,获取一个哈希子表,并建立一个连接映射关系,用来表示每个第一分区的边界值和这个第一分区的哈希子表的对应关系。In this embodiment, the first data table is divided into at least one first partition, and for each first partition, a hash subtable is obtained, and a connection mapping relationship is established to represent the boundary value and The correspondence of the hash subtable of this first partition.
该连接映射关系可以以表格的形式表示,可选的,还可以通过映射、数组,集合等其他方式进行表示,可根据实际应用环境进行选取,对此本发明不做限制。The connection mapping relationship can be expressed in the form of a table, optionally, it can also be expressed in other ways such as mapping, array, set, etc., which can be selected according to the actual application environment, which is not limited in the present invention.
S103:对所述第二数据表标识对应的第二数据表进行分区处理,获取至少两个第二分区,并对每个第二分区,从所述第二分区中获取第一外键值,所述第一外键值在一个所述第一分区的边界值范围内。S103: Perform partition processing on the second data table corresponding to the second data table identifier, obtain at least two second partitions, and obtain the first foreign key value from the second partition for each second partition, The first foreign key value is within a boundary value range of the first partition.
在本实施例中,根据第二数据表标识获取第二数据表,并对第二数据表进行分区处理,以获取至少一个第二分区,该第二分区的数量可以与第一分区的数量相同,也可以不同。从每个第二分区中,选取一个外键值与第一分区的边界值进行一一比对,若该外键值在一个第一分区的边界值范围内,则将该外键值作为该第二分区的第一外键值。In this embodiment, the second data table is obtained according to the second data table identifier, and the second data table is partitioned to obtain at least one second partition, and the number of the second partitions may be the same as the number of the first partitions , can also be different. From each second partition, select a foreign key value and compare it with the boundary value of the first partition one by one. If the foreign key value is within the boundary value range of the first partition, use the foreign key value as the The first foreign key value for the second partition.
在本实施例中,第一分区的边界值范围为当前第一分区的边界值和上一个第一分区的边界值之间的值。In this embodiment, the boundary value range of the first partition is a value between the current boundary value of the first partition and the previous boundary value of the first partition.
在关系型数据库中,每一个数据表中都有若干属性,若其中的某一个属性组能够唯一标识该数据表,该属性组就为该表格的一个主键,例如学生表包括学号、姓名、性别和班级,而其中每个学生的学号是唯一的,则学号就是主键。外键主要是用于与另一个数据表进行关联,例如成绩表中包括学号、课程号和成绩,学号和课程号一起才能确定成绩,所以该成绩表的主键为学号和课程号,而成绩表中的学号和学生表中的学号对应,因此该成绩表中的学号为学生表的外键。In a relational database, each data table has several attributes. If one of the attribute groups can uniquely identify the data table, the attribute group is a primary key of the table. For example, the student table includes student number, name, Gender and class, and the student ID of each student is unique, so the student ID is the primary key. The foreign key is mainly used to associate with another data table. For example, the grade table includes the student number, course number and grade. The student number and the course number can determine the grade together. Therefore, the primary key of the grade table is the student number and the course number. The student number in the grade table corresponds to the student number in the student table, so the student number in the grade table is the foreign key of the student table.
S104:查询所述连接映射关系,获取与第一外键值对应的第一哈希子表。S104: Query the connection mapping relationship, and obtain a first hash subtable corresponding to the first foreign key value.
在本实施例中,利用该第一外键值查询该连接映射关系,获取第一哈希子表,该第一哈希子表为多个哈希子表中对应于与该第一外键值相同的第一分区的边界值的哈希子表。In this embodiment, the connection mapping relationship is queried by using the first foreign key value, and the first hash subtable is obtained. Hash subtable of boundary values for the first partition.
S105:扫描所述第一哈希子表,获取与所述查询信息对应的数据,并将所述数据返回用户终端。S105: Scan the first hash subtable, acquire data corresponding to the query information, and return the data to the user terminal.
本发明实施例基于OLAP系统的数据查询方法,通过接收用户终端的查询请求,对第一数据表进行逻辑分区获取第一分区,并建立连接映射关系,该连接映射关系表示第一分区的边界值和哈希子表的对应关系,再对第二数据表进行分区获取第二分区,在进行多线程查询的过程中,每个线程通过该连接映射关系,获取并扫描第一哈希子表,从而获取数据,再将该数据返回客户终端,解决了现有技术中多线程查询时间长,服务器开销大的问题,有效缩短数据查询时间,降低服务器开销。The embodiment of the present invention is based on the data query method of the OLAP system. By receiving the query request from the user terminal, the first data table is logically partitioned to obtain the first partition, and a connection mapping relationship is established. The connection mapping relationship represents the boundary value of the first partition The corresponding relationship with the hash subtable, and then partition the second data table to obtain the second partition. In the process of multi-threaded query, each thread obtains and scans the first hash subtable through the connection mapping relationship, thereby obtaining Data, and then return the data to the client terminal, which solves the problems of long multi-threaded query time and high server overhead in the prior art, effectively shortens the data query time, and reduces server overhead.
图2为本发明基于OLAP系统的数据查询方法实施例二的流程图,如图2所示,在上述实施例的基础上,S102的具体实现方式包括以下步骤:Fig. 2 is the flow chart of embodiment two of the data query method based on OLAP system of the present invention, as shown in Fig. 2, on the basis of above-mentioned embodiment, the specific implementation of S102 comprises the following steps:
S201:根据所述第一数据表的固有顺序,对所述第一数据表进行分区处理,获取至少两个第一分区。S201: According to the inherent order of the first data table, perform partition processing on the first data table to obtain at least two first partitions.
在本实施例中,在OLAP型数据库系统中,第一数据表的主键通常存在偏序关系,即存在固有顺序,因此可以根据该固有顺序将第一数据表进行分区处理,划分为互不相交的至少两个第一分区。In this embodiment, in the OLAP database system, the primary key of the first data table usually has a partial order relationship, that is, there is an inherent order, so the first data table can be partitioned according to the inherent order, and divided into mutually disjoint at least two of the first partitions.
S202:对于每个所述第一分区,根据所述第一分区的主键,进行哈希运算,建立对应的哈希子表。S202: For each of the first partitions, perform a hash operation according to the primary key of the first partition, and establish a corresponding hash subtable.
在本实施例中,根据上述划分的第一分区,对每个第一分区的主键值锁哈希运算,建立共享的哈希子表,该每个第一分区对应的哈希子表之间互相独立。In this embodiment, according to the above-mentioned first partition, a shared hash subtable is established for the primary key value lock hash operation of each first partition, and the hash subtables corresponding to each first partition are mutually independent.
S203:对于每个所述第一分区,根据所述第一分区对应的哈希子表,建立对应的连接映射关系。S203: For each of the first partitions, establish a corresponding connection mapping relationship according to the hash subtable corresponding to the first partition.
在本实施例中,获取每个第一分区的边界值,且每个第一分区只有一个边界值,即用该边界值标识该第一分区,获取将该边界值和该第一分区对应的哈希子表的对应关系,建立连接映射关系。In this embodiment, the boundary value of each first partition is obtained, and each first partition has only one boundary value, that is, the boundary value is used to identify the first partition, and the boundary value corresponding to the first partition is obtained. Hash the corresponding relationship between the sub-tables, and establish the connection mapping relationship.
本发明实施例基于OLAP系统的数据查询方法,通过接收用户终端的查询请求,对第一数据表进行逻辑分区获取第一分区,获取每个第一分区对应的哈希子表和每个第一分区的边界值,并建立连接映射关系,该连接映射关系表示第一分区的边界值和哈希子表的对应关系,再对第二数据表进行分区获取第二分区,在进行多线程查询的过程中,每个线程通过该连接映射关系,获取并扫描第一哈希子表,从而获取数据,再将该数据返回客户终端,解决了现有技术中多线程查询时间长,服务器开销大的问题,有效缩短数据查询时间,降低服务器开销。The embodiment of the present invention is based on the data query method of the OLAP system. By receiving the query request from the user terminal, the first data table is logically partitioned to obtain the first partition, and the hash subtable corresponding to each first partition and each first partition are obtained. boundary value, and establish a connection mapping relationship, the connection mapping relationship represents the corresponding relationship between the boundary value of the first partition and the hash subtable, and then partition the second data table to obtain the second partition, in the process of multi-threaded query , each thread obtains and scans the first hash subtable through the connection mapping relationship, thereby obtaining data, and then returns the data to the client terminal, which solves the problems of long multi-thread query time and large server overhead in the prior art, and is effective Shorten data query time and reduce server overhead.
在上述实施例的基础上,特别的,则S102中,所述第一数据表标识对应的第一数据表进行逻辑分区处理,获取至少两个第一分区,包括以下两种实现方式:On the basis of the above-mentioned embodiments, in particular, in S102, the first data table corresponding to the first data table identifier performs logical partition processing to obtain at least two first partitions, including the following two implementations:
第一种实现方式,若所述第一数据表有序,则将所述第一数据表按照所述第一数据表的固有顺序进行逻辑分区处理,获取至少两个所述第一分区。In a first implementation manner, if the first data table is in order, the first data table is logically partitioned according to the inherent order of the first data table, and at least two of the first partitions are obtained.
在本实施例中,如果第一数据表的主键具有一定的固定顺序(偏序关系),则将第一数据表按照该固定顺序进行逻辑分区处理。In this embodiment, if the primary keys of the first data table have a certain fixed order (partial order relationship), the first data table is logically partitioned according to the fixed order.
第二种实现方式,若所述第一数据表无序,则在所述第一数据表中增加有序代理列作为主键列,并将所述第一数据表按照所述有序代理列的顺序进行逻辑分区处理,获取至少两个所述第一分区。In the second implementation mode, if the first data table is out of order, add an ordered proxy column in the first data table as a primary key column, and use the first data table according to the ordered proxy column The logical partition processing is performed sequentially, and at least two of the first partitions are obtained.
在本实施例中,如果第一数据表的不具有一定的固定顺序(偏序关系),则给第一数据表增加一个有序代理列,例如,增加一个递增的数字序列1至该第一数据表的行数作为有序代理列,然后按照该新增的有序代理列对第一数据表进行逻辑分区处理。In this embodiment, if the first data table does not have a certain fixed order (partial order relationship), then add an ordered proxy column to the first data table, for example, add an incremental number sequence 1 to the first data table The number of rows in the data table is used as an ordered proxy column, and then the first data table is logically partitioned according to the newly added ordered proxy column.
进一步的,在S103中,所述对所述第二数据表标识对应的第二数据表进行分区处理,获取至少两个第二分区,具体有以下三种实现方式:Further, in S103, performing partition processing on the second data table corresponding to the second data table identifier to obtain at least two second partitions, specifically, there are the following three implementation methods:
第一种实现方式,若所述第一数据表有序,且第二数据表有序,则根据所述第二数据表的固有顺序和并行处理能力对所述第二数据表进行分区处理,获取至少两个第二分区。In the first implementation manner, if the first data table is in order and the second data table is in order, the second data table is partitioned according to the inherent order and parallel processing capability of the second data table, Get at least two second partitions.
第二种实现方式,若所述第一数据表有序,且所述第二数据表无序,则根据并行处理能力对所述第二数据表进行分区处理,获取至少两个第二分区;In the second implementation manner, if the first data table is in order and the second data table is out of order, perform partition processing on the second data table according to parallel processing capability, and obtain at least two second partitions;
第三种实现方式,若所述第一数据表无序,且所述第二数据表无序,则将所述第二数据表中原外键值替换为与所述第一数据表中所述有序代理列对应的新的主键值,并根据并行处理能力对替换后的所述第二数据表进行分区处理,获取至少两个第二分区。In the third implementation, if the first data table is out of order and the second data table is out of order, replace the original foreign key value in the second data table with the value described in the first data table The new primary key value corresponding to the ordered surrogate column, and partitioning the replaced second data table according to the parallel processing capability to obtain at least two second partitions.
特别的,对于第一数据表与第二数据表具有一样的固有顺序(即具有同等的偏序关系,实际应用中此情况很常见),例如:“The TPC BenchmarkTMH”中的lineitem、order表两个常用连接数据表,order表的主键顺序与lineitem表的记录顺序具有一样的固有顺序,因此可以对order表和lineitem表采用相同边界值进行分区处理,能够保证两个数据表的逻辑分区一一对应,在进行哈希连接过程可以不需要连接映射关系,即可只访问唯一的哈希子表。In particular, for the first data table and the second data table have the same inherent order (that is, have the same partial order relationship, which is very common in practical applications), for example: lineitem, order in "The TPC Benchmark TM H" Table two commonly used connection data tables, the primary key order of the order table has the same inherent order as the record order of the lineitem table, so the order table and lineitem table can be partitioned with the same boundary value, which can ensure the logical partition of the two data tables One-to-one correspondence, during the hash join process, the link mapping relationship may not be required, and only the unique hash subtable can be accessed.
具体的,S105中,所述扫描所述第一哈希子表,获取与所述查询信息对应的数据,包括:扫描所述第一哈希子表,获取与所述查询信息对应的所述第一数据表和所述第二数据表中的关联行中的所有数据信息作为所述数据。Specifically, in S105, the scanning the first hash subtable to obtain data corresponding to the query information includes: scanning the first hash subtable to obtain the first hash data corresponding to the query information. All the data information in the associated row in the data table and the second data table is used as the data.
下面特举一实例,对上述实施例中的技术方案进行详细说明。具体的,第一数据表中存储了50名个体人的姓名和对应的性别和年龄,即该第一数据表的属性包括姓名、性别和年龄,主键为姓名,并且该第一数据表的固有顺序为姓名的首字母是按照二十六个字母的顺序排列的,即按照字母顺序排列行。与该第一数据表关联的第二数据表中存储了100名个体人的姓名、出生日期和手机号码,且该第二数据表的属性中姓名为第一数据表的外键。An example is given below to describe the technical solutions in the above embodiments in detail. Specifically, the names of 50 individuals and their corresponding genders and ages are stored in the first data table, that is, the attributes of the first data table include name, gender, and age, the primary key is name, and the inherent The order is that the first letter of the name is arranged in the order of the twenty-six letters, that is, the rows are arranged in alphabetical order. The names, dates of birth and mobile phone numbers of 100 individuals are stored in the second data table associated with the first data table, and the name in the attribute of the second data table is the foreign key of the first data table.
利用该基于OLAP系统的数据查询方法的主要过程为:首先,根据第一数据表的固有顺序(姓名首字母顺序)将该第一数据表进行分区处理,分为五个第一分区,每个第一分区的边界值为该第一分区的边界人员的姓名,对分区后的每个第一分区进行哈希计算,获取每个第一分区对应的哈希子表。建立连接映射关系表,该表格中存储了每个第一分区的边界值(姓名,例如其中一个边界值为:张三)和对应的哈希子表的标识,即该连接映射关系表标识了每个第一分区的边界值与对应的哈希子表的映射关系。The main process of utilizing the data query method based on the OLAP system is as follows: first, the first data table is partitioned according to the inherent order of the first data table (name initial alphabet order), and is divided into five first partitions, each The border value of the first partition is the name of the border personnel of the first partition, hash calculation is performed on each first partition after partition, and the hash subtable corresponding to each first partition is obtained. A connection mapping table is established, which stores the boundary value (name, for example, one of the boundary values: Zhang San) and the identification of the corresponding hash subtable of each first partition in the table, that is, the connection mapping table identifies each The mapping relationship between the boundary value of the first partition and the corresponding hash subtable.
其次,在第二数据表中,一般情况下第一数据表的外键列是无序的,因此对第二数据表不做物理分区,仅仅进行逻辑水平划分,将该第二数据表划分为三个第二分区(需要处理的线程数位三个线程,或者还可以分为与系统能处理的最大线程数量相同的分区数量,具体的可以根据实际情况进行选择,本申请对此不作限制),从每个第二分区中查询所有的第一数据表的外键值,找出一个在某个第一分区的边界值的范围内的姓名作为第一外键值,即找出姓名:张三,根据该第一外键值查询连接映射表,获取对应的哈希子表,然后通过现有技术的方式,对每个线程独立扫描对应的哈希子表,从第一数据表和第二数据表中获取所有的需要的数据。该扫描过程中每个线程并不相互交互。Secondly, in the second data table, in general, the foreign key columns of the first data table are out of order, so the second data table is not physically partitioned, but only logically divided horizontally, and the second data table is divided into Three second partitions (the number of threads that need to be processed is three threads, or can also be divided into the same partition number as the maximum number of threads that the system can handle, which can be selected according to the actual situation, and the application does not limit this), Query all the foreign key values of the first data table from each second partition, find a name within the range of the boundary value of a certain first partition as the first foreign key value, that is, find out the name: Zhang San , query the connection mapping table according to the first foreign key value, obtain the corresponding hash sub-table, and then independently scan the corresponding hash sub-table for each thread in the way of the prior art, from the first data table and the second data table Get all the data you need. Each thread does not interact with each other during this scan.
最后,将获取的数据返回给用户终端,完成整个数据查询的过程。Finally, the acquired data is returned to the user terminal to complete the entire data query process.
在本实施例中,将第一数据表(主键表)水平逻辑分解为N个第一分区(N取决于平台所能提供的并行能力)。由于主键上的固有顺序,这些第一分区可以互不相交,并通过记录边界值以区别各个第一分区。将第一数据表(主键表)各进行分区处理后,每个第一分区交由不同线程并行扫描,在扫描过程中使用哈希函数完成进行计算,生成每个第一分区的哈希子表。并以连接映射关系结构记录主键边界值与哈希子表的映射关系。在此过程中每线程独立完成各自哈希计算过程,各线程无冲突。In this embodiment, the first data table (primary key table) is logically decomposed into N first partitions (N depends on the parallel capability provided by the platform). Due to the inherent order of the primary key, these first partitions can be mutually disjoint, and each first partition can be distinguished by recording boundary values. After the first data table (primary key table) is partitioned, each first partition is scanned by different threads in parallel, and a hash function is used to complete the calculation during the scanning process to generate a hash sub-table of each first partition. And the mapping relationship between the primary key boundary value and the hash subtable is recorded in a connection mapping relationship structure. In this process, each thread independently completes its own hash calculation process, and each thread has no conflict.
连接中的第二数据表(外键表)在外键列上无序,因此该第二数据表(外键表)不进行物理分区,只按并行度水平逻辑分片,采用多线程并行扫描。对于每个线程读取的一个第一外键值,首先查询连接映射关系,通过映射关系确定该第一外键值对应的哈希子表。多线程对连接映射关系的查询是只读操作,与其他操作无冲突。The second data table (foreign key table) in the connection is out of order on the foreign key columns, so the second data table (foreign key table) is not physically partitioned, but only logically partitioned according to the parallelism level, and multi-threaded parallel scanning is used. For a first foreign key value read by each thread, first query the connection mapping relationship, and determine the hash subtable corresponding to the first foreign key value through the mapping relationship. The multi-threaded query of the connection mapping relationship is a read-only operation and has no conflict with other operations.
本发明实施例基于OLAP系统的数据查询方法,通过接收用户终端的查询请求,对第一数据表进行分区获取第一分区,获取每个第一分区对应的哈希子表和每个第一分区的边界值,并建立连接映射关系,该连接映射关系表示第一分区的边界值和哈希子表的对应关系,再对第二数据表进行分区获取第二分区,在进行多线程查询的过程中,每个线程通过该连接映射关系,获取并扫描第一哈希子表,对每个第一外键值对应的哈希子表进行独立的扫描,从而获取数据,再将该数据返回客户终端,解决了现有技术中多线程查询时间长,服务器开销大的问题,有效缩短数据查询时间,降低服务器开销。The embodiment of the present invention is based on the data query method of the OLAP system. By receiving the query request from the user terminal, the first data table is partitioned to obtain the first partition, and the hash subtable corresponding to each first partition and the hash subtable of each first partition are obtained. Boundary value, and establish a connection mapping relationship, the connection mapping relationship represents the corresponding relationship between the boundary value of the first partition and the hash subtable, and then partition the second data table to obtain the second partition, in the process of multi-threaded query, Each thread obtains and scans the first hash subtable through the connection mapping relationship, independently scans the hash subtable corresponding to each first foreign key value, thereby obtaining data, and then returns the data to the client terminal, solving the problem In the prior art, the multi-threaded query time is long and the server overhead is high, which can effectively shorten the data query time and reduce the server overhead.
图3为本发明基于OLAP系统的数据查询装置实施例的结构示意图,如图3所示,本实施例的装置可以包括:收发模块31、处理模块32和获取模块33,其中,收发模块31,用于接收用户终端发送的数据查询请求,所述数据查询请求包括查询信息、第一数据表标识和第二数据表标识;处理模块32,用于根据所述数据查询请求,对所述第一数据表标识对应的第一数据表进行分区处理,获取至少两个第一分区,并对于每个第一分区,建立对应的连接映射关系,所述连接映射关系包括:所述第一分区的边界值和对应的哈希子表;所述处理模块32还用于对所述第二数据表标识对应的第二数据表进行分区处理,获取至少两个第二分区,并对每个第二分区,从所述第二分区中获取第一外键值,所述第一外键值与一个所述第一分区的边界值相同;获取模块33,用于查询所述连接映射关系,获取与第一外键值对应的第一哈希子表;所述获取模块33还用于扫描所述第一哈希子表,获取与所述查询信息对应的数据,并通过所述收发模块31将所述数据返回用户终端。Fig. 3 is a schematic structural diagram of an embodiment of a data query device based on an OLAP system in the present invention. As shown in Fig. 3, the device of this embodiment may include: a transceiver module 31, a processing module 32 and an acquisition module 33, wherein the transceiver module 31, For receiving a data query request sent by a user terminal, the data query request includes query information, a first data table identifier and a second data table identifier; the processing module 32 is configured to perform the first data query according to the data query request. Perform partition processing on the first data table corresponding to the data table identifier, obtain at least two first partitions, and establish a corresponding connection mapping relationship for each first partition, and the connection mapping relationship includes: the boundary of the first partition value and the corresponding hash subtable; the processing module 32 is further configured to partition the second data table corresponding to the second data table identifier, obtain at least two second partitions, and for each second partition, Acquire the first foreign key value from the second partition, the first foreign key value is the same as a boundary value of the first partition; the obtaining module 33 is used to query the connection mapping relationship, and obtain the connection mapping relationship with the first The first hash sub-table corresponding to the foreign key value; the acquisition module 33 is also used to scan the first hash sub-table, obtain data corresponding to the query information, and return the data through the transceiver module 31 user terminal.
本实施例提供的基于OLAP系统的数据查询装置,可以用于执行图1所示方法实施例的技术方案,通过收发模块接收用户终端的查询请求,处理模块对第一数据表进行逻辑分区获取第一分区,获取每个第一分区对应的哈希子表和每个第一分区的边界值,并建立连接映射关系,该连接映射关系表示第一分区的边界值和哈希子表的对应关系,再对第二数据表进行分区获取第二分区,在进行多线程查询的过程中,每个线程通过该连接映射关系,获取模块获取并扫描第一哈希子表,从而获取数据,再将该数据返回客户终端,解决了现有技术中多线程查询时间长,服务器开销大的问题,有效缩短数据查询时间,降低服务器开销。The data query device based on the OLAP system provided in this embodiment can be used to implement the technical solution of the method embodiment shown in Figure 1, and the query request of the user terminal is received by the transceiver module, and the processing module logically partitions the first data table to obtain the second data table. A partition, obtain the hash subtable corresponding to each first partition and the boundary value of each first partition, and establish a connection mapping relationship, the connection mapping relationship represents the corresponding relationship between the boundary value of the first partition and the hash subtable, and then Partition the second data table to obtain the second partition. In the process of multi-thread query, each thread obtains and scans the first hash sub-table through the connection mapping relationship to obtain data, and then returns the data The client terminal solves the problems of long multi-thread query time and high server overhead in the prior art, effectively shortens data query time, and reduces server overhead.
在本发明基于OLAP系统的数据查询装置实施例二中,在上述实施例的基础上,所述处理模块32用于:In the second embodiment of the data query device based on the OLAP system of the present invention, on the basis of the above embodiments, the processing module 32 is used for:
若所述第一数据表有序,则将所述第一数据表按照所述第一数据表的固有顺序进行逻辑分区处理,获取至少两个所述第一分区;If the first data table is in order, perform logical partition processing on the first data table according to the inherent order of the first data table, and obtain at least two of the first partitions;
或者,or,
若所述第一数据表无序,则在所述第一数据表中增加有序代理列作为主键列,并将所述第一数据表按照所述有序代理列的顺序进行逻辑分区处理,获取至少两个所述第一分区。If the first data table is out of order, an ordered proxy column is added to the first data table as a primary key column, and the first data table is logically partitioned according to the order of the ordered proxy column, Obtain at least two of said first partitions.
可选的,所述处理模块32还用于:Optionally, the processing module 32 is also used for:
若所述第一数据表有序,且第二数据表有序,则根据所述第二数据表的固有顺序和并行处理能力对所述第二数据表进行分区处理,获取至少两个第二分区;If the first data table is in order and the second data table is in order, the second data table is partitioned according to the inherent order and parallel processing capability of the second data table, and at least two second data tables are obtained. Partition;
或者,or,
若所述第一数据表有序,且所述第二数据表无序,则根据并行处理能力对所述第二数据表进行分区处理,获取至少两个第二分区;If the first data table is ordered and the second data table is out of order, partitioning the second data table according to parallel processing capability to obtain at least two second partitions;
或者,or,
若所述第一数据表无序,且所述第二数据表无序,则将所述第二数据表中原外键值替换为与所述第一数据表中所述有序代理列对应的新的主键值,并根据并行处理能力对替换后的所述第二数据表进行分区处理,获取至少两个第二分区。If the first data table is unordered and the second data table is unordered, then replace the original foreign key value in the second data table with the one corresponding to the ordered proxy column in the first data table new primary key value, and perform partition processing on the replaced second data table according to the parallel processing capability to obtain at least two second partitions.
可选的,若所述第一数据表和所述第二数据表的固有顺序相同,则所述处理模块32用于:根据与所述第一数据表相同的所述固有顺序,对所述第二数据表进行分区处理,获取至少两个第二分区;其中,所述第二分区的数量与所述第一分区的数量相同。Optionally, if the inherent order of the first data table and the second data table are the same, the processing module 32 is configured to: according to the same inherent order as the first data table, process the Partitioning is performed on the second data table to obtain at least two second partitions; wherein, the number of the second partitions is the same as the number of the first partitions.
具体的,所述获取模块33具体用于:扫描所述第一哈希子表,获取与所述查询信息对应的所述第一数据表和所述第二数据表中的关联行中的所有数据信息作为所述数据。Specifically, the obtaining module 33 is specifically configured to: scan the first hash sub-table, and obtain all data in associated rows in the first data table and the second data table corresponding to the query information information as said data.
本实施例提供的基于OLAP系统的数据查询装置,可以用于执行方法实施例一至三任意实施例的技术方案,其实现原理和技术效果类似,此处不再赘述。The data query device based on the OLAP system provided in this embodiment can be used to implement the technical solution of any of the method embodiments 1 to 3, and its implementation principle and technical effect are similar, and will not be repeated here.
图4为本发明基于OLAP系统的数据查询系统实施例的结构示意图。如图4所述,该系统包括:用户终端41和图3所示的任一装置实施例所述的基于OLAP系统的数据查询装置42。其中,用户终端41用于向基于OLAP系统的数据查询装置42发送数据查询消息,并用于接收该基于OLAP系统的数据查询装置42返回的数据。基于OLAP系统的数据查询装置42用于执行图1、图2及实例中任一方法实施例的技术方案,其实现原理和技术效果类似,此处不再赘述。FIG. 4 is a schematic structural diagram of an embodiment of the data query system based on the OLAP system of the present invention. As shown in FIG. 4 , the system includes: a user terminal 41 and a data query device 42 based on an OLAP system described in any device embodiment shown in FIG. 3 . Wherein, the user terminal 41 is used to send a data query message to the data query device 42 based on the OLAP system, and is used to receive the data returned by the data query device 42 based on the OLAP system. The data query device 42 based on the OLAP system is used to execute the technical solution of any one of the method embodiments in Fig. 1, Fig. 2 and examples, and its implementation principle and technical effect are similar, and will not be repeated here.
本领域普通技术人员可以理解:实现上述各方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成。前述的程序可以存储于一计算机可读取存储介质中。该程序在执行时,执行包括上述各方法实施例的步骤;而前述的存储介质包括:ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。Those of ordinary skill in the art can understand that all or part of the steps for implementing the above method embodiments can be completed by program instructions and related hardware. The aforementioned program can be stored in a computer-readable storage medium. When the program is executed, it executes the steps including the above-mentioned method embodiments; and the aforementioned storage medium includes: ROM, RAM, magnetic disk or optical disk and other various media that can store program codes.
最后应说明的是:以上各实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述各实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的范围。Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present invention, rather than limiting them; although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: It is still possible to modify the technical solutions described in the foregoing embodiments, or perform equivalent replacements for some or all of the technical features; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the technical solutions of the various embodiments of the present invention. scope.
Claims (11)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410228109.0A CN103995879B (en) | 2014-05-27 | 2014-05-27 | Data query method, apparatus and system based on OLAP system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410228109.0A CN103995879B (en) | 2014-05-27 | 2014-05-27 | Data query method, apparatus and system based on OLAP system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103995879A true CN103995879A (en) | 2014-08-20 |
CN103995879B CN103995879B (en) | 2017-12-15 |
Family
ID=51310044
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410228109.0A Active CN103995879B (en) | 2014-05-27 | 2014-05-27 | Data query method, apparatus and system based on OLAP system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103995879B (en) |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2016165525A1 (en) * | 2015-04-16 | 2016-10-20 | 华为技术有限公司 | Data query method in crossing-partition database, and crossing-partition query device |
CN107085570A (en) * | 2016-02-14 | 2017-08-22 | 华为技术有限公司 | Data processing method, application server and router |
CN107229692A (en) * | 2017-05-19 | 2017-10-03 | 哈工大大数据产业有限公司 | A kind of distributed multi-table connecting method and system based on streamline |
CN107729500A (en) * | 2017-10-20 | 2018-02-23 | 锐捷网络股份有限公司 | A kind of data processing method of on-line analytical processing, device and background devices |
WO2018040722A1 (en) * | 2016-08-31 | 2018-03-08 | 华为技术有限公司 | Table data query method and device |
CN107818117A (en) * | 2016-09-14 | 2018-03-20 | 阿里巴巴集团控股有限公司 | A kind of method for building up of tables of data, online query method and relevant apparatus |
WO2018090557A1 (en) * | 2016-11-18 | 2018-05-24 | 华为技术有限公司 | Method and device for querying data table |
CN108427684A (en) * | 2017-02-14 | 2018-08-21 | 华为技术有限公司 | Data query method, apparatus and computing device |
CN108874873A (en) * | 2018-04-26 | 2018-11-23 | 北京空间科技信息研究所 | Data query method, apparatus, storage medium and processor |
CN108959330A (en) * | 2017-05-26 | 2018-12-07 | 阿里巴巴集团控股有限公司 | A kind of processing of database, data query method and apparatus |
CN109189808A (en) * | 2018-09-18 | 2019-01-11 | 腾讯科技(深圳)有限公司 | Data query method and relevant device |
CN109582694A (en) * | 2017-09-29 | 2019-04-05 | 北京国双科技有限公司 | A kind of method and Related product generating data query script |
CN109885574A (en) * | 2019-02-22 | 2019-06-14 | 广州荔支网络技术有限公司 | A kind of data query method and device |
CN110083658A (en) * | 2019-03-11 | 2019-08-02 | 北京达佳互联信息技术有限公司 | Method of data synchronization, device, electronic equipment and storage medium |
CN110287213A (en) * | 2019-07-03 | 2019-09-27 | 中通智新(武汉)技术研发有限公司 | Data query method, apparatus and system based on OLAP system |
CN112597248A (en) * | 2020-12-26 | 2021-04-02 | 中国农业银行股份有限公司 | Big data partition storage method and device |
WO2025002114A1 (en) * | 2023-06-30 | 2025-01-02 | 腾讯科技(深圳)有限公司 | Data table query method and apparatus, storage medium, and electronic device |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120221510A1 (en) * | 2010-03-31 | 2012-08-30 | International Business Machines Corporation | Method and system for validating data |
CN102663117A (en) * | 2012-04-18 | 2012-09-12 | 中国人民大学 | OLAP (On Line Analytical Processing) inquiry processing method facing database and Hadoop mixing platform |
CN103235793A (en) * | 2013-04-01 | 2013-08-07 | 华为技术有限公司 | On-line data processing method, equipment and system |
CN103309958A (en) * | 2013-05-28 | 2013-09-18 | 中国人民大学 | OLAP star connection query optimizing method under CPU and GPU mixing framework |
CN103324724A (en) * | 2013-06-26 | 2013-09-25 | 华为技术有限公司 | Method and device for processing data |
-
2014
- 2014-05-27 CN CN201410228109.0A patent/CN103995879B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120221510A1 (en) * | 2010-03-31 | 2012-08-30 | International Business Machines Corporation | Method and system for validating data |
CN102663117A (en) * | 2012-04-18 | 2012-09-12 | 中国人民大学 | OLAP (On Line Analytical Processing) inquiry processing method facing database and Hadoop mixing platform |
CN103235793A (en) * | 2013-04-01 | 2013-08-07 | 华为技术有限公司 | On-line data processing method, equipment and system |
CN103309958A (en) * | 2013-05-28 | 2013-09-18 | 中国人民大学 | OLAP star connection query optimizing method under CPU and GPU mixing framework |
CN103324724A (en) * | 2013-06-26 | 2013-09-25 | 华为技术有限公司 | Method and device for processing data |
Non-Patent Citations (1)
Title |
---|
朱阅岸 等: ""一种基于三元组存储的列式OLAP查询执行引擎"", 《软件学报》 * |
Cited By (28)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106156168B (en) * | 2015-04-16 | 2019-10-22 | 华为技术有限公司 | Method for querying data in cross-partition database and cross-partition query device |
CN106156168A (en) * | 2015-04-16 | 2016-11-23 | 华为技术有限公司 | The method of data is being inquired about and across subregion inquiry unit in partitioned data base |
WO2016165525A1 (en) * | 2015-04-16 | 2016-10-20 | 华为技术有限公司 | Data query method in crossing-partition database, and crossing-partition query device |
CN107085570A (en) * | 2016-02-14 | 2017-08-22 | 华为技术有限公司 | Data processing method, application server and router |
CN107784044B (en) * | 2016-08-31 | 2020-02-14 | 华为技术有限公司 | Table data query method and device |
WO2018040722A1 (en) * | 2016-08-31 | 2018-03-08 | 华为技术有限公司 | Table data query method and device |
CN107818117A (en) * | 2016-09-14 | 2018-03-20 | 阿里巴巴集团控股有限公司 | A kind of method for building up of tables of data, online query method and relevant apparatus |
CN107818117B (en) * | 2016-09-14 | 2022-02-15 | 阿里巴巴集团控股有限公司 | Data table establishing method, online query method and related device |
WO2018090557A1 (en) * | 2016-11-18 | 2018-05-24 | 华为技术有限公司 | Method and device for querying data table |
CN108073641A (en) * | 2016-11-18 | 2018-05-25 | 华为技术有限公司 | The method and apparatus for inquiring about tables of data |
CN108073641B (en) * | 2016-11-18 | 2020-06-16 | 华为技术有限公司 | Method and device for querying data table |
CN108427684B (en) * | 2017-02-14 | 2020-12-25 | 华为技术有限公司 | Data query method and device and computing equipment |
CN108427684A (en) * | 2017-02-14 | 2018-08-21 | 华为技术有限公司 | Data query method, apparatus and computing device |
CN107229692B (en) * | 2017-05-19 | 2018-05-01 | 哈工大大数据产业有限公司 | A kind of distributed multi-table connecting method and system based on assembly line |
CN107229692A (en) * | 2017-05-19 | 2017-10-03 | 哈工大大数据产业有限公司 | A kind of distributed multi-table connecting method and system based on streamline |
CN108959330A (en) * | 2017-05-26 | 2018-12-07 | 阿里巴巴集团控股有限公司 | A kind of processing of database, data query method and apparatus |
CN109582694A (en) * | 2017-09-29 | 2019-04-05 | 北京国双科技有限公司 | A kind of method and Related product generating data query script |
CN107729500A (en) * | 2017-10-20 | 2018-02-23 | 锐捷网络股份有限公司 | A kind of data processing method of on-line analytical processing, device and background devices |
CN108874873A (en) * | 2018-04-26 | 2018-11-23 | 北京空间科技信息研究所 | Data query method, apparatus, storage medium and processor |
CN108874873B (en) * | 2018-04-26 | 2022-04-12 | 北京空间科技信息研究所 | Data query method, device, storage medium and processor |
CN109189808A (en) * | 2018-09-18 | 2019-01-11 | 腾讯科技(深圳)有限公司 | Data query method and relevant device |
CN109885574A (en) * | 2019-02-22 | 2019-06-14 | 广州荔支网络技术有限公司 | A kind of data query method and device |
CN110083658A (en) * | 2019-03-11 | 2019-08-02 | 北京达佳互联信息技术有限公司 | Method of data synchronization, device, electronic equipment and storage medium |
CN110287213A (en) * | 2019-07-03 | 2019-09-27 | 中通智新(武汉)技术研发有限公司 | Data query method, apparatus and system based on OLAP system |
CN110287213B (en) * | 2019-07-03 | 2023-02-17 | 中通智新(武汉)技术研发有限公司 | Data query method, device and system based on OLAP system |
CN112597248A (en) * | 2020-12-26 | 2021-04-02 | 中国农业银行股份有限公司 | Big data partition storage method and device |
CN112597248B (en) * | 2020-12-26 | 2024-04-12 | 中国农业银行股份有限公司 | Big data partition storage method and device |
WO2025002114A1 (en) * | 2023-06-30 | 2025-01-02 | 腾讯科技(深圳)有限公司 | Data table query method and apparatus, storage medium, and electronic device |
Also Published As
Publication number | Publication date |
---|---|
CN103995879B (en) | 2017-12-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103995879B (en) | Data query method, apparatus and system based on OLAP system | |
US11334544B2 (en) | Method, apparatus, device and medium for storing and querying data | |
CN103902698B (en) | A kind of data-storage system and storage method | |
US8924365B2 (en) | System and method for range search over distributive storage systems | |
EP3716090A1 (en) | Data processing method, apparatus and system | |
US9953102B2 (en) | Creating NoSQL database index for semi-structured data | |
CN103514201B (en) | Method and device for querying data in non-relational database | |
US9619492B2 (en) | Data migration | |
US9639542B2 (en) | Dynamic mapping of extensible datasets to relational database schemas | |
US20170255709A1 (en) | Atomic updating of graph database index structures | |
CN107977396B (en) | A kind of updating method and table data updating device of data table of KeyValue database | |
US9128967B2 (en) | Storing graph data in a column-oriented data store | |
US20180150536A1 (en) | Instance-based distributed data recovery method and apparatus | |
US20170255708A1 (en) | Index structures for graph databases | |
EP3376403A1 (en) | Method of accessing distributed database and device providing distributed data service | |
CN103810224A (en) | Information persistence and query method and device | |
CN104699796A (en) | Data cleaning method based on data warehouse | |
US20180137158A1 (en) | Foreign key learner | |
US8312050B2 (en) | Avoiding database related joins with specialized index structures | |
US8407255B1 (en) | Method and apparatus for exploiting master-detail data relationships to enhance searching operations | |
CN107958023A (en) | Method of data synchronization, data synchronization unit and computer-readable recording medium | |
CN107169003B (en) | Data association method and device | |
CN109614411B (en) | Data storage method, device and storage medium | |
CN110147396B (en) | A method and device for generating a mapping relationship | |
CN114564621A (en) | A method, apparatus, device and readable storage medium for associating data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |