CN104102710A - Massive data query method - Google Patents
Massive data query method Download PDFInfo
- Publication number
- CN104102710A CN104102710A CN201410336964.3A CN201410336964A CN104102710A CN 104102710 A CN104102710 A CN 104102710A CN 201410336964 A CN201410336964 A CN 201410336964A CN 104102710 A CN104102710 A CN 104102710A
- Authority
- CN
- China
- Prior art keywords
- mapping
- index
- rowkey
- hbase
- query
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2471—Distributed queries
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/248—Presentation of query results
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Computing Systems (AREA)
- Fuzzy Systems (AREA)
- Mathematical Physics (AREA)
- Probability & Statistics with Applications (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
本发明公开了一种海量数据查询方法,其特征在于,包括:建立HBase非行键值rowkey查询字段与rowkey的索引映射;查询时,根据所述索引映射关系,在SolrCloud中查询到查询字段对应的rowkey;使用所述rowkey在HBase中进行查找,并将查询结果分页显示。
The invention discloses a massive data query method, which is characterized in that it includes: establishing an index mapping between an HBase non-rowkey query field and a rowkey; when querying, according to the index mapping relationship, the corresponding query field is queried in SolrCloud rowkey; use the rowkey to search in HBase, and display the query results in pages.
Description
技术领域technical field
本发明涉及大数据领域,具体涉及一种基于SolrCloud和HBase的海量数据查询方法。The invention relates to the field of big data, in particular to a massive data query method based on SolrCloud and HBase.
背景技术Background technique
大数据(Big data)通常用来形容一个公司创造的大量非结构化数据和半结构化数据,这些数据在下载到关系型数据库用于分析时会花费过多时间和金钱。大数据分析常和云计算联系到一起,因为实时的大型数据集分析需要像MapReduce、HBase一样的框架来向数十、数百或甚至数千的电脑分配工作。大数据分析相比于传统的数据仓库应用,具有数据量大、查询分析复杂等特点。大数据需要特殊的技术,以有效地处理大量的容忍经过时间内的数据。适用于大数据的技术,包括大规模并行处理(MPP)数据库、数据挖掘电网、分布式文件系统、分布式数据库、云计算平台、互联网和可扩展的存储系统。Big data (Big data) is usually used to describe the large amount of unstructured and semi-structured data created by a company, which takes too much time and money to download to a relational database for analysis. Big data analysis is often associated with cloud computing, because real-time analysis of large data sets requires frameworks like MapReduce and HBase to distribute work to tens, hundreds, or even thousands of computers. Compared with traditional data warehouse applications, big data analysis has the characteristics of large data volume and complex query and analysis. Big data requires special techniques to efficiently handle large volumes of data that tolerate elapsed time. Technologies applicable to big data, including massively parallel processing (MPP) databases, data mining grids, distributed file systems, distributed databases, cloud computing platforms, the Internet, and scalable storage systems.
Solr是一个独立的企业级搜索应用服务器,它对外提供类似于Web-service的API接口。用户可以通过http请求,向搜索引擎服务器提交一定格式的XML文件,生成索引;也可以通过Http Get操作提出查找请求,并得到XML或json格式的返回结果。SolrCloud是Solr4.0版本以后基于Solr和Zookeeper的分布式搜索方案。SolrCloud是Solr的基于Zookeeper一种部署方式。Solr is an independent enterprise-level search application server that provides an API interface similar to Web-service. Users can submit XML files in a certain format to the search engine server through http requests to generate indexes; they can also submit search requests through Http Get operations and get returned results in XML or json format. SolrCloud is a distributed search solution based on Solr and Zookeeper after Solr4.0. SolrCloud is a deployment method of Solr based on Zookeeper.
HBase是一个分布式的、面向列的开源数据库,该技术来源于Fay Chang所撰写的Google论文“Bigtable:一个结构化数据的分布式存储系统”。HBase–Hadoop Database,是一个高可靠性、高性能、面向列、可伸缩的分布式存储系统,利用HBase技术可在廉价PC Server上搭建起大规模结构化存储集群。HBase在提供高并发读写操作支持的同时,也存在着一些显著的缺陷:由于HBase只对rowkey(行键值)进行排序,所以HBase无法实现对于rowkey以外字段的快速查找和检索。同时HBase也无法实现基于查询的分页显示和逐页查询。因此,设计一种基于SolrCloud和HBase的海量数据查询方法,可以有效的解决这些问题。HBase is a distributed, column-oriented open source database, which is derived from the Google paper "Bigtable: A Distributed Storage System for Structured Data" written by Fay Chang. HBase–Hadoop Database is a high-reliability, high-performance, column-oriented, and scalable distributed storage system. Using HBase technology, a large-scale structured storage cluster can be built on a cheap PC Server. While HBase provides support for highly concurrent read and write operations, it also has some significant defects: because HBase only sorts rowkeys (row key values), HBase cannot quickly search and retrieve fields other than rowkeys. At the same time, HBase cannot realize query-based paging display and page-by-page query. Therefore, designing a massive data query method based on SolrCloud and HBase can effectively solve these problems.
发明内容Contents of the invention
为了解决上述技术问题,本发明提供了一种海量数据查询方法及装置,实现了灵活的海量数据的多条件查询,模糊查询及查询结果的分页。In order to solve the above technical problems, the present invention provides a massive data query method and device, which realize flexible multi-condition query of massive data, fuzzy query and pagination of query results.
一种海量数据查询方法,包括:A massive data query method, comprising:
建立HBase非行键值rowkey查询字段与rowkey的索引映射;Establish HBase non-rowkey rowkey query field and rowkey index mapping;
查询时,根据所述索引映射关系,在SolrCloud中查询到查询字段对应的rowkey;When querying, according to the index mapping relationship, the rowkey corresponding to the query field is queried in SolrCloud;
使用所述rowkey在HBase中进行查找,并将查询结果分页显示。Use the rowkey to search in HBase, and display the query results in pages.
优选地,在HBase中的数据发生变化时,定期的更新SolrCloud中的索引映射。Preferably, when the data in HBase changes, the index mapping in SolrCloud is regularly updated.
优选地,所述索引映射是分布式存储的,Preferably, the index mapping is stored in a distributed manner,
当主服务器接收索引映射的更新时,将更新的索引映射发送到同一分片的其他副本服务器上;When the master server receives the update of the index mapping, it sends the updated index mapping to other replica servers of the same shard;
当副本服务器接收索引映射的更新时,将更新的索引映射发送到所属的主服务器上。When the replica server receives the update of the index mapping, it sends the updated index mapping to the master server to which it belongs.
优选地,使用Mapreduce模型加速索引映射的建立。Preferably, a Mapreduce model is used to speed up the establishment of the index mapping.
一种海量数据查询装置,包括:A massive data query device, comprising:
映射模块,对HBase非rowkey查询字段建立与rowkey的索引映射;The mapping module establishes an index mapping with rowkey for HBase non-rowkey query fields;
查询模块,根据索引映射关系,先在SolrCloud中查询到该查询字段所对应的HBase rowkey,再使用该rowkey在HBase中查询所需的数据;The query module, according to the index mapping relationship, first queries the HBase rowkey corresponding to the query field in SolrCloud, and then uses the rowkey to query the required data in HBase;
显示模块,将查询结果向用户分页显示。The display module displays the query results in pages to the user.
优选地,更新模块,当HBase中的数据变更时,定期的更新SolrCloud中的索引映射。Preferably, the update module regularly updates the index mapping in SolrCloud when the data in HBase changes.
优选地,同步模块,在该装置作为主服务器时,将更新的索引映射发送到同一分片的其他副本服务器上。Preferably, the synchronization module sends the updated index mapping to other replica servers of the same fragment when the device acts as the master server.
优选地,同步模块,在该装置作为副本服务器时,当更新模块对索引映射更新后,同步模块将更新的索引映射发送到所属的主服务器上。Preferably, the synchronization module, when the device serves as a replica server, after the update module updates the index mapping, the synchronization module sends the updated index mapping to the master server to which it belongs.
本申请的技术方案使用SolrCloud存储和维护HBase中的需要查询的非rowkey字段到rowkey的索引映射,根据查询条件查找到对应的rowkey,再使用rowkey在HBase中进行数据的查找,从而实现了灵活的海量数据的多条件查询,模糊查询及查询结果的分页;同时,SolrCloud采用分布式方式部署,可以实现集中式的信息存储,自动容错,近实时搜索和自动的负载均衡。The technical solution of this application uses SolrCloud to store and maintain the index mapping from non-rowkey fields that need to be queried to rowkey in HBase, find the corresponding rowkey according to the query conditions, and then use the rowkey to search for data in HBase, thereby realizing flexible Multi-condition query of massive data, fuzzy query and pagination of query results; at the same time, SolrCloud is deployed in a distributed manner, which can realize centralized information storage, automatic fault tolerance, near real-time search and automatic load balancing.
附图说明Description of drawings
此处所说明的附图用来提供对本发明的进一步理解,构成本申请的一部分,本发明的示意性实施例及其说明用于解释本发明,并不构成对本发明的不当限定。在附图中:The accompanying drawings described here are used to provide a further understanding of the present invention and constitute a part of the application. The schematic embodiments of the present invention and their descriptions are used to explain the present invention and do not constitute improper limitations to the present invention. In the attached picture:
图1是本发明实施例的Solr+HBase查询示意图;Fig. 1 is the Solr+HBase query schematic diagram of the embodiment of the present invention;
图2是本发明实施例的HBase rowkey索引映射示意图;Fig. 2 is the HBase rowkey index mapping schematic diagram of the embodiment of the present invention;
图3是本发明实施例的SolrCloud集群分布示意图;Fig. 3 is the SolrCloud cluster distribution schematic diagram of the embodiment of the present invention;
图4是本发明实施例的海量数据查询方法流程图;Fig. 4 is a flow chart of a massive data query method according to an embodiment of the present invention;
图5是本发明实施例的海量数据查询装置结构图。FIG. 5 is a structural diagram of a massive data query device according to an embodiment of the present invention.
具体实施方式Detailed ways
本发明采用基于SolrCloud+HBase的方法,可以对HBase中的指定的非rowkey字段建立与rowkey的索引映射,查询时先找到所要查询的字段对应的rowkey,然后在HBase中查找,避免了HBase直接查询时查询条件单一的问题。本发明在显示查询结果时,可以分页显示;从而提供了方便易实现的多条件查询及查询结果分页,同时提供了传统HBase存储所不具备的全文索引,模糊查询的能力。对于数量统计类的请求,直接通过Solr的索引映射即可取得结果,不必再对HBase进行查询请求。The present invention adopts a method based on SolrCloud+HBase, which can establish an index mapping with a rowkey for a specified non-rowkey field in HBase. When querying, first find the rowkey corresponding to the field to be queried, and then search in HBase, avoiding direct query by HBase When querying a problem with a single condition. When the present invention displays query results, it can be displayed in pages; thereby providing convenient and easy-to-implement multi-condition query and query result paging, and at the same time providing full-text index and fuzzy query capabilities that traditional HBase storage does not possess. For requests for quantity statistics, the results can be obtained directly through Solr's index mapping, and there is no need to query HBase.
下面结合附图及具体实施例对本发明进行详细说明。The present invention will be described in detail below in conjunction with the accompanying drawings and specific embodiments.
HBase中的每条记录是根据rowkey进行有序索引的,其索引的方式如图2中描述的一样,是一个多级索引的形式,采用类似于3层B+树的定位。首先是从zookeeper中找到root region所在的位置,从而加载-ROOT-这个region。-ROOT-region是.META.表的第一个region,里面存放了.META.表的其他所有region的位置信息。而.META.表是常驻在所有RegionServer的内存中的,其中存放着所有数据表的region位置信息。当通过rowkey在HBase中查询时,就是通过查找-ROOT-,.META.然后定位到数据所在的region,然后从该region中取出有效数据的。由于每一步的索引都是已经建立好的且有序的,所以在HBase中使用这种基于rowkey的查询效率是很高的。但对于非rowkey的查询,效率就显著下降。Each record in HBase is indexed in an orderly manner according to the rowkey. The indexing method is the same as that described in Figure 2. It is a multi-level index and adopts a positioning similar to a 3-layer B+ tree. The first is to find the location of the root region from zookeeper, so as to load the -ROOT-region. -ROOT-region is the first region of the .META. table, which stores the location information of all other regions of the .META. table. The .META. table is resident in the memory of all RegionServers, where the region location information of all data tables is stored. When querying in HBase through rowkey, it searches for -ROOT-, .META. Then locates the region where the data is located, and then retrieves valid data from the region. Since the indexes at each step are already established and ordered, it is very efficient to use this rowkey-based query in HBase. But for non-rowkey queries, the efficiency drops significantly.
为了解决这一问题,本发明使用SolrCloud预先对非rowkey查询字段建立其对应rowkey的一组索引映射,查询过程如图1所示,先在SolrCloud中查询到该查询条件所对应的HBase rowkey,再使用该rowkey在HBase中查询数据,最后,向客户端返回查询结果。这种方式可以大大提高查询效率。In order to solve this problem, the present invention uses SolrCloud to pre-establish a group of index mappings of its corresponding rowkey to the non-rowkey query field. The query process is shown in Figure 1. First, the HBase rowkey corresponding to the query condition is found in SolrCloud, and then Use the rowkey to query data in HBase, and finally, return the query result to the client. This method can greatly improve query efficiency.
SolrCloud采用分布式方式部署,可以实现集中式的信息存储,自动容错,近实时搜索和自动的负载均衡。如图3所示,这是一个拥有6个节点(服务器)的SolrCloud集群,索引映射分布在两个Shard(分片)里面,每个Shard包含三个Solr节点,一个Leader(主)节点,两个Replica(副本)节点。每一个Shard同时存在3个副本,当2个节点同时宕机时,系统仍可正常工作。集群的所有状态信息由Zookeeper集群统一维护。对于这6个节点,任何一个节点都可以接受索引映射的更新请求,从而实现了负载均衡。例如当Server4这个节点收到了关于Shard1索引映射的更新请求,Server4会将信息转发给索引映射应当所属的那个Leader节点,即Server1。Server1节点更新结束后,将版本号和索引映射发给同属于一个Shard的其他Replicas节点,即Server2和Server3,来完成同步。SolrCloud is deployed in a distributed manner, which can realize centralized information storage, automatic fault tolerance, near real-time search and automatic load balancing. As shown in Figure 3, this is a SolrCloud cluster with 6 nodes (servers). The index mapping is distributed in two Shards (shards). Each Shard contains three Solr nodes, one Leader (master) node, and two Shards. A Replica (replica) node. Each shard has 3 copies at the same time. When 2 nodes are down at the same time, the system can still work normally. All status information of the cluster is maintained uniformly by the Zookeeper cluster. For these 6 nodes, any node can accept the update request of the index mapping, thus realizing load balancing. For example, when the node Server4 receives an update request about the index mapping of Shard1, Server4 will forward the information to the Leader node to which the index mapping should belong, that is, Server1. After the update of the Server1 node is completed, the version number and index mapping are sent to other Replicas nodes belonging to the same shard, namely Server2 and Server3, to complete the synchronization.
本发明提供的海量数据查询方法,如图4所示,包括:The massive data query method provided by the present invention, as shown in Figure 4, includes:
步骤401,对HBase非rowkey查询字段建立与rowkey的索引映射。Step 401, establishing an index mapping with a rowkey for an HBase non-rowkey query field.
当HBase的数据建立时,根据设置的查询条件,使用SolrCloud建立非rowkey字段与rowkey的索引映射。所述查询条件是针对HBase非rowkey字段设置的。When HBase data is created, SolrCloud is used to create an index mapping between non-rowkey fields and rowkeys according to the set query conditions. The query condition is set for HBase non-rowkey fields.
在Solr索引映射建立的阶段,可使用Mapreduce模型加速索引映射的建立。At the stage of establishing Solr index mapping, the Mapreduce model can be used to accelerate the establishment of index mapping.
步骤402,查询时,根据索引映射关系,在SolrCloud中查询到对应的rowkey。Step 402, when querying, according to the index mapping relationship, the corresponding rowkey is queried in SolrCloud.
当需要进行查询时,根据索引映射关系,先在SolrCloud中查询到该查询条件所对应的HBase rowkey,再使用该rowkey在HBase中查询所需的数据。When a query is required, according to the index mapping relationship, first query the HBase rowkey corresponding to the query condition in SolrCloud, and then use the rowkey to query the required data in HBase.
优选地,所述索引映射是分布式存储的,Preferably, the index mapping is stored in a distributed manner,
当主服务器接收索引映射的更新时,将更新的索引映射发送到同一分片的其他副本服务器上;When the master server receives the update of the index mapping, it sends the updated index mapping to other replica servers of the same shard;
当副本服务器接收索引映射的更新时,将更新的索引映射发送到所属的主服务器上。When the replica server receives the update of the index mapping, it sends the updated index mapping to the master server to which it belongs.
步骤403,将查询结果向用户分页显示。Step 403, displaying the query results to the user in pages.
根据rowkey在HBase中获得数据后,向用户显示时,根据设置的分页方式,向用户显示。After the data is obtained in HBase according to the rowkey, when displaying to the user, it will be displayed to the user according to the set paging method.
在HBase原有的方式中,查询结果不支持分页显示,用户对查询结果只能全部查看。而本发明的改进是,对查询结果分页显示,例如:每页显示20项,用户可以对所显示的项目一目了然。In the original method of HBase, the query results do not support paging display, and users can only view all the query results. The improvement of the present invention is that the query results are displayed in pages, for example, 20 items are displayed on each page, and the user can know the displayed items at a glance.
优选地,该方法还可以包括:当HBase中的数据变更时,定期的更新SolrCloud中的索引映射。Preferably, the method may further include: regularly updating the index mapping in SolrCloud when the data in HBase changes.
本发明还提供了相应的海量数据查询装置,如图5所示,包括:The present invention also provides a corresponding massive data query device, as shown in Figure 5, including:
映射模块,对HBase非rowkey查询字段建立与rowkey的索引映射;The mapping module establishes an index mapping with rowkey for HBase non-rowkey query fields;
查询模块,根据索引映射关系,先在SolrCloud中查询到该查询条件所对应的HBase rowkey,再使用该rowkey在HBase中查询所需的数据;The query module, according to the index mapping relationship, first queries the HBase rowkey corresponding to the query condition in SolrCloud, and then uses the rowkey to query the required data in HBase;
显示模块,将查询结果向用户分页显示。The display module displays the query results in pages to the user.
优选地,该装置还包括更新模块,当HBase中的数据变更时,定期的更新SolrCloud中的索引映射。Preferably, the device further includes an update module, which regularly updates the index mapping in SolrCloud when the data in the HBase changes.
本发明所述的海量数据查询装置可以作为一个服务器节点,如图3所示,在每个服务器中都可以设置,构成一个集群。The massive data query device of the present invention can be used as a server node, as shown in FIG. 3 , can be set in each server to form a cluster.
优选地,还装置还包括同步模块,在该装置作为副本服务器时,当更新模块对索引映射更新后,同步模块将更新的索引映射发送到所属的主服务器上。Preferably, the device further includes a synchronization module. When the device serves as a copy server, after the update module updates the index mapping, the synchronization module sends the updated index mapping to the master server to which it belongs.
优选地,同步模块,在该装置作为主服务器时,将更新的索引映射发送到同一分片的其他副本服务器上。Preferably, the synchronization module sends the updated index mapping to other replica servers of the same fragment when the device acts as the master server.
应用实施例Application example
1.Solr schema(架构)文件的定义和配置1. Definition and configuration of Solr schema (architecture) files
修改schema.xml文件,在其中添加需要索引的字段。同时修改原来的uniqueKey,设置HBase表中的rowkey为Solr的uniqueKey。Modify the schema.xml file and add fields that need to be indexed. At the same time, modify the original uniqueKey and set the rowkey in the HBase table to Solr's uniqueKey.
2.索引映射的建立2. Establishment of index mapping
通过HBase API全表扫描(Scan)的方式或者通过MapReduce的方式对HBase中的数据建立Solr索引。Create a Solr index for the data in HBase through HBase API full table scan (Scan) or through MapReduce.
3.查询和分页的实现3. Realization of query and paging
查询的时候,在Solr中查找到查询条件所对应的一个或一组rowkey。在获得了这些rowkey之后,分组的使用rowkey在HBase中进行查询,从而查询到实际的结果并且实现了分页查找。When querying, one or a group of rowkeys corresponding to the query conditions are found in Solr. After obtaining these rowkeys, the grouped rowkeys are used to query in HBase, so that the actual results can be queried and paging search can be realized.
本领域普通技术人员可以理解上述方法中的全部或部分步骤可通过程序来指令相关硬件完成,所述程序可以存储于计算机可读存储介质中,如只读存储器、磁盘或光盘等。可选地,上述实施例的全部或部分步骤也可以使用一个或多个集成电路来实现。相应地,上述实施例中的各模块/单元可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。本申请不限制于任何特定形式的硬件和软件的结合。Those skilled in the art can understand that all or part of the steps in the above method can be completed by instructing relevant hardware through a program, and the program can be stored in a computer-readable storage medium, such as a read-only memory, a magnetic disk or an optical disk, and the like. Optionally, all or part of the steps in the foregoing embodiments may also be implemented using one or more integrated circuits. Correspondingly, each module/unit in the foregoing embodiments may be implemented in the form of hardware, or may be implemented in the form of software function modules. This application is not limited to any specific form of combination of hardware and software.
以上所述,仅为本发明的较佳实例而已,并非用于限定本发明的保护范围。凡在本发明的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。The above descriptions are only preferred examples of the present invention, and are not intended to limit the protection scope of the present invention. Any modifications, equivalent replacements, improvements, etc. made within the spirit and principles of the present invention shall be included within the protection scope of the present invention.
Claims (8)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201410336964.3A CN104102710A (en) | 2014-07-15 | 2014-07-15 | Massive data query method |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201410336964.3A CN104102710A (en) | 2014-07-15 | 2014-07-15 | Massive data query method |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN104102710A true CN104102710A (en) | 2014-10-15 |
Family
ID=51670864
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201410336964.3A Pending CN104102710A (en) | 2014-07-15 | 2014-07-15 | Massive data query method |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN104102710A (en) |
Cited By (35)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN104731945A (en) * | 2015-03-31 | 2015-06-24 | 浪潮集团有限公司 | Full-text searching method and device based on HBase |
| CN104834730A (en) * | 2015-05-15 | 2015-08-12 | 北京京东尚科信息技术有限公司 | Data analysis system and method |
| CN104951509A (en) * | 2015-05-25 | 2015-09-30 | 中国科学院信息工程研究所 | Big data online interactive query method and system |
| CN105095458A (en) * | 2015-07-29 | 2015-11-25 | 南威软件股份有限公司 | Method for big data retrieval based on time characteristics and supporting complicated conditions |
| CN105320746A (en) * | 2015-09-25 | 2016-02-10 | 北京北信源软件股份有限公司 | Big data based index acquisition method and system |
| CN105787058A (en) * | 2016-02-26 | 2016-07-20 | 广州品唯软件有限公司 | User label system and data pushing system based on same |
| CN105989117A (en) * | 2015-02-13 | 2016-10-05 | 中国移动通信集团山西有限公司 | Method and system for rapidly and jointly processing semi-structured data |
| CN106202490A (en) * | 2016-07-19 | 2016-12-07 | 浪潮电子信息产业股份有限公司 | A kind of SolrCloud configuration file amending method, Apparatus and system |
| CN106326429A (en) * | 2016-08-25 | 2017-01-11 | 武汉光谷信息技术股份有限公司 | Hbase second-level query scheme based on solr |
| CN106326309A (en) * | 2015-07-03 | 2017-01-11 | 阿里巴巴集团控股有限公司 | Data query method and device |
| CN106446145A (en) * | 2016-09-21 | 2017-02-22 | 郑州云海信息技术有限公司 | Quick creation method based on Hadoop for big data index |
| CN106528051A (en) * | 2016-11-15 | 2017-03-22 | 国云科技股份有限公司 | High-efficiency operation method for queuing and stacking big data based on MongoDB |
| CN106649828A (en) * | 2016-12-29 | 2017-05-10 | 中国银联股份有限公司 | Data query method and system |
| CN106682148A (en) * | 2016-12-22 | 2017-05-17 | 北京锐安科技有限公司 | Method and device based on Solr data search |
| CN106844374A (en) * | 2015-12-04 | 2017-06-13 | 北京四维图新科技股份有限公司 | A kind of storage, the method and device of retrieval photo |
| CN106909671A (en) * | 2017-02-28 | 2017-06-30 | 湖南蚁坊软件股份有限公司 | A kind of method and system of NoSQL databases condition query |
| CN107038207A (en) * | 2017-02-20 | 2017-08-11 | 阿里巴巴集团控股有限公司 | A kind of data query method, data processing method and device |
| CN107239517A (en) * | 2017-05-23 | 2017-10-10 | 中国联合网络通信集团有限公司 | Many condition searching method and device based on Hbase databases |
| CN107291964A (en) * | 2017-08-16 | 2017-10-24 | 南京华飞数据技术有限公司 | A kind of method that fuzzy query is realized based on HBase |
| CN107515867A (en) * | 2016-06-15 | 2017-12-26 | 阿里巴巴集团控股有限公司 | The generation method and device that data storage, querying method and the device and a kind of rowKey of a kind of NoSQL databases combine entirely |
| CN107704475A (en) * | 2016-08-10 | 2018-02-16 | 泰康保险集团股份有限公司 | Multilayer distributed unstructured data storage method, querying method and device |
| CN108153805A (en) * | 2017-11-17 | 2018-06-12 | 广东睿江云计算股份有限公司 | A kind of method, the system of efficient cleaning Hbase time series datas |
| CN108319636A (en) * | 2017-11-27 | 2018-07-24 | 大象慧云信息技术有限公司 | Electronic invoice data querying method |
| CN108628893A (en) * | 2017-03-21 | 2018-10-09 | 华为技术有限公司 | Metadata access method and storage device in a kind of storage device |
| CN109144995A (en) * | 2017-06-26 | 2019-01-04 | 辽宁艾特斯智能交通技术有限公司 | A kind of highway magnanimity transaction data search method |
| CN109271437A (en) * | 2018-09-27 | 2019-01-25 | 智庭(北京)智能科技有限公司 | A kind of Query method in real time of magnanimity rent information |
| CN110297832A (en) * | 2019-07-01 | 2019-10-01 | 联想(北京)有限公司 | A kind of time series data storage method and device, time series data querying method and device |
| CN110362549A (en) * | 2019-06-17 | 2019-10-22 | 平安普惠企业管理有限公司 | Log memory search method, electronic device and computer equipment |
| CN110555021A (en) * | 2018-03-26 | 2019-12-10 | 深圳先进技术研究院 | Data storage method, query method and related device |
| CN110765132A (en) * | 2019-10-22 | 2020-02-07 | 北京思特奇信息技术股份有限公司 | Data storage and retrieval method and device based on HBase |
| CN111797134A (en) * | 2020-06-23 | 2020-10-20 | 北京小米松果电子有限公司 | Data query method, device and storage medium for distributed database |
| CN112148731A (en) * | 2020-08-13 | 2020-12-29 | 新华三大数据技术有限公司 | Data paging query method, device and storage medium |
| CN112632157A (en) * | 2021-03-11 | 2021-04-09 | 全时云商务服务股份有限公司 | Multi-condition paging query method under distributed system |
| CN112687364A (en) * | 2020-12-24 | 2021-04-20 | 宁波金唐软件有限公司 | Hbase-based medical data management method and system |
| WO2023143095A1 (en) * | 2022-01-25 | 2023-08-03 | Zhejiang Dahua Technology Co., Ltd. | Method and system for data query |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP2610759A1 (en) * | 2010-08-26 | 2013-07-03 | Tencent Technology (Shenzhen) Company Limited | Method and apparatus for managing massive data messages |
| CN103399887A (en) * | 2013-07-19 | 2013-11-20 | 蓝盾信息安全技术股份有限公司 | Query and statistical analysis system for mass logs |
| CN103701633A (en) * | 2013-12-09 | 2014-04-02 | 国家电网公司 | Setup and maintenance system of visual cluster application for distributed search SolrCloud |
-
2014
- 2014-07-15 CN CN201410336964.3A patent/CN104102710A/en active Pending
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP2610759A1 (en) * | 2010-08-26 | 2013-07-03 | Tencent Technology (Shenzhen) Company Limited | Method and apparatus for managing massive data messages |
| CN103399887A (en) * | 2013-07-19 | 2013-11-20 | 蓝盾信息安全技术股份有限公司 | Query and statistical analysis system for mass logs |
| CN103701633A (en) * | 2013-12-09 | 2014-04-02 | 国家电网公司 | Setup and maintenance system of visual cluster application for distributed search SolrCloud |
Non-Patent Citations (1)
| Title |
|---|
| MR.CHENZ: "基于Solr的HBase多条件查询", 《博客园》 * |
Cited By (50)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN105989117A (en) * | 2015-02-13 | 2016-10-05 | 中国移动通信集团山西有限公司 | Method and system for rapidly and jointly processing semi-structured data |
| CN104731945A (en) * | 2015-03-31 | 2015-06-24 | 浪潮集团有限公司 | Full-text searching method and device based on HBase |
| CN104731945B (en) * | 2015-03-31 | 2018-04-06 | 浪潮集团有限公司 | A kind of text searching method and device based on HBase |
| CN104834730A (en) * | 2015-05-15 | 2015-08-12 | 北京京东尚科信息技术有限公司 | Data analysis system and method |
| CN104834730B (en) * | 2015-05-15 | 2018-06-01 | 北京京东尚科信息技术有限公司 | data analysis system and method |
| CN104951509A (en) * | 2015-05-25 | 2015-09-30 | 中国科学院信息工程研究所 | Big data online interactive query method and system |
| CN106326309B (en) * | 2015-07-03 | 2020-02-21 | 阿里巴巴集团控股有限公司 | Data query method and device |
| CN106326309A (en) * | 2015-07-03 | 2017-01-11 | 阿里巴巴集团控股有限公司 | Data query method and device |
| CN105095458A (en) * | 2015-07-29 | 2015-11-25 | 南威软件股份有限公司 | Method for big data retrieval based on time characteristics and supporting complicated conditions |
| CN105320746A (en) * | 2015-09-25 | 2016-02-10 | 北京北信源软件股份有限公司 | Big data based index acquisition method and system |
| CN106844374B (en) * | 2015-12-04 | 2020-04-03 | 北京四维图新科技股份有限公司 | A method and device for storing and retrieving photos |
| CN106844374A (en) * | 2015-12-04 | 2017-06-13 | 北京四维图新科技股份有限公司 | A kind of storage, the method and device of retrieval photo |
| CN105787058A (en) * | 2016-02-26 | 2016-07-20 | 广州品唯软件有限公司 | User label system and data pushing system based on same |
| CN105787058B (en) * | 2016-02-26 | 2019-08-02 | 广州品唯软件有限公司 | A kind of user tag system and the data delivery system based on user tag system |
| CN107515867B (en) * | 2016-06-15 | 2021-06-29 | 阿里巴巴集团控股有限公司 | Data storage and query method and device of NoSQL database and generation method and device of rowKey full combination |
| CN107515867A (en) * | 2016-06-15 | 2017-12-26 | 阿里巴巴集团控股有限公司 | The generation method and device that data storage, querying method and the device and a kind of rowKey of a kind of NoSQL databases combine entirely |
| CN106202490A (en) * | 2016-07-19 | 2016-12-07 | 浪潮电子信息产业股份有限公司 | A kind of SolrCloud configuration file amending method, Apparatus and system |
| CN107704475A (en) * | 2016-08-10 | 2018-02-16 | 泰康保险集团股份有限公司 | Multilayer distributed unstructured data storage method, querying method and device |
| CN106326429A (en) * | 2016-08-25 | 2017-01-11 | 武汉光谷信息技术股份有限公司 | Hbase second-level query scheme based on solr |
| CN106446145A (en) * | 2016-09-21 | 2017-02-22 | 郑州云海信息技术有限公司 | Quick creation method based on Hadoop for big data index |
| CN106528051A (en) * | 2016-11-15 | 2017-03-22 | 国云科技股份有限公司 | High-efficiency operation method for queuing and stacking big data based on MongoDB |
| CN106528051B (en) * | 2016-11-15 | 2019-02-19 | 国云科技股份有限公司 | The method of big data queue stack manipulation based on MongoDB |
| CN106682148A (en) * | 2016-12-22 | 2017-05-17 | 北京锐安科技有限公司 | Method and device based on Solr data search |
| CN106649828B (en) * | 2016-12-29 | 2019-12-24 | 中国银联股份有限公司 | A data query method and system |
| CN106649828A (en) * | 2016-12-29 | 2017-05-10 | 中国银联股份有限公司 | Data query method and system |
| CN107038207B (en) * | 2017-02-20 | 2021-03-19 | 创新先进技术有限公司 | A data query method, data processing method and device |
| CN107038207A (en) * | 2017-02-20 | 2017-08-11 | 阿里巴巴集团控股有限公司 | A kind of data query method, data processing method and device |
| CN106909671A (en) * | 2017-02-28 | 2017-06-30 | 湖南蚁坊软件股份有限公司 | A kind of method and system of NoSQL databases condition query |
| CN108628893A (en) * | 2017-03-21 | 2018-10-09 | 华为技术有限公司 | Metadata access method and storage device in a kind of storage device |
| CN107239517A (en) * | 2017-05-23 | 2017-10-10 | 中国联合网络通信集团有限公司 | Many condition searching method and device based on Hbase databases |
| CN107239517B (en) * | 2017-05-23 | 2020-09-29 | 中国联合网络通信集团有限公司 | Multi-condition search method and device based on Hbase database |
| CN109144995A (en) * | 2017-06-26 | 2019-01-04 | 辽宁艾特斯智能交通技术有限公司 | A kind of highway magnanimity transaction data search method |
| CN109144995B (en) * | 2017-06-26 | 2022-09-13 | 辽宁艾特斯智能交通技术有限公司 | Method for searching mass transaction data on highway |
| CN107291964B (en) * | 2017-08-16 | 2019-11-15 | 南京华飞数据技术有限公司 | A Method of Realizing Fuzzy Query Based on HBase |
| CN107291964A (en) * | 2017-08-16 | 2017-10-24 | 南京华飞数据技术有限公司 | A kind of method that fuzzy query is realized based on HBase |
| CN108153805A (en) * | 2017-11-17 | 2018-06-12 | 广东睿江云计算股份有限公司 | A kind of method, the system of efficient cleaning Hbase time series datas |
| CN108319636A (en) * | 2017-11-27 | 2018-07-24 | 大象慧云信息技术有限公司 | Electronic invoice data querying method |
| CN110555021A (en) * | 2018-03-26 | 2019-12-10 | 深圳先进技术研究院 | Data storage method, query method and related device |
| CN110555021B (en) * | 2018-03-26 | 2023-09-19 | 深圳先进技术研究院 | Data storage method, query method and related devices |
| CN109271437A (en) * | 2018-09-27 | 2019-01-25 | 智庭(北京)智能科技有限公司 | A kind of Query method in real time of magnanimity rent information |
| CN110362549A (en) * | 2019-06-17 | 2019-10-22 | 平安普惠企业管理有限公司 | Log memory search method, electronic device and computer equipment |
| CN110297832A (en) * | 2019-07-01 | 2019-10-01 | 联想(北京)有限公司 | A kind of time series data storage method and device, time series data querying method and device |
| CN110297832B (en) * | 2019-07-01 | 2021-12-24 | 联想(北京)有限公司 | Time sequence data storage method and device and time sequence data query method and device |
| CN110765132A (en) * | 2019-10-22 | 2020-02-07 | 北京思特奇信息技术股份有限公司 | Data storage and retrieval method and device based on HBase |
| CN111797134A (en) * | 2020-06-23 | 2020-10-20 | 北京小米松果电子有限公司 | Data query method, device and storage medium for distributed database |
| CN112148731B (en) * | 2020-08-13 | 2022-05-27 | 新华三大数据技术有限公司 | Data paging query method, device and storage medium |
| CN112148731A (en) * | 2020-08-13 | 2020-12-29 | 新华三大数据技术有限公司 | Data paging query method, device and storage medium |
| CN112687364A (en) * | 2020-12-24 | 2021-04-20 | 宁波金唐软件有限公司 | Hbase-based medical data management method and system |
| CN112632157A (en) * | 2021-03-11 | 2021-04-09 | 全时云商务服务股份有限公司 | Multi-condition paging query method under distributed system |
| WO2023143095A1 (en) * | 2022-01-25 | 2023-08-03 | Zhejiang Dahua Technology Co., Ltd. | Method and system for data query |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN104102710A (en) | Massive data query method | |
| US11816126B2 (en) | Large scale unstructured database systems | |
| Tauro et al. | Comparative study of the new generation, agile, scalable, high performance NOSQL databases | |
| US9081837B2 (en) | Scoped database connections | |
| CN108431804B (en) | Ability to group multiple container databases into a single container database cluster | |
| US8244743B2 (en) | Scalable rendering of large spatial databases | |
| US8977646B2 (en) | Leveraging graph databases in a federated database system | |
| US9805137B2 (en) | Virtualizing schema relations over a single database relation | |
| CN111767303A (en) | A data query method, device, server and readable storage medium | |
| US10733172B2 (en) | Method and computing device for minimizing accesses to data storage in conjunction with maintaining a B-tree | |
| CN107506464A (en) | A kind of method that HBase secondary indexs are realized based on ES | |
| Borkar et al. | Have your data and query it too: From key-value caching to big data management | |
| CN105824868A (en) | Distributed type database data processing method and distributed type database system | |
| CN107480252A (en) | A kind of data query method, client, service end and system | |
| Srivastava et al. | Analysis of various NoSql database | |
| CN101963993B (en) | Method for fast searching database sheet table record | |
| CN105117442B (en) | A kind of big data querying method based on probability | |
| Das et al. | A study on big data integration with data warehouse | |
| CN111221785A (en) | A Semantic Data Lake Construction Method for Multi-source Heterogeneous Data | |
| Anand et al. | MongoDB and Oracle NoSQL: A technical critique for design decisions | |
| Ramanathan et al. | Comparison of cloud database: Amazon's SimpleDB and Google's Bigtable | |
| CN109388659B (en) | Data storage method, apparatus and computer readable storage medium | |
| CN105930354B (en) | Storage model conversion method and device | |
| JP6371136B2 (en) | Data virtualization server, query processing method and query processing program in data virtualization server | |
| CN115168361A (en) | Label management method and device |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| WD01 | Invention patent application deemed withdrawn after publication | ||
| WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20141015 |