CN110019204A - Method and apparatus are indexed inside split towards HDFS - Google Patents
Method and apparatus are indexed inside split towards HDFS Download PDFInfo
- Publication number
- CN110019204A CN110019204A CN201711023820.2A CN201711023820A CN110019204A CN 110019204 A CN110019204 A CN 110019204A CN 201711023820 A CN201711023820 A CN 201711023820A CN 110019204 A CN110019204 A CN 110019204A
- Authority
- CN
- China
- Prior art keywords
- index
- attributes
- split
- index attributes
- nonclustered
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract 13
- 230000002776 aggregation Effects 0.000 claims abstract 13
- 238000004220 aggregation Methods 0.000 claims abstract 13
- 239000003016 pheromone Substances 0.000 claims abstract 11
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/182—Distributed file systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2471—Distributed queries
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Software Systems (AREA)
- Fuzzy Systems (AREA)
- Mathematical Physics (AREA)
- Probability & Statistics with Applications (AREA)
- Computational Linguistics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the present invention provides a kind of inside split index method and apparatus towards HDFS, belongs to field of data retrieval.This method comprises: receiving inquiry request;Index attributes are determined according to the inquiry request;Pass through the aggregat ion pheromones pre-established according to the index attributes value of the index attributes or nonclustered index determines piecemeal split;And split determined by loading is to obtain data corresponding with the inquiry request.Through the above technical solutions, aggregat ion pheromones or nonclustered index of the present invention by pre-establishing reduce the magnetic disc i/o that unnecessary data scanning generates, improve the inquiry velocity of HDFS come the split loaded according to the determining expectation of inquiry request.
Description
Technical field
The present invention relates to field of data retrieval, more particularly to index method and apparatus inside the split towards HDFS.
Background technique
Underlying basis of the HDFS (Hadoop distributed file system) as the Hadoop ecosphere, be usually used to storage from
Line number evidence, and analytical inquiry is handled in conjunction with Map/Reduce, but for having the selectivity being relatively strict with to the response time
And interactive inquiry, then have the defects that in performance.
In traditional database management technology, improving the most common method of query processing speed is index.Pass through index
The data for not meeting query requirement can be quickly filtered, I/O can be greatly reduced, reduce search range, reduce the response time.
However, traditional index technology can not be applied directly in the inquiry of HDFS.
In HDFS, list file can be divided into multiple split to be handled, and each split contains a large amount of note
Record, when inquiry, if be all scanned to every record, it will generate a large amount of magnetic disc i/o, reduction search efficiency.
Summary of the invention
The purpose of the embodiment of the present invention is that a kind of inside split index method and apparatus towards HDFS are provided, for solving
The certainly big problem of I/O expense.
To achieve the goals above, indexing means inside the split that the embodiment of the invention provides a kind of towards HDFS, should
Method includes: reception inquiry request;Index attributes are determined according to the inquiry request;According to the index attributes of the index attributes
Value passes through the aggregat ion pheromones pre-established or nonclustered index determines piecemeal split;And split determined by loading is to obtain
Data corresponding with the inquiry request.
Preferably, pass through the aggregat ion pheromones or nonclustered index pre-established according to the index attributes value of the index attributes
Determine that split includes: to determine split by the aggregat ion pheromones in the case where the index attributes only include an attribute;
And in the case where the index attributes include multiple attributes, split is determined by the nonclustered index.
Preferably, the establishment process of the aggregat ion pheromones is as follows: being arranged for the index attributes value of an index attributes
Sequence, and aggregat ion pheromones are established based on the index attributes value after sequence.
Preferably, the establishment process of the nonclustered index is as follows: for the rope of the first attribute in multiple index attributes
Draw attribute value to be ranked up, and aggregat ion pheromones are established based on the index attributes value after sequence;And belong to for the multiple index
Other attributes other than first attribute in property establish nonclustered index.
Preferably, this method further include: by the index attributes value of the index attributes according to determined by the inquiry request
Range is compared with the range of the index attributes value of corresponding index attributes in the nonclustered index, judges whether there is friendship
Collection;There are intersection, the data of the corresponding split in intersection part of index attributes value are loaded;And
There is no in the case where intersection, the corresponding split of index attributes value of index attributes corresponding in the nonclustered index is lost
It abandons.
Correspondingly, indexing unit inside the split that the embodiment of the invention provides a kind of towards HDFS, the device include:
Receiving module, for receiving inquiry request;Processing module for determining index attributes according to the inquiry request, and is used for root
Pass through the aggregat ion pheromones pre-established according to the index attributes or nonclustered index determines piecemeal split;And loading module, it uses
The split determined by loading is to obtain data corresponding with the inquiry request.
Preferably, the processing module is also used to: in the case where the index attributes only include an attribute, passing through institute
It states aggregat ion pheromones and determines split;And in the case where the index attributes include multiple attributes, pass through the nonclustered index
Determine split.
Preferably, the establishment process of the aggregat ion pheromones is as follows: being arranged for the index attributes value of an index attributes
Sequence, and aggregat ion pheromones are established based on the index attributes value after sequence.
Preferably, the establishment process of the nonclustered index is as follows: for the rope of the first attribute in multiple index attributes
Draw attribute value to be ranked up, and aggregat ion pheromones are established based on the index attributes value after sequence;And belong to for the multiple index
Other attributes other than first attribute in property establish nonclustered index.
Preferably, the processing module is also used to: by the index category of the index attributes according to determined by the inquiry request
The range of property value is compared with the range of the index attributes value of corresponding index attributes in the nonclustered index, is judged whether
There is intersection;And the loading module is also used to: there are intersection, the intersection part of index attributes value is corresponding
The data of split are loaded;And in the case where intersection is not present, by index attributes corresponding in the nonclustered index
Index attributes be worth corresponding split and abandon.
Through the above technical solutions, the present invention is asked by the aggregat ion pheromones that pre-establish or nonclustered index according to inquiry
The split for determining expectation load is sought, the magnetic disc i/o that unnecessary data scanning generates is reduced, improves the inquiry speed of HDFS
Degree.
The other feature and advantage of the embodiment of the present invention will the following detailed description will be given in the detailed implementation section.
Detailed description of the invention
Attached drawing is to further understand for providing to the embodiment of the present invention, and constitute part of specification, under
The specific embodiment in face is used to explain the present invention embodiment together, but does not constitute the limitation to the embodiment of the present invention.Attached
In figure:
Fig. 1 is the flow chart of indexing means inside the split provided by the invention towards HDFS;
Fig. 2 is the diagram of aggregat ion pheromones structure provided by the invention;
Fig. 3 is the flow chart provided by the invention for establishing aggregat ion pheromones;
Fig. 4 is the flow chart of the query processing process of aggregat ion pheromones provided by the invention;
Fig. 5 is the diagram of nonclustered index structure provided by the invention;
Fig. 6 is the flow chart of the query processing process of nonclustered index provided by the invention;And
Fig. 7 is the block diagram of indexing unit inside the split provided by the invention towards HDFS.
Specific embodiment
It is described in detail below in conjunction with specific embodiment of the attached drawing to the embodiment of the present invention.It should be understood that this
Locate described specific embodiment and be merely to illustrate and explain the present invention embodiment, is not intended to restrict the invention embodiment.
Fig. 1 is the flow chart of indexing means inside the split provided by the invention towards HDFS, as shown in Figure 1, this method
Include:
Step 101, inquiry request is received.
Step 102, index attributes are determined according to inquiry request.
Step 103, the aggregat ion pheromones pre-established are passed through according to the index attributes value of index attributes or nonclustered index is true
Determine piecemeal split.
Step 104, split determined by loading is to obtain data corresponding with the inquiry request.
Wherein aggregat ion pheromones and nonclustered index pre-establish, and aggregat ion pheromones are for true according to inquiry request institute
Fixed index attributes only include the case where that an attribute, nonclustered index are for the index attributes according to determined by inquiry request
Include the case where multiple attributes.Thus, in above step 103 according to index attributes pass through the aggregat ion pheromones that pre-establish or
Nonclustered index determines that split includes: to determine in the case where index attributes only include an attribute by aggregat ion pheromones
split;In the case where index attributes include multiple attributes, split is determined by nonclustered index.
The establishment process of aggregat ion pheromones is as follows: being ranked up for the index attributes value of an index attributes, and based on row
Index attributes value after sequence establishes aggregat ion pheromones.When handling the list file being stored on HDFS, list file can be drawn
It is divided into split one by one, when establishing aggregat ion pheromones, by the data in each split according to the index attributes of index attributes
The sequence of value is ranked up, and is also ranked up according to identical ordering rule to the index attributes value in aggregat ion pheromones, that is,
It says, the data and index attributes value in split for aggregat ion pheromones, are namely based on according to identical rule compositor
Index attributes value after sequence establishes aggregat ion pheromones, and then the aggregat ion pheromones established are stored in after split data and are protected
There are in HDFS.
Fig. 2 is the diagram of aggregat ion pheromones structure provided by the invention, in Fig. 2:
Split data is the data of the split.
Trojan index is the aggregat ion pheromones established for the index attributes of the split, and it includes the split data
Index attributes value and offset.
Header be aggregat ion pheromones metamessage, it includes 5 fields: DataSize, IndexSize, Max, Min and
RecordNum.Wherein field DataSize is the size of the split data, and field IndexSize is that aggregat ion pheromones itself are big
Small, field Max is maximum value of the index attributes in the split, and field Min is minimum value of the index attributes in the split, field
RecordNum is the data bulk of the split.
Footer is the information of the new split repartitioned, and the aggregat ion pheromones of original data and generation are divided into
Field SplitSize and field FooterSize in one new split, Footer respectively indicate split size and
The size of Footer.
Fig. 3 is the flow chart provided by the invention for establishing aggregat ion pheromones, as shown in figure 3, the process includes:
Step 301, Selecting Index attribute, that is, a certain field is chosen as foundation aggregation from the list file on HDFS
The index attributes of index.
Step 302, such as ascending sort is carried out according to the index attributes value of index attributes to the data in each split,
Be under normal circumstances according to ascending sort, certain those skilled in the art can also according to the actual situation descending sort or press other
Rule compositor, present embodiment are only to provide a kind of example.
Step 303, using the result after ascending sort as the value of Split data.
Step 304, using the index attributes value of Split data and offset as the value of Trojan index.
Step 305, the value of Header and Footer is calculated according to Split data and Trojan index.
Step 306, aggregat ion pheromones are generated.
Fig. 4 is the flow chart of the query processing process of aggregat ion pheromones provided by the invention, as shown in figure 4, the process includes:
Step 401, inquiry request is received.
Step 402, index attributes and corresponding index attributes value are determined according to inquiry request, index attributes value here is just
It is condition described in step 405, present embodiment, which to be accomplished that, retrieves the index attributes according to determined by inquiry request
It is worth corresponding data.Index attributes value can be a range, for aggregat ion pheromones, the index of the split in list file
The range of attribute value is included within the scope of the index attributes value to be inquired.
Step 403, since the last one Footer field at list file end, each Footer word is successively read forward
SplitSize field in section marks off each split to come.It will be appreciated by those skilled in the art that in list file, respectively
The storage of a split is mutually continuous, thus need according to the size of each split come by each split mark off come,
Here the size of split is stored in SplitSize field, so being drawn each split according to SplitSize field
It branches away.
Step 404, the Header field for reading each split, obtains the metamessage of index, such as rope of metamessage here
Draw size etc..
Step 405, scanning index can refer to step with the offset of the determining data for meeting condition, condition here
Explanation in 402.
Step 406, the data for reading the condition that meets, i.e., read corresponding number according to the offset that scanning index obtains
According to, that is, load corresponding split.
Aggregat ion pheromones described above are suitable for the case where querying condition only relates to an attribute, if querying condition is related to
Multiple attributes, in order to improve search efficiency, it is necessary to carry out nonclustered index, that is, need to pre-establish nonclustered index.
Nonclustered index is generally built upon on the basis of aggregat ion pheromones, a list file can possess simultaneously an aggregat ion pheromones and
One or more nonclustered indexes, to support different query demands.
The establishment process of nonclustered index is as follows: carrying out for the index attributes value of the first attribute in multiple index attributes
Sequence, and aggregat ion pheromones are established based on the index attributes value after sequence;And in multiple index attributes in addition to first belongs to
Other attributes except property establish nonclustered index.That is assemble rope firstly the need of foundation when establishing nonclustered index
Draw, then establishing nonclustered index, it should be noted that aggregat ion pheromones are established based on the index attributes value after sequence
, and nonclustered index is not based on any rule, that is to say, that aggregat ion pheromones were ordered into, nonclustered index is unordered.
For the establishment process of nonclustered index, it is noted that wherein described multiple index attributes and basis
Index attributes determined by inquiry request will be distinguished, and the establishment process of nonclustered index is before receiving inquiry request
It carries out, is not aware that in targeted multiple index attributes which or multiple index attributes are when establishing nonclustered index
With the index attributes according to determined by inquiry request it is consistent or whether there is the index attributes according to determined by inquiry request.
When establishing nonclustered index, targeted multiple index attributes are for example the first category respectively in the presence of three attributes
Property, the second attribute, third attribute, then aggregat ion pheromones can be established for the index attributes value of the first attribute, then for the
Two attributes and third attribute establish nonclustered index.Here volume first, second, third is used for the purpose of three in description
Attribute differentiates, and is not for purposes of limitation.
Fig. 5 is the diagram of nonclustered index structure provided by the invention, in Fig. 5:
Split data is the data of the split.
Trojan index is the aggregat ion pheromones established for the index attributes of the split, and it includes the split data
Index attributes value and offset.
Header is the metamessage of aggregat ion pheromones, the description referring specifically to combination Fig. 2 to aggregat ion pheromones structure.
Non-Clustered Index is nonclustered index, mainly saves the offset of unsorted nonclustered index attribute
Amount.
Non-Clustered Header is the metamessage of nonclustered index.
The File Header essential record split establishes index on which attribute.
Footer is the information of the new split repartitioned, is retouched referring specifically in conjunction with Fig. 2 to aggregat ion pheromones structure
It states.
Wherein, Non-Clustered Index and Non-Clustered Header can have multiple, that is to say, that can be with
Multiple nonclustered indexes are established simultaneously.
Due to being arranged by the targeted index attributes of aggregat ion pheromones the data in split when establishing aggregat ion pheromones
Sequence, Split data are the data after sequence, so when establishing nonclustered index, it only need to be by the rope of selected nonclustered index
Draw attribute, the index attributes value and offset that the nonclustered index of each data is chosen from Split data are as Non-
Clustered Index.
Since data are sorted by aggregat ion pheromones, do not sort by nonclustered index, so for nonclustered index,
Data are unordered.The query process of nonclustered index is different with aggregat ion pheromones.
So query process includes: by the index attributes according to determined by inquiry request for nonclustered index
The range of index attributes value is compared with the range of the index attributes value of index attributes corresponding in nonclustered index, and judgement is
It is no to have intersection;There are intersection, the data of the corresponding split in intersection part of index attributes value are loaded;
And in the case where intersection is not present, the index attributes of index attributes corresponding in nonclustered index are worth corresponding split
It abandons.
Wherein, there are intersection, for the index attributes value of index attributes corresponding in nonclustered index
The data of the corresponding split in non-intersection part, can load or be not loaded with, certainly, in order to accelerate inquiry velocity, general feelings
It is not loaded under condition.
The query processing process of nonclustered index is illustrated presently in connection with the structure of nonclustered index, Fig. 6 is that the present invention mentions
The flow chart of the query processing process of the nonclustered index of confession, as shown in fig. 6, the process includes:
Step 601, inquiry request is received.
Step 602, index attributes and corresponding index attributes value are determined according to inquiry request.
Step 603, since the last one Footer field at list file end, each Footer word is successively read forward
SplitSize field in section marks off each split to come.
Step 604, the Header field for reading each split, determines the offset of the index attributes of nonclustered index.
Step 605, NonClustered Header is read, the metamessage of nonclustered index is obtained, passes through what is wherein recorded
The index attributes value and the index attributes value according to determined by inquiry request of split determines scanning strategy.Wherein, specific scanning
Strategy will be explained below.
Step 606, split data are scanned according to identified scanning strategy, returned the result.
It is specifically described scanning strategy below, it is assumed that the range of the index attributes value of split is [c, d] in list file, according to
The range of the index attributes value of index attributes determined by query messages is [a, b], is come according to range [c, d] and range [a, b]
The scanning strategy for loading split is as follows:
(1) as c≤a≤d and b >=d, scanning starting position is the offset minimum value of all values in the section [a, d], eventually
Stop bit is set to the offset maximum value of all values in the section [a, d], the i.e. data of split corresponding to load section [a, d], i.e.,
Load the data of the corresponding split in intersection part of index attributes value.
(2) as c≤a≤d and c≤b≤d, scanning starting position is that the offset of all values in the section [a, b] is minimum
Value, final position are the offset maximum value of all values in the section [a, b], i.e. the number of split corresponding to load section [a, b]
According to, i.e., load index attributes value the corresponding split in intersection part data.
(3) as a≤c and c≤b≤d, scanning starting position is the offset minimum value of all values in the section [c, b], eventually
Stop bit is set to the offset maximum value of all values in the section [c, b]., that is, the data of split corresponding to the section [c, b] are loaded,
Load the data of the corresponding split in intersection part of index attributes value.
(4) as a≤c and b >=d, entire split is scanned, that is, loads entire split, i.e. all data of split.
(5) when a, b are unsatisfactory for above 4 kinds of situations, the split is abandoned, that is, abandons entire split.
For aggregat ion pheromones, there is only situations in above (4).
In addition, it will be appreciated by those skilled in the art that the side of traversal can be used when inquiry or scan data
Formula.
Fig. 7 is the block diagram of indexing unit inside the split provided by the invention towards HDFS, as shown in fig. 7, the device packet
Include receiving module 701, processing module 702 and loading module 703.Receiving module 701 is for receiving inquiry request.Processing module
702 for determining index attributes according to inquiry request, and is used to pass through the aggregat ion pheromones pre-established or non-according to index attributes
Aggregat ion pheromones determine piecemeal split.Loading module 703 is corresponding with inquiry request to obtain for loading identified split
Data.
It should be noted that the detail and benefit of the inside the split indexing unit provided by the invention towards HDFS
Similar with indexing means inside the split provided by the invention towards HDFS, in this, it will not go into details.
The optional embodiment of the embodiment of the present invention is described in detail in conjunction with attached drawing above, still, the embodiment of the present invention is simultaneously
The detail being not limited in above embodiment can be to of the invention real in the range of the technology design of the embodiment of the present invention
The technical solution for applying example carries out a variety of simple variants, these simple variants belong to the protection scope of the embodiment of the present invention.
The technical solution provided through the invention is optimized the inside split indexing means, in query execution rank
Section meets the data of querying condition by read-only take of the internal index of inquiry, greatly reduces what unnecessary data scanning generated
Magnetic disc i/o.The present invention can also be combined with the optimization method of other levels further to promote HDFS inquiry velocity.
It is further to note that specific technical features described in the above specific embodiments, in not lance
In the case where shield, it can be combined in any appropriate way.In order to avoid unnecessary repetition, the embodiment of the present invention pair
No further explanation will be given for various combinations of possible ways.
In addition, any combination can also be carried out between a variety of different embodiments of the embodiment of the present invention, as long as it is not
The thought of the embodiment of the present invention is violated, equally should be considered as disclosure of that of the embodiment of the present invention.
Claims (10)
1. indexing means inside a kind of split towards HDFS, which is characterized in that this method comprises:
Receive inquiry request;
Index attributes are determined according to the inquiry request;
Pass through the aggregat ion pheromones pre-established according to the index attributes value of the index attributes or nonclustered index determines piecemeal
split;And
Split determined by loading is to obtain data corresponding with the inquiry request.
2. the method according to claim 1, wherein being passed through according to the index attributes value of the index attributes preparatory
The aggregat ion pheromones or nonclustered index of foundation determine that split includes:
In the case where the index attributes only include an attribute, split is determined by the aggregat ion pheromones;And
In the case where the index attributes include multiple attributes, split is determined by the nonclustered index.
3. the method according to claim 1, wherein the establishment process of the aggregat ion pheromones is as follows:
It is ranked up for the index attributes value of an index attributes, and aggregation rope is established based on the index attributes value after sequence
Draw.
4. the method according to claim 1, wherein the establishment process of the nonclustered index is as follows:
It is ranked up for the index attributes value of the first attribute in multiple index attributes, and based on the index attributes value after sequence
Establish aggregat ion pheromones;And
Nonclustered index is established for other attributes other than first attribute in the multiple index attributes.
5. according to the method described in claim 4, it is characterized in that, this method further include:
By the range of the index attributes value of the index attributes according to determined by the inquiry request with it is right in the nonclustered index
The range of the index attributes value for the index attributes answered is compared, and judges whether there is intersection;
There are intersection, the data of the corresponding split in intersection part of index attributes value are loaded;And
It is in the case where intersection is not present, the index attributes value of index attributes corresponding in the nonclustered index is corresponding
Split is abandoned.
6. indexing unit inside a kind of split towards HDFS, which is characterized in that the device includes:
Receiving module, for receiving inquiry request;
Processing module, for determining index attributes according to the inquiry request, and it is preparatory for being passed through according to the index attributes
The aggregat ion pheromones or nonclustered index of foundation determine piecemeal split;And
Loading module, for loading identified split to obtain data corresponding with the inquiry request.
7. device according to claim 6, which is characterized in that the processing module is also used to:
In the case where the index attributes only include an attribute, split is determined by the aggregat ion pheromones;And
In the case where the index attributes include multiple attributes, split is determined by the nonclustered index.
8. device according to claim 6, which is characterized in that the establishment process of the aggregat ion pheromones is as follows:
It is ranked up for the index attributes value of an index attributes, and aggregation rope is established based on the index attributes value after sequence
Draw.
9. device according to claim 6, which is characterized in that the establishment process of the nonclustered index is as follows:
It is ranked up for the index attributes value of the first attribute in multiple index attributes, and based on the index attributes value after sequence
Establish aggregat ion pheromones;And
Nonclustered index is established for other attributes other than first attribute in the multiple index attributes.
10. device according to claim 9, which is characterized in that the processing module is also used to: will be asked according to the inquiry
Ask the range of the index attributes value of identified index attributes and the index category of corresponding index attributes in the nonclustered index
The range of property value is compared, and judges whether there is intersection;And
The loading module is also used to: there are intersection, by the corresponding split's in intersection part of index attributes value
Data are loaded;And in the case where intersection is not present, by the index of index attributes corresponding in the nonclustered index
The corresponding split of attribute value is abandoned.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711023820.2A CN110019204A (en) | 2017-10-27 | 2017-10-27 | Method and apparatus are indexed inside split towards HDFS |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711023820.2A CN110019204A (en) | 2017-10-27 | 2017-10-27 | Method and apparatus are indexed inside split towards HDFS |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110019204A true CN110019204A (en) | 2019-07-16 |
Family
ID=67186662
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711023820.2A Pending CN110019204A (en) | 2017-10-27 | 2017-10-27 | Method and apparatus are indexed inside split towards HDFS |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110019204A (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101520801A (en) * | 2009-01-14 | 2009-09-02 | 中国科学院地理科学与资源研究所 | Method for storing space geometric objects to database |
CN101866358A (en) * | 2010-06-12 | 2010-10-20 | 中国科学院计算技术研究所 | A multi-dimensional interval query method and system |
CN102722531A (en) * | 2012-05-17 | 2012-10-10 | 北京大学 | Query method based on regional bitmap indexes in cloud environment |
CN103324762A (en) * | 2013-07-17 | 2013-09-25 | 陆嘉恒 | Hadoop-based index creation method and indexing method thereof |
US20150220529A1 (en) * | 2014-02-06 | 2015-08-06 | International Business Machines Corporation | Split elimination in mapreduce systems |
-
2017
- 2017-10-27 CN CN201711023820.2A patent/CN110019204A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101520801A (en) * | 2009-01-14 | 2009-09-02 | 中国科学院地理科学与资源研究所 | Method for storing space geometric objects to database |
CN101866358A (en) * | 2010-06-12 | 2010-10-20 | 中国科学院计算技术研究所 | A multi-dimensional interval query method and system |
CN102722531A (en) * | 2012-05-17 | 2012-10-10 | 北京大学 | Query method based on regional bitmap indexes in cloud environment |
CN103324762A (en) * | 2013-07-17 | 2013-09-25 | 陆嘉恒 | Hadoop-based index creation method and indexing method thereof |
US20150220529A1 (en) * | 2014-02-06 | 2015-08-06 | International Business Machines Corporation | Split elimination in mapreduce systems |
Non-Patent Citations (1)
Title |
---|
何 龙 等: ""一种面向HDFS 的多层索引技术"", 《软件学报》 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110222072A (en) | Data Query Platform, method, equipment and storage medium | |
CN104408159B (en) | A kind of data correlation, loading, querying method and device | |
CN103995879B (en) | Data query method, apparatus and system based on OLAP system | |
CN102147792B (en) | Customized knowledge intelligent system | |
CN110990447B (en) | Data exploration method, device, equipment and storage medium | |
CN106407244A (en) | Multi-database-based data query method, system and apparatus | |
KR20120120159A (en) | Table Search Device, Table Search Method, and Table Search System | |
CN105096174A (en) | Transaction matching method and transaction matching system | |
CN106383860A (en) | Data query method and apparatus | |
CN106325756A (en) | Data storage and data computation methods and devices | |
CN102467525A (en) | Document associating method and system | |
CN112069175A (en) | Data query method and device and electronic equipment | |
CN110362563A (en) | The processing method and processing device of tables of data, storage medium, electronic device | |
CN111930823A (en) | Data query method and device, data center station and storage medium | |
CN115757477A (en) | Database query processing method, device, equipment and storage medium | |
CN108319652A (en) | A kind of the column document storage system and method for the elevator data based on HDFS | |
CN110287213B (en) | Data query method, device and system based on OLAP system | |
CN107832342B (en) | Robot chatting method and system | |
CN104598485B (en) | The method and apparatus for handling database table | |
CN110019204A (en) | Method and apparatus are indexed inside split towards HDFS | |
EP2990983A1 (en) | Method and apparatus for scanning files | |
CN108243348B (en) | A kind of stream process request distribution server | |
CN105094810A (en) | Data processing method and apparatus based on plug-in of common gateway interface | |
CN111639846A (en) | Demand processing method and device, electronic equipment and computer readable storage medium | |
CN111324737A (en) | Bag-of-words model-based distributed text clustering method, storage medium and computing device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190716 |
|
RJ01 | Rejection of invention patent application after publication |