[go: up one dir, main page]

CN110222074A - It indexes lookup method, search device, electronic equipment and storage medium - Google Patents

It indexes lookup method, search device, electronic equipment and storage medium Download PDF

Info

Publication number
CN110222074A
CN110222074A CN201910515781.0A CN201910515781A CN110222074A CN 110222074 A CN110222074 A CN 110222074A CN 201910515781 A CN201910515781 A CN 201910515781A CN 110222074 A CN110222074 A CN 110222074A
Authority
CN
China
Prior art keywords
index
target
storage medium
target index
section
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910515781.0A
Other languages
Chinese (zh)
Inventor
尹滔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kingsoft Cloud Network Technology Co Ltd
Beijing Kingsoft Cloud Technology Co Ltd
Original Assignee
Beijing Kingsoft Cloud Network Technology Co Ltd
Beijing Kingsoft Cloud Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kingsoft Cloud Network Technology Co Ltd, Beijing Kingsoft Cloud Technology Co Ltd filed Critical Beijing Kingsoft Cloud Network Technology Co Ltd
Priority to CN201910515781.0A priority Critical patent/CN110222074A/en
Publication of CN110222074A publication Critical patent/CN110222074A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明实施例提供了一种索引查找方法、查找装置、电子设备及存储介质,其中方法包括:获取目标索引标识;基于预设查找算法将目标索引标识和预先存储在第一存储介质中的多个参考索引标识进行比较,确定与目标索引标识对应的目标索引区间;从存储在第二存储介质中的索引文件中确定与目标索引区间对应的索引信息,得到目标索引信息,并将目标索引信息读取到第一存储介质中;基于预设的查找算法在目标索引信息中查找目标索引标识,得到待查找索引。本发明实施例能够缩短索引的查找时间。

An embodiment of the present invention provides an index search method, a search device, an electronic device, and a storage medium, wherein the method includes: acquiring a target index identifier; Compare the reference index identifiers to determine the target index interval corresponding to the target index identifier; determine the index information corresponding to the target index interval from the index file stored in the second storage medium, obtain the target index information, and convert the target index information to the target index information. Read into the first storage medium; search for the target index identifier in the target index information based on a preset search algorithm to obtain the index to be searched. The embodiment of the present invention can shorten the search time of the index.

Description

索引查找方法、查找装置、电子设备及存储介质Index search method, search device, electronic device and storage medium

技术领域technical field

本发明涉及数据存储技术领域,特别是涉及一种索引查找方法、查找装置、电子设备及存储介质。The invention relates to the technical field of data storage, in particular to an index search method, a search device, electronic equipment and a storage medium.

背景技术Background technique

索引是对数据库表中一列或多列的值进行排序的一种结构,现有技术中通常使用索引技术来提高对数据库的访问速度。An index is a structure for sorting the values of one or more columns in a database table. In the prior art, the index technology is usually used to improve the access speed to the database.

具体来说,由于内存的读写性能要远大于磁盘的读写性能,现有技术中通常将索引文件存储在内存中,当接收到数据库访问请求时,先读取索引文件以定位数据在数据库的存储位置,进而根据存储位置访问数据库,从而大大提高了对数据库的访问速度。Specifically, since the read/write performance of the memory is much greater than that of the disk, the index file is usually stored in the memory in the prior art. When a database access request is received, the index file is first read to locate the data in the database. The storage location, and then access the database according to the storage location, thus greatly improving the access speed to the database.

发明人在实施本发明的过程中发现,当索引文件太大时,无法长期将索引文件存储在内存中,只能存储在磁盘上,这是由于如果将索引文件存储在内存中,则会消耗存储系统过多的内存资源,导致存储系统被拖慢。因此,在每次读取数据的时候,都需要将整个索引文件从磁盘中读取到内存中,然后再基于索引文件的读取结果查询数据库,待从数据库中获取到访问结果之后释放索引文件所占用的内存。这样,由于每次读取数据时,都需要先将整个索引文件读取到内存中,而索引文件本身较大,读取到内存的过程需要耗费较长时间,导致影响对数据库的访问速度的问题。The inventor found in the process of implementing the present invention that when the index file is too large, the index file cannot be stored in the memory for a long time, and can only be stored on the disk. This is because if the index file is stored in the memory, it will consume The storage system has too many memory resources, causing the storage system to be slowed down. Therefore, every time data is read, the entire index file needs to be read from the disk into the memory, and then the database is queried based on the reading result of the index file, and the index file is released after the access result is obtained from the database. The memory used. In this way, since each time the data is read, the entire index file needs to be read into the memory first, and the index file itself is large, and the process of reading into the memory takes a long time, which affects the access speed of the database. question.

发明内容Contents of the invention

本发明实施例的目的在于提供一种索引查找方法、查找装置、电子设备及存储介质,以缩短索引的查找时间。具体技术方案如下:The purpose of the embodiment of the present invention is to provide an index search method, search device, electronic equipment and storage medium, so as to shorten the search time of the index. The specific technical scheme is as follows:

第一方面,本发明实施例提供了一种索引查找方法,包括:In a first aspect, an embodiment of the present invention provides an index search method, including:

获取目标索引标识,所述目标索引标识为待查找索引的索引标识;Obtaining a target index identifier, where the target index identifier is an index identifier of an index to be searched;

基于预设查找算法将所述目标索引标识和预先存储在第一存储介质中的多个参考索引标识进行比较,确定与所述目标索引标识对应的目标索引区间;comparing the target index identifier with a plurality of reference index identifiers pre-stored in the first storage medium based on a preset search algorithm, and determining a target index interval corresponding to the target index identifier;

从存储在第二存储介质中的索引文件中确定与所述目标索引区间对应的索引信息,得到目标索引信息,并将所述目标索引信息读取到第一存储介质中,所述第一存储介质的数据读写性能高于所述第二存储介质的数据读写性能;Determine the index information corresponding to the target index interval from the index file stored in the second storage medium, obtain the target index information, and read the target index information into the first storage medium, the first storage The data read and write performance of the medium is higher than the data read and write performance of the second storage medium;

基于预设的查找算法在所述目标索引信息中查找所述目标索引标识,得到所述待查找索引。The target index identifier is searched in the target index information based on a preset search algorithm to obtain the index to be searched.

可选的,所述参考索引标识通过以下方法得到:Optionally, the reference index identifier is obtained by the following method:

预先遍历存储在所述第二存储介质中的索引文件;traversing the index files stored in the second storage medium in advance;

从所述索引文件中按照预定顺序间隔抽取多个索引标识,作为所述参考索引标识。A plurality of index identifiers are extracted from the index file at intervals in a predetermined sequence as the reference index identifiers.

可选的,所述从存储在第二存储介质中的索引文件中确定与所述目标索引区间对应的索引信息的步骤,包括:Optionally, the step of determining the index information corresponding to the target index interval from the index file stored in the second storage medium includes:

确定所述目标索引区间的序号;determining the sequence number of the target index interval;

基于所述目标索引区间的序号确定所述目标索引区间在所述索引文件中的起始偏移量;determining the starting offset of the target index section in the index file based on the sequence number of the target index section;

确定所述目标索引区间中包含的索引的字节总量;determining the total amount of bytes of the indexes included in the target index range;

基于所述起始偏移量和所述目标索引区间中包含的索引的字节总量确定所述目标索引信息。The target index information is determined based on the start offset and the total amount of bytes of indexes contained in the target index interval.

可选的,通过以下方法确定所述目标索引区间的序号:根据所述目标索引区间对应的参考索引标识在所述多个参考索引中的顺序,确定所述目标索引区间的序号。Optionally, the sequence number of the target index interval is determined by the following method: according to the order of reference index identifiers corresponding to the target index interval in the plurality of reference indexes, the sequence number of the target index interval is determined.

可选的,基于第一预设表达式确定所述目标索引区间在所述索引文件中的起始偏移量O,所述第一预设表达式为:Optionally, the starting offset O of the target index interval in the index file is determined based on a first preset expression, where the first preset expression is:

O=N*Q*SO=N*Q*S

其中,N为所述目标索引区间的序号,Q为所述目标索引区间中包含的索引的数量,S为每条索引的字节数量。Wherein, N is the sequence number of the target index interval, Q is the number of indexes contained in the target index interval, and S is the number of bytes of each index.

可选的,基于第二预设表达式确定所述目标索引区间中包含的索引的字节总量TS,所述第二预设表达式为:Optionally, the total number of bytes TS of the indexes included in the target index range is determined based on a second preset expression, where the second preset expression is:

TS=Q*STS=Q*S

其中,Q为所述目标索引区间中包含的索引的数量,S为每条索引的字节数量。Wherein, Q is the number of indexes contained in the target index interval, and S is the number of bytes of each index.

可选的,所述预设的查找算法可以是以下方法中的任一种:二分法、二叉排序树法、哈希表法、分块查找法。Optionally, the preset search algorithm may be any one of the following methods: binary method, binary sorting tree method, hash table method, and block search method.

可选的,所述第二存储介质为磁盘,所述第一存储介质为内存。Optionally, the second storage medium is a magnetic disk, and the first storage medium is a memory.

第二方面,本发明实施例提供了一种索引查找装置,包括:In a second aspect, an embodiment of the present invention provides an index search device, including:

获取模块,用于获取目标索引标识,所述目标索引标识为待查找索引的索引标识;An acquisition module, configured to acquire a target index identifier, where the target index identifier is an index identifier of an index to be searched;

确定模块,用于基于预设查找算法将所述目标索引标识和预先存储在第一存储介质中的多个参考索引标识进行比较,确定与所述目标索引标识对应的目标索引区间;A determining module, configured to compare the target index identifier with a plurality of reference index identifiers pre-stored in the first storage medium based on a preset search algorithm, and determine a target index interval corresponding to the target index identifier;

读取模块,用于从存储在第二存储介质中的索引文件中确定与所述目标索引区间对应的索引信息,得到目标索引信息,并将所述目标索引信息读取到第一存储介质中,所述第一存储介质的数据读写性能高于所述第二存储介质的数据读写性能;A reading module, configured to determine index information corresponding to the target index interval from an index file stored in the second storage medium, obtain target index information, and read the target index information into the first storage medium , the data read and write performance of the first storage medium is higher than the data read and write performance of the second storage medium;

查找模块,用于基于预设的查找算法在所述目标索引信息中查找所述目标索引标识,得到所述待查找索引。A search module, configured to search for the target index identifier in the target index information based on a preset search algorithm to obtain the index to be searched.

可选的,所述装置还包括:Optionally, the device also includes:

遍历模块,用于预先遍历存储在所述第二存储介质中的索引文件;A traversal module, configured to pre-traverse the index files stored in the second storage medium;

抽取模块,用于从所述索引文件中按照预定顺序间隔抽取多个索引标识,作为所述参考索引标识。An extracting module, configured to extract a plurality of index identifiers at intervals in a predetermined order from the index file as the reference index identifiers.

可选的,所述确定模块,包括:Optionally, the determination module includes:

第一确定子模块,用于确定所述目标索引区间的序号;A first determining submodule, configured to determine the sequence number of the target index interval;

第二确定子模块,用于基于所述目标索引区间的序号确定所述目标索引区间在所述索引文件中的起始偏移量;The second determination submodule is configured to determine the starting offset of the target index range in the index file based on the sequence number of the target index range;

第三确定子模块,用于确定所述目标索引区间中包含的索引的字节总量;A third determining submodule, configured to determine the total amount of bytes of the indexes contained in the target index interval;

第四确定子模块,用于基于所述起始偏移量和所述目标索引区间中包含的索引的字节总量确定所述目标索引信息。The fourth determining submodule is configured to determine the target index information based on the start offset and the total amount of bytes of the indexes contained in the target index interval.

可选的,所述第一确定子模块,具体用于:Optionally, the first determining submodule is specifically used for:

根据所述目标索引区间对应的参考索引标识在所述多个参考索引中的顺序,确定所述目标索引区间的序号。The sequence number of the target index range is determined according to the sequence of reference index identifiers corresponding to the target index range in the plurality of reference indexes.

可选的,基于第一预设表达式确定所述目标索引区间在所述索引文件中的起始偏移量O,所述第一预设表达式为:Optionally, the starting offset O of the target index interval in the index file is determined based on a first preset expression, where the first preset expression is:

O=N*Q*SO=N*Q*S

其中,N为所述目标索引区间的序号,Q为所述目标索引区间中包含的索引的数量,S为每条索引的字节数量。Wherein, N is the sequence number of the target index interval, Q is the number of indexes contained in the target index interval, and S is the number of bytes of each index.

可选的,基于第二预设表达式确定所述目标索引区间中包含的索引的字节总量TS,所述第二预设表达式为:Optionally, the total number of bytes TS of the indexes included in the target index range is determined based on a second preset expression, where the second preset expression is:

TS=Q*STS=Q*S

其中,Q为所述目标索引区间中包含的索引的数量,S为每条索引的字节数量。Wherein, Q is the number of indexes contained in the target index interval, and S is the number of bytes of each index.

可选的,所述预设的查找算法可以是以下方法中的任一种:二分法、二叉排序树法、哈希表法、分块查找法。Optionally, the preset search algorithm may be any one of the following methods: binary method, binary sorting tree method, hash table method, and block search method.

可选的,所述第二存储介质为磁盘,所述第一存储介质为内存。Optionally, the second storage medium is a magnetic disk, and the first storage medium is a memory.

第三方面,本发明实施例提供了一种电子设备,包括处理器和机器可读存储介质,所述机器可读存储介质存储有能够被所述处理器执行的机器可执行指令,所述处理器执行所述机器可执行指令以实现上述第一方面提供的索引查找方法的方法步骤。In a third aspect, an embodiment of the present invention provides an electronic device, including a processor and a machine-readable storage medium, the machine-readable storage medium stores machine-executable instructions that can be executed by the processor, and the processing The machine executes the machine-executable instructions to implement the method steps of the index lookup method provided in the first aspect above.

第四方面,本发明实施例提供了一种计算机可读存储介质,所述计算机可读存储介质内存储有计算机程序,所述计算机程序被处理器执行时,实现上述第一方面提供的索引查找方法的方法步骤。In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, the index search provided by the above-mentioned first aspect is realized The method steps of the method.

第五方面,本发明实施例还提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行上述第一方面提供的索引查找方法的方法步骤。In the fifth aspect, the embodiment of the present invention also provides a computer program product containing instructions, which, when run on a computer, causes the computer to execute the method steps of the index search method provided in the first aspect above.

第六方面,本发明实施例还提供了一种计算机程序,当其在计算机上运行时,使得计算机执行上述第一方面提供的索引查找方法的方法步骤。In a sixth aspect, an embodiment of the present invention further provides a computer program, which, when run on a computer, causes the computer to execute the method steps of the index search method provided in the first aspect.

本发明实施例提供的一种索引查找方法、查找装置、电子设备及存储介质,在获取目标索引标识后,通过将目标索引标识和预先存储在第一存储介质中的多个参考索引标识进行比较,确定与目标索引标识对应的目标索引区间,进而从存储在第二存储介质中的索引文件中确定与目标索引区间对应的索引信息,并将目标索引信息读取到第一存储介质中,与现有技术读取整个索引文件相比,本发明实施例读取到第一存储介质中的目标索引信息的索引数量大大减少,进而索引读取时间缩短,从而缩短索引的查找时间。当然,实施本发明的任一产品或方法必不一定需要同时达到以上所述的所有优点。In the index search method, search device, electronic equipment and storage medium provided by the embodiments of the present invention, after obtaining the target index mark, the target index mark is compared with a plurality of reference index marks pre-stored in the first storage medium , determine the target index interval corresponding to the target index identifier, and then determine the index information corresponding to the target index interval from the index file stored in the second storage medium, and read the target index information into the first storage medium, and Compared with reading the entire index file in the prior art, the number of indexes of the target index information read into the first storage medium in the embodiment of the present invention is greatly reduced, thereby shortening the index reading time, thereby shortening the index search time. Of course, implementing any product or method of the present invention does not necessarily need to achieve all the above-mentioned advantages at the same time.

附图说明Description of drawings

为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description are only These are some embodiments of the present invention. Those skilled in the art can also obtain other drawings based on these drawings without creative work.

图1为本发明实施例提供的索引查找方法的第一种流程示意图;FIG. 1 is a schematic flow chart of the first index search method provided by an embodiment of the present invention;

图2为本发明实施例中多条索引的存储示意图;FIG. 2 is a schematic diagram of storing multiple indexes in an embodiment of the present invention;

图3为本发明实施例提供的索引查找方法中,步骤S103的一种流程示意图;FIG. 3 is a schematic flowchart of step S103 in the index search method provided by the embodiment of the present invention;

图4为本发明实施例提供的索引查找方法的第二种流程示意图;FIG. 4 is a schematic flowchart of a second index search method provided by an embodiment of the present invention;

图5为本发明实施例提供的索引查找装置的第一种结构示意图;Fig. 5 is a first structural schematic diagram of an index search device provided by an embodiment of the present invention;

图6为本发明实施例提供的索引查找装置的第二种结构示意图;FIG. 6 is a schematic diagram of a second structure of an index search device provided by an embodiment of the present invention;

图7为本发明实施例提供的索引查找装置中,确定模块的一种结构示意图;FIG. 7 is a schematic structural diagram of a determination module in an index search device provided by an embodiment of the present invention;

图8为本发明实施例提供的一种电子设备的结构示意图。FIG. 8 is a schematic structural diagram of an electronic device provided by an embodiment of the present invention.

具体实施方式Detailed ways

下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

现有的索引查找方法,通常还可以采用以下方案:The existing index lookup method can usually also adopt the following scheme:

目前的应用场景下,由于在数据存储过程中产生的索引数量过多,导致索引不能全部放入内存,因此每次读取数据时,都需要先从磁盘中读取索引,再根据索引读取数据,但是,这种方法需要从磁盘中存储的索引文件中,按顺序遍历其中的索引,直到找到需要的读取的索引,由于最差的情况是需要遍历整个索引文件中的索引,因为需要花费大量的时间,因此对于查找性能要求高的应用场景并不适用。In the current application scenario, due to the excessive number of indexes generated during the data storage process, all the indexes cannot be put into the memory. Therefore, each time the data is read, the index needs to be read from the disk first, and then read according to the index. Data, however, this method needs to traverse the index in order from the index file stored in the disk until the index that needs to be read is found, because the worst case is that the index in the entire index file needs to be traversed, because it needs It takes a lot of time, so it is not suitable for application scenarios with high search performance requirements.

有鉴如此,如图1所示,本发明实施例提供了一种索引查找方法,该方法可以包括以下步骤:In view of this, as shown in FIG. 1, an embodiment of the present invention provides an index search method, which may include the following steps:

S101,获取目标索引标识。S101. Acquire a target index identifier.

可以理解,当索引被存储后可以生成索引的索引标识,该索引标识用于标识索引的存储顺序。在接收到对数据库中某条数据的查询请求之后,从查询请求中获取待查找数据所对应的索引的索引标识,即,目标索引标识。It can be understood that after the index is stored, an index identifier of the index may be generated, and the index identifier is used to identify a storage order of the index. After receiving a query request for a piece of data in the database, the index identifier of the index corresponding to the data to be searched, ie, the target index identifier, is obtained from the query request.

S102,基于预设查找算法将目标索引标识和预先存储在第一存储介质中的多个参考索引标识进行比较,确定与目标索引标识对应的目标索引区间。S102. Based on a preset search algorithm, compare the target index identifier with multiple reference index identifiers pre-stored in the first storage medium, and determine a target index interval corresponding to the target index identifier.

本发明实施例中,可以预先在第一存储介质中存储多个参考索引标识,这些参考索引标识可以用于对保存于索引文件中的多条索引进行区间标识,以标识出多个索引区间。多个参考索引标识通过以下方法得到:遍历索引文件,从索引文件中按照预定顺序间隔抽取多个索引标识作为参考索引标识,多个参考索引标识可以采用数组的形式来进行存储。In the embodiment of the present invention, a plurality of reference index identifiers may be pre-stored in the first storage medium, and these reference index identifiers may be used to identify intervals of multiple indexes stored in the index file, so as to identify multiple index intervals. The multiple reference index identifiers are obtained by the following method: traversing the index file, and extracting multiple index identifiers from the index file at predetermined intervals as reference index identifiers. The multiple reference index identifiers can be stored in the form of an array.

下面首先对索引的存储过程进行说明:The following first describes the stored procedure of the index:

一个数据被存储至磁盘后,便可以生成该数据对应的索引,用以在日后快速地通过所生成的索引查找该数据,那么,对于多条索引,可以按照预设的存储顺序保存至索引文件中,例如,按照索引生成时间的先后顺序存储各索引。After a piece of data is stored on the disk, the index corresponding to the data can be generated to quickly find the data through the generated index in the future. Then, for multiple indexes, they can be saved to the index file according to the preset storage order In , for example, indexes are stored in the order of index generation time.

示例性地,本发明实施例中的多条索引可以按照如图2所示的形式进行存储,图2中,当索引被存储至索引文件中后,可以分别生成该索引的索引标识(用seq表示),索引编号(用id表示),索引对应数据的偏移(用off表示),索引对应数据的大小(用size表示)。Exemplarily, multiple indexes in the embodiment of the present invention can be stored in the form shown in FIG. 2. In FIG. Indicates), the index number (indicated by id), the offset of the data corresponding to the index (indicated by off), and the size of the data corresponding to the index (indicated by size).

其中,上述索引标识与各索引的存储顺序对应,也即,上述索引标识可以标识各索引的存储顺序;上述索引编号用于表示一条索引的唯一识别号码,通常地,一条索引的索引编号与该索引的当前索引标识相同;上述偏移用于标明该索引对应的数据在磁盘中的偏移量,从而能够在磁盘上定位索引对应的数据;上述大小用于表示索引对应的数据在磁盘中所占用的存储空间的大小。Wherein, the above-mentioned index identification corresponds to the storage order of each index, that is, the above-mentioned index identification can identify the storage order of each index; the above-mentioned index number is used to represent the unique identification number of an index, and generally, the index number of an index is the same as the The current index identifier of the index is the same; the above offset is used to indicate the offset of the data corresponding to the index in the disk, so that the data corresponding to the index can be located on the disk; the above size is used to indicate the data corresponding to the index in the disk The size of the storage space used.

参考图2可知,各索引的索引标识并不一定总是连续的,例如,图2中各索引对应的索引标识的顺序为:0、2、3、4、6……,其中,0和2不连续,2到4连续,而4和6又不连续,这是由于当所存储数据被删除后,其对应的索引也将随之删除,因此可能出现索引不连续的情况,但是可以看到的是,多条索引整体的先后顺序并没有发生改变,即,上述索引标识仍然按照由小到大的顺序排列。Referring to FIG. 2, it can be seen that the index marks of each index are not always continuous. For example, the order of the index marks corresponding to each index in FIG. 2 is: 0, 2, 3, 4, 6..., where 0 and 2 Discontinuous, 2 to 4 are continuous, and 4 and 6 are not continuous. This is because when the stored data is deleted, the corresponding index will also be deleted, so the index may be discontinuous, but it can be seen Yes, the overall order of the multiple indexes has not changed, that is, the above index identifiers are still arranged in ascending order.

以下仍以图2为例对参考索引标识与索引文件的映射关系,以及参考索引标识的生成过程进行进一步说明:The mapping relationship between the reference index identifier and the index file, and the generation process of the reference index identifier are further described below still taking Figure 2 as an example:

遍历图2所示的索引文件,按照间隔16条索引数据的方式抽取索引标识0,28,36,并将其放入至预先定义的数组seq_array内,作为参考索引标识,从而建立参考索引标识跟索引文件的映射关系。具体地,参考索引标识0,28,36,将索引文件划分为3个索引区间:[0,28),[28,61),[61,∞),其中,[0,28)表示索引标识为0到28(不包含28)的区间,[28,61)表示索引标识为28到61(不包含61)的区间,[61,∞)表示索引标识为61(不包含61)以后的区间。其中[61,∞)区间内也包含16条索引对应的索引标识。Traverse the index file shown in Figure 2, extract the index identifiers 0, 28, and 36 according to the interval of 16 index data, and put them into the pre-defined array seq_array as the reference index identifier, so as to establish the reference index identifier and The mapping relationship of index files. Specifically, referring to the index identifiers 0, 28, and 36, the index file is divided into three index intervals: [0, 28), [28, 61), [61, ∞), where [0, 28) represents the index identifier It is the interval from 0 to 28 (not including 28), [28, 61) means the interval whose index is 28 to 61 (not including 61), [61, ∞) means the interval after the index is 61 (not including 61) . The interval [61, ∞) also contains index identifiers corresponding to 16 indexes.

本发明实施例可以将目标索引标识和多个参考索引标识进行比较,以确定与目标索引标识对应与索引文件的索引区间,即目标索引区间。In the embodiment of the present invention, the target index identifier can be compared with multiple reference index identifiers to determine the index interval corresponding to the target index identifier and the index file, that is, the target index interval.

示例性地,如果目标索引标识为39,则可以将39与上述0,28,61比较,以确定目标索引标识39对应的目标索引区间。例如,采用二分法来比较,其比较过程可以为:先将39与28比较,39比28大,然后将39跟61比较,39比61小,表明39在28和61这个区间,表示为[28,61),该区间即为目标索引区间。Exemplarily, if the target index ID is 39, 39 may be compared with the above 0, 28, 61 to determine the target index range corresponding to the target index ID 39. For example, using the dichotomy method for comparison, the comparison process can be: first compare 39 with 28, 39 is larger than 28, then compare 39 with 61, 39 is smaller than 61, indicating that 39 is in the interval between 28 and 61, expressed as [ 28, 61), this interval is the target index interval.

作为本发明实施例一种可选的实施方式,可以采用其他的查找算法进行查找,例如,二叉排序树法,哈希表法,分块查找法等。As an optional implementation manner of the embodiment of the present invention, other search algorithms may be used for search, for example, a binary sorting tree method, a hash table method, a block search method, and the like.

作为本发明实施例一种可选的实施方式,上述第一存储介质具体可以为内存,内存具有读写性能高、延迟短的特点,而上述参考索引标识恰恰需要内存的读写特点,即,需要快速地对参考索引标识进行读取,因此,将参考索引标识保存至内存,能够利用内存的性能优势快速确定与目标索引标识对应的目标索引区间,从而提高查找效率。As an optional implementation manner of the embodiment of the present invention, the above-mentioned first storage medium may specifically be a memory, which has the characteristics of high read-write performance and short delay, and the above-mentioned reference index identification just requires the read-write characteristics of the memory, that is, It is necessary to quickly read the reference index identifier. Therefore, saving the reference index identifier in the memory can take advantage of the performance advantages of the memory to quickly determine the target index interval corresponding to the target index identifier, thereby improving the search efficiency.

作为本发明实施例一种可选的实施方式,对于多条按顺序存储的索引,可以基于参考索引标识,预先对索引文件中保存的索引进行区间划分,即,把这些索引划分至不同的索引区间中,从而直接在预先划分的索引区间中快速确定与目标索引标识对应的目标索引区间。As an optional implementation of the embodiment of the present invention, for multiple indexes stored in sequence, based on the reference index identifier, the indexes stored in the index file can be divided into intervals in advance, that is, these indexes can be divided into different indexes interval, so as to quickly determine the target index interval corresponding to the target index identifier directly in the pre-divided index interval.

参考图1,S103,从存储在第二存储介质中的索引文件中确定与目标索引区间对应的索引信息,得到目标索引信息,并将目标索引信息读取到第一存储介质中。Referring to FIG. 1, S103, determine index information corresponding to the target index interval from the index file stored in the second storage medium, obtain the target index information, and read the target index information into the first storage medium.

在确定目标索引区间后,可以从存储在第二存储介质中的索引文件中,确定与该目标索引区域对应的索引信息,即得到目标索引信息。示例性地,参考图2,当目标索引区间为[28,61)时,则可以在索引文件中确定与目标索引区间[28,61)对应的索引信息,然后,将所确定的索引信息读取到第一存储介质中,其中,第一存储介质的数据读写性能高于第二存储介质的数据读写性能。第二存储介质,例如可以为:磁盘。After the target index section is determined, the index information corresponding to the target index area can be determined from the index file stored in the second storage medium, that is, the target index information can be obtained. Exemplarily, referring to FIG. 2, when the target index interval is [28, 61), then the index information corresponding to the target index interval [28, 61) can be determined in the index file, and then the determined index information is read The data is fetched to the first storage medium, wherein the data read and write performance of the first storage medium is higher than the data read and write performance of the second storage medium. The second storage medium may be, for example, a magnetic disk.

作为本发明实施例一种可选的实施方式,如图3所示,上述步骤S103具体可以为:As an optional implementation manner of the embodiment of the present invention, as shown in FIG. 3, the above step S103 may specifically be:

S1031,确定目标索引区间的序号。S1031. Determine the sequence number of the target index interval.

S1032,基于目标索引区间的序号确定目标索引区间在索引文件中的起始偏移量。S1032. Determine the starting offset of the target index section in the index file based on the sequence number of the target index section.

S1033,确定目标索引区间中包含的索引的字节总量。S1033. Determine the total amount of bytes of the indexes included in the target index interval.

S1034,基于起始偏移量和目标索引区间中包含的索引的字节总量确定目标索引信息。S1034. Determine target index information based on the start offset and the total amount of bytes of the index contained in the target index interval.

可选的,可以基于第一预设表达式确定目标索引区间在索引文件中的起始偏移量O,第一预设表达式为:Optionally, the starting offset O of the target index range in the index file may be determined based on a first preset expression, where the first preset expression is:

O=N*Q*SO=N*Q*S

其中,N为目标索引区间的序号,Q为目标索引区间中包含的索引的数量,S为每条索引的字节数量。本领域技术人员可以根据实际的业务需求,灵活设定目标索引区间中包含的索引的数量,其具体数值本发明实施例不做限定。Among them, N is the serial number of the target index range, Q is the number of indexes contained in the target index range, and S is the number of bytes of each index. Those skilled in the art can flexibly set the number of indexes contained in the target index interval according to actual business requirements, and the specific numerical value is not limited in this embodiment of the present invention.

下面对步骤S1031~S1034进行说明:Steps S1031 to S1034 are described below:

参考图2,由3个参考索引标识0,28,36确定的3个索引区间可以为:[0,28),[28,61),[61,∞),当目标索引区间为[28,61)时,则该目标索引区间为第二个区间,序号N为1(从0开始计数),则在索引文件中的起始偏移量为:N*16*24,其中16为索引区间中索引的个数,24为索引在索引文件中的字节数量(通常是预设的固定大小,单位为比特)。根据上述第一预设表达式:N*16*24=1*16*24=384,可以确定目标索引区间在索引文件中的起始偏移量为384。Referring to Figure 2, the three index intervals determined by the three reference index identifiers 0, 28, and 36 can be: [0, 28), [28, 61), [61, ∞), when the target index interval is [28, 61), the target index interval is the second interval, and the serial number N is 1 (counting from 0), then the starting offset in the index file is: N*16*24, where 16 is the index interval The number of indexes in the index, 24 is the number of bytes of the index in the index file (usually a preset fixed size, the unit is bit). According to the above first preset expression: N*16*24=1*16*24=384, it can be determined that the starting offset of the target index section in the index file is 384.

可选的,可以基于第二预设表达式确定目标索引区间中包含的索引的字节总量TS,第二预设表达式为:Optionally, the total number of bytes TS of the indexes included in the target index range may be determined based on a second preset expression, where the second preset expression is:

TS=Q*STS=Q*S

其中,Q为目标索引区间中包含的索引的数量,S为每条索引的字节数量。Among them, Q is the number of indexes contained in the target index interval, and S is the number of bytes of each index.

示例性地,目标索引区间中包含的索引的数量为16,每条索引的字节数量为24,则可以确定目标索引区间中包含的索引的字节总量TS为:16*24=384。Exemplarily, if the number of indexes contained in the target index range is 16, and the number of bytes of each index is 24, it can be determined that the total number of bytes TS of the indexes contained in the target index range is: 16*24=384.

作为本发明实施例一种可选的实施方式,可以根据目标索引区间对应的参考索引标识在多个参考索引中的顺序,确定目标索引区间的序号。As an optional implementation manner of the embodiment of the present invention, the sequence number of the target index range may be determined according to the order of the reference index identifiers corresponding to the target index range in multiple reference indices.

示例性地,目标索引区间对应的参考索引标识为39,39属于[28,61)这个目标索引区间,其中,28是数组【0、28、61】中的第2个数,由于数组是从0开始计算的,因此需要将2减去1,得到1,即,得到39属于第1个区间,也即,39所在的目标索引区间的序号为1。Exemplarily, the reference index identification corresponding to the target index interval is 39, and 39 belongs to the target index interval [28, 61), where 28 is the second number in the array [0, 28, 61], since the array is from The calculation starts from 0, so you need to subtract 1 from 2 to get 1, that is, 39 belongs to the first interval, that is, the sequence number of the target index interval where 39 is located is 1.

参考图1,S104,基于预设的查找算法在目标索引信息中查找目标索引标识,得到待查找索引。Referring to FIG. 1 , in S104 , the target index identifier is searched in the target index information based on a preset search algorithm to obtain the index to be searched.

确定目标索引信息后,由于目标索引信息中包含待查找索引及其索引标识,因此可以利用预设的查找算法,在目标索引信息中查找目标索引标识,从而得到待查找索引。After the target index information is determined, since the target index information includes the index to be searched and its index identifier, a preset search algorithm can be used to search the target index identifier in the target index information to obtain the index to be searched.

作为本发明实施例一种可选的实施方式,为了提高查找速度,本发明实施例可以采用不同的预设查找算法进行查找,例如,二分法,二叉排序树法,哈希表法,分块查找法等。As an optional implementation of the embodiment of the present invention, in order to improve the search speed, the embodiment of the present invention can use different preset search algorithms to search, for example, binary method, binary sorting tree method, hash table method, classification block search etc.

当采用二分法查找时,示例性地,如果查找到的目标索引标识为39,可以将目标索引标识39与目标索引区间[28,61)中位于中间的索引标识46进行比较,39小于46,表明该索引在[28,46]这一子区间内,然后,再利用二分查找算法,将目标索引标识39与子区间[28,46]中位于中间的索引标识进行比较,位于中间的索引标识恰好也是39,则查找到索引,进而可以读取该索引的记录:seq:39,id:39,off:10210,size:40687。When using binary search, for example, if the found target index is 39, the target index 39 can be compared with the middle index 46 in the target index interval [28, 61), 39 is less than 46, It indicates that the index is in the subinterval [28, 46]. Then, using the binary search algorithm, the target index identification 39 is compared with the middle index identification in the subinterval [28, 46]. The middle index identification It also happens to be 39, then the index is found, and the records of the index can be read: seq:39, id:39, off:10210, size:40687.

作为本发明实施例一种可选的实施方式,如图4所示,参考索引标识可以通过以下方法得到:As an optional implementation manner of the embodiment of the present invention, as shown in FIG. 4, the reference index identifier can be obtained by the following method:

S201,预先遍历存储在第二存储介质中的索引文件。S201. Traverse index files stored in the second storage medium in advance.

该步骤可以位于步骤S101之前。本发明实施例可以通过预先遍历存储在第二存储介质中的索引文件,例如,遍历存储在磁盘中的索引文件,参考图2,可以生成各索引的索引标识。该索引标识已在前述实施例中阐述,此处不再赘述。This step may be before step S101. In this embodiment of the present invention, an index identifier of each index may be generated by traversing an index file stored in a second storage medium in advance, for example, traversing an index file stored in a disk. Referring to FIG. 2 . The index identifier has been described in the foregoing embodiments, and will not be repeated here.

S202,从索引文件中按照预定顺序间隔抽取多个索引标识,作为参考索引标识。S202. Extract a plurality of index identifiers from the index file at intervals in a predetermined order as reference index identifiers.

可以理解,对于所抽取的多个参考索引标识,它们具体可以为存储顺序号或者索引编号,因此可以先通过参考索引标识,确定出各索引区间所对应的索引标识的范围,进而确定待查找索引标识所属的索引标识范围,并且,由于索引标识与索引是一一对应的,因此在确定索引标识所属的索引标识范围后,可以容易地确定目标索引标识对应的目标索引区间。正是由于上述参考标识能够用于确定目标索引标识对应的目标索引区间,因此称为参考标识。通过生成参考索引标识,能够缩短确定目标索引标识对应的目标索引区间的时间。It can be understood that for the multiple extracted reference index identifiers, they may specifically be storage sequence numbers or index numbers. Therefore, the range of index identifiers corresponding to each index interval can be determined by referring to the index identifiers, and then the index to be searched can be determined. The index identifier range to which the identifier belongs, and since the index identifier is in one-to-one correspondence with the index, after determining the index identifier range to which the index identifier belongs, the target index interval corresponding to the target index identifier can be easily determined. It is precisely because the above-mentioned reference identifier can be used to determine the target index interval corresponding to the target index identifier, so it is called a reference identifier. By generating the reference index identifier, the time for determining the target index interval corresponding to the target index identifier can be shortened.

得到多个参考标识后,可以将多个参考标识保存在第一存储介质中,例如,存储在内存中,以供后续使用。After the multiple reference identifiers are obtained, the multiple reference identifiers may be stored in the first storage medium, for example, stored in a memory, for subsequent use.

本发明实施例提供的一种索引查找方法,在获取目标索引标识后,通过将目标索引标识和预先存储在第一存储介质中的多个参考索引标识进行比较,确定与目标索引标识对应的目标索引区间,进而从存储在第二存储介质中的索引文件中确定与目标索引区间对应的索引信息,并将目标索引信息读取到第一存储介质中,与现有技术读取整个索引文件相比,本发明实施例读取到第一存储介质中的目标索引信息的索引数量大大减少,进而索引读取时间缩短,从而缩短索引的查找时间。In an index search method provided by an embodiment of the present invention, after obtaining the target index ID, the target index corresponding to the target index ID is determined by comparing the target index ID with a plurality of reference index IDs pre-stored in the first storage medium. index section, and then determine the index information corresponding to the target index section from the index file stored in the second storage medium, and read the target index information into the first storage medium, which is similar to reading the entire index file in the prior art Compared with that, the number of indexes of the target index information read into the first storage medium in the embodiment of the present invention is greatly reduced, and the index reading time is shortened, thereby shortening the index search time.

相应于上面的方法实施例,本发明实施例还提供了相应的装置实施例。Corresponding to the above method embodiments, the embodiments of the present invention also provide corresponding device embodiments.

如图5所示,本发明实施例提供了一种索引查找装置,与图1所示实施例对应,该装置包括:As shown in Figure 5, an embodiment of the present invention provides an index search device, which corresponds to the embodiment shown in Figure 1, and the device includes:

获取模块501,用于获取目标索引标识,目标索引标识为待查找索引的索引标识。The acquiring module 501 is configured to acquire a target index identifier, where the target index identifier is an index identifier of an index to be searched.

确定模块502,用于基于预设查找算法将目标索引标识和预先存储在第一存储介质中的多个参考索引标识进行比较,确定与目标索引标识对应的目标索引区间。The determination module 502 is configured to compare the target index identifier with a plurality of reference index identifiers pre-stored in the first storage medium based on a preset search algorithm, and determine a target index interval corresponding to the target index identifier.

读取模块503,用于从存储在第二存储介质中的索引文件中确定与目标索引区间对应的索引信息,得到目标索引信息,并将目标索引信息读取到第一存储介质中,第一存储介质的数据读写性能高于第二存储介质的数据读写性能。The reading module 503 is configured to determine the index information corresponding to the target index interval from the index file stored in the second storage medium, obtain the target index information, and read the target index information into the first storage medium, the first The data read and write performance of the storage medium is higher than the data read and write performance of the second storage medium.

查找模块504,用于基于预设的查找算法在目标索引信息中查找目标索引标识,得到待查找索引。The search module 504 is configured to search for the target index identifier in the target index information based on a preset search algorithm to obtain the index to be searched.

其中,如图6所示,在图5所示装置结构的基础上,本发明实施例的索引查找装置还可以包括:Wherein, as shown in FIG. 6, on the basis of the device structure shown in FIG. 5, the index search device in the embodiment of the present invention may further include:

遍历模块601,用于预先遍历存储在第二存储介质中的索引文件。The traversal module 601 is configured to pre-traverse the index files stored in the second storage medium.

抽取模块602,用于从索引文件中按照预定顺序间隔抽取多个索引标识,作为参考索引标识。The extracting module 602 is configured to extract a plurality of index identifiers from the index file at predetermined intervals as reference index identifiers.

其中,如图7所示,确定模块502,包括:Wherein, as shown in FIG. 7, the determination module 502 includes:

第一确定子模块5021,用于确定目标索引区间的序号。The first determination sub-module 5021 is configured to determine the sequence number of the target index interval.

第二确定子模块5022,用于基于目标索引区间的序号确定目标索引区间在索引文件中的起始偏移量。The second determination sub-module 5022 is configured to determine the starting offset of the target index range in the index file based on the serial number of the target index range.

第三确定子模块5023,用于确定目标索引区间中包含的索引的字节总量。The third determination sub-module 5023 is configured to determine the total amount of bytes of the index contained in the target index interval.

第四确定子模块5024,用于基于起始偏移量和目标索引区间中包含的索引的字节总量确定目标索引信息。The fourth determination sub-module 5024 is configured to determine target index information based on the start offset and the total number of bytes of the index contained in the target index interval.

其中,第一确定子模块,具体用于:Among them, the first determination sub-module is specifically used for:

根据目标索引区间对应的参考索引标识在多个参考索引中的顺序,确定目标索引区间的序号。The sequence number of the target index range is determined according to the order of the reference index identifiers corresponding to the target index range in the plurality of reference indices.

其中,基于第一预设表达式确定目标索引区间在索引文件中的起始偏移量O,第一预设表达式为:Wherein, the starting offset O of the target index interval in the index file is determined based on the first preset expression, and the first preset expression is:

O=N*Q*SO=N*Q*S

式中,N为目标索引区间的序号,Q为目标索引区间中包含的索引的数量,S为每条索引的字节数量。In the formula, N is the serial number of the target index interval, Q is the number of indexes contained in the target index interval, and S is the number of bytes of each index.

其中,基于第二预设表达式确定目标索引区间中包含的索引的字节总量TS,第二预设表达式为:Wherein, the total amount of bytes TS of the indexes contained in the target index interval is determined based on the second preset expression, and the second preset expression is:

TS=Q*STS=Q*S

其中,Q为目标索引区间中包含的索引的数量,S为每条索引的字节数量。Among them, Q is the number of indexes contained in the target index interval, and S is the number of bytes of each index.

其中,预设的查找算法可以是以下方法中的任一种:二分法、二叉排序树法、哈希表法、分块查找法。Wherein, the preset search algorithm may be any one of the following methods: binary method, binary sorting tree method, hash table method, and block search method.

其中,第二存储介质为磁盘,第一存储介质为内存。Wherein, the second storage medium is a disk, and the first storage medium is a memory.

本发明实施例提供的一种索引查找装置,在获取目标索引标识后,通过将目标索引标识和预先存储在第一存储介质中的多个参考索引标识进行比较,确定与目标索引标识对应的目标索引区间,进而从存储在第二存储介质中的索引文件中确定与目标索引区间对应的索引信息,并将目标索引信息读取到第一存储介质中,与现有技术读取整个索引文件相比,本发明实施例读取到第一存储介质中的目标索引信息的索引数量大大减少,进而索引读取时间缩短,从而缩短索引的查找时间。In an index search device provided by an embodiment of the present invention, after acquiring a target index ID, the target index ID corresponding to the target index ID is determined by comparing the target index ID with a plurality of reference index IDs pre-stored in the first storage medium. index section, and then determine the index information corresponding to the target index section from the index file stored in the second storage medium, and read the target index information into the first storage medium, which is similar to reading the entire index file in the prior art Compared with that, the number of indexes of the target index information read into the first storage medium in the embodiment of the present invention is greatly reduced, and the index reading time is shortened, thereby shortening the index search time.

本发明实施例还提供了一种电子设备,具体可以为服务器,如图8所示,该设备800包括处理器801和机器可读存储介质802,机器可读存储介质存储有能够被处理器执行的机器可执行指令,处理器执行机器可执行指令实现以下步骤:The embodiment of the present invention also provides an electronic device, which may be specifically a server. As shown in FIG. 8, the device 800 includes a processor 801 and a machine-readable storage medium 802. The machine-readable storage medium stores information that can be executed by the processor. machine-executable instructions, the processor executes the machine-executable instructions to implement the following steps:

获取目标索引标识,目标索引标识为待查找索引的索引标识;Obtain the target index ID, where the target index ID is the index ID of the index to be searched;

基于预设查找算法将目标索引标识和预先存储在第一存储介质中的多个参考索引标识进行比较,确定与目标索引标识对应的目标索引区间;comparing the target index identifier with a plurality of reference index identifiers pre-stored in the first storage medium based on a preset search algorithm, and determining a target index interval corresponding to the target index identifier;

从存储在第二存储介质中的索引文件中确定与目标索引区间对应的索引信息,得到目标索引信息,并将目标索引信息读取到第一存储介质中,第一存储介质的数据读写性能高于第二存储介质的数据读写性能;Determine the index information corresponding to the target index interval from the index file stored in the second storage medium, obtain the target index information, and read the target index information into the first storage medium, the data read and write performance of the first storage medium Higher data read and write performance than the second storage medium;

基于预设的查找算法在目标索引信息中查找目标索引标识,得到待查找索引。Based on a preset search algorithm, the target index identifier is searched in the target index information to obtain the index to be searched.

本发明实施例提供的一种电子设备,在获取目标索引标识后,通过将目标索引标识和预先存储在第一存储介质中的多个参考索引标识进行比较,确定与目标索引标识对应的目标索引区间,进而从存储在第二存储介质中的索引文件中确定与目标索引区间对应的索引信息,并将目标索引信息读取到第一存储介质中,与现有技术读取整个索引文件相比,本发明实施例读取到第一存储介质中的目标索引信息的索引数量大大减少,进而索引读取时间缩短,从而缩短索引的查找时间。An electronic device provided by an embodiment of the present invention determines the target index corresponding to the target index ID by comparing the target index ID with a plurality of reference index IDs pre-stored in the first storage medium after acquiring the target index ID interval, and then determine the index information corresponding to the target index interval from the index file stored in the second storage medium, and read the target index information into the first storage medium, compared with the prior art of reading the entire index file In this embodiment of the present invention, the number of indexes of the target index information read into the first storage medium is greatly reduced, and the index reading time is shortened, thereby shortening the index search time.

上述机器可读存储介质可以包括随机存取存储器(Random Access Memory,简称RAM),也可以包括非易失性存储器(non-volatile memory),例如至少一个磁盘存储器。可选的,存储器还可以是至少一个位于远离前述处理器的存储装置。The above-mentioned machine-readable storage medium may include a random access memory (Random Access Memory, RAM for short), and may also include a non-volatile memory (non-volatile memory), such as at least one magnetic disk memory. Optionally, the memory may also be at least one storage device located far away from the aforementioned processor.

上述的处理器可以是通用处理器,包括中央处理器(Central Processing Unit,简称CPU)、网络处理器(Network Processor,简称NP)等;还可以是数字信号处理器(Digital Signal Processing,简称DSP)、专用集成电路(Application SpecificIntegrated Circuit,简称ASIC)、现场可编程门阵列(Field-Programmable Gate Array,简称FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。The above-mentioned processor can be a general-purpose processor, including a central processing unit (Central Processing Unit, referred to as CPU), a network processor (Network Processor, referred to as NP), etc.; it can also be a digital signal processor (Digital Signal Processing, referred to as DSP) , Application Specific Integrated Circuit (ASIC for short), Field Programmable Gate Array (Field-Programmable Gate Array, FPGA for short) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.

本发明实施例提供了一种计算机可读存储介质,计算机可读存储介质内存储有计算机程序,计算机程序被处理器执行时,用以执行如下步骤:An embodiment of the present invention provides a computer-readable storage medium. A computer program is stored in the computer-readable storage medium. When the computer program is executed by a processor, it is used to perform the following steps:

获取目标索引标识,目标索引标识为待查找索引的索引标识;Obtain the target index ID, where the target index ID is the index ID of the index to be searched;

基于预设查找算法将目标索引标识和预先存储在第一存储介质中的多个参考索引标识进行比较,确定与目标索引标识对应的目标索引区间;comparing the target index identifier with a plurality of reference index identifiers pre-stored in the first storage medium based on a preset search algorithm, and determining a target index interval corresponding to the target index identifier;

从存储在第二存储介质中的索引文件中确定与目标索引区间对应的索引信息,得到目标索引信息,并将目标索引信息读取到第一存储介质中,第一存储介质的数据读写性能高于第二存储介质的数据读写性能;Determine the index information corresponding to the target index interval from the index file stored in the second storage medium, obtain the target index information, and read the target index information into the first storage medium, the data read and write performance of the first storage medium Higher data read and write performance than the second storage medium;

基于预设的查找算法在目标索引信息中查找目标索引标识,得到待查找索引。Based on a preset search algorithm, the target index identifier is searched in the target index information to obtain the index to be searched.

本发明实施例提供的计算机可读存储介质,在获取目标索引标识后,通过将目标索引标识和预先存储在第一存储介质中的多个参考索引标识进行比较,确定与目标索引标识对应的目标索引区间,进而从存储在第二存储介质中的索引文件中确定与目标索引区间对应的索引信息,并将目标索引信息读取到第一存储介质中,与现有技术读取整个索引文件相比,本发明实施例读取到第一存储介质中的目标索引信息的索引数量大大减少,进而索引读取时间缩短,从而缩短索引的查找时间。The computer-readable storage medium provided by the embodiment of the present invention determines the target corresponding to the target index ID by comparing the target index ID with multiple reference index IDs pre-stored in the first storage medium after acquiring the target index ID. index section, and then determine the index information corresponding to the target index section from the index file stored in the second storage medium, and read the target index information into the first storage medium, which is similar to reading the entire index file in the prior art Compared with that, the number of indexes of the target index information read into the first storage medium in the embodiment of the present invention is greatly reduced, and the index reading time is shortened, thereby shortening the index search time.

本发明实施例提供了一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行以下步骤:An embodiment of the present invention provides a computer program product containing instructions, which when run on a computer causes the computer to perform the following steps:

获取目标索引标识,目标索引标识为待查找索引的索引标识;Obtain the target index ID, where the target index ID is the index ID of the index to be searched;

基于预设查找算法将目标索引标识和预先存储在第一存储介质中的多个参考索引标识进行比较,确定与目标索引标识对应的目标索引区间;comparing the target index identifier with a plurality of reference index identifiers pre-stored in the first storage medium based on a preset search algorithm, and determining a target index interval corresponding to the target index identifier;

从存储在第二存储介质中的索引文件中确定与目标索引区间对应的索引信息,得到目标索引信息,并将目标索引信息读取到第一存储介质中,第一存储介质的数据读写性能高于第二存储介质的数据读写性能;Determine the index information corresponding to the target index interval from the index file stored in the second storage medium, obtain the target index information, and read the target index information into the first storage medium, the data read and write performance of the first storage medium Higher data read and write performance than the second storage medium;

基于预设的查找算法在目标索引信息中查找目标索引标识,得到待查找索引。Based on a preset search algorithm, the target index identifier is searched in the target index information to obtain the index to be searched.

本发明实施例提供的包含指令的计算机程序产品,在获取目标索引标识后,通过将目标索引标识和预先存储在第一存储介质中的多个参考索引标识进行比较,确定与目标索引标识对应的目标索引区间,进而从存储在第二存储介质中的索引文件中确定与目标索引区间对应的索引信息,并将目标索引信息读取到第一存储介质中,与现有技术读取整个索引文件相比,本发明实施例读取到第一存储介质中的目标索引信息的索引数量大大减少,进而索引读取时间缩短,从而缩短索引的查找时间。In the computer program product including instructions provided by the embodiment of the present invention, after obtaining the target index ID, the target index ID is compared with multiple reference index IDs pre-stored in the first storage medium to determine the corresponding target index ID The target index interval, and then determine the index information corresponding to the target index interval from the index file stored in the second storage medium, and read the target index information into the first storage medium, which is the same as reading the entire index file in the prior art In comparison, the number of indexes of the target index information read into the first storage medium in the embodiment of the present invention is greatly reduced, thereby shortening the index reading time, thereby shortening the index search time.

本发明实施例还提供了一种计算机程序,当其在计算机上运行时,使得计算机执行以下步骤:An embodiment of the present invention also provides a computer program that, when run on a computer, causes the computer to perform the following steps:

获取目标索引标识,目标索引标识为待查找索引的索引标识;Obtain the target index ID, where the target index ID is the index ID of the index to be searched;

基于预设查找算法将目标索引标识和预先存储在第一存储介质中的多个参考索引标识进行比较,确定与目标索引标识对应的目标索引区间;comparing the target index identifier with a plurality of reference index identifiers pre-stored in the first storage medium based on a preset search algorithm, and determining a target index interval corresponding to the target index identifier;

从存储在第二存储介质中的索引文件中确定与目标索引区间对应的索引信息,得到目标索引信息,并将目标索引信息读取到第一存储介质中,第一存储介质的数据读写性能高于第二存储介质的数据读写性能;Determine the index information corresponding to the target index interval from the index file stored in the second storage medium, obtain the target index information, and read the target index information into the first storage medium, the data read and write performance of the first storage medium Higher data read and write performance than the second storage medium;

基于预设的查找算法在目标索引信息中查找目标索引标识,得到待查找索引。Based on a preset search algorithm, the target index identifier is searched in the target index information to obtain the index to be searched.

本发明实施例提供的包含指令的计算机程序,在获取目标索引标识后,通过将目标索引标识和预先存储在第一存储介质中的多个参考索引标识进行比较,确定与目标索引标识对应的目标索引区间,进而从存储在第二存储介质中的索引文件中确定与目标索引区间对应的索引信息,并将目标索引信息读取到第一存储介质中,与现有技术读取整个索引文件相比,本发明实施例读取到第一存储介质中的目标索引信息的索引数量大大减少,进而索引读取时间缩短,从而缩短索引的查找时间。The computer program including instructions provided by the embodiment of the present invention determines the target corresponding to the target index ID by comparing the target index ID with a plurality of reference index IDs pre-stored in the first storage medium after acquiring the target index ID. index section, and then determine the index information corresponding to the target index section from the index file stored in the second storage medium, and read the target index information into the first storage medium, which is similar to reading the entire index file in the prior art Compared with that, the number of indexes of the target index information read into the first storage medium in the embodiment of the present invention is greatly reduced, and the index reading time is shortened, thereby shortening the index search time.

对于装置/电子设备/存储介质实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。For the apparatus/electronic equipment/storage medium embodiments, since they are basically similar to the method embodiments, the description is relatively simple, and for relevant parts, please refer to part of the description of the method embodiments.

需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括要素的过程、方法、物品或者设备中还存在另外的相同要素。It should be noted that in this article, relational terms such as first and second are only used to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply that there is a relationship between these entities or operations. There is no such actual relationship or order between them. Furthermore, the term "comprises", "comprises" or any other variation thereof is intended to cover a non-exclusive inclusion such that a process, method, article, or apparatus comprising a set of elements includes not only those elements, but also includes elements not expressly listed. other elements of or also include elements inherent in such a process, method, article, or device. Without further limitations, an element defined by the phrase "comprising a ..." does not preclude the presence of additional identical elements in the process, method, article, or apparatus that includes the element.

本说明书中的各个实施例均采用相关的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于系统实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。Each embodiment in this specification is described in a related manner, the same and similar parts of each embodiment can be referred to each other, and each embodiment focuses on the differences from other embodiments. In particular, for the system embodiment, since it is basically similar to the method embodiment, the description is relatively simple, and for relevant parts, refer to part of the description of the method embodiment.

以上所述仅为本发明的较佳实施例而已,并非用于限定本发明的保护范围。凡在本发明的精神和原则之内所作的任何修改、等同替换、改进等,均包含在本发明的保护范围内。The above descriptions are only preferred embodiments of the present invention, and are not intended to limit the protection scope of the present invention. Any modification, equivalent replacement, improvement, etc. made within the spirit and principles of the present invention are included in the protection scope of the present invention.

Claims (18)

1. a kind of index lookup method, which is characterized in that the described method includes:
Target index mark is obtained, the target index is identified as the index mark of index to be found;
The target is indexed into mark based on default lookup algorithm and is stored in advance in multiple with reference to rope in the first storage medium Tendering knowledge is compared, and determining target corresponding with target index mark indexes section;
Index information corresponding with target index section is determined from the index file being stored in the second storage medium, is obtained It is read in the first storage medium to target index information, and by the target index information, the number of first storage medium It is higher than the reading and writing data performance of second storage medium according to readwrite performance;
The target index mark is searched in the target index information based on preset lookup algorithm, is obtained described to be found Index.
2. the method according to claim 1, wherein reference key mark obtains by the following method:
The index file being stored in second storage medium is traversed in advance;
Multiple indexes are extracted according to predetermined order interval from the index file to identify, and are identified as the reference key.
3. the method according to claim 1, wherein described from the index file being stored in the second storage medium The step of corresponding index information in middle determination and the target index section, comprising:
Determine the serial number in target index section;
Start offset of the target index section in the index file is determined based on the serial number in target index section Amount;
Determine the byte total amount for the index for including in target index section;
The target rope is determined based on the byte total amount for the index for including in the start offset amount and target index section Fuse breath.
4. according to the method described in claim 3, it is characterized in that, determining the sequence in target index section by the following method Number: the corresponding reference key in section is indexed according to the target and identifies the sequence in the multiple reference key, determine described in The serial number in target index section.
5. according to the method described in claim 3, it is characterized in that, determining the target index area based on the first default expression formula Between start offset amount O in the index file, the first default expression formula are as follows:
O=N*Q*S
Wherein, N is the serial number that the target indexes section, and Q is the quantity that the target indexes the index for including in section, and S is The byte quantity of every index.
6. according to the method described in claim 3, it is characterized in that, determining the target index area based on the second default expression formula Between in include index byte total amount TS, the second default expression formula are as follows:
TS=Q*S
Wherein, Q is the quantity that the target indexes the index for including in section, and S is the byte quantity of every index.
7. method according to claim 1-6, which is characterized in that the preset lookup algorithm can be following Any one of method: dichotomy, binary sort tree method, Hash table method, block research method.
8. method according to claim 1-6, which is characterized in that second storage medium is disk, described First storage medium is memory.
9. a kind of index searches device, which is characterized in that described device includes:
Module is obtained, for obtaining target index mark, the target index is identified as the index mark of index to be found;
Determining module, for the target to be indexed mark based on default lookup algorithm and is stored in advance in the first storage medium Multiple reference keys mark be compared, determining target index corresponding with target index mark section;
Read module, it is corresponding with target index section for being determined from the index file being stored in the second storage medium Index information, obtain target index information, and the target index information is read in the first storage medium, described first The reading and writing data performance of storage medium is higher than the reading and writing data performance of second storage medium;
Searching module, for searching the target index mark in the target index information based on preset lookup algorithm, Obtain the index to be found.
10. device according to claim 9, which is characterized in that described device further include:
Spider module, for traversing the index file being stored in second storage medium in advance;
Abstraction module is identified for extracting multiple indexes according to predetermined order interval from the index file, as the ginseng Examine index mark.
11. device according to claim 9, which is characterized in that the determining module, comprising:
First determines submodule, for determining the serial number in target index section;
Second determines submodule, for determining target index section in the rope based on the serial number in target index section Start offset amount in quotation part;
Third determines submodule, for determining the byte total amount for the index for including in target index section;
4th determines submodule, for the byte based on the index for including in the start offset amount and target index section Total amount determines the target index information.
12. device according to claim 11, which is characterized in that described first determines submodule, is specifically used for:
Index the corresponding reference key in section according to the target and identify the sequence in the multiple reference key, determine described in The serial number in target index section.
13. device according to claim 11, which is characterized in that determine that the target indexes based on the first default expression formula Start offset amount O of the section in the index file, the first default expression formula are as follows:
O=N*Q*S
Wherein, N is the serial number that the target indexes section, and Q is the quantity that the target indexes the index for including in section, and S is The byte quantity of every index.
14. device according to claim 11, which is characterized in that determine that the target indexes based on the second default expression formula The byte total amount TS for the index for including in section, the second default expression formula are as follows:
TS=Q*S
Wherein, Q is the quantity that the target indexes the index for including in section, and S is the byte quantity of every index.
15. according to the described in any item devices of claim 9-14, which is characterized in that the preset lookup algorithm can be with Any one of lower method: dichotomy, binary sort tree method, Hash table method, block research method.
16. according to the described in any item devices of claim 9-14, which is characterized in that second storage medium is disk, institute Stating the first storage medium is memory.
17. a kind of electronic equipment, which is characterized in that including processor and machine readable storage medium, the machine readable storage Media storage has the machine-executable instruction that can be executed by the processor, and the processor executes the executable finger of the machine Enable the method and step to realize the described in any item index lookup methods of claim 1-8.
18. a kind of computer readable storage medium, which is characterized in that be stored with computer in the computer readable storage medium Program realizes the side of the described in any item index lookup methods of claim 1-8 when the computer program is executed by processor Method step.
CN201910515781.0A 2019-06-14 2019-06-14 It indexes lookup method, search device, electronic equipment and storage medium Pending CN110222074A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910515781.0A CN110222074A (en) 2019-06-14 2019-06-14 It indexes lookup method, search device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910515781.0A CN110222074A (en) 2019-06-14 2019-06-14 It indexes lookup method, search device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN110222074A true CN110222074A (en) 2019-09-10

Family

ID=67817375

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910515781.0A Pending CN110222074A (en) 2019-06-14 2019-06-14 It indexes lookup method, search device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110222074A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110765321A (en) * 2019-10-28 2020-02-07 北京明略软件系统有限公司 Data storage path generation method and device and readable storage medium
CN111459883A (en) * 2020-03-31 2020-07-28 潍柴动力股份有限公司 Data processing method and device
CN113190507A (en) * 2021-05-14 2021-07-30 杭州海康威视数字技术股份有限公司 Index information synchronization method and device and electronic equipment
CN113961477A (en) * 2020-07-20 2022-01-21 科尔奇普投资公司 Binary search method and system
CN114661666A (en) * 2022-03-03 2022-06-24 北京城市网邻信息技术有限公司 Data searching method, device, equipment and storage medium
CN114978646A (en) * 2022-05-13 2022-08-30 京东科技控股股份有限公司 Access authority determination method, device, equipment and storage medium
CN115617848A (en) * 2021-07-15 2023-01-17 北京希姆计算科技有限公司 A data lookup method, device and readable storage medium based on a lookup table

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106407250A (en) * 2015-07-28 2017-02-15 阿里巴巴集团控股有限公司 Information query method, device and system, server and client side
US20170270150A1 (en) * 2015-05-14 2017-09-21 Walleye Software, LLC Dynamic table index mapping
CN107391769A (en) * 2017-09-12 2017-11-24 北京优网助帮信息技术有限公司 A kind of search index method and device
CN108255958A (en) * 2017-12-21 2018-07-06 百度在线网络技术(北京)有限公司 Data query method, apparatus and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170270150A1 (en) * 2015-05-14 2017-09-21 Walleye Software, LLC Dynamic table index mapping
CN106407250A (en) * 2015-07-28 2017-02-15 阿里巴巴集团控股有限公司 Information query method, device and system, server and client side
CN107391769A (en) * 2017-09-12 2017-11-24 北京优网助帮信息技术有限公司 A kind of search index method and device
CN108255958A (en) * 2017-12-21 2018-07-06 百度在线网络技术(北京)有限公司 Data query method, apparatus and storage medium

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110765321A (en) * 2019-10-28 2020-02-07 北京明略软件系统有限公司 Data storage path generation method and device and readable storage medium
CN111459883A (en) * 2020-03-31 2020-07-28 潍柴动力股份有限公司 Data processing method and device
CN111459883B (en) * 2020-03-31 2023-08-18 潍柴动力股份有限公司 Data processing method and device
CN113961477A (en) * 2020-07-20 2022-01-21 科尔奇普投资公司 Binary search method and system
CN113961477B (en) * 2020-07-20 2025-12-23 美商光禾科技股份有限公司 Binary search method and system
CN113190507A (en) * 2021-05-14 2021-07-30 杭州海康威视数字技术股份有限公司 Index information synchronization method and device and electronic equipment
CN113190507B (en) * 2021-05-14 2022-06-03 杭州海康威视数字技术股份有限公司 Index information synchronization method and device and electronic equipment
CN115617848A (en) * 2021-07-15 2023-01-17 北京希姆计算科技有限公司 A data lookup method, device and readable storage medium based on a lookup table
CN114661666A (en) * 2022-03-03 2022-06-24 北京城市网邻信息技术有限公司 Data searching method, device, equipment and storage medium
CN114661666B (en) * 2022-03-03 2023-01-24 北京城市网邻信息技术有限公司 Data searching method, device, equipment and storage medium
CN114978646A (en) * 2022-05-13 2022-08-30 京东科技控股股份有限公司 Access authority determination method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN110222074A (en) It indexes lookup method, search device, electronic equipment and storage medium
CN110909266B (en) Deep paging method and device and server
US20150199433A1 (en) Method and system for search engine indexing and searching using the index
WO2018064962A1 (en) Data storage method, electronic device and computer non-volatile storage medium
CN110532347A (en) A kind of daily record data processing method, device, equipment and storage medium
CN102332030A (en) Data storage, management and query method and system for distributed key-value storage system
CN106055621A (en) Log retrieval method and device
US10346496B2 (en) Information category obtaining method and apparatus
WO2018036549A1 (en) Distributed database query method and device, and management system
CN106776809A (en) A kind of data query method and system
CN111339293A (en) Data processing method and device of alarm event and classification method of alarm event
CN103714121B (en) The management method and device of a kind of index record
US9213759B2 (en) System, apparatus, and method for executing a query including boolean and conditional expressions
CN115455207A (en) Reference relation retrieval method and device, electronic equipment and storage medium
CN106407322A (en) Quick file searching method based on Android system
CN116383192A (en) Data query method, device, equipment and storage medium
CN109992708B (en) Method, device, equipment and storage medium for metadata query
US12399708B2 (en) Software recognition using tree-structured pattern matching rules for software asset management
CN110019829A (en) Data attribute determines method, apparatus
CN112232970B (en) Data relationship identification method and device, storage medium and electronic equipment
CN116821135A (en) A database full-text retrieval processing method and system
CN116303354A (en) A storage engine determination method, device, electronic equipment and storage medium
CN110046180A (en) Method and device for positioning similar examples and electronic equipment
CN116610808A (en) Data query method, device, equipment and storage medium
CN103220355A (en) Multi-user configuration method in content distribution network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190910