[go: up one dir, main page]

CN112860840A - Search processing method, device, equipment and storage medium - Google Patents

Search processing method, device, equipment and storage medium Download PDF

Info

Publication number
CN112860840A
CN112860840A CN201911102850.1A CN201911102850A CN112860840A CN 112860840 A CN112860840 A CN 112860840A CN 201911102850 A CN201911102850 A CN 201911102850A CN 112860840 A CN112860840 A CN 112860840A
Authority
CN
China
Prior art keywords
keyword
keywords
candidate
index relationship
query word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911102850.1A
Other languages
Chinese (zh)
Inventor
连义江
傅畅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201911102850.1A priority Critical patent/CN112860840A/en
Publication of CN112860840A publication Critical patent/CN112860840A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/316Indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请公开了一种搜索处理方法、装置、设备和存储介质,涉及智能搜索技术领域。具体实现方案为:获取用户输入的查询词;基于预先建立的索引关系,确定与查询词匹配的至少一个候选关键词;索引关系是在将各关键词按照语义进行聚类后,针对聚类得到的各分类对应的关键词中的部分关键词,建立的部分关键词的分词与对应关键词之间的索引关系;确定各候选关键词与查询词的相关度,筛选出相关度满足设定条件的候选关键词;基于筛选出的候选关键词,确定与查询词匹配的目标关键词。本申请实施例通过针对聚类后每一类别的部分关键词的分词与对应关键词之间索引关系的引入和使用,减少了搜索时长,提高了搜索效率,并提高了搜索系统的召回能力。

Figure 201911102850

The present application discloses a search processing method, apparatus, device and storage medium, and relates to the technical field of intelligent search. The specific implementation scheme is: obtaining the query word input by the user; determining at least one candidate keyword matching the query word based on a pre-established index relationship; after clustering each keyword according to semantics, the index relationship is obtained for the clustering Some keywords in the keywords corresponding to each category, establish the index relationship between the word segmentation of some keywords and the corresponding keywords; determine the correlation between each candidate keyword and the query word, and filter out the correlation to meet the set conditions The candidate keywords are selected; based on the selected candidate keywords, the target keywords that match the query words are determined. The embodiment of the present application reduces the search time, improves the search efficiency, and improves the recall capability of the search system through the introduction and use of the index relationship between the word segmentation and the corresponding keywords for some keywords of each category after clustering.

Figure 201911102850

Description

一种搜索处理方法、装置、设备和存储介质A search processing method, apparatus, device and storage medium

技术领域technical field

本申请涉及计算机技术,尤其涉及智能搜索技术领域。The present application relates to computer technology, in particular to the technical field of intelligent search.

背景技术Background technique

当搜索引擎接收到用户输入的查询词后,需要根据索引表确定出与查询词相匹配的关键词,并根据匹配结果向用户进行相应信息(比如关键词对应的广告)的展示。After the search engine receives the query word input by the user, it needs to determine the keyword matching the query word according to the index table, and display corresponding information (such as an advertisement corresponding to the keyword) to the user according to the matching result.

现有技术中,在根据索引表确定与查询词相匹配的关键词时,需要通过相关性校验模型分别确定查询词与索引表中各候选关键词的相关度,并根据相关度进行关键词匹配。In the prior art, when determining the keywords matching the query word according to the index table, the correlation between the query word and each candidate keyword in the index table needs to be determined respectively through a correlation check model, and the keywords are determined according to the correlation degree. match.

然而,相关性校验模型耗时较长,每次仅能计算少量候选关键词与查询词之间的相关度,极大的限制了搜索系统的召回能力。However, the correlation verification model takes a long time, and can only calculate the correlation between a small number of candidate keywords and query words each time, which greatly limits the recall ability of the search system.

发明内容SUMMARY OF THE INVENTION

本申请实施例提供了一种搜索处理方法、装置、设备和存储介质,以减少搜索时长,提高搜索效率,同时提高搜索系统的召回能力。The embodiments of the present application provide a search processing method, apparatus, device, and storage medium, so as to reduce the search time, improve the search efficiency, and at the same time improve the recall capability of the search system.

第一方面,本申请实施例提供了一种搜索处理方法,包括:In a first aspect, an embodiment of the present application provides a search processing method, including:

获取用户输入的查询词;Get the query words entered by the user;

基于预先建立的索引关系,确定与所述查询词匹配的至少一个候选关键词;其中,所述索引关系是在将各关键词按照语义进行聚类后,针对聚类得到的各分类所对应的关键词中的部分关键词,建立的部分关键词的分词与对应关键词之间的索引关系;Based on a pre-established index relationship, at least one candidate keyword matching the query term is determined; wherein, the index relationship is the corresponding classification for each category obtained by clustering each keyword according to semantics Some keywords in the keywords, the established index relationship between the participles of some keywords and the corresponding keywords;

确定各所述候选关键词与所述查询词的相关度,筛选出相关度满足设定条件的所述候选关键词;determining the degree of relevancy between each of the candidate keywords and the query word, and screening out the candidate keywords whose degree of relevancy satisfies a set condition;

基于筛选出的所述候选关键词,确定与所述查询词匹配的目标关键词。Based on the selected candidate keywords, a target keyword matching the query word is determined.

本申请实施例通过获取用户输入的查询词;基于预先在将各关键词按照语句进行聚类后,针对聚类得到的各分类所对应的关键词中的部分关键词,建立的部分关键词的分词与对应关键词之间的索引关系,确定与查询词匹配的至少一个候选关键词;确定各候选关键词与查询词的相关度,筛选出相关度满足设定条件的候选关键词;基于筛选出的候选关键词,确定与查询词匹配的目标关键词。上述技术方案对于语义相同的关键词建立部分关键词的分词与对应的关键词之间的索引关系,而不是针对全部关键词建立索引关系,因此基于该索引关系确定的与查询词匹配的候选关键词的数量也会减少,进而在进行候选关键词与查询词之间相关度计算的工作量也会减少,从而减少了搜索时长,提高了搜索效率,同时,还提高了搜索系统的召回能力。In the embodiment of the present application, the query words input by the user are obtained; based on the pre-clustering of the keywords according to the sentences, for some keywords in the keywords corresponding to the categories obtained by the clustering, the established part of the keywords The index relationship between the segmented words and the corresponding keywords is used to determine at least one candidate keyword that matches the query word; The selected candidate keywords are determined to determine the target keywords that match the query words. The above technical solution establishes an index relationship between the word segmentation of some keywords and the corresponding keywords for keywords with the same semantics, rather than establishing an index relationship for all keywords, so the candidate key determined based on the index relationship that matches the query word The number of words will also be reduced, and the workload of calculating the correlation between candidate keywords and query words will also be reduced, thereby reducing search time, improving search efficiency, and improving the recall capability of the search system.

可选的,索引关系的建立方法包括:Optionally, the method for establishing the index relationship includes:

将各关键词按照语义进行聚类;Cluster each keyword according to semantics;

对于聚类后得到的各个分类,选取当前分类中的一个关键词作为代表元关键词;For each category obtained after clustering, select a keyword in the current category as the representative meta keyword;

对所述代表元关键词进行切词处理,建立切词处理得到的各个分词与对应的代表元关键词之间的索引关系。A word segmentation process is performed on the representative meta-keyword, and an index relationship between each segmented word obtained by the word segmentation process and the corresponding representative meta-keyword is established.

上述申请中的一种可选实施方式,通过将各关键词按照语义进行聚类,并对于聚类后得到的各个分类,选取当前分类中的一个关键词作为代表元关键词,并对代表元关键词进行切词处理,建立切词处理得到的各个分词与对应的代表元关键词之间的索引关系,从而完善了索引关系的建立机制,为查询词的匹配操作奠定基础,从而在基于所建立的索引关系进行查询词匹配和所匹配的候选关键词与查询词的相关度确定时,减少查询词匹配时长和相关度的确定计算量,达到减少搜索时长,提高搜索效率的效果。In an optional embodiment of the above application, the keywords are clustered according to semantics, and for each category obtained after the clustering, a keyword in the current category is selected as the representative meta keyword, and the representative meta keyword is selected. The keyword is processed by word segmentation, and the index relationship between each word segment obtained by the word segmentation process and the corresponding representative meta-keyword is established, so as to improve the establishment mechanism of the index relationship and lay the foundation for the matching operation of query words. When the established index relationship is used to match the query word and determine the correlation between the matched candidate keywords and the query word, the calculation amount of the query word matching duration and the correlation is reduced, so as to reduce the search duration and improve the search efficiency.

可选的,在选取当前分类中的一个关键词作为代表元关键词之后,所述方法还包括:Optionally, after selecting a keyword in the current classification as the representative meta-keyword, the method further includes:

建立各代表元关键词与对应的分类之间的索引关系;Establish the index relationship between each representative meta-keyword and the corresponding classification;

相应的,基于筛选出的所述候选关键词,确定与所述查询词匹配的目标关键词,包括:Correspondingly, based on the selected candidate keywords, determine the target keywords matching the query words, including:

对于筛选出的全部或部分候选关键词,确定当前候选关键词作为代表元关键词所对应的分类;For all or part of the selected candidate keywords, determine the current candidate keyword as the classification corresponding to the representative meta keyword;

读取所确定分类中的各关键词,并将读取的各关键词作为与所述查询词匹配的目标关键词。Each keyword in the determined category is read, and each read keyword is used as a target keyword matching the query word.

上述申请中的一种可选实施方式,通过在索引关系构建过程中,在将各关键词按照语义进行聚类之后,建立代表元关键词与对一个分类之间所索引关系,从而在目标关键词确定过程中,通过分词与代表元关键词对应的一索引关系,以及代表元关键词与分类对应的另一索引关系,实现目标关键词的确定,提高了目标关键词的覆盖度,保证了目标关键词的全面性。In an optional embodiment of the above application, in the process of constructing the index relationship, after the keywords are clustered according to semantics, the index relationship between the representative meta-keyword and a category is established, so that the target key In the process of word determination, through an index relationship corresponding to the word segmentation and the representative meta-keyword, and another index relationship corresponding to the representative meta-keyword and the classification, the determination of the target keyword is realized, the coverage of the target keyword is improved, and the coverage of the target keyword is guaranteed. Comprehensiveness of target keywords.

可选的,对于筛选出的部分候选关键词,确定当前候选关键词作为代表元关键词所对应的分类,包括:Optionally, for some of the selected candidate keywords, determine the current candidate keyword as the classification corresponding to the representative meta-keyword, including:

确定筛选出的各候选关键词中与所述查询词的相关度最高的预设数目的候选关键词;Determine a preset number of candidate keywords with the highest correlation with the query word among the selected candidate keywords;

对于确定出的各候选关键词,确定当前候选关键词作为代表元关键词所对应的分类。For each determined candidate keyword, the current candidate keyword is determined as the category corresponding to the representative meta keyword.

上述申请中的一种可选实施方式,通过在确定代表元关键词所对应的分类时,基于相关度和预设数目进行候选关键词的筛选,仅基于相关度较高的候选关键词进行目标关键词的确定,从而提高了目标关键词与查询词之间的匹配度。In an optional embodiment of the above application, when the classification corresponding to the representative meta-keyword is determined, the candidate keywords are selected based on the degree of relevancy and the preset number, and the target keyword is only based on the candidate keywords with higher degree of relevancy. The key word is determined, thereby improving the matching degree between the target key word and the query word.

可选的,选取当前分类中的一个关键词作为代表元关键词,包括:Optionally, select a keyword in the current category as the representative meta keyword, including:

选取当前分类中长度最短的一个关键词作为代表元关键词。Select a keyword with the shortest length in the current category as the representative meta-keyword.

上述申请中的一种可选实施方式,将代表元关键词的确定过程,细化为选取当前分类中长度最短的一个关键词作为代表元关键词,从而保证了代表元关键词与当前分类中其他关键词之间的代表性,为查询词与候选关键词之间的相关度提供了保障。In an optional embodiment of the above application, the process of determining the representative meta-keyword is refined to select a keyword with the shortest length in the current classification as the representative meta-keyword, thereby ensuring that the representative meta-keyword is the same as that in the current classification. The representativeness of other keywords provides a guarantee for the correlation between query words and candidate keywords.

可选的,将各关键词按照语义进行聚类,包括:Optionally, cluster each keyword according to semantics, including:

针对各关键词挖掘同义的关键词对,并基于挖掘出的关键词对确定至少一个关键词组;Mining synonymous keyword pairs for each keyword, and determining at least one keyword group based on the mined keyword pairs;

其中,每个关键词组中的关键词的语义全部相同。The semantics of the keywords in each keyword group are all the same.

上述申请中的一种可选实施方式,在将关键词进行聚类时,对进行同义关键词对的挖掘,并基于挖掘出的关键词对进行关键词组的确定,完善了关键词的聚类机制,同时通过同义挖掘的方式,保证了分类结果的全面性。In an optional embodiment of the above application, when the keywords are clustered, synonymous keyword pairs are mined, and keyword groups are determined based on the mined keyword pairs, which improves the clustering of keywords. Class mechanism, and at the same time, through the method of synonym mining, the comprehensiveness of the classification results is guaranteed.

可选的,在确定与所述查询词匹配的目标关键词之后,所述方法还包括:Optionally, after determining the target keyword matching the query term, the method further includes:

基于所述目标关键词,检索购买了所述目标关键词的信息投放方的待投放信息;Based on the target keyword, retrieve the information to be delivered of the information delivery party who purchased the target keyword;

将所述待投放信息推送给所述用户的客户端进行展示。Push the to-be-delivered information to the user's client for display.

上述申请中的一个可选实施方式,通过在确定目标关键词之后,确定目标关键词的信息投放方的待投放信息,并将待投放信息推送至用户客户端进行展示,实现了基于用户查询操作进行信息推送的功能,同时提高了所推送信息与用户查询词之间的匹配度和全面性。In an optional embodiment of the above application, after the target keyword is determined, the to-be-posted information of the information releaser of the target keyword is determined, and the to-be-posted information is pushed to the user client for display, thereby realizing user-based query operations. The function of information push is improved, and the matching degree and comprehensiveness between the pushed information and the user's query words are improved.

第二方面,本申请实施例还提供了一种搜索处理装置,包括:In a second aspect, an embodiment of the present application further provides a search processing device, including:

查询词获取模块,用于获取用户输入的查询词;The query word acquisition module is used to obtain the query word input by the user;

候选关键词匹配模块,用于基于预先建立的索引关系,确定与所述查询词匹配的至少一个候选关键词;其中,所述索引关系是在将各关键词按照语义进行聚类后,针对聚类得到的各分类所对应的关键词中的部分关键词,建立的部分关键词的分词与对应关键词之间的索引关系;The candidate keyword matching module is used to determine at least one candidate keyword matching the query word based on the pre-established index relationship; Part of the keywords in the keywords corresponding to each category obtained by the class, and the established index relationship between the word segmentation of the part of the keywords and the corresponding keywords;

候选关键词筛选模块,用于确定各所述候选关键词与所述查询词的相关度,筛选出相关度满足设定条件的所述候选关键词;A candidate keyword screening module, configured to determine the degree of relevancy between each candidate keyword and the query word, and filter out the candidate keywords whose degree of relevancy satisfies a set condition;

目标关键词确定模块,用于基于筛选出的所述候选关键词,确定与所述查询词匹配的目标关键词。A target keyword determination module, configured to determine a target keyword matching the query word based on the selected candidate keywords.

第三方面,本申请实施例还提供了一种电子设备,包括:In a third aspect, an embodiment of the present application also provides an electronic device, including:

至少一个处理器;以及at least one processor; and

与所述至少一个处理器通信连接的存储器;其中,a memory communicatively coupled to the at least one processor; wherein,

所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行如第一方面实施例所提供的一种搜索处理方法。The memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processor to enable the at least one processor to perform a method as provided by an embodiment of the first aspect Search for processing methods.

第四方面,本申请实施例还提供了一种存储有计算机指令的非瞬时计算机可读存储介质,所述计算机指令用于使所述计算机执行如第一方面实施例所提供的一种搜索处理方法。In a fourth aspect, embodiments of the present application further provide a non-transitory computer-readable storage medium storing computer instructions, where the computer instructions are used to cause the computer to perform a search process as provided by the embodiments of the first aspect method.

上述可选方式所具有的其他效果将在下文中结合具体实施例加以说明。Other effects of the above-mentioned optional manners will be described below with reference to specific embodiments.

附图说明Description of drawings

附图用于更好地理解本方案,不构成对本申请的限定。其中:The accompanying drawings are used for better understanding of the present solution, and do not constitute a limitation to the present application. in:

图1是本申请实施例一中的一种搜索处理方法的流程图;1 is a flowchart of a search processing method in Embodiment 1 of the present application;

图2是本申请实施例二中的一种搜索处理方法的流程图;2 is a flowchart of a search processing method in Embodiment 2 of the present application;

图3是本申请实施例三中的一种搜索处理方法的流程图;3 is a flowchart of a search processing method in Embodiment 3 of the present application;

图4是本申请实施例四中的一种搜索处理装置的结构图;4 is a structural diagram of a search processing apparatus in Embodiment 4 of the present application;

图5是用来实现本申请实施例的搜索处理方法的电子设备的框图。FIG. 5 is a block diagram of an electronic device used to implement the search processing method of the embodiment of the present application.

具体实施方式Detailed ways

以下结合附图对本申请的示范性实施例做出说明,其中包括本申请实施例的各种细节以助于理解,应当将它们认为仅仅是示范性的。因此,本领域普通技术人员应当认识到,可以对这里描述的实施例做出各种改变和修改,而不会背离本申请的范围和精神。同样,为了清楚和简明,以下的描述中省略了对公知功能和结构的描述。Exemplary embodiments of the present application are described below with reference to the accompanying drawings, which include various details of the embodiments of the present application to facilitate understanding, and should be considered as exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted from the following description for clarity and conciseness.

实施例一Example 1

图1是本申请实施例一中的一种搜索处理方法的流程图,本申请实施例适用于用户通过搜索引擎进行信息搜索的情况,该方法通过搜索处理装置执行,该装置采用软件和/或硬件实现,并具体配置于具备一定数据运算能力的电子设备中。FIG. 1 is a flowchart of a search processing method in Embodiment 1 of the present application. The embodiment of the present application is applicable to a situation where a user searches for information through a search engine, and the method is executed by a search processing device that uses software and/or It is implemented in hardware and is specifically configured in an electronic device with a certain data computing capability.

如图1所示的一种搜索处理方法,包括:A search processing method as shown in Figure 1, including:

S101、获取用户输入的查询词。S101. Obtain a query word input by a user.

其中,查询词可以是用户输入的名词或者动词等实词。The query word may be a noun or a verb input by the user and other real words.

S102、基于预先建立的索引关系,确定与所述查询词匹配的至少一个候选关键词;其中,所述索引关系是在将各关键词按照语义进行聚类后,针对聚类得到的各分类所对应的关键词中的部分关键词,建立的部分关键词的分词与对应关键词之间的索引关系。S102. Determine at least one candidate keyword matching the query word based on a pre-established index relationship; wherein, the index relationship is a result of clustering each keyword according to semantics, for each classification obtained by the clustering. For some keywords in the corresponding keywords, the established index relationship between the participles of the partial keywords and the corresponding keywords.

其中,索引关系可以是倒排索引。The index relationship may be an inverted index.

其中,候选关键词可以是一个实词,或者是包括至少两个实词的短语。The candidate keyword may be a content word or a phrase including at least two content words.

其中,索引关系可以预先存储在电子设备本地、与电子设备关联的其他存储设备或云端中;相应的,在需要基于索引关系确定与查询词匹配的至少一个候选关键词时,从电子设备本地、与电子设备关联的其他存储设备或云端中,进行索引关系的获取或查询使用。The index relationship may be pre-stored locally on the electronic device, in other storage devices associated with the electronic device, or in the cloud; correspondingly, when at least one candidate keyword matching the query word needs to be determined based on the index relationship, the local, Obtain or query the index relationship in other storage devices or the cloud associated with the electronic device.

S103、确定各所述候选关键词与所述查询词的相关度,筛选出相关度满足设定条件的所述候选关键词。S103: Determine the degree of correlation between each of the candidate keywords and the query word, and filter out the candidate keywords whose degree of correlation satisfies a set condition.

其中,候选关键词与查询词的相关度可以通过互信息(MI,Mutual Information)进行确定。当然,相关度还可以通过现有技术中的其他参数加以确定,在此不再赘述。Wherein, the correlation between the candidate keyword and the query word may be determined by mutual information (MI, Mutual Information). Of course, the correlation may also be determined by other parameters in the prior art, which will not be repeated here.

分别确定查找到的每个候选关键词与查询词之间的相关度,并根据相关度的大小以及候选关键词的数量,进行候选关键词的筛选。The correlation between each candidate keyword found and the query word is determined respectively, and the candidate keywords are screened according to the size of the correlation and the number of candidate keywords.

示例性地,可以对各候选关键词的相关度进行排序,筛选出候选关键词与查询词的相关度最高的设定数量的候选关键词。其中,设定数量可以是技术人员根据需要或经验值进行设定。当然,设定数量可以是预先设定的需要筛选出的候选关键词的个数,还可以是预先设定需要筛选的候选关键词的百分比,并根据该预先设定的百分比,确定需要筛选的候选关键词的数量。Exemplarily, the relevancy of each candidate keyword may be sorted, and a set number of candidate keywords with the highest relevancy between the candidate keyword and the query word are screened out. Wherein, the set number can be set by technical personnel according to needs or experience values. Of course, the set number may be a preset number of candidate keywords to be screened, or a preset percentage of candidate keywords to be screened, and according to the preset percentage, determine the number of candidate keywords to be screened The number of candidate keywords.

S104、基于筛选出的所述候选关键词,确定与所述查询词匹配的目标关键词。S104. Based on the selected candidate keywords, determine a target keyword matching the query word.

示例性地,基于筛选出的所述候选关键词,确定与所述查询词匹配的目标关键词,可以是直接将筛选出的候选关键词作为与查询词匹配的目标关键词;或者可选的,还可以根据候选关键词在索引关系中的所属类别,确定候选关键词所属类别中的至少一个关键词作为目标关键词;或者可选的,还可以根据候选关键词在索引关系中的所属类别,确定候选关键词所属类别的关联类别,并将关联类别中的至少一个关键词作为目标关键词。其中,关联类别可以是语义相近、用途相近或上层类别相同的类别。示例性地,当确定的候选关键词为“花盆”时,关联类别可以是“花瓶”、“花架”或“花肥”等Exemplarily, based on the selected candidate keywords, the target keywords matching the query words are determined, and the selected candidate keywords may be directly used as the target keywords matching the query words; or optional , it is also possible to determine at least one keyword in the category to which the candidate keyword belongs as the target keyword according to the category to which the candidate keyword belongs in the index relationship; or alternatively, it can also be based on the category to which the candidate keyword belongs in the index relationship. , determine the associated category of the category to which the candidate keyword belongs, and use at least one keyword in the associated category as the target keyword. The associated categories may be categories with similar semantics, similar uses, or the same upper-level category. Exemplarily, when the determined candidate keyword is "flower pot", the associated category may be "flower vase", "flower stand" or "flower fertilizer", etc.

本申请实施例通过获取用户输入的查询词;基于预先在将各关键词按照语句进行聚类后,针对各分类中的部分关键词,建立的关键词的分词与对应关键词之间的索引关系,确定与查询词匹配的至少一个候选关键词;确定各候选关键词与查询词的相关度,筛选出相关度满足设定条件的候选关键词;基于筛选出的候选关键词,确定与查询词匹配的目标关键词。上述技术方案对于语义相同的关键词建立部分关键词的分词与对应的关键词之间的索引关系,而不是针对全部关键词建立索引关系,因此基于该索引关系确定的与查询词匹配的候选关键词的数量也会减少,进而在进行候选关键词与查询词之间相关度计算的工作量也会减少,从而减少了搜索时长,提高了搜索效率,同时,还提高了搜索系统的召回能力。In the embodiment of the present application, the query words input by the user are obtained; based on the index relationship between the word segmentation of the keywords and the corresponding keywords established for some keywords in each category after clustering the keywords according to the sentences in advance , determine at least one candidate keyword that matches the query word; determine the correlation between each candidate keyword and the query word, and screen out candidate keywords whose correlation meets the set conditions; matching target keywords. The above technical solution establishes an index relationship between the word segmentation of some keywords and the corresponding keywords for keywords with the same semantics, rather than establishing an index relationship for all keywords, so the candidate key determined based on the index relationship that matches the query word The number of words will also be reduced, and the workload of calculating the correlation between candidate keywords and query words will also be reduced, thereby reducing search time, improving search efficiency, and improving the recall capability of the search system.

在上述各实施例的技术方案的基础上,为了将搜索处理方法适用于信息推送的应用场景,可以在确定与查询词匹配的目标关键词之后,基于目标关键词,检索购买了目标关键词的信息投放方的待投放信息;将待投放信息推送给用户的客户端进行展示。On the basis of the technical solutions of the above-mentioned embodiments, in order to apply the search processing method to the application scenario of information push, after determining the target keyword matching the query word, based on the target keyword, search and purchase the target keyword. The information to be delivered by the information delivery party; the information to be delivered is pushed to the user's client for display.

可以理解的是,搜索引擎通常会根据用户搜索的内容进行信息推荐,以增加搜索引擎的信息投放效益。当基于用户出入的查询词确定目标关键词后,将会基于目标关键词确定各目标关键词的信息投放方,进而根据信息投放方确定当前时段对应的待投放信息;将确定的待投放信息推送至用户的客户端或展示网页,进行信息展示,从而实现了根据用户查询词向用户进行信息推送,从而提高了所推送信息与用户需求的匹配度。It is understandable that the search engine usually recommends information according to the content searched by the user, so as to increase the information delivery efficiency of the search engine. After the target keywords are determined based on the query words entered and exited by the user, the information delivery party of each target keyword will be determined based on the target keywords, and then the information to be delivered corresponding to the current period will be determined according to the information delivery party; the determined information to be delivered will be pushed. To the user's client or display the web page, to display the information, so as to realize the information push to the user according to the user's query words, thereby improving the matching degree of the pushed information and the user's needs.

实施例二Embodiment 2

图2是本申请实施例二中的一种搜索处理方法的流程图,本申请实施例在上述各实施例的技术方案的基础上进行了优化改进。FIG. 2 is a flowchart of a search processing method in Embodiment 2 of the present application. The embodiments of the present application are optimized and improved on the basis of the technical solutions of the foregoing embodiments.

进一步的,在执行“基于预先建立的索引关系,确定与所述查询词匹配的至少一个候选关键词”之前,追加索引关系的建立操作;进一步的,将索引关系的建立操作,细化为“将各关键词按照语义进行聚类;对于聚类后得到的各个分类,选取当前分类中的一个关键词作为代表元关键词;对所述代表元关键词进行切词处理,建立切词处理得到的各个分词与对应的代表元关键词之间的索引关系”,以完善索引关系的建立机制,从而为候选关键词的确定奠定基础。Further, before executing "determine at least one candidate keyword matching the query word based on the pre-established index relationship", the establishment operation of the index relationship is added; further, the establishment operation of the index relationship is refined as " Cluster each keyword according to semantics; for each category obtained after clustering, select a keyword in the current category as a representative meta-keyword; perform word segmentation processing on the representative meta-keyword, and establish a word segmentation process to obtain The index relationship between each participle and the corresponding representative meta-keyword”, to improve the establishment mechanism of the index relationship, thus laying the foundation for the determination of candidate keywords.

如图2所示的一种搜索处理方法,包括:A search processing method as shown in Figure 2, including:

S201、将各关键词按照语义进行聚类。S201. Cluster each keyword according to semantics.

示例性地,可以通过机器学习模型的训练和使用,对各关键词进行语义分类,将各关键词划分为不同类别。Exemplarily, by training and using a machine learning model, each keyword can be semantically classified, and each keyword can be divided into different categories.

示例性地,还可以针对各关键词挖掘同义的关键词对,并基于挖掘出的关键词对确定至少一个关键词组;其中,每个关键词组中的关键词的语义全部相同,也即每个关键词组对应一个分类类别。Exemplarily, synonymous keyword pairs may also be mined for each keyword, and at least one keyword group may be determined based on the mined keyword pairs; wherein, the semantics of the keywords in each keyword group are all the same, that is, each keyword group has the same semantics. Each keyword group corresponds to a classification category.

可以理解的是,由于同义关系满足自反、对称和传递性,因此在进行关键词对挖掘时,还可以通过关键词的自反性、对称性和传递性等中的至少一个,确定与关键词对应的关联关键词,并根据关键词和与关键词对应的关联关键词构建关键词组。It can be understood that, since the synonymous relationship satisfies reflexivity, symmetry and transitivity, during the keyword pair mining, at least one of the reflexivity, symmetry and transitivity of the keyword can also be used to determine the Associated keywords corresponding to the keywords, and a keyword group is constructed according to the keywords and the associated keywords corresponding to the keywords.

例如,当向用户进行信息展现时,通常用户对关键词A对应的消息产生交互行为时,也会对关键词B对应的消息产生相应的交互行为,则认定关键词A和关键词B互为关联关键词,可形成关键词对;其中,交互行为可以是浏览、点击、收藏、下单以及评论等中的至少一个。For example, when displaying information to the user, usually when the user interacts with the message corresponding to keyword A, it also generates corresponding interactive behavior for the message corresponding to keyword B, then it is determined that keyword A and keyword B are mutually exclusive. The associated keywords can form a keyword pair; wherein, the interactive behavior can be at least one of browsing, clicking, bookmarking, placing an order, and commenting.

又如,当关键词A对应的信息可以通过改写关键词B对应的信息得到,或者关键词A对应的信息与关键词B对应的信息可以基于相同的信息生成模板所生成,则认定关键词A和关键词B互为关联关键词,可形成关键词对。For another example, when the information corresponding to keyword A can be obtained by rewriting the information corresponding to keyword B, or the information corresponding to keyword A and the information corresponding to keyword B can be generated based on the same information generation template, then keyword A is determined. and keyword B are related keywords to each other, and a keyword pair can be formed.

再如,关键词A对应的信息和关键词B对应的信息在相同投放区域投放展示和/或投放区域被同一信息投放方购买,则认定关键词A和关键词B互为关联关键词,可形成关键词对。For another example, if the information corresponding to keyword A and the information corresponding to keyword B are displayed in the same delivery area and/or the delivery area is purchased by the same information delivery party, then it is determined that keyword A and keyword B are mutually related keywords, and may Form keyword pairs.

当然,可以理解的是,若关键词A与关键词B互为关联关键词,关键词B与关键词C互为关联关键词,则可以认定关键词A与关键词C同样互为关联关键词。Of course, it can be understood that if keyword A and keyword B are mutually related keywords, and keyword B and keyword C are mutually related keywords, then it can be determined that keyword A and keyword C are also mutually related keywords .

S202、对于聚类后得到的各个分类,选取当前分类中的一个关键词作为代表元关键词。S202. For each category obtained after clustering, select a keyword in the current category as a representative meta keyword.

示例性地,对于聚类后得到的各个分类,可以随机选择当前分类中的任一关键词作为代表元关键词。可以理解的是,由于关键词可以是包含有多个词汇的短句,因此为了保证所选择的关键词与同一分类中的其他关键词之间的匹配度和关联性;典型的,可以选取当前分类中长度最短的一个关键词作为代表元关键词。Exemplarily, for each category obtained after clustering, any keyword in the current category may be randomly selected as the representative meta keyword. It can be understood that, since a keyword can be a short sentence containing multiple words, in order to ensure the matching degree and relevance between the selected keyword and other keywords in the same category; typically, the current keyword can be selected. A keyword with the shortest length in the category is used as the representative meta-keyword.

举例说明,在一个分类类别中,包括如下关键词:“双眼皮手术费用”、“双眼皮手术的价格”、以及“眼睑手术要多少钱”,则可以选择“双眼皮手术费用”作为代表元关键词。For example, in a classification category, including the following keywords: "double eyelid surgery cost", "double eyelid surgery price", and "how much does eyelid surgery cost", then "double eyelid surgery cost" can be selected as the representative element Key words.

S203、对所述代表元关键词进行切词处理,建立切词处理得到的各个分词与对应的代表元关键词之间的索引关系。S203: Perform word segmentation on the representative meta-keyword, and establish an index relationship between each segmented word obtained by the word segmentation process and the corresponding representative meta-keyword.

S204、获取用户输入的查询词。S204. Obtain the query word input by the user.

S205、基于预先建立的索引关系,确定与所述查询词匹配的至少一个候选关键词。S205. Determine at least one candidate keyword matching the query word based on a pre-established index relationship.

具体的,基于分词与代表元关键词之间的索引关系,确定与查询词匹配的至少一个代表元关键词作为候选关键词。Specifically, based on the index relationship between the segmented word and the representative meta-keyword, at least one representative meta-keyword matching the query term is determined as a candidate keyword.

可以理解的是,由于在索引关系建立过程中,根据各关键词的聚类结果,从每个类别中仅选取一个代表元关键词进行后续索引关系的建立,所以能够大幅度减小建立索引关系时的数据运算量和建立时长,能够显著提升索引关系建立效率。It can be understood that, in the process of establishing the index relationship, only one representative meta-keyword is selected from each category to establish the subsequent index relationship according to the clustering results of each keyword, so the establishment of the index relationship can be greatly reduced. It can significantly improve the efficiency of index relationship establishment.

相应的,在后续使用所建立的索引关系进行与查询词所匹配的候选关键词的确定过程中,进行候选关键词匹配的数据运算量和匹配时长也会相应减少,因此能够显著提升候选关键词确定效率。同时,由于索引关系中代表元关键词的数量较现有技术中的索引关系中的关键词的数量显著减少,那么相应的进行候选关键词匹配确定时所确定的候选关键词的数量也显著减少。Correspondingly, in the subsequent process of using the established index relationship to determine the candidate keywords that match the query words, the amount of data operation and the matching time for matching the candidate keywords will be correspondingly reduced, so the candidate keywords can be significantly improved. Determine efficiency. At the same time, since the number of representative meta keywords in the index relationship is significantly reduced compared to the number of keywords in the index relationship in the prior art, the corresponding number of candidate keywords determined when the candidate keyword matching is determined is also significantly reduced. .

S206、确定各所述候选关键词与所述查询词的相关度,筛选出相关度满足设定条件的所述候选关键词。S206: Determine the degree of relevancy between each of the candidate keywords and the query word, and filter out the candidate keywords whose degree of relevancy satisfies a set condition.

可以理解的是,由于所确定的候选关键词的数量显著减少,那么在确定各候选关键词与查询词的相关度时的数据运算量和确定时长也必然会降低,从而提高了相关度确定效率。It can be understood that since the number of the determined candidate keywords is significantly reduced, the amount of data calculation and the determination time when determining the correlation between each candidate keyword and the query word will inevitably be reduced, thereby improving the efficiency of determining the correlation. .

S207、基于筛选出的所述候选关键词,确定与所述查询词匹配的目标关键词。S207. Based on the selected candidate keywords, determine a target keyword matching the query word.

需要说明的是,S201~S203对应的索引关系建立操作,可以在S204之前执行,也可以在S204之后执行,仅需保证其执行顺序位于S205之前即可。It should be noted that, the index relationship establishment operations corresponding to S201 to S203 may be performed before S204, or may be performed after S204, and it is only necessary to ensure that the execution sequence is before S205.

本申请实施例通过在基于预先建立的索引关系,确定与查询词匹配的至少一个候选关键词之前,追加索引关系的建立操作,并具体通过将各关键词按照语义进行聚类,对于聚类后得到的各个分类,选取当前分类中的一个关键词作为代表元关键词,并对代表元关键词进行切词处理,建立切词处理得到的各个分词与对应的代表元关键词之间的索引关系,从而为候选关键词的确定奠定基础。同时,在进行索引关系建立的过程中,从各个分类中选取一个关键词作为代表元关键词,进行索引关系的建立,大大减少了索引关系建立的时间成本,同时减少了后续基于索引关系进行候选关键词的匹配和相关度计算时,能够节约时间成本和计算成本,从而提高了搜索效率。In this embodiment of the present application, an operation of establishing an index relationship is added before at least one candidate keyword matching the query word is determined based on a pre-established index relationship, and specifically, the keywords are clustered according to their semantics. For each category obtained, select a keyword in the current category as a representative meta-keyword, and perform word segmentation processing on the representative meta-keyword, and establish an index relationship between each word segment obtained by the segmentation process and the corresponding representative meta-keyword. , thus laying the foundation for the determination of candidate keywords. At the same time, in the process of establishing the index relationship, a keyword is selected from each category as the representative meta-keyword to establish the index relationship, which greatly reduces the time cost of establishing the index relationship and reduces the subsequent candidate based on the index relationship. When the matching of keywords and the calculation of the relevance degree, time cost and calculation cost can be saved, thereby improving the search efficiency.

实施例三Embodiment 3

图3是本申请实施例三中的一种搜索处理方法的流程图,本申请实施例在上述各实施例的技术方案的基础上,进行了优化改进。FIG. 3 is a flowchart of a search processing method in Embodiment 3 of the present application. On the basis of the technical solutions of the foregoing embodiments, the embodiments of the present application are optimized and improved.

进一步的,将操作“在将各关键词按照语义进行聚类”之后,追加“建立各代表元关键词与对应的分类之间的索引关系”;相应的,将操作“基于筛选出的所述候选关键词,确定与所述查询词匹配的目标关键词”细化为“对于筛选出的全部或部分候选关键词,确定当前候选关键词作为代表元关键词所对应的分类;读取所确定分类中的各关键词,并将读取的各关键词作为与所述查询词匹配的目标关键词”,以提高目标关键词的覆盖度和全面性。Further, after the operation "cluster each keyword according to semantics", "establish an index relationship between each representative meta-keyword and the corresponding classification" is added; correspondingly, the operation "based on the selected Candidate keywords, determine the target keyword matching the query word" is refined into "For all or part of the selected candidate keywords, determine the current candidate keyword as the classification corresponding to the representative meta keyword; read the determined keyword. Each keyword in the classification, and each read keyword is used as the target keyword matching the query word", so as to improve the coverage and comprehensiveness of the target keyword.

如图3所示的一种搜索处理方法,包括:A search processing method as shown in Figure 3, including:

S301、将各关键词按照语义进行聚类。S301. Cluster each keyword according to semantics.

S302、对于聚类后得到的各个分类,选取当前分类中的一个关键词作为代表元关键词。S302. For each category obtained after clustering, select a keyword in the current category as a representative meta keyword.

S303、对所述代表元关键词进行切词处理,建立切词处理得到的各个分词与对应的代表元关键词之间的第一索引关系。S303. Perform word segmentation on the representative meta-keyword, and establish a first index relationship between each segmented word obtained by the word segmentation process and the corresponding representative meta-keyword.

S304、建立各代表元关键词与对应的分类之间的第二索引关系。S304. Establish a second index relationship between each representative meta-keyword and the corresponding category.

需要说明的是,S303和S304两者可以先后执行,也可以同时执行,本申请实施例对两者的具体执行顺序不做任何限定。It should be noted that, both S303 and S304 may be executed sequentially, or may be executed simultaneously, and the embodiment of the present application does not make any limitation on the specific execution order of the two.

S305、获取用户输入的查询词。S305. Obtain the query word input by the user.

S306、基于第一索引关系,确定与所述查询词匹配的至少一个代表元关键词作为候选关键词。S306. Based on the first index relationship, determine at least one representative meta-keyword matching the query term as a candidate keyword.

基于预先建立的分词与代表元关键词之间的第一索引关系,确定与查询词匹配的代表元关键词,并将匹配的代表元关键词作为候选关键词。Based on the pre-established first index relationship between the segmented word and the representative meta-keyword, the representative meta-keyword matching the query term is determined, and the matched representative meta-keyword is used as a candidate keyword.

S307、确定各所述候选关键词与所述查询词的相关度,筛选出相关度满足设定条件的所述候选关键词。S307: Determine the degree of correlation between each of the candidate keywords and the query word, and filter out the candidate keywords whose degree of correlation satisfies a set condition.

需要说明的是,在进行相关度计算过程中,每个候选关键词将代表候选关键词所属类别中各关键词进行相关度计算,因此计算一个候选关键词与查询词之间的相关度,相当于同时计算得到了候选关键词所属类别中其他关键词与查询词之间的相关度,显著减少了相关度计算过程的数据运算量,进而提高了搜索系统的召回能力。It should be noted that, during the correlation calculation process, each candidate keyword will represent each keyword in the category to which the candidate keyword belongs to perform correlation calculation. Therefore, calculating the correlation between a candidate keyword and a query word is equivalent to At the same time, the correlation between other keywords in the category to which the candidate keyword belongs and the query word is calculated, which significantly reduces the amount of data calculation in the correlation calculation process, thereby improving the recall ability of the search system.

S308、对于筛选出的全部或部分候选关键词,基于第二索引关系确定当前候选关键词所对应的分类。S308. For all or part of the filtered candidate keywords, determine the classification corresponding to the current candidate keyword based on the second index relationship.

在本申请实施例的一种可选实施方式中,可以直接通过代表元关键词与分类之间的第二索引关系,确定所筛选的各候选关键词作为代表元关键词对应的分类。In an optional implementation manner of the embodiment of the present application, each of the selected candidate keywords may be determined as the categories corresponding to the representative meta keywords by directly using the second index relationship between the representative meta keywords and the categories.

为了减少确定分类过程的数据运算量,以及后续基于分类结果所确定的目标关键词的数量,在本申请实施例的另一可选实施方式中,还可以仅针对筛选出的部分候选关键词,基于第二索引关系确定当前候选关键词作为代表元关键词所对应的分类。其中,可以通过随机筛选、与查询词的相关度大小和候选关键词长度等方式中的至少一种,从筛选出的候选关键词中进行部分候选关键词的确定。In order to reduce the amount of data computation for determining the classification process and the number of target keywords subsequently determined based on the classification results, in another optional implementation of this embodiment of the present application, only some of the selected candidate keywords may be selected. Based on the second index relationship, the current candidate keyword is determined as the category corresponding to the representative meta-keyword. Wherein, some candidate keywords may be determined from the selected candidate keywords by at least one of random screening, the degree of correlation with the query word, and the length of the candidate keywords.

示例性地,对于筛选出的部分候选关键词,基于第二索引关系确定当前候选关键词作为代表元关键词所对应的分类,可以是:确定筛选出的各候选关键词中与所述查询词的相关度最高的预设数目的候选关键词;对于确定出的各候选关键词,基于第二索引关系确定当前候选关键词作为代表元关键词所对应的分类。Exemplarily, for some of the selected candidate keywords, determining the classification corresponding to the current candidate keyword as the representative meta-keyword based on the second index relationship may be: determining the relationship between the selected candidate keywords and the query word. The preset number of candidate keywords with the highest degree of correlation; for each determined candidate keyword, the current candidate keyword is determined based on the second index relationship as the classification corresponding to the representative meta keyword.

S309、读取所确定分类中的各关键词,并将读取的各关键词作为与所述查询词匹配的目标关键词。S309: Read each keyword in the determined category, and use each read keyword as a target keyword matching the query word.

可以理解的是,通过分词与代表元关键词对应的第一索引关系,进行查询词对应的候选关键词的确定和筛选,能够减少候选关键词确定和筛选过程的数据运算量和耗费时长,进而提高搜索效率;通过分类与代表元关键词对应的第二索引关系,进行目标关键词的确定,能够基于筛选的候选关键词所属类别,使目标关键词在该类别内有效覆盖,提高了目标关键词的覆盖率和全面性。It can be understood that by determining and screening the candidate keywords corresponding to the query words through the first index relationship corresponding to the word segmentation and the representative meta keywords, the amount of data calculation and the time consuming in the process of determining and screening the candidate keywords can be reduced, and further. Improve search efficiency; determine the target keyword by classifying the second index relationship corresponding to the representative meta-keyword. Based on the category to which the selected candidate keyword belongs, the target keyword can be effectively covered in the category, and the target keyword can be improved. word coverage and comprehensiveness.

本申请实施例通过引入分类与代表元关键词对应的索引关系,并在确定目标关键词时,基于该索引关系进行筛选出的候选关键词所属分类的确定,从而将所确定的分类中的各关键词作为目标关键词,提高了目标关键词的覆盖范围和全面性,同时提高了搜索系统的召回能力。In the embodiment of the present application, the index relationship corresponding to the classification and the representative meta-keyword is introduced, and when the target keyword is determined, the classification to which the selected candidate keyword belongs is determined based on the index relationship, so that each category in the determined classification is determined. The keyword as the target keyword improves the coverage and comprehensiveness of the target keyword, and at the same time improves the recall ability of the search system.

实施例四Embodiment 4

图4是本申请实施例四中的一种搜索处理装置的结构图,本申请实施例适用于用户通过搜索引擎进行信息搜索的情况,该装置采用软件和/或硬件实现,并具体配置于具备一定数据运算能力的电子设备中。FIG. 4 is a structural diagram of a search processing apparatus in Embodiment 4 of the present application. The embodiment of the present application is applicable to a situation where a user searches for information through a search engine. The apparatus is implemented by software and/or hardware, and is specifically configured to have In electronic equipment with certain data computing capability.

如图4所示的一种搜索处理装置400,包括:查询词获取模块401,候选关键词匹配模块402,候选关键词筛选模块403和目标关键词确定模块404;其中,A search processing device 400 as shown in FIG. 4 includes: a query word acquisition module 401, a candidate keyword matching module 402, a candidate keyword screening module 403 and a target keyword determination module 404; wherein,

查询词获取模块401,用于获取用户输入的查询词;A query word obtaining module 401, configured to obtain a query word input by a user;

候选关键词匹配模块402,用于基于预先建立的索引关系,确定与所述查询词匹配的至少一个候选关键词;其中,所述索引关系是在将各关键词按照语义进行聚类后,针对聚类得到的各分类所对应的关键词中的部分关键词,建立的部分关键词的分词与对应关键词之间的索引关系;The candidate keyword matching module 402 is configured to determine at least one candidate keyword matching the query word based on the pre-established index relationship; Part of the keywords in the keywords corresponding to each classification obtained by clustering, and the established index relationship between the participles of the partial keywords and the corresponding keywords;

候选关键词筛选模块403,用于确定各所述候选关键词与所述查询词的相关度,筛选出相关度满足设定条件的所述候选关键词;A candidate keyword screening module 403, configured to determine the degree of relevancy between each candidate keyword and the query word, and filter out the candidate keywords whose relevancy satisfies a set condition;

目标关键词确定模块404,用于基于筛选出的所述候选关键词,确定与所述查询词匹配的目标关键词。The target keyword determination module 404 is configured to determine a target keyword matching the query word based on the selected candidate keywords.

本申请实施例通过查询词获取模块获取用户输入的查询词;通过候选关键词匹配模块基于预先在将各关键词按照语句进行聚类后,针对各分类中的部分关键词,建立的关键词的分词与对应关键词之间的索引关系,确定与查询词匹配的至少一个候选关键词;通过候选关键词筛选模块确定各候选关键词与查询词的相关度,筛选出相关度满足设定条件的候选关键词;通过目标关键词确定模块基于筛选出的候选关键词,确定与查询词匹配的目标关键词。上述技术方案对于语义相同的关键词建立部分关键词的分词与对应的关键词之间的索引关系,而不是针对全部关键词建立索引关系,因此在确定与查询词匹配的候选关键词的数量也会减少,进而在进行候选关键词与查询词之间相关度计算的工作量也会减少,从而减少了搜索时长,提高了搜索效率,同时,还提高了搜索系统的召回能力。In this embodiment of the present application, the query word input by the user is obtained through the query word acquisition module; the candidate keyword matching module is used to obtain the established keywords for some keywords in each category after clustering the keywords according to the sentences in advance. The index relationship between the segmented words and the corresponding keywords is used to determine at least one candidate keyword that matches the query word; the candidate keyword screening module is used to determine the correlation between each candidate keyword and the query word, and filter out the correlation degree that satisfies the set conditions. Candidate keywords; the target keyword determination module determines the target keyword matching the query word based on the selected candidate keywords. The above technical solution establishes an index relationship between the word segmentation of some keywords and the corresponding keywords for keywords with the same semantics, instead of establishing an index relationship for all keywords, so the number of candidate keywords matching the query word is also determined. will reduce, and then the workload of calculating the correlation between candidate keywords and query words will also be reduced, thereby reducing the search time, improving the search efficiency, and at the same time, improving the recall ability of the search system.

进一步地,所述装置还包括,索引关系建立模块,包括:Further, the device also includes an index relationship establishment module, including:

聚类单元,用于将各关键词按照语义进行聚类;Clustering unit, used to cluster each keyword according to semantics;

代表元关键词选取单元,用于对于聚类后得到的各个分类,选取当前分类中的一个关键词作为代表元关键词;The representative meta keyword selection unit is used to select a keyword in the current category as the representative meta keyword for each category obtained after clustering;

第一索引关系建立单元,用于对所述代表元关键词进行切词处理,建立切词处理得到的各个分词与对应的代表元关键词之间的索引关系。The first index relationship establishing unit is configured to perform a word segmentation process on the representative meta-keyword, and establish an index relationship between each segmented word obtained by the word segmentation process and the corresponding representative meta-keyword.

进一步地,该装置还包括,第二索引关系建立单元,用于:Further, the device also includes a second index relationship establishing unit for:

在选取当前分类中的一个关键词作为代表元关键词之后,建立各代表元关键词与对应的分类之间的索引关系;After selecting a keyword in the current category as the representative meta-keyword, establish an index relationship between each representative meta-keyword and the corresponding category;

相应的,目标关键词确定模块404,包括:Correspondingly, the target keyword determination module 404 includes:

分类确定单元,用于对于筛选出的全部或部分候选关键词,确定当前候选关键词作为代表元关键词所对应的分类;a classification determination unit, configured to determine the current candidate keyword as the classification corresponding to the representative meta-keyword for all or part of the selected candidate keywords;

目标关键词确定单元,用于读取所确定分类中的各关键词,并将读取的各关键词作为与所述查询词匹配的目标关键词。A target keyword determination unit, configured to read each keyword in the determined category, and use each read keyword as a target keyword matching the query word.

进一步地,分类确定单元,具体用于:Further, the classification determination unit is specifically used for:

确定筛选出的各候选关键词中与所述查询词的相关度最高的预设数目的候选关键词;Determine a preset number of candidate keywords with the highest correlation with the query word among the selected candidate keywords;

对于确定出的各候选关键词,确定当前候选关键词作为代表元关键词所对应的分类。For each determined candidate keyword, the current candidate keyword is determined as the category corresponding to the representative meta keyword.

进一步地,代表元关键词选取单元,在执行选取当前分类中的一个关键词作为代表元关键词时,具体用于:Further, the representative meta-keyword selection unit, when performing selecting a keyword in the current classification as the representative meta-keyword, is specifically used for:

选取当前分类中长度最短的一个关键词作为代表元关键词。Select a keyword with the shortest length in the current category as the representative meta-keyword.

进一步地,聚类单元,具体用于:Further, the clustering unit is specifically used for:

针对各关键词挖掘同义的关键词对,并基于挖掘出的关键词对确定至少一个关键词组;Mining synonymous keyword pairs for each keyword, and determining at least one keyword group based on the mined keyword pairs;

其中,每个关键词组中的关键词的语义全部相同。The semantics of the keywords in each keyword group are all the same.

进一步地,该装置还包括信息推送模块,具体用于:Further, the device also includes an information push module, which is specifically used for:

在确定与所述查询词匹配的目标关键词之后,基于所述目标关键词,检索购买了所述目标关键词的信息投放方的待投放信息;After determining the target keyword matching the query term, based on the target keyword, retrieve the information to be delivered of the information delivery party who purchased the target keyword;

将所述待投放信息推送给所述用户的客户端进行展示。Push the to-be-delivered information to the user's client for display.

上述搜索处理装置可执行本申请任意实施例所提供的搜索处理方法,具备执行搜索处理方法相应的功能模块和有益效果。The above-mentioned search processing apparatus can execute the search processing method provided by any embodiment of the present application, and has corresponding functional modules and beneficial effects for executing the search processing method.

实施例五Embodiment 5

根据本申请的实施例,本申请还提供了一种电子设备和一种可读存储介质。According to the embodiments of the present application, the present application further provides an electronic device and a readable storage medium.

如图5所示,是实现本申请实施例的搜索处理方法的电子设备的框图。电子设备旨在表示各种形式的数字计算机,诸如,膝上型计算机、台式计算机、工作台、个人数字助理、服务器、刀片式服务器、大型计算机、和其它适合的计算机。电子设备还可以表示各种形式的移动装置,诸如,个人数字处理、蜂窝电话、智能电话、可穿戴设备和其它类似的计算装置。本文所示的部件、它们的连接和关系、以及它们的功能仅仅作为示例,并且不意在限制本文中描述的和/或者要求的本申请的实现。As shown in FIG. 5 , it is a block diagram of an electronic device implementing the search processing method according to the embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers. Electronic devices may also represent various forms of mobile devices, such as personal digital processors, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions are by way of example only, and are not intended to limit implementations of the application described and/or claimed herein.

如图5所示,该电子设备包括:一个或多个处理器501、存储器502,以及用于连接各部件的接口,包括高速接口和低速接口。各个部件利用不同的总线互相连接,并且可以被安装在公共主板上或者根据需要以其它方式安装。处理器可以对在电子设备内执行的指令进行处理,包括存储在存储器中或者存储器上以在外部输入/输出装置(诸如,耦合至接口的显示设备)上显示GUI的图形信息的指令。在其它实施方式中,若需要,可以将多个处理器和/或多条总线与多个存储器和多个存储器一起使用。同样,可以连接多个电子设备,各个设备提供部分必要的操作(例如,作为服务器阵列、一组刀片式服务器、或者多处理器系统)。图5中以一个处理器501为例。As shown in FIG. 5, the electronic device includes: one or more processors 501, a memory 502, and interfaces for connecting various components, including a high-speed interface and a low-speed interface. The various components are interconnected using different buses and may be mounted on a common motherboard or otherwise as desired. The processor may process instructions executed within the electronic device, including instructions stored in or on memory to display graphical information of the GUI on an external input/output device, such as a display device coupled to the interface. In other embodiments, multiple processors and/or multiple buses may be used with multiple memories and multiple memories, if desired. Likewise, multiple electronic devices may be connected, each providing some of the necessary operations (eg, as a server array, a group of blade servers, or a multiprocessor system). A processor 501 is taken as an example in FIG. 5 .

存储器502即为本申请所提供的非瞬时计算机可读存储介质。其中,所述存储器存储有可由至少一个处理器执行的指令,以使所述至少一个处理器执行本申请所提供的搜索处理方法。本申请的非瞬时计算机可读存储介质存储计算机指令,该计算机指令用于使计算机执行本申请所提供的搜索处理方法。The memory 502 is the non-transitory computer-readable storage medium provided by the present application. Wherein, the memory stores instructions executable by at least one processor, so that the at least one processor executes the search processing method provided by the present application. The non-transitory computer-readable storage medium of the present application stores computer instructions, and the computer instructions are used to cause the computer to execute the search processing method provided by the present application.

存储器502作为一种非瞬时计算机可读存储介质,可用于存储非瞬时软件程序、非瞬时计算机可执行程序以及模块,如本申请实施例中的搜索处理的方法对应的程序指令/模块(例如,附图4所示的包括查询词获取模块401,候选关键词匹配模块402,候选关键词筛选模块403和目标关键词确定模块404的搜索处理装置400)。处理器501通过运行存储在存储器502中的非瞬时软件程序、指令以及模块,从而执行服务器的各种功能应用以及数据处理,即实现上述方法实施例中的搜索处理的方法。As a non-transitory computer-readable storage medium, the memory 502 can be used to store non-transitory software programs, non-transitory computer-executable programs, and modules, such as program instructions/modules corresponding to the search processing method in the embodiments of the present application (for example, The search processing device 400 shown in FIG. 4 includes a query word acquisition module 401, a candidate keyword matching module 402, a candidate keyword screening module 403 and a target keyword determination module 404). The processor 501 executes various functional applications and data processing of the server by running the non-transitory software programs, instructions and modules stored in the memory 502, that is, the method for implementing the search processing in the above method embodiments.

存储器502可以包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需要的应用程序;存储数据区可存储实现搜索处理方法的电子设备的使用所创建的数据等。此外,存储器502可以包括高速随机存取存储器,还可以包括非瞬时存储器,例如至少一个磁盘存储器件、闪存器件、或其他非瞬时固态存储器件。在一些实施例中,存储器502可选包括相对于处理器501远程设置的存储器,这些远程存储器可以通过网络连接至执行搜索处理方法的电子设备。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。The memory 502 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created by the use of the electronic device implementing the search processing method, etc. . Additionally, memory 502 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage device. In some embodiments, the memory 502 may optionally include memory located remotely relative to the processor 501, and these remote memories may be connected via a network to the electronic device performing the search processing method. Examples of such networks include, but are not limited to, the Internet, an intranet, a local area network, a mobile communication network, and combinations thereof.

执行搜索处理方法的电子设备还可以包括:输入装置503和输出装置504。处理器501、存储器502、输入装置503和输出装置504可以通过总线或者其他方式连接,图5中以通过总线连接为例。The electronic device executing the search processing method may further include: an input device 503 and an output device 504 . The processor 501 , the memory 502 , the input device 503 and the output device 504 may be connected by a bus or in other ways, and the connection by a bus is taken as an example in FIG. 5 .

输入装置503可接收输入的数字或字符信息,以及产生与执行搜索处理方法的电子设备的用户设置以及功能控制有关的键信号输入,例如触摸屏、小键盘、鼠标、轨迹板、触摸板、指示杆、一个或者多个鼠标按钮、轨迹球、操纵杆等输入装置。输出装置504可以包括显示设备、辅助照明装置(例如,LED)和触觉反馈装置(例如,振动电机)等。该显示设备可以包括但不限于,液晶显示器(LCD)、发光二极管(LED)显示器和等离子体显示器。在一些实施方式中,显示设备可以是触摸屏。The input device 503 can receive input numerical or character information, and generate key signal input related to user settings and function control of the electronic device executing the search processing method, such as a touch screen, a keypad, a mouse, a trackpad, a touchpad, a pointing stick , one or more mouse buttons, trackballs, joysticks and other input devices. Output devices 504 may include display devices, auxiliary lighting devices (eg, LEDs), haptic feedback devices (eg, vibration motors), and the like. The display device may include, but is not limited to, a liquid crystal display (LCD), a light emitting diode (LED) display, and a plasma display. In some implementations, the display device may be a touch screen.

此处描述的系统和技术的各种实施方式可以在数字电子电路系统、集成电路系统、专用ASIC(专用集成电路)、计算机硬件、固件、软件、和/或它们的组合中实现。这些各种实施方式可以包括:实施在一个或者多个计算机程序中,该一个或者多个计算机程序可在包括至少一个可编程处理器的可编程系统上执行和/或解释,该可编程处理器可以是专用或者通用可编程处理器,可以从存储系统、至少一个输入装置、和至少一个输出装置接收数据和指令,并且将数据和指令传输至该存储系统、该至少一个输入装置、和该至少一个输出装置。Various implementations of the systems and techniques described herein can be implemented in digital electronic circuitry, integrated circuit systems, application specific ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include being implemented in one or more computer programs executable and/or interpretable on a programmable system including at least one programmable processor that The processor, which may be a special purpose or general-purpose programmable processor, may receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device, and the at least one output device an output device.

这些计算程序(也称作程序、软件、软件应用、或者代码)包括可编程处理器的机器指令,并且可以利用高级过程和/或面向对象的编程语言、和/或汇编/机器语言来实施这些计算程序。如本文使用的,术语“机器可读介质”和“计算机可读介质”指的是用于将机器指令和/或数据提供给可编程处理器的任何计算机程序产品、设备、和/或装置(例如,磁盘、光盘、存储器、可编程逻辑装置(PLD)),包括,接收作为机器可读信号的机器指令的机器可读介质。术语“机器可读信号”指的是用于将机器指令和/或数据提供给可编程处理器的任何信号。These computational programs (also referred to as programs, software, software applications, or codes) include machine instructions for programmable processors, and may be implemented using high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages calculation program. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or apparatus for providing machine instructions and/or data to a programmable processor ( For example, magnetic disks, optical disks, memories, programmable logic devices (PLDs), including machine-readable media that receive machine instructions as machine-readable signals. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.

为了提供与用户的交互,可以在计算机上实施此处描述的系统和技术,该计算机具有:用于向用户显示信息的显示装置(例如,CRT(阴极射线管)或者LCD(液晶显示器)监视器);以及键盘和指向装置(例如,鼠标或者轨迹球),用户可以通过该键盘和该指向装置来将输入提供给计算机。其它种类的装置还可以用于提供与用户的交互;例如,提供给用户的反馈可以是任何形式的传感反馈(例如,视觉反馈、听觉反馈、或者触觉反馈);并且可以用任何形式(包括声输入、语音输入或者、触觉输入)来接收来自用户的输入。To provide interaction with a user, the systems and techniques described herein may be implemented on a computer having a display device (eg, a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to the user ); and a keyboard and pointing device (eg, a mouse or trackball) through which a user can provide input to the computer. Other kinds of devices can also be used to provide interaction with the user; for example, the feedback provided to the user can be any form of sensory feedback (eg, visual feedback, auditory feedback, or tactile feedback); and can be in any form (including acoustic input, voice input, or tactile input) to receive input from the user.

可以将此处描述的系统和技术实施在包括后台部件的计算系统(例如,作为数据服务器)、或者包括中间件部件的计算系统(例如,应用服务器)、或者包括前端部件的计算系统(例如,具有图形用户界面或者网络浏览器的用户计算机,用户可以通过该图形用户界面或者该网络浏览器来与此处描述的系统和技术的实施方式交互)、或者包括这种后台部件、中间件部件、或者前端部件的任何组合的计算系统中。可以通过任何形式或者介质的数字数据通信(例如,通信网络)来将系统的部件相互连接。通信网络的示例包括:局域网(LAN)、广域网(WAN)和互联网。The systems and techniques described herein may be implemented on a computing system that includes back-end components (eg, as a data server), or a computing system that includes middleware components (eg, an application server), or a computing system that includes front-end components (eg, a user's computer having a graphical user interface or web browser through which a user may interact with implementations of the systems and techniques described herein), or including such backend components, middleware components, Or any combination of front-end components in a computing system. The components of the system may be interconnected by any form or medium of digital data communication (eg, a communication network). Examples of communication networks include: Local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.

计算机系统可以包括客户端和服务器。客户端和服务器一般远离彼此并且通常通过通信网络进行交互。通过在相应的计算机上运行并且彼此具有客户端-服务器关系的计算机程序来产生客户端和服务器的关系。A computer system can include clients and servers. Clients and servers are generally remote from each other and usually interact through a communication network. The relationship of client and server arises by computer programs running on the respective computers and having a client-server relationship to each other.

根据本申请实施例的技术方案,通过获取用户输入的查询词;基于预先在将各关键词按照语句进行聚类后,针对聚类得到的各分类所对应的关键词中的部分关键词,建立的部分关键词的分词与对应关键词之间的索引关系,确定与查询词匹配的至少一个候选关键词;确定各候选关键词与查询词的相关度,筛选出相关度满足设定条件的候选关键词;基于筛选出的候选关键词,确定与查询词匹配的目标关键词。上述技术方案对于语义相同的关键词建立部分关键词的分词与对应的关键词之间的索引关系,而不是针对全部关键词建立索引关系,因此在确定与查询词匹配的候选关键词的数量也会减少,进而在进行候选关键词与查询词之间相关度计算的工作量也会减少,从而减少了搜索时长,提高了搜索效率,同时,还提高了搜索系统的召回能力。According to the technical solutions of the embodiments of the present application, the query words input by the user are obtained; after clustering the keywords according to the sentences in advance, for some keywords in the keywords corresponding to the categories obtained by the clustering, the establishment of The index relationship between the word segmentation of some keywords and the corresponding keywords, determine at least one candidate keyword that matches the query word; determine the correlation between each candidate keyword and the query word, and filter out the candidates whose correlation degree satisfies the set conditions. Keywords; based on the selected candidate keywords, determine the target keywords that match the query words. The above technical solution establishes an index relationship between the word segmentation of some keywords and the corresponding keywords for keywords with the same semantics, instead of establishing an index relationship for all keywords, so the number of candidate keywords matching the query word is also determined. will reduce, and then the workload of calculating the correlation between candidate keywords and query words will also be reduced, thereby reducing the search time, improving the search efficiency, and at the same time, improving the recall ability of the search system.

应该理解,可以使用上面所示的各种形式的流程,重新排序、增加或删除步骤。例如,本发申请中记载的各步骤可以并行地执行也可以顺序地执行也可以不同的次序执行,只要能够实现本申请公开的技术方案所期望的结果,本文在此不进行限制。It should be understood that steps may be reordered, added or deleted using the various forms of flow shown above. For example, the steps described in the present application can be performed in parallel, sequentially or in different orders, and as long as the desired results of the technical solutions disclosed in the present application can be achieved, no limitation is imposed herein.

上述具体实施方式,并不构成对本申请保护范围的限制。本领域技术人员应该明白的是,根据设计要求和其他因素,可以进行各种修改、组合、子组合和替代。任何在本申请的精神和原则之内所作的修改、等同替换和改进等,均应包含在本申请保护范围之内。The above-mentioned specific embodiments do not constitute a limitation on the protection scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may occur depending on design requirements and other factors. Any modifications, equivalent replacements and improvements made within the spirit and principles of this application shall be included within the protection scope of this application.

Claims (10)

1.一种搜索处理方法,其特征在于,包括:1. a search processing method, is characterized in that, comprises: 获取用户输入的查询词;Get the query words entered by the user; 基于预先建立的索引关系,确定与所述查询词匹配的至少一个候选关键词;其中,所述索引关系是在将各关键词按照语义进行聚类后,针对聚类得到的各分类所对应的关键词中的部分关键词,建立的部分关键词的分词与对应关键词之间的索引关系;Based on a pre-established index relationship, at least one candidate keyword matching the query term is determined; wherein, the index relationship is the corresponding classification for each category obtained by clustering each keyword according to semantics Some keywords in the keywords, the established index relationship between the participles of some keywords and the corresponding keywords; 确定各所述候选关键词与所述查询词的相关度,筛选出相关度满足设定条件的所述候选关键词;determining the degree of relevancy between each of the candidate keywords and the query word, and screening out the candidate keywords whose degree of relevancy satisfies a set condition; 基于筛选出的所述候选关键词,确定与所述查询词匹配的目标关键词。Based on the selected candidate keywords, a target keyword matching the query word is determined. 2.根据权利要求1所述的方法,其特征在于,所述索引关系的建立方法包括:2. The method according to claim 1, wherein the method for establishing the index relationship comprises: 将各关键词按照语义进行聚类;Cluster each keyword according to semantics; 对于聚类后得到的各个分类,选取当前分类中的一个关键词作为代表元关键词;For each category obtained after clustering, select a keyword in the current category as the representative meta keyword; 对所述代表元关键词进行切词处理,建立切词处理得到的各个分词与对应的代表元关键词之间的索引关系。A word segmentation process is performed on the representative meta-keyword, and an index relationship between each segmented word obtained by the word segmentation process and the corresponding representative meta-keyword is established. 3.根据权利要求2所述的方法,其特征在于,在选取当前分类中的一个关键词作为代表元关键词之后,所述方法还包括:3. method according to claim 2, is characterized in that, after selecting a keyword in current classification as representative meta-keyword, described method also comprises: 建立各代表元关键词与对应的分类之间的索引关系;Establish the index relationship between each representative meta-keyword and the corresponding classification; 相应的,基于筛选出的所述候选关键词,确定与所述查询词匹配的目标关键词,包括:Correspondingly, based on the selected candidate keywords, determine the target keywords matching the query words, including: 对于筛选出的全部或部分候选关键词,确定当前候选关键词作为代表元关键词所对应的分类;For all or part of the selected candidate keywords, determine the current candidate keyword as the classification corresponding to the representative meta keyword; 读取所确定分类中的各关键词,并将读取的各关键词作为与所述查询词匹配的目标关键词。Each keyword in the determined category is read, and each read keyword is used as a target keyword matching the query word. 4.根据权利要求3所述的方法,其特征在于,对于筛选出的部分候选关键词,确定当前候选关键词作为代表元关键词所对应的分类,包括:4. The method according to claim 3, wherein, for some of the selected candidate keywords, determining the current candidate keywords as the classification corresponding to the representative meta keywords, comprising: 确定筛选出的各候选关键词中与所述查询词的相关度最高的预设数目的候选关键词;Determine a preset number of candidate keywords with the highest correlation with the query word among the selected candidate keywords; 对于确定出的各候选关键词,确定当前候选关键词作为代表元关键词所对应的分类。For each determined candidate keyword, the current candidate keyword is determined as the category corresponding to the representative meta keyword. 5.根据权利要求2所述的方法,其特征在于,选取当前分类中的一个关键词作为代表元关键词,包括:5. method according to claim 2, is characterized in that, selects a keyword in current classification as representative meta-keyword, comprising: 选取当前分类中长度最短的一个关键词作为代表元关键词。Select a keyword with the shortest length in the current category as the representative meta-keyword. 6.根据权利要求2-5中任一项所述的方法,其特征在于,将各关键词按照语义进行聚类,包括:6. The method according to any one of claims 2-5, wherein each keyword is clustered according to semantics, comprising: 针对各关键词挖掘同义的关键词对,并基于挖掘出的关键词对确定至少一个关键词组;Mining synonymous keyword pairs for each keyword, and determining at least one keyword group based on the mined keyword pairs; 其中,每个关键词组中的关键词的语义全部相同。The semantics of the keywords in each keyword group are all the same. 7.根据权利要求1-5中任一项所述的方法,其特征在于,在确定与所述查询词匹配的目标关键词之后,所述方法还包括:7. The method according to any one of claims 1-5, wherein after determining the target keyword matching the query word, the method further comprises: 基于所述目标关键词,检索购买了所述目标关键词的信息投放方的待投放信息;Based on the target keyword, retrieve the information to be delivered of the information delivery party who purchased the target keyword; 将所述待投放信息推送给所述用户的客户端进行展示。Push the to-be-delivered information to the user's client for display. 8.一种搜索处理装置,其特征在于,包括:8. A search processing device, comprising: 查询词获取模块,用于获取用户输入的查询词;The query word acquisition module is used to obtain the query word input by the user; 候选关键词匹配模块,用于基于预先建立的索引关系,确定与所述查询词匹配的至少一个候选关键词;其中,所述索引关系是在将各关键词按照语义进行聚类后,针对聚类得到的各分类所对应的关键词中的部分关键词,建立的部分关键词的分词与对应关键词之间的索引关系;The candidate keyword matching module is used to determine at least one candidate keyword matching the query word based on the pre-established index relationship; Part of the keywords in the keywords corresponding to each category obtained by the class, and the established index relationship between the word segmentation of the part of the keywords and the corresponding keywords; 候选关键词筛选模块,用于确定各所述候选关键词与所述查询词的相关度,筛选出相关度满足设定条件的所述候选关键词;A candidate keyword screening module, configured to determine the degree of relevancy between each candidate keyword and the query word, and filter out the candidate keywords whose degree of relevancy satisfies a set condition; 目标关键词确定模块,用于基于筛选出的所述候选关键词,确定与所述查询词匹配的目标关键词。A target keyword determination module, configured to determine a target keyword matching the query word based on the selected candidate keywords. 9.一种电子设备,其特征在于,包括:9. An electronic device, characterized in that, comprising: 至少一个处理器;以及at least one processor; and 与所述至少一个处理器通信连接的存储器;其中,a memory communicatively coupled to the at least one processor; wherein, 所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行权利要求1-7中任一项所述的一种搜索处理方法。The memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processor to enable the at least one processor to perform the execution of any of claims 1-7 a search processing method. 10.一种存储有计算机指令的非瞬时计算机可读存储介质,其特征在于,所述计算机指令用于使所述计算机执行权利要求1-7中任一项所述的一种搜索处理方法。10. A non-transitory computer-readable storage medium storing computer instructions, wherein the computer instructions are used to cause the computer to execute a search processing method according to any one of claims 1-7.
CN201911102850.1A 2019-11-12 2019-11-12 Search processing method, device, equipment and storage medium Pending CN112860840A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911102850.1A CN112860840A (en) 2019-11-12 2019-11-12 Search processing method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911102850.1A CN112860840A (en) 2019-11-12 2019-11-12 Search processing method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN112860840A true CN112860840A (en) 2021-05-28

Family

ID=75984445

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911102850.1A Pending CN112860840A (en) 2019-11-12 2019-11-12 Search processing method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112860840A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115033797A (en) * 2022-06-30 2022-09-09 拉扎斯网络科技(上海)有限公司 Content search method and device, storage medium, and computer equipment
CN117577350A (en) * 2023-11-20 2024-02-20 北京壹永科技有限公司 Training and reasoning method, device, equipment and medium of medical large language model

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101763441A (en) * 2010-01-13 2010-06-30 北京中加国道科技有限公司 Technology organizing search results in active directory mode
JP2012033171A (en) * 2011-08-09 2012-02-16 Ricoh Co Ltd Apparatus, program and method for processing content
CN104636334A (en) * 2013-11-06 2015-05-20 阿里巴巴集团控股有限公司 Keyword recommending method and device
CN109241274A (en) * 2017-07-04 2019-01-18 腾讯科技(深圳)有限公司 text clustering method and device
CN109271574A (en) * 2018-08-28 2019-01-25 麒麟合盛网络技术股份有限公司 A kind of hot word recommended method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101763441A (en) * 2010-01-13 2010-06-30 北京中加国道科技有限公司 Technology organizing search results in active directory mode
JP2012033171A (en) * 2011-08-09 2012-02-16 Ricoh Co Ltd Apparatus, program and method for processing content
CN104636334A (en) * 2013-11-06 2015-05-20 阿里巴巴集团控股有限公司 Keyword recommending method and device
CN109241274A (en) * 2017-07-04 2019-01-18 腾讯科技(深圳)有限公司 text clustering method and device
CN109271574A (en) * 2018-08-28 2019-01-25 麒麟合盛网络技术股份有限公司 A kind of hot word recommended method and device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115033797A (en) * 2022-06-30 2022-09-09 拉扎斯网络科技(上海)有限公司 Content search method and device, storage medium, and computer equipment
CN117577350A (en) * 2023-11-20 2024-02-20 北京壹永科技有限公司 Training and reasoning method, device, equipment and medium of medical large language model

Similar Documents

Publication Publication Date Title
JP6714024B2 (en) Automatic generation of N-grams and conceptual relationships from language input data
CN112650907B (en) Search word recommendation method, target model training method, device and equipment
US20170351687A1 (en) Method and system for enhanced query term suggestion
US20210200813A1 (en) Human-machine interaction method, electronic device, and storage medium
JP7300475B2 (en) Entity Relationship Mining Method, Apparatus, Electronic Device, Computer Readable Storage Medium and Computer Program
CN112530576A (en) Online doctor-patient matching method and device, electronic equipment and storage medium
JP2021099890A (en) Determination method of cause-and-effect relationship, device, electronic apparatus, and storage medium
JP7093825B2 (en) Man-machine dialogue methods, devices, and equipment
CN112765452B (en) Search recommendation method and device and electronic equipment
WO2021139221A1 (en) Method and apparatus for query auto-completion, device and computer storage medium
CN111563198B (en) Material recall method, device, equipment and storage medium
CN112148895B (en) Training method, device, equipment and computer storage medium for retrieval model
JP7241122B2 (en) Smart response method and device, electronic device, storage medium and computer program
JP2022106948A (en) Information display method, device, electronic apparatus, storage media, and computer program
US10198497B2 (en) Search term clustering
CN111666417B (en) Method, device, electronic equipment and readable storage medium for generating synonyms
CN111428489B (en) Comment generation method and device, electronic equipment and storage medium
CN111783013A (en) Method, apparatus, device and computer-readable storage medium for publishing comment information
CN112860840A (en) Search processing method, device, equipment and storage medium
CN111881255B (en) Synonymous text acquisition method and device, electronic equipment and storage medium
CN112052410A (en) Map point of interest update method and device
CN112699314A (en) Hot event determination method and device, electronic equipment and storage medium
CN118733633A (en) Entity search method, large language model fine-tuning method, device and equipment
CN113377922B (en) Method, device, electronic equipment and medium for matching information
EP3842961A2 (en) Method and apparatus for mining tag, device, storage medium and computer program product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination