[go: up one dir, main page]

CN101820592A - Method and device for mobile search - Google Patents

Method and device for mobile search Download PDF

Info

Publication number
CN101820592A
CN101820592A CN200910140119A CN200910140119A CN101820592A CN 101820592 A CN101820592 A CN 101820592A CN 200910140119 A CN200910140119 A CN 200910140119A CN 200910140119 A CN200910140119 A CN 200910140119A CN 101820592 A CN101820592 A CN 101820592A
Authority
CN
China
Prior art keywords
search
interest
user
search type
score value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN200910140119A
Other languages
Chinese (zh)
Inventor
胡汉强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN200910140119A priority Critical patent/CN101820592A/en
Priority to PCT/CN2009/074758 priority patent/WO2010096986A1/en
Publication of CN101820592A publication Critical patent/CN101820592A/en
Priority to US13/219,058 priority patent/US20110314059A1/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明公开了一种移动搜索方法及装置,所述方法包括:接收搜索请求,所述搜索请求中包含一个或多个查询关键字;计算各搜索类型域的评分值,所述评分值为以下任意一项的评分值或多项的综合评分值:所述搜索请求与所述搜索类型域的相似度、所述搜索请求对应所述搜索类型域的大众搜索率、搜索类型域的个性化用户兴趣评分值;所述大众搜索率为:大众搜索次数,或者大众搜索结果点击次数;根据各搜索类型域的评分值选择其中一个或几个搜索类型域搜索所述查询关键字。利用本发明,可以为用户提供个性化的准确的搜索结果。

Figure 200910140119

The invention discloses a mobile search method and device. The method includes: receiving a search request, the search request includes one or more query keywords; calculating the score value of each search type field, and the score value is as follows The score value of any item or the comprehensive score value of multiple items: the similarity between the search request and the search type domain, the search rate of the search request corresponding to the search type domain, the personalized user of the search type domain Interest scoring value; the popular search rate: the number of popular searches, or the number of clicks on popular search results; according to the scoring values of each search type field, select one or several search type fields to search for the query keyword. With the present invention, personalized and accurate search results can be provided for users.

Figure 200910140119

Description

移动搜索方法及装置 Mobile search method and device

技术领域technical field

本发明涉及移动通信技术,具体涉及一种移动搜索方法及装置。The invention relates to mobile communication technology, in particular to a mobile search method and device.

背景技术Background technique

目前,作为搜索引擎和移动通信这两个当前信息产业的两大热门领域的结合-移动搜索,已经成为移动增值业务新的亮点和增长点。移动搜索框架是一个基于元搜索的开放的平台,它整合许多专业/垂直搜索引擎的能力,为用户提供一个综合的搜索能力。At present, mobile search, which is the combination of search engine and mobile communication, two hot fields in the current information industry, has become a new highlight and growth point of mobile value-added services. The mobile search framework is an open platform based on meta-search, which integrates the capabilities of many professional/vertical search engines to provide users with a comprehensive search capability.

用户使用移动搜索时,通常输入搜索关键字后直接进行搜索而没有选择搜索的类型域(domain)。因此,如何正确理解用户的搜索意图,为用户提供个性化的精确的搜索结果,现有技术中还没有很好的解决方案。When a user uses a mobile search, he usually searches directly after inputting a search keyword without selecting a search type domain (domain). Therefore, how to correctly understand the user's search intention and provide the user with personalized and accurate search results, there is no good solution in the prior art.

发明内容Contents of the invention

本发明实施例提供一种移动搜索方法及装置,能够为用户提供个性化的准确的搜索结果。Embodiments of the present invention provide a mobile search method and device, which can provide users with personalized and accurate search results.

本发明实施例提供一种移动搜索方法,包括:An embodiment of the present invention provides a mobile search method, including:

接收搜索请求,所述搜索请求中包含一个或多个查询关键字;Receiving a search request, the search request includes one or more query keywords;

计算各搜索类型域的评分值,所述评分值为以下任意一项的评分值或多项的综合评分值:所述搜索请求与所述搜索类型域的相似度、所述搜索请求对应所述搜索类型域的大众搜索率、搜索类型域的个性化用户兴趣评分值;Calculate the score value of each search type field, and the score value is the score value of any one of the following items or the comprehensive score value of multiple items: the similarity between the search request and the search type field, the search request corresponding to the The popular search rate of the search type domain, the personalized user interest score value of the search type domain;

根据各搜索类型域的评分值选择其中一个或几个搜索类型域搜索所述查询关键字。According to the scoring value of each search type field, one or several search type fields are selected to search for the query keyword.

本发明实施例提供一种移动搜索装置,包括:An embodiment of the present invention provides a mobile search device, including:

接收单元,用于接收搜索请求,所述搜索请求中包含一个或多个查询关键字;a receiving unit, configured to receive a search request, wherein the search request includes one or more query keywords;

计算单元,用于计算各搜索类型域的评分值,所述评分值为以下任意一项的评分值或多项的综合评分值:所述搜索请求与所述搜索类型域的相似度、所述搜索请求对应所述搜索类型域的大众搜索率、搜索类型域的个性化用户兴趣评分值;A calculation unit, configured to calculate the score value of each search type field, and the score value is a score value of any one of the following items or a comprehensive score value of multiple items: the similarity between the search request and the search type field, the The search request corresponds to the popular search rate of the search type domain and the personalized user interest score value of the search type domain;

选择单元,根据各搜索类型域的评分值选择其中一个或几个搜索类型域;Selection unit, select one or several search type fields according to the scoring value of each search type field;

搜索单元,用于利用所述选择单元选择的搜索类型域搜索所述查询关键字。A search unit, configured to search for the query keyword using the search type field selected by the selection unit.

本发明实施例提供的移动搜索方法及装置,通过分析用户的大众兴趣与用户的个性化兴趣,确定用户的个性化查询分类,从而为用户提供个性化的精确的搜索结果。The mobile search method and device provided by the embodiments of the present invention determine the user's personalized query category by analyzing the user's public interest and the user's personalized interest, thereby providing the user with personalized and accurate search results.

附图说明Description of drawings

图1是本发明实施例移动搜索方法的流程图;Fig. 1 is the flow chart of the mobile search method of the embodiment of the present invention;

图2是本发明实施例移动搜索方法的一种实现流程图;Fig. 2 is a kind of implementation flowchart of the mobile search method of the embodiment of the present invention;

图3是本发明实施例移动搜索方法的另一种实现流程图;Fig. 3 is another implementation flowchart of the mobile search method in the embodiment of the present invention;

图4是本发明实施例移动搜索方法的另一种实现流程图;Fig. 4 is another implementation flowchart of the mobile search method in the embodiment of the present invention;

图5是本发明实施例移动搜索方法的另一种实现流程图;Fig. 5 is another implementation flowchart of the mobile search method according to the embodiment of the present invention;

图6是本发明实施例移动搜索装置的结构示意图;FIG. 6 is a schematic structural diagram of a mobile search device according to an embodiment of the present invention;

图7是本发明实施例移动搜索装置的一种具体结构示意图;FIG. 7 is a schematic structural diagram of a mobile search device according to an embodiment of the present invention;

图8是本发明实施例移动搜索装置的另一种具体结构示意图;Fig. 8 is a schematic diagram of another specific structure of a mobile search device according to an embodiment of the present invention;

图9是本发明实施例移动搜索装置的另一种具体结构示意图;FIG. 9 is a schematic diagram of another specific structure of a mobile search device according to an embodiment of the present invention;

图10是图9所示装置中兴趣模型提取子单元的一种结构示意图;Fig. 10 is a schematic structural diagram of an interest model extraction subunit in the device shown in Fig. 9;

图11是图9所示装置中兴趣模型提取子单元的另一种结构示意图;Fig. 11 is another schematic structural diagram of the interest model extraction subunit in the device shown in Fig. 9;

图12是本发明实施例移动搜索装置的另一种具体结构示意图。Fig. 12 is a schematic diagram of another specific structure of a mobile search device according to an embodiment of the present invention.

具体实施方式Detailed ways

为了使本技术领域的人员更好地理解本发明实施例的方案,下面结合附图和实施方式对本发明实施例作进一步的详细说明。In order to enable those skilled in the art to better understand the solutions of the embodiments of the present invention, the embodiments of the present invention will be further described in detail below in conjunction with the drawings and implementations.

本发明实施例移动搜索方法及装置,针对用户的搜索请求,通过分析用户对应的大众兴趣与用户的个性化兴趣,确定用户的个性化查询分类,具体地,计算各搜索类型域的评分值,所述评分值为以下任意一项的评分值或多项的综合评分值:所述搜索请求与所述搜索类型域的相似度、所述搜索请求对应所述搜索类型域的大众搜索率、搜索类型域的个性化用户兴趣评分值;所述大众搜索率为:大众搜索次数,或者大众搜索结果点击次数;然后,根据各搜索类型域的评分值选择其中一个或几个搜索类型域搜索所述查询关键字,从而为用户提供个性化的精确的搜索结果。The mobile search method and device of the embodiments of the present invention determine the user's personalized query category by analyzing the user's corresponding public interest and user's personalized interest for the user's search request, specifically, calculate the scoring value of each search type field, The score value is the score value of any one of the following items or the comprehensive score value of multiple items: the similarity between the search request and the search type domain, the public search rate of the search request corresponding to the search type domain, search The personalized user interest score value of the type domain; the mass search rate is: the number of mass searches, or the number of clicks on the mass search results; then, according to the scoring value of each search type field, select one or several search type fields to search the Query keywords to provide users with personalized and accurate search results.

如图1所示,是本发明实施例移动搜索方法的流程图。As shown in FIG. 1 , it is a flowchart of a mobile search method according to an embodiment of the present invention.

步骤101,接收搜索请求,所述搜索请求中包含一个或多个查询关键字。Step 101, receiving a search request, where the search request includes one or more query keywords.

步骤102,计算各搜索类型域的评分值,所述评分值为以下任意一项的评分值或多项的综合评分值:所述搜索请求与所述搜索类型域的相似度、所述搜索请求对应所述搜索类型域的大众搜索率、搜索类型域的个性化用户兴趣评分值;所述大众搜索率为:大众搜索次数,或者大众搜索结果点击次数。Step 102, calculate the score value of each search type field, the score value is the score value of any one of the following items or the comprehensive score value of multiple items: the similarity between the search request and the search type field, the search request Corresponding to the popular search rate of the search type field and the personalized user interest score value of the search type field; the popular search rate: the number of times of popular searches, or the number of clicks of popular search results.

步骤103,根据各搜索类型域的评分值选择其中一个或几个搜索类型域搜索所述查询关键字。Step 103, select one or more of the search type fields to search for the query keyword according to the scoring values of each search type field.

在本发明实施例中,在确定用户的个性化查询分类时,可以有多种实现方式,比如,可以是根据所述搜索请求与所述搜索类型域的相似度,选择相似度高的一个或几个搜索类型域进行搜索;也可以是根据所述搜索请求对应所述搜索类型域的大众搜索率,选择大众搜索率高的一个或几个搜索类型域进行搜索;还可以根据搜索类型域的个性化用户兴趣评分值,选择个性化用户兴趣评分值高的一个或几个搜索类型域进行搜索。当然,还可以是综合考虑上述几项,计算出每个搜索类型域的综合评分值,选择综合评分值高的一个或几个搜索类型域进行搜索。下面对此分别举例详细说明。In the embodiment of the present invention, when determining the user's personalized query category, there may be multiple implementation methods, for example, it may be based on the similarity between the search request and the search type field, select the one with a high similarity or Several search type domains are searched; it is also possible to select one or several search type domains with high public search rate according to the popular search rate of the search type domain corresponding to the search request; it is also possible to search according to the search type domain Personalized user interest score value, select one or several search type domains with high personalized user interest score value to search. Of course, it is also possible to comprehensively consider the above items, calculate the comprehensive score value of each search type field, and select one or several search type fields with high comprehensive score values for searching. Examples are given below to explain this in detail.

参照图2,是本发明实施例移动搜索方法的一种实现流程图。Referring to FIG. 2 , it is a flowchart of an implementation of a mobile search method according to an embodiment of the present invention.

在该实施例中,根据所述搜索请求与所述搜索类型域的相似度,选择搜索类型域进行搜索,以便为用户提供个性化的准确的搜索结果。In this embodiment, according to the similarity between the search request and the search type field, a search type field is selected for searching, so as to provide users with personalized and accurate search results.

步骤201,接收搜索请求,所述搜索请求中包含一个或多个查询关键字。Step 201, receiving a search request, where the search request includes one or more query keywords.

步骤202,根据所述查询关键字计算所述搜索请求与各搜索类型域的相似度。Step 202, calculate the similarity between the search request and each search type field according to the query keyword.

可以为所述搜索请求中的查询关键字设置相应的权重,由所述查询关键字的权重生成查询向量Query(q1,q2,...qn’);其中,q1,q2,...qn’为对应各查询关键字的权重;具体地,可以将所有关键字设置相同的权重,比如权重=1;也可以为不同的关键字设置不同的权重,比如,为排在最前面的关键字设置最大权重,比如权重=1,为排在中间的关键字设置中间大小的权重,比如0.5<权重<1,为排在最后的关键字设置最小权重,比如权重=0.5。Corresponding weights may be set for the query keywords in the search request, and the query vector Query(q1, q2,...qn') is generated by the weights of the query keywords; where, q1, q2,...qn ' is the weight corresponding to each query keyword; specifically, the same weight can be set for all keywords, such as weight = 1; different weights can also be set for different keywords, for example, for the top keywords Set the maximum weight, such as weight=1, set the weight of the middle size for the keywords ranked in the middle, such as 0.5<weight<1, set the minimum weight for the keywords ranked last, such as weight=0.5.

由所述搜索类型域的各词的权重生成对应该搜索类型域的域向量,比如给每个搜索类型域的所有主题词和相关词设置一定的权重,由这些主题词和相关词的权重组成对应该搜索类型域的域向量Domain(t1,t2,…,tn),其中,t1,t2,…,tn为该搜索类型域中各词的权重。通过计算所述查询向量和域向量得到所述所述搜索请求与搜索类型域的相似度。The domain vector corresponding to the search type domain is generated from the weights of each word in the search type domain, such as setting a certain weight for all subject words and related words in each search type domain, which is composed of the weights of these subject terms and related words A domain vector Domain(t1, t2, ..., tn) corresponding to the search type domain, where t1, t2, ..., tn is the weight of each word in the search type domain. The similarity between the search request and the search type domain is obtained by calculating the query vector and the domain vector.

可以按以下公式计算向量Domian(t1,t2,...,tn)与向量Query(q1,q2,...,qn’)之间的相似度:The similarity between the vector Domian(t1, t2, ..., tn) and the vector Query(q1, q2, ..., qn') can be calculated according to the following formula:

SimSim (( QueryQuery (( qq 11 ,, qq 22 ,, .. .. .. ,, qnqn ,, )) ,, DomainDomain (( tt 11 ,, tt 22 ,, .. .. .. ,, tntn )) ))

== (( qq 11 ** tt ii 11 ++ qq 22 ** tt ii 22 ++ .. .. .. .. .. .. ++ qnqn ,, ** tt inin ,, )) // (( qq 11 22 ++ qq 22 22 ++ .. .. .. ++ qnqn &prime;&prime; 22 ** tt ii 11 22 ++ tt ii 22 22 ++ .. .. .. ++ tt inin &prime;&prime; 22 )) -- -- -- (( 11 ))

其中,ti1,ti2,…,tin’分别是向量Domian(t1,t2,...,tn)中与权重q1,q2,...,qn’对应的查询关键字相同的词对应的权重。Among them, t i1 , t i2 , ..., t in ' are the words corresponding to the same query keywords corresponding to weights q1, q2, ..., qn' in the vector Domian(t1, t2, ..., tn) the weight of.

假设有m个搜索类型域,对应的域向量分别为Domain1(t1,t2,...,tn),Domain2(t1,t2,...,tn),...,Domainm(t1,t2,...,tn),则按公式(1)分别计算向量Query(q1,q2,...,qn’)与上述各域向量的相似度。Suppose there are m search type domains, and the corresponding domain vectors are Domain1(t1, t2, ..., tn), Domain2(t1, t2, ..., tn), ..., Domainm(t1, t2, ..., tn), then calculate the similarity between the vector Query(q1, q2, ..., qn') and the above domain vectors according to the formula (1).

步骤203,选择相似度高的一个或多个搜索类型域进行搜索。Step 203, select one or more search type domains with high similarity for searching.

在该实施例中,各搜索类型域中主题词、相关词,以及各词的权重可以有多种方式来设置。In this embodiment, the subject words, related words, and weights of each word in each search type field can be set in various ways.

1.人工分配方式1. Manual allocation method

对于主题词设置最大的权重,对于强相关词设置中间大小的权重,对于弱相关词设置最小权重。Set the maximum weight for the subject words, set the middle size weight for the strong related words, and set the minimum weight for the weak related words.

比如:主题词(如餐饮搜索类型域中的“川菜”)设置权重为1,强相关词(如餐饮搜索类型域中的“辣”)设置权重为0.8,弱相关词(如餐饮搜索类型域中的“香”)设置权重为0.5。For example: subject words (such as "Sichuan Cuisine" in the catering search type field) set the weight to 1, strong related words (such as "spicy" in the catering search type field) set the weight to 0.8, weakly related words (such as catering search type field " "Incense" in ) set the weight to 0.5.

2.通过学习自动分配方式2. By learning the automatic allocation method

具体过程如下:The specific process is as follows:

(1)对于每个搜索类型域,获取对应该搜索类型域的训练文本语料样本;(1) For each search type field, obtain a training text corpus sample corresponding to the search type field;

(2)对所述语料样本进行切词,生成该搜索类型域的词库;(2) Carry out word segmentation to described corpus sample, generate the thesaurus of this search type field;

(3)计算所述词库中各词的权重,每个词的权重=TF*GIDF,其中TF为该词在该搜索类型域所有语料样本中总词频,GIDF为全局反向文档频率,GIDF=log(1+N/GDF),其中N为所有搜索类型域的所有语料样本的总数量,GDF为全局语料样本频率,即为所有搜索类型域中包含该词的所有语料样本的数量;(3) Calculate the weight of each word in the thesaurus, the weight of each word=TF*GIDF, wherein TF is the total word frequency of this word in all corpus samples of this search type domain, GIDF is the global reverse document frequency, GIDF =log(1+N/GDF), where N is the total number of all corpus samples in all search type domains, and GDF is the global corpus sample frequency, which is the number of all corpus samples that include the word in all search type domains;

(4)根据各词的权重确定所述搜索类型域中的主题词和相关词;(4) determine subject words and related words in the search type field according to the weight of each word;

假设某搜索类型域的词库中共有n个词,对应的权重为T1,T2,...,Tn,其中,T1>T2>...>Tn,这样,可以认为T1对应的词为主题词,其他词为相关词。Assume that there are n words in the thesaurus of a certain search type domain, and the corresponding weights are T1, T2, ..., Tn, among them, T1>T2>...>Tn, so that the word corresponding to T1 can be considered as the topic words, and other words are related words.

进一步地,还可以将所述词库中的所有词按照权重划分为不同档次的集合,为每个档次的集合设置最终评分值,并将每个档次的最终评分值作为该档次内的各词的权重。比如,共有L档,为第一档设置最高评分值,中间档设置中间大小的评分值,第L档设置最小评分值。这样,由词类中的词及其最终评分值即可组成对应的搜索类型域的域向量。Further, it is also possible to divide all the words in the thesaurus into sets of different grades according to their weights, set the final score value for the set of each grade, and use the final score value of each grade as each word in the grade the weight of. For example, there are L grades in total, the highest score value is set for the first grade, the intermediate grade value is set for the middle grade, and the minimum score value is set for the L grade. In this way, the domain vector of the corresponding search type domain can be formed from the words in the part of speech and their final scoring values.

当然,本发明实施例并不仅限于上述这些设置方式,对于各搜索类型域中主题词、相关词,以及各词的权重还可以采用其他方式来设置,在此不再一一详细说明。Of course, the embodiments of the present invention are not limited to the above-mentioned setting methods, and other methods can be used to set the subject words, related words, and weights of each word in each search type field, which will not be described in detail here.

本发明实施例移动搜索方法,针对用户的搜索请求,通过计算搜索请求的查询向量与各搜索类型域的域向量的相似度,选择相似度高的一个或几个搜索类型域进行搜索,从而可以为用户确定个性化查询分类,为用户提供个性化的精确的搜索结果。In the mobile search method of the embodiment of the present invention, for the user's search request, by calculating the similarity between the query vector of the search request and the field vectors of each search type field, one or several search type fields with high similarity are selected for searching, so that Determine personalized query categories for users, and provide users with personalized and accurate search results.

参照图3,是本发明实施例移动搜索方法的另一种实现流程图。Referring to FIG. 3 , it is a flow chart of another implementation of the mobile search method according to the embodiment of the present invention.

在该实施例中,根据所述搜索请求对应所述搜索类型域的大众搜索率,选择搜索类型域进行搜索,以便为用户提供个性化的准确的搜索结果。In this embodiment, according to the popular search rate corresponding to the search type field in the search request, the search type field is selected for searching, so as to provide users with personalized and accurate search results.

步骤301,接收搜索请求,所述搜索请求中包含一个或多个查询关键字。Step 301: Receive a search request, where the search request includes one or more query keywords.

步骤302,根据所述查询关键字计算所述搜索请求对应各搜索类型域的大众搜索率。Step 302, calculating the popular search rate of each search type domain corresponding to the search request according to the query keyword.

步骤303,选择大众搜索率高的一个或多个搜索类型域进行搜索。Step 303, select one or more search type domains with high public search rate to search.

在本发明实施例中,所述大众搜索率具体可以是:大众搜索次数,或者大众搜索结果点击次数等。In the embodiment of the present invention, the popular search rate may specifically be: the number of popular searches, or the number of clicks on popular search results.

下面分别详细说明计算所述搜索请求对应各搜索类型域的大众搜索次数和大众搜索结果点击次数的过程。The process of calculating the number of popular searches and the number of clicks on popular search results corresponding to each search type field of the search request will be described in detail below.

计算所述搜索请求对应的某个搜索类型域的大众搜索次数的过程如下:The process of calculating the number of public searches for a certain search type domain corresponding to the search request is as follows:

(1)计算所述搜索请求中每个关键字对应的某个搜索类型域的大众搜索总次数;(1) Calculate the total number of public searches of a certain search type field corresponding to each keyword in the search request;

可以依据历史记录,搜集所有用户关于包含所述搜索请求中某个关键字的搜索请求选择用某个搜索类型域进行搜索的次数的总和,作为该关键字对应的大众对该搜索类型域进行搜索的总次数,即对应该搜索类型域的大众搜索总次数;According to historical records, the sum of the number of times all users choose to search with a certain search type field for a search request containing a certain keyword in the search request can be used as the number of times the public corresponding to the keyword searches for this search type field The total number of times, that is, the total number of public searches corresponding to the search type domain;

(2)将所述搜索请求中所有关键字对应的该搜索类型域的大众搜索总次数的和,作为所述搜索请求对应的该搜索类型域的大众搜索总次数。(2) The sum of the total number of popular searches in the search type field corresponding to all the keywords in the search request is taken as the total number of popular searches in the search type field corresponding to the search request.

同样,计算所述搜索请求对应的某个搜索类型域的大众搜索结果点击次数的过程如下:Similarly, the process of calculating the number of clicks on the public search results of a certain search type domain corresponding to the search request is as follows:

(1)计算所述搜索请求中每个关键字对应的某个搜索类型域的大众搜索结果点击总次数;(1) calculating the total number of clicks on popular search results of a certain search type field corresponding to each keyword in the search request;

可以依据历史记录,搜集所有用户关于包含所述搜索请求中某个关键字的搜索请求选择用某个搜索类型域进行搜索的搜索结果点击次数的总和,作为该关键字对应的大众对该搜索类型域的搜索结果点击的总次数,即对应该搜索类型域的大众搜索结果点击总次数;According to historical records, the sum of the number of times all users click on the search results of a search request containing a certain keyword in the search request and choose a certain search type domain is collected as the search type corresponding to the keyword. The total number of clicks on the search results of the domain, that is, the total number of clicks on the public search results corresponding to the search type domain;

(2)将所述搜索请求中所有关键字对应的该搜索类型域的大众搜索结果点击总次数的和,作为所述搜索请求对应的该搜索类型域的大众搜索结果点击总次数。(2) The sum of the total number of clicks on the popular search results of the search type field corresponding to all the keywords in the search request is used as the total number of clicks on the popular search results of the search type field corresponding to the search request.

本发明实施例移动搜索方法,针对用户的搜索请求,通过计算所述搜索请求对应各搜索类型域的大众搜索率,选择大众搜索率高的一个或几个搜索类型域进行搜索,从而可以为用户确定个性化查询分类,为用户提供个性化的精确的搜索结果。In the mobile search method of the embodiment of the present invention, for a user's search request, by calculating the popular search rate of each search type domain corresponding to the search request, one or several search type domains with a high public search rate are selected for searching, thereby providing users with Determine the classification of personalized queries to provide users with personalized and accurate search results.

参照图4,是本发明实施例移动搜索方法的另一种实现流程图。Referring to FIG. 4 , it is another implementation flowchart of the mobile search method in the embodiment of the present invention.

在该实施例中,根据搜索类型域的个性化用户兴趣评分值,选择评分值高的搜索类型域进行搜索,以便为用户提供个性化的准确的搜索结果。In this embodiment, according to the personalized user interest score value of the search type field, a search type field with a high score value is selected for searching, so as to provide users with personalized and accurate search results.

步骤401,接收搜索请求,所述搜索请求中包含一个或多个查询关键字。Step 401, receiving a search request, where the search request includes one or more query keywords.

步骤402,从用户数据中提取用户的兴趣模型。Step 402, extract the user's interest model from the user data.

所述用户的兴趣模型为所述用户数据针对多个兴趣维度的评分值组成的向量,比如IM(I1,I2,...,In),其中Ii为用户第i个兴趣维度的评分值。可以从用户个性化数据(比如静态档案、搜索点击历史数据、呈现业务信息、本地信息等)中提取用户兴趣模型;也可预先从用户个性化数据中提取出对应的用户兴趣模型并保存,在需要时,直接从这些保存的用户兴趣模型提取所需的用户兴趣模型。The user's interest model is a vector composed of rating values of the user data for multiple interest dimensions, such as IM(I1, I2, ..., In), where Ii is the rating value of the i-th interest dimension of the user. User interest models can be extracted from user personalized data (such as static files, search and click history data, presented business information, local information, etc.); the corresponding user interest models can also be extracted from user personalized data in advance and saved. When needed, the required user interest models are directly extracted from these saved user interest models.

所述用户的兴趣模型可以是静态兴趣模型或动态兴趣模型,当然,也可以是综合静态兴趣模型和动态兴趣模型生成的兴趣模型。The user's interest model may be a static interest model or a dynamic interest model, of course, it may also be an interest model generated by combining the static interest model and the dynamic interest model.

从用户的静态档案中可以提取用户的静态兴趣模型,具体过程可以有以下两种方式:The user's static interest model can be extracted from the user's static profile. The specific process can be in the following two ways:

(1)计算用户的静态档案中属于每个兴趣维度的所有词的词频之和,并将其作为对应每个兴趣维度的评分值,由对应每个兴趣维度的评分值作为向量生成所述用户兴趣模型;(1) Calculate the sum of the word frequencies of all words belonging to each interest dimension in the user's static file, and use it as the score value corresponding to each interest dimension, and generate the user by using the score value corresponding to each interest dimension as a vector interest model;

(2)计算用户的静态档案与每个兴趣维度的相似度评分值,并将其作为对应每个兴趣维度的评分值,由对应每个兴趣维度的评分值作为向量生成所述用户兴趣模型;(2) Calculating the static file of the user and the similarity scoring value of each dimension of interest, and using it as the scoring value corresponding to each dimension of interest, generating the user interest model as a vector by the scoring value corresponding to each dimension of interest;

从用户数据中提取用户的动态兴趣模型,具体过程可以有以下两种方式:Extract the user's dynamic interest model from the user data. The specific process can be in the following two ways:

(1)计算用户的搜索点击历史记录中属于每个兴趣维度的所有词的词频之和,并将其作为对应每个兴趣维度的评分值,由对应每个兴趣维度的评分值作为向量生成所述用户的动态兴趣模型;(1) Calculate the sum of the word frequencies of all words belonging to each dimension of interest in the user's search click history, and use it as the score value corresponding to each dimension of interest, which is generated by the score value corresponding to each dimension of interest as a vector Describe the user's dynamic interest model;

(2)计算搜索点击历史记录与每个兴趣维度的相似度评分值,并将其作为对应每个兴趣维度的评分值,由对应每个兴趣维度的评分值作为向量生成所述用户的动态兴趣模型。(2) Calculate the similarity score value between the search click history record and each interest dimension, and use it as the score value corresponding to each interest dimension, and generate the dynamic interest of the user by using the score value corresponding to each interest dimension as a vector Model.

综合静态兴趣模型和动态兴趣模型生成的兴趣模型可以是:The interest model generated by combining static interest model and dynamic interest model can be:

(1)首先分别对所述静态兴趣模型和所述动态兴趣模型进行归一化处理,然后计算归一化处理后的一个或多个静态兴趣模型、和一个或多个动态兴趣模型的和,并将该和作为所述用户的兴趣模型。(1) first normalize the static interest model and the dynamic interest model respectively, and then calculate the sum of one or more static interest models and one or more dynamic interest models after normalization, And use this sum as the user's interest model.

(2)首先将一个或多个所述静态兴趣模型、和一个或多个所述动态兴趣模型进行加权相加,然后再将加权相加的和进行归一化处理,并将归一化处理后的结果作为所述用户的兴趣模型。(2) First, one or more of the static interest models and one or more of the dynamic interest models are weighted and added, and then the sum of the weighted additions is normalized, and the normalized The final result is used as the user's interest model.

步骤403,将所述搜索类型域对应所述用户兴趣模型的一个或多个兴趣维度的评分值之和作为所述搜索类型域的个性化用户兴趣评分值。Step 403: The sum of the score values of the search type field corresponding to one or more interest dimensions of the user interest model is used as the personalized user interest score value of the search type field.

步骤404,选择评分值高的一个或多个搜索类型域搜索所述查询关键字。Step 404: Select one or more search type domains with high scoring values to search for the query keyword.

例如,将用户的兴趣用n个维度来表示,如:新闻、体育、娱乐、财经、科技、房产、游戏、女性、论坛、天气、商品、家电、音乐、读书、博客、手机、军事、教育、旅游、彩信、彩铃、餐饮、民航、工业、农业、电脑、地理等。所述用户兴趣模型即为用户对每个维度的兴趣的评分值所组成的一个向量W(r1,r2,r3,......,rn)。For example, the user's interests are represented by n dimensions, such as: news, sports, entertainment, finance, technology, real estate, games, women, forums, weather, commodities, home appliances, music, reading, blogs, mobile phones, military, education , tourism, MMS, CRBT, catering, civil aviation, industry, agriculture, computer, geography, etc. The user interest model is a vector W(r1, r2, r3, .

在从用户个性化数据中提取用户兴趣模型时,可以从用户的静态档案中提取,也可以从用户搜索的历史数据中提取。When extracting the user interest model from the user's personalized data, it can be extracted from the user's static profile, or can be extracted from the historical data of the user's search.

从用户的静态档案中提取用户兴趣模型W1可以有以下几种方式:There are several ways to extract the user interest model W1 from the user's static profile:

(1)W1=(p1,p2,p3,......,pn),其中pi为静态档案中类型属于第i个兴趣维度的所有词的词频之和。(1) W1=(p1, p2, p3, . . . , pn), where pi is the sum of the word frequencies of all words whose type belongs to the i-th dimension of interest in the static file.

(2)W1=(p1,p2,p3,......,pn),其中pi为静态档案与第i个兴趣维度的相似度评分值。(2) W1=(p1, p2, p3, . . . , pn), where pi is the similarity score between the static profile and the i-th dimension of interest.

其中,计算静态档案与某个兴趣维度的相似度pi的过程如下:Among them, the process of calculating the similarity pi between a static file and a dimension of interest is as follows:

(a)提取分类器的特征词库,具体为:(a) Extract the feature vocabulary of the classifier, specifically:

(i)对用户的每个兴趣维度分别收集相应的语料集,生成语料库;(i) Collect corresponding corpus for each interest dimension of the user to generate a corpus;

(ii)对所述语料库进行切词,形成一系列词条;(ii) performing word segmentation on the corpus to form a series of entries;

(iii)判断切词后的词条是否为特征词,具体可以采用卡方统计算法(CHI):(iii) To determine whether the entry after word segmentation is a feature word, specifically, the chi-square statistical algorithm (CHI) can be used:

&chi;&chi; 22 (( tt ,, cc )) == NN &CenterDot;&Center Dot; (( ADAD -- BCBC )) 22 (( AA ++ CC )) (( BB ++ DD. )) (( AA ++ BB )) (( CC ++ DD. )) ;;

其中,各参数的含义如下:t:某一词条;c:某一类别;N:训练文本总数;A:属于c且包含t的训练文本数;B:不属于c但是包含t的文本数;C:属于c但不包含t的文本数;D:不属于c也不包含t的文本数。如果C、D都是0,那么χ2(t,c)=0;Among them, the meaning of each parameter is as follows: t: a certain entry; c: a certain category; N: the total number of training texts; A: the number of training texts that belong to c and contain t; B: the number of texts that do not belong to c but contain t ;C: the number of texts belonging to c but not including t; D: the number of texts not belonging to c nor including t. If both C and D are 0, then χ 2 (t, c)=0;

词条t对整个训练集的CHI值可定义为:

Figure B2009101401198D0000091
Figure B2009101401198D0000092
低于指定阈值的词条可不考虑作为特征词。The CHI value of term t to the entire training set can be defined as:
Figure B2009101401198D0000091
or
Figure B2009101401198D0000092
Entries below the specified threshold may not be considered as feature words.

其中P(c)的计算过程如下:The calculation process of P(c) is as follows:

设类别为C1,C2,...,CnLet the categories be C 1 , C 2 , ..., C n ,

Figure B2009101401198D0000093
其中,N(Ci)是类别Ci所包含的训练文本的数量;but
Figure B2009101401198D0000093
Among them, N(C i ) is the number of training texts contained in category C i ;

或者,

Figure B2009101401198D0000094
其中,M(Ci)是类别Ci的所有训练文本所包含的词条总数,M是所有训练文本所包含的词条总数。or,
Figure B2009101401198D0000094
Among them, M(C i ) is the total number of entries contained in all training texts of category C i , and M is the total number of entries contained in all training texts.

最终得到的特征词条记为t1,t2,...,tn。The final feature words are denoted as t1, t2, ..., tn.

当然,判断切词后的词条是否为特征词时,并不仅限于上述CHI算法,还可以采用其他算法,比如,χ2(t,c)=|AD-BC|。Of course, when judging whether the word-sliced entry is a feature word, it is not limited to the above-mentioned CHI algorithm, and other algorithms can also be used, for example, χ 2 (t, c)=|AD-BC|.

(b)根据(a)步骤得到的特征词,生成第i个兴趣维度的特征向量Wi=(wi1,wi2,...,wii,...,win),其中wii为特征词ti在第i个兴趣维度中的权重。(b) Generate the feature vector Wi=(wi1,wi2,...,wii,...,win) of the i-th dimension of interest according to the feature words obtained in step (a), where wii is the feature word ti Weights in the i dimensions of interest.

Wii=TFi*log(1+N/GDFi),TFi为特征词ti在属于第i个兴趣维度的所有语料中出现的词频,N为特征词ti在所有兴趣维度的所有语料中文档数量,GDFi(全局文档频率)为所有兴趣维度的所有语料中包含特征词ti的文档数量。Wii=TFi*log(1+N/GDFi), TFi is the word frequency of the feature word ti appearing in all corpora belonging to the i-th interest dimension, N is the number of documents in all corpus of the feature word ti in all interest dimensions, GDFi (Global document frequency) is the number of documents containing the feature word ti in all corpora of all dimensions of interest.

(c)根据(a)步骤得到的特征词,生成用户静态档案的特征向量S=(s1,s2,...,sn),其中si为特征词ti在用户静态档案中的权重。(c) Generate the feature vector S=(s1, s2, .

Si=特征词ti在静态档案中出现的词频。Si = the word frequency of the characteristic word ti appearing in the static file.

(d)计算用户静态档案向量与第i个兴趣维度的特征向量Wi之间的相似度,得到相似度的评分值pi,(d) Calculate the similarity between the user's static profile vector and the feature vector Wi of the i-th interest dimension, and obtain the score value pi of the similarity,

PiPi == Wiwi ** SS // || Wiwi || ** || SS ||

== (( wiwi 11 ** sthe s 11 ++ wiwi 22 ** sthe s 22 ++ .. .. .. ++ winwin ** snsn )) // (( wiwi 11 22 ++ wiwi 22 22 ++ .. .. .. ++ winwin 22 ** sthe s 11 22 ++ sthe s 22 22 ++ .. .. .. ++ snsn 22 ))

从用户搜索的历史数据中提取用户兴趣模型W2可以有以下几种方式:There are several ways to extract user interest model W2 from historical data searched by users:

W2=d1+d2+d3+......dm,其中di为用户某个点击文档所对应的兴趣模型向量;W2=d1+d2+d3+...dm, where di is the interest model vector corresponding to a document clicked by the user;

获取某个点击文档所对应的兴趣模型向量有两种方法:There are two ways to obtain the interest model vector corresponding to a clicked document:

(1)di=(t1,t2,t3,......,tn),当用户最新点击了这个文档,tj等于文档中类型属于第j个兴趣维度的所有词的词频之和。(1) di=(t1, t2, t3,...,tn), when the user clicks on this document recently, tj is equal to the sum of the word frequencies of all words in the document whose type belongs to the jth dimension of interest.

(2)di=(t1,t2,t3,......,tn),其中di为文档与第i个兴趣维度的相似度评分值。计算di的过程如下:(2) di=(t1, t2, t3, . . . , tn), where di is the similarity score between the document and the i-th dimension of interest. The process of calculating di is as follows:

(a)提取分类器的特征词库,具体为:(a) Extract the feature vocabulary of the classifier, specifically:

(i)对用户的每个兴趣维度分别收集相应的语料集,生成语料库;(i) Collect corresponding corpus for each interest dimension of the user to generate a corpus;

(ii)对所述语料库进行分词,形成一系列词条;(ii) performing word segmentation on the corpus to form a series of entries;

(iii)判断切词后的词条,是否特征词,具体可以采用CHI算法:(iii) To determine whether the entry after word segmentation is a feature word, the CHI algorithm can be used specifically:

&chi;&chi; 22 (( tt ,, cc )) == NN &CenterDot;&CenterDot; (( ADAD -- BCBC )) 22 (( AA ++ CC )) (( BB ++ DD. )) (( AA ++ BB )) (( CC ++ DD. )) ;;

其中,各参数的含义如下:t:某一词条;c:某一类别;N:训练文本总数;A:属于c且包含t的文本数;B:不属于c但是包含t的文本数;C:属于c但不包含t的文本数;D:不属于c也不包含t的文本数;如果C、D都是0,那么χ2(t,c)=0。Among them, the meaning of each parameter is as follows: t: a certain entry; c: a certain category; N: the total number of training texts; A: the number of texts that belong to c and contain t; B: the number of texts that do not belong to c but contain t; C: the number of texts belonging to c but not including t; D: the number of texts not belonging to c nor including t; if both C and D are 0, then χ 2 (t, c)=0.

词条t对整个训练集的CHI值可定义为:

Figure B2009101401198D0000102
Figure B2009101401198D0000103
低于指定阈值的词条可不考虑作为特征词。The CHI value of term t to the entire training set can be defined as:
Figure B2009101401198D0000102
or
Figure B2009101401198D0000103
Entries below the specified threshold may not be considered as feature words.

设定类别为C1,C2,…,Cn,P(c)的计算过程如下:Set the categories as C 1 , C 2 , ..., C n , and the calculation process of P(c) is as follows:

Figure B2009101401198D0000104
其中,N(Ci)是类别Ci所包含的训练文本的数量;
Figure B2009101401198D0000104
Among them, N(C i ) is the number of training texts contained in category C i ;

或者,

Figure B2009101401198D0000105
其中,M(Ci)是类别Ci的所有训练文本所包含的词条总数,M是所有训练文本所包含的词条总数。or,
Figure B2009101401198D0000105
Among them, M(C i ) is the total number of entries contained in all training texts of category C i , and M is the total number of entries contained in all training texts.

最终得到的特征词条记为t1,t2,...,tn。The final feature words are denoted as t1, t2, ..., tn.

当然,判断切词后的词条是否为特征词时,并不仅限于上述CHI算法,还可以采用其他算法,比如,χ2(t,c)=|AD-BC|。Of course, when judging whether the word-sliced entry is a feature word, it is not limited to the above-mentioned CHI algorithm, and other algorithms can also be used, for example, χ 2 (t, c)=|AD-BC|.

(b)根据(a)步骤得到的特征词,生成第i个兴趣维度的特征向量Wi=(wi1,wi2,...,wii,...,win),其中wii为特征词ti在第i个兴趣维度中的权重。(b) Generate the feature vector Wi=(wi1,wi2,...,wii,...,win) of the i-th dimension of interest according to the feature words obtained in step (a), where wii is the feature word ti Weights in the i dimensions of interest.

Wii=TFi*log(1+N/GDFi),TFi为特征词ti在属于第i个兴趣维度的所有语料中出现的词频,N为特征词ti在所有兴趣维度的所有语料中文档数量,GDFi(全局文档频率)为所有兴趣维度的所有语料中包含特征词ti的文档数量。Wii=TFi*log(1+N/GDFi), TFi is the word frequency of the feature word ti appearing in all corpora belonging to the i-th interest dimension, N is the number of documents in all corpus of the feature word ti in all interest dimensions, GDFi (Global document frequency) is the number of documents containing the feature word ti in all corpora of all dimensions of interest.

(c)根据(a)步骤得到的特征词,生成文档的特征向量V=(v1,v2,...,vn),其中vi为特征词ti在文档中的权重,vi=特征词ti在文档中出现的词频。(c) According to the feature words obtained in step (a), generate the feature vector V=(v1, v2, ..., vn) of the document, where vi is the weight of the feature word ti in the document, and vi=the feature word ti in The frequency of the term in the document.

(d)计算文档的特征向量v与第i个兴趣维度的特征向量Wi之间的相似度,得到相似度的评分值di:(d) Calculate the similarity between the feature vector v of the document and the feature vector Wi of the i-th dimension of interest, and obtain the score value di of the similarity:

didi == Wiwi ** VV // || Wiwi || ** || VV ||

== (( wiwi 11 ** vv 11 ++ wiwi 22 ** vv 22 ++ .. .. .. ++ winwin ** vnvn )) (( wiwi 11 22 ++ wiwi 22 22 ++ .. .. .. ++ winwin 22 ** vv 11 22 ++ vv 22 22 ++ .. .. .. ++ vnvn 22 ))

如果用户对某个点击过的文档进行评价,如果评价为好,di向量乘以一个正的常数c,表示文档的重要性增加,即di=c*di=(c*ti,c*t2,c*t3,......,c*tn);如果评价为不好,di向量乘以一个正的常数c的倒数,表示文档的重要性减小,即di=1/c*di=(1/c*ti,1/c*t2,1/c*t3,......,1/c*tn);If the user evaluates a clicked document, if the evaluation is good, the di vector is multiplied by a positive constant c, indicating that the importance of the document increases, that is, di=c*di=(c*ti, c*t2, c*t3,...,c*tn); if the evaluation is not good, the di vector is multiplied by the reciprocal of a positive constant c, indicating that the importance of the document is reduced, that is, di=1/c*di =(1/c*ti, 1/c*t2, 1/c*t3, ..., 1/c*tn);

一段时间后,tj的值自动减少一定的百分比,表示随着时间的推移其重要性减弱,直到过了较长的时间tj的值减为零为止,这时可以将di从历史记录中删除。After a period of time, the value of tj will automatically decrease by a certain percentage, indicating that its importance will decrease as time goes by, until the value of tj decreases to zero after a long period of time, then di can be deleted from the history.

分别对W1和W2作归一化,得到用户兴趣模型W=r1*W1+r2*W2,其中r1+r2=1。W1 and W2 are normalized respectively to obtain a user interest model W=r1*W1+r2*W2, where r1+r2=1.

本发明实施例移动搜索方法,针对用户的搜索请求,通过计算各搜索类型域的个性化用户兴趣评分值,选择评分值高的一个或几个搜索类型域进行搜索,从而可以为用户确定个性化查询分类,为用户提供个性化的精确的搜索结果。In the mobile search method of the embodiment of the present invention, according to the user's search request, by calculating the personalized user interest score value of each search type domain, one or several search type domains with high score value are selected for searching, so as to determine the personalized user interest score value for the user. Query classification to provide users with personalized and accurate search results.

在上面各实施例中,在进行搜索类型域选择时,分别以所述搜索请求与所述搜索类型域的相似度、所述搜索请求对应所述搜索类型域的大众搜索率、以及搜索类型域的个性化用户兴趣评分值作为搜索类型域选择的依据,确定用户的个性化查询分类,为用户提供个性化的精确的搜索结果。In each of the above embodiments, when selecting a search type field, the similarity between the search request and the search type field, the search rate of the search request corresponding to the search type field, and the search type field The personalized user interest score value is used as the basis for selecting the search type field to determine the user's personalized query classification and provide users with personalized and accurate search results.

在本发明实施例中,还可以综合考虑上述任意两项或多项,计算出每个搜索类型域的综合评分值,选择综合评分值高的一个或几个搜索类型域进行搜索。下面以综合考虑上述三项作为搜索类型域选择的依据为例,对本发明实施例详细说明。In the embodiment of the present invention, any two or more of the above-mentioned items may be considered comprehensively to calculate the comprehensive score value of each search type domain, and select one or several search type domains with high comprehensive score values for searching. Hereinafter, the embodiment of the present invention will be described in detail by taking comprehensive consideration of the above three items as the basis for selecting the search type field as an example.

参照图5,是本发明实施例移动搜索方法的另一种实现流程图。Referring to FIG. 5 , it is another implementation flowchart of the mobile search method according to the embodiment of the present invention.

步骤501,接收搜索请求,所述搜索请求中包含一个或多个查询关键字。Step 501, receiving a search request, the search request includes one or more query keywords.

步骤502,分别计算所述搜索请求与各搜索类型域的相似度、所述搜索请求对应各搜索类型域的大众搜索率、所述搜索类型域的个性化用户兴趣评分值。Step 502, respectively calculating the similarity between the search request and each search type field, the public search rate corresponding to each search type field of the search request, and the personalized user interest score value of the search type field.

步骤503,将得到对应所述搜索类型域的各值进行归一化处理,得到各搜索类型域的综合评分值。Step 503 , performing normalization processing on the obtained values corresponding to the search type fields to obtain the comprehensive score value of each search type field.

比如,计算所述搜索请求与某个搜索类型域的相似度,并将其归一化,得到值Score1;For example, calculate the similarity between the search request and a certain search type field, and normalize it to obtain the value Score1;

计算所述搜索请求对应该搜索类型域的大众搜索率,并将其归一化,得到值Score2;Calculate the public search rate of the search request corresponding to the search type domain, and normalize it to obtain the value Score2;

计算该搜索类型域的个性化用户兴趣评分值,并将其归一化,得到值Score3;Calculate the personalized user interest score value of the search type domain and normalize it to obtain the value Score3;

计算该搜索类型域的综合评分值=r1*score1+r2*score2+r3*score3,其中,r1,r2,r3分别为Score1,Score2,Score3的权值,r1+r2+r3+r4=1。Calculate the comprehensive score value of the search type field=r1*score1+r2*score2+r3*score3, where r1, r2, r3 are the weights of Score1, Score2, and Score3 respectively, and r1+r2+r3+r4=1.

综合评分值也可以有其他计算方式,如:The comprehensive score value can also be calculated in other ways, such as:

综合评分值=score1*score2*score3,或者Comprehensive score value = score1*score2*score3, or

综合评分值=(score1+score2+score3)/3,等。Comprehensive score value=(score1+score2+score3)/3, etc.

步骤504,选择综合评分值高的一个或多个搜索类型域进行搜索。Step 504, select one or more search type domains with high comprehensive score values to search.

可见,在本发明实施例中,综合考虑了多项因素确定用户的个性化查询分类,计算出每个搜索类型域的综合评分值,选择综合评分值高的一个或几个搜索类型域进行搜索,从而为用户提供个性化的精确的搜索结果。It can be seen that in the embodiment of the present invention, a number of factors are comprehensively considered to determine the user's personalized query classification, the comprehensive score value of each search type field is calculated, and one or several search type fields with high comprehensive score values are selected for searching. , so as to provide users with personalized and accurate search results.

本领域普通技术人员可以理解实现上述实施例方法中的全部或部分步骤是可以通过程序来指令相关的硬件来完成,所述的程序可以存储于一计算机可读取存储介质中,所述的存储介质,如:ROM/RAM、磁碟、光盘等。Those of ordinary skill in the art can understand that all or part of the steps in the method of the above-mentioned embodiments can be completed by instructing related hardware through a program, and the program can be stored in a computer-readable storage medium, and the storage Media, such as: ROM/RAM, disk, CD, etc.

本发明实施例还提供了一种移动搜索装置,如图6所示,是该装置的结构示意图:The embodiment of the present invention also provides a mobile search device, as shown in Figure 6, which is a schematic structural diagram of the device:

在该实施例中,所述装置包括:接收单元601、计算单元602、选择单元603和搜索单元604。其中:In this embodiment, the apparatus includes: a receiving unit 601 , a calculating unit 602 , a selecting unit 603 and a searching unit 604 . in:

接收单元601,用于接收搜索请求,所述搜索请求中包含一个或多个查询关键字;A receiving unit 601, configured to receive a search request, where the search request includes one or more query keywords;

计算单元602,用于计算各搜索类型域的评分值,所述评分值为以下任意一项的评分值或多项的综合评分值:所述搜索请求与所述搜索类型域的相似度、所述搜索请求对应所述搜索类型域的大众搜索率、搜索类型域的个性化用户兴趣评分值;Calculation unit 602, configured to calculate the score value of each search type field, the score value is the score value of any one of the following items or the comprehensive score value of multiple items: the similarity between the search request and the search type field, the The search request corresponds to the popular search rate of the search type domain and the personalized user interest score value of the search type domain;

计算单元602计算各搜索类型域的综合评分值为:根据搜索请求与搜索类型域的相似度、搜索请求对应搜索类型域的大众搜索率和搜索类型域的个性化用户兴趣评分值中多项计算乘积评分值、平均评分值或加权评分值。The calculation unit 602 calculates the comprehensive score value of each search type field: according to the similarity between the search request and the search type field, the public search rate of the search request corresponding to the search type field, and the personalized user interest score value of the search type field. Product rating, average rating, or weighted rating.

选择单元603,根据各搜索类型域的评分值选择其中一个或几个搜索类型域;A selection unit 603, selecting one or several search type fields according to the scoring values of each search type field;

搜索单元604,用于利用所述选择单元选择的搜索类型域搜索所述查询关键字。The searching unit 604 is configured to use the search type field selected by the selecting unit to search for the query keyword.

在本发明实施例中,在计算单元602和选择单元603确定用户的个性化查询分类时,可以有多种实现方式,比如,可以是根据所述搜索请求与所述搜索类型域的相似度,选择相似度高的一个或几个搜索类型域进行搜索;也可以是根据所述搜索请求对应所述搜索类型域的大众搜索率,选择大众搜索率高的一个或几个搜索类型域进行搜索;还可以根据搜索类型域的个性化用户兴趣评分值,选择个性化用户兴趣评分值高的一个或几个搜索类型域进行搜索。当然,还可以是综合考虑上述几项,计算出每个搜索类型域的综合评分值,选择综合评分值高的一个或几个搜索类型域进行搜索。因此,所述计算单元602包括以下任意一个或多个单元:In the embodiment of the present invention, when the calculation unit 602 and the selection unit 603 determine the classification of the user's personalized query, there may be multiple implementation methods, for example, it may be based on the similarity between the search request and the search type domain, Select one or several search type domains with high similarity to search; it may also be to select one or several search type domains with high public search rate according to the search request corresponding to the popular search rate of the search type domain; It is also possible to select one or several search type domains with high personalized user interest score values for searching according to the personalized user interest score values of the search type domains. Of course, it is also possible to comprehensively consider the above items, calculate the comprehensive score value of each search type field, and select one or several search type fields with high comprehensive score values for searching. Therefore, the computing unit 602 includes any one or more of the following units:

相似度计算单元,用于计算所述搜索请求与各搜索类型域的相似度;a similarity calculation unit, configured to calculate the similarity between the search request and each search type field;

大众搜索率计算单元,用于计算所述搜索请求对应各搜索类型域的大众搜索率;a public search rate calculation unit, configured to calculate the public search rate corresponding to each search type field of the search request;

用户兴趣评分值计算单元,用于计算各搜索类型域的个性化用户兴趣评分值。The user interest scoring value calculation unit is used to calculate the personalized user interest scoring value of each search type domain.

下面对此分别举例详细说明。Examples are given below to explain this in detail.

如图7所示,是本发明实施例移动搜索装置的一种具体结构示意图。As shown in FIG. 7 , it is a schematic structural diagram of a mobile search device according to an embodiment of the present invention.

在该实施例中,所述装置包括:接收单元701、相似度计算单元702、选择单元703和搜索单元704。其中,所述接收单元701、选择单元703和搜索单元704与图6所示实施例中各对应单元一致,在此不再详细描述。In this embodiment, the apparatus includes: a receiving unit 701 , a similarity calculating unit 702 , a selecting unit 703 and a searching unit 704 . Wherein, the receiving unit 701 , the selecting unit 703 and the searching unit 704 are consistent with the corresponding units in the embodiment shown in FIG. 6 , and will not be described in detail here.

所述相似度计算单元702包括:权重设置子单元721、查询向量生成子单元722、域向量生成单元723和第一计算子单元724。其中:权重设置子单元721,用于为所述查询关键字设置权重;查询向量生成子单元722,用于由所述查询关键字的权重生成查询向量;域向量生成单元723,用于由所述搜索类型域的各词的权重生成对应该搜索类型域的域向量;第一计算子单元724,用于通过计算所述查询向量和域向量得到所述所述搜索请求与搜索类型域的相似度。The similarity calculation unit 702 includes: a weight setting subunit 721 , a query vector generation subunit 722 , a field vector generation unit 723 and a first calculation subunit 724 . Wherein: the weight setting subunit 721 is used to set the weight for the query keyword; the query vector generation subunit 722 is used to generate a query vector by the weight of the query keyword; the field vector generation unit 723 is used to generate the query vector by the weight of the query keyword The weight of each word in the search type domain generates a domain vector corresponding to the search type domain; the first calculation subunit 724 is used to obtain the similarity between the search request and the search type domain by calculating the query vector and domain vector Spend.

在该实施例中,所述装置还可进一步包括:设置单元(未图示)或学习单元705。其中,所述设置单元,用于通过人工方式确定所述搜索类型域中的主题词和相关词,以及各词的权重;所述学习单元705,用于通过自动学习方式确定所述搜索类型域中的主题词和相关词,以及各词的权重。In this embodiment, the device may further include: a setting unit (not shown) or a learning unit 705 . Wherein, the setting unit is used to manually determine the subject words and related words in the search type field, as well as the weight of each word; the learning unit 705 is used to determine the search type field through automatic learning. The subject words and related words in , and the weight of each word.

所述学习单元705包括:语料样本获取子单元751、词库生成子单元752、权重计算子单元753和主题词确定子单元754。其中:语料样本获取子单元751,用于对于每个搜索类型域,获取对应该搜索类型域的训练文本语料样本;词库生成子单元752,用于对所述语料样本进行切词,生成该搜索类型域的词库;The learning unit 705 includes: a corpus sample acquisition subunit 751 , a thesaurus generation subunit 752 , a weight calculation subunit 753 and a subject term determination subunit 754 . Wherein: the corpus sample acquisition subunit 751 is used for each search type domain, obtains the training text corpus sample corresponding to the search type domain; search thesaurus for type domains;

权重计算子单元753,用于计算所述词库中各词的权重;主题词确定子单元754,用于根据各词的权重确定所述搜索类型域中的主题词和相关词。The weight calculation subunit 753 is used to calculate the weight of each word in the thesaurus; the topic word determination subunit 754 is used to determine the topic word and related words in the search type field according to the weight of each word.

在本发明实施例中,所述学习单元705还可进一步包括:档次划分子单元755和评分值设置子单元756。其中,档次划分子单元755,用于将所述词库中的所有词按照权重划分为不同档次的集合;评分值设置子单元756,用于为每个档次的集合设置最终评分值,并将每个档次的最终评分值作为该档次内的各词的权重。In the embodiment of the present invention, the learning unit 705 may further include: a grade division subunit 755 and a scoring value setting subunit 756 . Wherein, the grade division subunit 755 is used to divide all words in the thesaurus into sets of different grades according to weight; the score value setting subunit 756 is used to set the final score value for the set of each grade, and set The final scoring value of each class is used as the weight of each word in the class.

本发明实施例移动搜索装置,针对用户的搜索请求,通过计算搜索请求与各搜索类型域的相似度,选择相似度高的一个或几个搜索类型域进行搜索,从而可以为用户确定个性化查询分类,为用户提供个性化的精确的搜索结果。具体过程可参照前面图2所示实施例中的描述,在此不再赘述。The mobile search device in the embodiment of the present invention, according to the user's search request, calculates the similarity between the search request and each search type field, and selects one or several search type fields with high similarity for searching, so as to determine the personalized query for the user Classification, to provide users with personalized and accurate search results. For the specific process, reference may be made to the description in the embodiment shown in FIG. 2 above, and details are not repeated here.

如图8所示,是本发明实施例移动搜索装置的另一种具体结构示意图。As shown in FIG. 8 , it is a schematic diagram of another specific structure of a mobile search device according to an embodiment of the present invention.

在该实施例中,所述装置包括:接收单元801、大众搜索率计算单元802、选择单元803和搜索单元804。其中,所述接收单元801、选择单元803和搜索单元804与图6所示实施例中各对应单元一致,在此不再详细描述。In this embodiment, the device includes: a receiving unit 801 , a popular search rate calculating unit 802 , a selecting unit 803 and a searching unit 804 . Wherein, the receiving unit 801 , the selecting unit 803 and the searching unit 804 are consistent with the corresponding units in the embodiment shown in FIG. 6 , and will not be described in detail here.

所述大众搜索率计算单元802包括第二计算子单元821和相加子单元822,其中,第二计算子单元821,用于计算所述搜索请求中每个查询关键字对应的各搜索类型域的大众搜索率;相加子单元822,用于将所述搜索请求中所有查询关键字对应的同一个搜索类型域的大众搜索率的和作为所述搜索请求对应该搜索类型域的大众搜索率。The public search rate calculation unit 802 includes a second calculation subunit 821 and an addition subunit 822, wherein the second calculation subunit 821 is used to calculate the search type fields corresponding to each query keyword in the search request the popular search rate; the adding subunit 822 is used to use the sum of the popular search rates of the same search type field corresponding to all query keywords in the search request as the popular search rate of the search request corresponding to the search type field .

在本发明实施例中,所述大众搜索率具体可以是大众搜索次数。所述第二计算子单元821计算所述搜索请求中每个关键字对应的某个搜索类型域的大众搜索总次数时,可以依据历史记录,搜集所有用户关于包含所述搜索请求中某个关键字的搜索请求选择用某个搜索类型域进行搜索的次数的总和,作为该关键字对应的大众对该搜索类型域进行搜索的总次数,即对应该搜索类型域的大众搜索总次数;然后所述相加子单元822将所述搜索请求中所有关键字对应的该搜索类型域的大众搜索总次数的和,作为所述搜索请求对应的该搜索类型域的大众搜索总次数。In this embodiment of the present invention, the public search rate may specifically be the number of public searches. When the second calculation subunit 821 calculates the total number of public searches for a certain search type field corresponding to each keyword in the search request, it can collect all user information about a certain keyword in the search request according to historical records. The sum of the number of times that a certain search type field is used for the search request of a word is selected as the total number of times that the public searches for this search type field corresponding to this keyword, that is, the total number of times that the public searches for this search type field; and then The adding subunit 822 takes the sum of the total number of popular searches in the search type field corresponding to all the keywords in the search request as the total number of popular searches in the search type field corresponding to the search request.

在本发明实施例中,所述大众搜索率具体还可以是大众搜索结果点击次数。所述第二计算子单元821计算所述搜索请求中每个关键字对应的某个搜索类型域的大众搜索结果点击总次数时,可以依据历史记录,搜集所有用户关于包含所述搜索请求中某个关键字的搜索请求选择用某个搜索类型域进行搜索的搜索结果点击次数的总和,作为该关键字对应的大众对该搜索类型域的搜索结果点击的总次数,即对应该搜索类型域的大众搜索结果点击总次数;然后所述相加子单元822将所述搜索请求中所有关键字对应的该搜索类型域的大众搜索结果点击总次数的和,作为所述搜索请求对应的该搜索类型域的大众搜索结果点击总次数。In the embodiment of the present invention, the public search rate may specifically be the number of clicks on public search results. When the second calculation subunit 821 calculates the total number of clicks on the public search results of a certain search type field corresponding to each keyword in the search request, it can collect all users' information about a certain search type field included in the search request according to historical records. The sum of the number of clicks on the search results of a certain search type domain for a search request of a keyword is taken as the total number of clicks on the search results of the search type domain by the public corresponding to the keyword, that is, the number of clicks corresponding to the search type domain The total number of clicks on popular search results; then the summing subunit 822 uses the sum of the total number of clicks on popular search results in the search type field corresponding to all keywords in the search request as the search type corresponding to the search request The total number of popular search result clicks for the domain.

本发明实施例移动搜索装置,针对用户的搜索请求,通过计算所述搜索请求对应各搜索类型域的大众搜索率,选择大众搜索率高的一个或几个搜索类型域进行搜索,从而可以为用户确定个性化查询分类,为用户提供个性化的精确的搜索结果。具体过程可参照前面图3所示实施例中的描述,在此不再赘述。According to the search request of the user, the mobile search device in the embodiment of the present invention calculates the public search rate of each search type domain corresponding to the search request, and selects one or several search type domains with a high public search rate to search, so as to provide users with Determine the classification of personalized queries to provide users with personalized and accurate search results. For the specific process, reference may be made to the description in the embodiment shown in FIG. 3 above, and details are not repeated here.

如图9所示,是本发明实施例移动搜索装置的另一种具体结构示意图。As shown in FIG. 9 , it is a schematic diagram of another specific structure of a mobile search device according to an embodiment of the present invention.

在该实施例中,所述装置包括:接收单元901、用户兴趣评分值计算单元902、选择单元903和搜索单元904。其中,所述接收单元901、选择单元903和搜索单元904与图6所示实施例中各对应单元一致,在此不再详细描述。In this embodiment, the device includes: a receiving unit 901 , a calculation unit 902 for a user interest score value, a selection unit 903 and a search unit 904 . Wherein, the receiving unit 901 , the selecting unit 903 and the searching unit 904 are consistent with the corresponding units in the embodiment shown in FIG. 6 , and will not be described in detail here.

所述用户兴趣评分值计算单元902包括兴趣模型提取子单元921和第三计算子单元922,其中,兴趣模型提取子单元921,用于从用户数据中提取用户的兴趣模型,所述用户的兴趣模型为所述用户数据针对多个兴趣维度的评分值组成的向量;第三计算子单元922,用于将所述搜索类型域对应所述用户兴趣模型的一个或多个兴趣维度的评分值之和作为所述搜索类型域的个性化用户兴趣评分值。The user interest score calculation unit 902 includes an interest model extraction subunit 921 and a third calculation subunit 922, wherein the interest model extraction subunit 921 is used to extract the user interest model from the user data, and the user interest The model is a vector composed of score values of multiple interest dimensions of the user data; the third calculation subunit 922 is configured to map the search type domain to one of the score values of one or more interest dimensions of the user interest model and the personalized user interest score value as the search type field.

在该实施例中,所述用户的兴趣模型为:静态兴趣模型或动态兴趣模型,还可以是综合所述静态兴趣模型或动态兴趣模型而生成的兴趣模型。为此,所述兴趣模型提取子单元921可以有多种结构方式。In this embodiment, the interest model of the user is: a static interest model or a dynamic interest model, or an interest model generated by combining the static interest model or the dynamic interest model. For this reason, the interest model extracting subunit 921 may have various structural modes.

所述兴趣模型提取子单元921可以只包括第一提取子单元(图中未示),用于计算用户的静态档案中属于每个兴趣维度的所有词的词频之和,并将其作为对应每个兴趣维度的评分值,由对应每个兴趣维度的评分值作为向量生成所述用户兴趣模型;The interest model extraction subunit 921 may only include a first extraction subunit (not shown in the figure), which is used to calculate the sum of the word frequencies of all words belonging to each interest dimension in the user's static file, and use it as the sum of the word frequencies corresponding to each interest dimension. The scoring value of each interest dimension, the user interest model is generated as a vector by the scoring value corresponding to each interest dimension;

所述兴趣模型提取子单元921还可以只包括第二提取子单元(图中未示),用于计算用户搜索的历史记录历史记录中被点击的文档中属于每个兴趣维度的所有词的词频之和,并将其作为对应每个兴趣维度的评分值,由对应每个兴趣维度的评分值作为向量生成所述用户的动态兴趣模型。The interest model extracting subunit 921 may also only include a second extracting subunit (not shown in the figure), which is used to calculate the word frequency of all words belonging to each interest dimension in the clicked document in the history record history record of the user search and take it as a score value corresponding to each interest dimension, and use the score value corresponding to each interest dimension as a vector to generate a dynamic interest model of the user.

如图10所示,所述兴趣模型提取子单元921还可以包括所述第一提取子单元1001和所述第二提取子单元1002,以及第一处理子单元1003和第一加权子单元1004。其中,第一处理子单元1003,用于分别对所述静态兴趣模型和所述动态兴趣模型进行归一化处理;第一加权子单元1004,用于计算归一化处理后的静态兴趣模型和动态兴趣模型的和,并将该和作为所述用户的兴趣模型。As shown in FIG. 10 , the interest model extraction subunit 921 may further include the first extraction subunit 1001 and the second extraction subunit 1002 , as well as a first processing subunit 1003 and a first weighting subunit 1004 . Wherein, the first processing subunit 1003 is used to normalize the static interest model and the dynamic interest model respectively; the first weighting subunit 1004 is used to calculate the normalized static interest model and The sum of dynamic interest models, and use this sum as the user's interest model.

如图11所示,所述兴趣模型提取子单元921还可以包括所述第一提取子单元1101和所述第二提取子单元1102,以及第二加权子单元1103和第二处理子单元1104。其中,第二加权子单元1103,用于将所述静态兴趣模型和所述动态兴趣模型进行加权相加;第二处理子单元1104,用于将所述第二加权子单元输出的结果进行归一化处理,并将归一化处理后的结果作为所述用户的兴趣模型。As shown in FIG. 11 , the interest model extraction subunit 921 may further include the first extraction subunit 1101 and the second extraction subunit 1102 , as well as a second weighting subunit 1103 and a second processing subunit 1104 . Wherein, the second weighting subunit 1103 is configured to perform weighted addition of the static interest model and the dynamic interest model; the second processing subunit 1104 is configured to normalize the results output by the second weighting subunit normalized, and use the normalized result as the user's interest model.

本发明实施例移动搜索装置,针对用户的搜索请求,通过计算各搜索类型域的个性化用户兴趣评分值,选择评分值高的一个或几个搜索类型域进行搜索,从而可以为用户确定个性化查询分类,为用户提供个性化的精确的搜索结果。具体过程可参照前面本发明实施例移动搜索方法中的描述。According to the user's search request, the mobile search device in the embodiment of the present invention calculates the personalized user interest score value of each search type domain, and selects one or several search type domains with high score values to search, so as to determine the personalization for the user. Query classification to provide users with personalized and accurate search results. For the specific process, reference may be made to the description in the mobile search method in the embodiment of the present invention.

在上面各实施例的移动搜索装置中,在进行搜索类型域选择时,分别以所述搜索请求与所述搜索类型域的相似度、所述搜索请求对应所述搜索类型域的大众搜索率、以及搜索类型域的个性化用户兴趣评分值作为搜索类型域选择的依据,确定用户的个性化查询分类,为用户提供个性化的精确的搜索结果。In the mobile search device in each of the above embodiments, when selecting a search type domain, the similarity between the search request and the search type domain, the public search rate of the search request corresponding to the search type domain, And the personalized user interest score value of the search type field is used as the basis for selecting the search type field to determine the user's personalized query category and provide the user with personalized and accurate search results.

在本发明实施例中,还可以综合考虑上述任意两项或多项,计算出每个搜索类型域的综合评分值,选择综合评分值高的一个或几个搜索类型域进行搜索。下面以综合考虑上述三项作为搜索类型域选择的依据为例,对本发明实施例详细说明。In the embodiment of the present invention, any two or more of the above-mentioned items may be considered comprehensively to calculate the comprehensive score value of each search type domain, and select one or several search type domains with high comprehensive score values for searching. Hereinafter, the embodiment of the present invention will be described in detail by taking comprehensive consideration of the above three items as the basis for selecting the search type field as an example.

参照图12,是本发明实施例移动搜索装置的另一种结构图。Referring to FIG. 12 , it is another structural diagram of a mobile search device according to an embodiment of the present invention.

在该实施例中,所述装置包括:接收单元1201、计算单元1202、选择单元1203和搜索单元1204。其中,接收单元1201,用于接收搜索请求,所述搜索请求中包含一个或多个查询关键字;计算单元1202,用于计算各搜索类型域的评分值,所述评分值为以下任意一项的评分值或多项的综合评分值:所述搜索请求与所述搜索类型域的相似度、所述搜索请求对应所述搜索类型域的大众搜索率、搜索类型域的个性化用户兴趣评分值;选择单元1203,根据各搜索类型域的评分值选择其中一个或几个搜索类型域;搜索单元1204,用于利用所述选择单元选择的搜索类型域搜索所述查询关键字。In this embodiment, the apparatus includes: a receiving unit 1201 , a calculating unit 1202 , a selecting unit 1203 and a searching unit 1204 . Wherein, the receiving unit 1201 is used to receive a search request, and the search request contains one or more query keywords; the calculation unit 1202 is used to calculate the score value of each search type field, and the score value is any one of the following Score value or comprehensive score value of multiple items: the similarity between the search request and the search type domain, the public search rate corresponding to the search request type domain, and the personalized user interest score value of the search type domain The selection unit 1203 is used to select one or more of the search type fields according to the scoring value of each search type field; the search unit 1204 is configured to use the search type field selected by the selection unit to search for the query keyword.

在该实施例中,所述计算单元1202包括:相似度计算单元1221,大众搜索率计算单元1222,用户兴趣评分值计算单元1223、归一化处理单元1224和综合处理单元1225。其中,相似度计算单元1221,用于计算所述搜索请求与各搜索类型域的相似度;大众搜索率计算单元1222,用于计算所述搜索请求对应各搜索类型域的大众搜索率;用户兴趣评分值计算单元1223,用于计算各搜索类型域的个性化用户兴趣评分值;归一化处理单元1224,用于分别对所述相似度计算单元、所述大众搜索率计算单元和所述用户兴趣评分值计算单元计算得到的值进行归一化处理;综合处理单元1225,用于对归一化处理单元1224得到的任意两个或多个归一化后的值进行综合计算,例如:乘积、平均或加权相加等,得到各搜索类型域的评分值。In this embodiment, the calculation unit 1202 includes: a similarity calculation unit 1221 , a public search rate calculation unit 1222 , a user interest score value calculation unit 1223 , a normalization processing unit 1224 and a comprehensive processing unit 1225 . Among them, the similarity calculation unit 1221 is used to calculate the similarity between the search request and each search type domain; the public search rate calculation unit 1222 is used to calculate the public search rate of each search type domain corresponding to the search request; user interest Score value calculation unit 1223, used to calculate the personalized user interest score value of each search type field; normalization processing unit 1224, used to respectively calculate the similarity calculation unit, the popular search rate calculation unit and the user The value calculated by the interest score value calculation unit is normalized; the comprehensive processing unit 1225 is used for comprehensive calculation of any two or more normalized values obtained by the normalization processing unit 1224, such as: product , average or weighted sum, etc., to obtain the score value of each search type field.

可见,本发明实施例的移动搜索装置,综合考虑了多项因素确定用户的个性化查询分类,计算出每个搜索类型域的综合评分值,选择综合评分值高的一个或几个搜索类型域进行搜索,从而可以为用户提供个性化的精确的搜索结果。It can be seen that the mobile search device in the embodiment of the present invention comprehensively considers multiple factors to determine the user's personalized query classification, calculates the comprehensive score value of each search type field, and selects one or several search type fields with high comprehensive score values. Search, which can provide users with personalized and accurate search results.

以上对本发明实施例进行了详细介绍,本文中应用了具体实施方式对本发明进行了阐述,以上实施例的说明只是用于帮助理解本发明的方法及设备;同时,对于本领域的一般技术人员,依据本发明的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本发明的限制。The embodiments of the present invention have been described in detail above, and the present invention has been described using specific implementation methods herein. The descriptions of the above embodiments are only used to help understand the method and equipment of the present invention; meanwhile, for those of ordinary skill in the art, According to the idea of the present invention, there will be changes in the specific implementation and scope of application. To sum up, the contents of this specification should not be construed as limiting the present invention.

Claims (27)

1.一种移动搜索方法,其特征在于,包括:1. A mobile search method, characterized in that, comprising: 接收搜索请求,所述搜索请求中包含一个或多个查询关键字;Receiving a search request, the search request includes one or more query keywords; 计算各搜索类型域的评分值,所述评分值为以下任意一项的评分值或多项的综合评分值:所述搜索请求与所述搜索类型域的相似度、所述搜索请求对应所述搜索类型域的大众搜索率、搜索类型域的个性化用户兴趣评分值;Calculate the score value of each search type field, and the score value is the score value of any one of the following items or the comprehensive score value of multiple items: the similarity between the search request and the search type field, the search request corresponding to the The popular search rate of the search type domain, the personalized user interest score value of the search type domain; 根据各搜索类型域的评分值选择其中一个或几个搜索类型域搜索所述查询关键字。According to the scoring value of each search type field, one or several search type fields are selected to search for the query keyword. 2.根据权利要求1所述的方法,其特征在于,所述计算各搜索类型域的综合评分值为根据所述搜索请求与所述搜索类型域的相似度、所述搜索请求对应所述搜索类型域的大众搜索率和搜索类型域的个性化用户兴趣评分值中多项计算乘积评分值、平均评分值或加权评分值。2. The method according to claim 1, wherein the calculation of the comprehensive score value of each search type field is based on the similarity between the search request and the search type field, the search request corresponding to the search The product score value, average score value or weighted score value are calculated from the mass search rate of the type domain and the personalized user interest score value of the search type domain. 3.根据权利要求1所述的方法,其特征在于,所述计算所述搜索请求与所述搜索类型域的相似度包括:3. The method according to claim 1, wherein the calculating the similarity between the search request and the search type field comprises: 为所述查询关键字设置权重;Setting weights for the query keywords; 由所述查询关键字的权重生成查询向量;generating a query vector by the weight of the query keyword; 由所述搜索类型域的各词的权重生成对应该搜索类型域的域向量;Generate a domain vector corresponding to the search type domain by the weight of each word in the search type domain; 通过计算所述查询向量和域向量得到所述所述搜索请求与搜索类型域的相似度。The similarity between the search request and the search type domain is obtained by calculating the query vector and the domain vector. 4.根据权利要求3所述的方法,其特征在于,所述方法还包括:4. method according to claim 3, is characterized in that, described method also comprises: 通过人工方式确定所述搜索类型域中的主题词和相关词,以及各词的权重;或者Manually determine the subject terms and related terms in the search type field, and the weight of each term; or 通过自动学习方式确定所述搜索类型域中的主题词和相关词,以及各词的权重。The subject words and related words in the search type domain, and the weight of each word are determined through automatic learning. 5.根据权利要求4所述的方法,其特征在于,所述通过自动学习方式确定所述搜索类型域中的主题词和相关词,以及各词的权重包括:5. method according to claim 4, is characterized in that, described subject word and related word in described search type domain are determined by automatic learning mode, and the weight of each word comprises: 对于每个搜索类型域,获取对应该搜索类型域的训练文本语料样本;For each search type domain, obtain a training text corpus sample corresponding to the search type domain; 对所述语料样本进行切词,生成该搜索类型域的词库;Carry out word segmentation to described corpus sample, generate the thesaurus of this search type field; 计算所述词库中各词的权重;Calculate the weight of each word in the thesaurus; 根据各词的权重确定所述搜索类型域中的主题词和相关词。Subject words and related words in the search type domain are determined according to the weights of each word. 6.根据权利要求5所述的方法,其特征在于,所述通过自动学习方式确定所述搜索类型域中的主题词和相关词,以及各词的权重还包括:6. method according to claim 5, is characterized in that, described subject word and related word in described search type field are determined by automatic learning mode, and the weight of each word also comprises: 将所述词库中的所有词按照权重划分为不同档次的集合;Divide all the words in the thesaurus into sets of different grades according to weight; 为每个档次的集合设置最终评分值,并将每个档次的最终评分值作为该档次内的各词的权重。Set the final scoring value for the collection of each grade, and use the final scoring value of each grade as the weight of each word in the grade. 7.根据权利要求3所述的方法,其特征在于,所述为所述查询关键字设置权重包括:7. The method according to claim 3, wherein said setting weight for said query keyword comprises: 为全部查询关键字设置相同的权重;或者Set the same weight for all query keywords; or 为排在最前的关键字设置最大权重,为排在中间的关键字设置中间大小的权重,为排在最后的关键字设置最小权重。Set a maximum weight for the top keywords, a medium-sized weight for the middle keywords, and a minimum weight for the bottom keywords. 8.根据权利要求1所述的方法,其特征在于,所述计算所述搜索请求对应所述搜索类型域的大众搜索率包括:8. The method according to claim 1, wherein said calculating the public search rate of said search request corresponding to said search type field comprises: 计算所述搜索请求中每个查询关键字对应的各搜索类型域的大众搜索率;Calculate the public search rate of each search type field corresponding to each query keyword in the search request; 将所述搜索请求中所有查询关键字对应的同一个搜索类型域的大众搜索率的和作为所述搜索请求对应该搜索类型域的大众搜索率。The sum of the popular search rates of the same search type field corresponding to all query keywords in the search request is taken as the popular search rate of the search request corresponding to the search type field. 9.根据权利要求8所述的方法,其特征在于,所述大众搜索率为:大众搜索次数,或者大众搜索结果点击次数。9 . The method according to claim 8 , wherein the public search rate is: the number of public searches, or the number of clicks on public search results. 10.根据权利要求1所述的方法,其特征在于,所述计算所述搜索类型域的个性化用户兴趣评分值包括:10. The method according to claim 1, wherein said calculating the personalized user interest score value of said search type domain comprises: 从用户数据中提取用户的兴趣模型,所述用户的兴趣模型为所述用户数据针对多个兴趣维度的评分值组成的向量;Extracting the user's interest model from the user data, the user's interest model is a vector composed of the score values of the user data for multiple interest dimensions; 将所述搜索类型域对应所述用户兴趣模型的一个或多个兴趣维度的评分值之和作为所述搜索类型域的个性化用户兴趣评分值。The sum of the score values of the search type field corresponding to one or more interest dimensions of the user interest model is used as the personalized user interest score value of the search type field. 11.根据权利要求10所述的方法,其特征在于,所述用户的兴趣模型为:静态兴趣模型或动态兴趣模型;11. The method according to claim 10, wherein the user's interest model is: a static interest model or a dynamic interest model; 从用户数据中提取用户的静态兴趣模型包括:The static interest model of users extracted from user data includes: 计算用户的静态档案中属于每个兴趣维度的所有词的词频之和,并将其作为对应每个兴趣维度的评分值;或者,计算用户的静态档案与每个兴趣维度的相似度评分值,并将其作为对应每个兴趣维度的评分值;Calculate the sum of the word frequencies of all words belonging to each interest dimension in the user's static profile, and use it as the score value corresponding to each interest dimension; or, calculate the similarity score value between the user's static profile and each interest dimension, And take it as the scoring value corresponding to each dimension of interest; 由对应每个兴趣维度的评分值作为向量生成所述用户兴趣模型;generating the user interest model by using the scoring value corresponding to each interest dimension as a vector; 从用户数据中提取用户的动态兴趣模型包括:The user's dynamic interest model extracted from user data includes: 计算用户的搜索点击历史记录中属于每个兴趣维度的所有词的词频之和,并将其作为对应每个兴趣维度的评分值;或者,计算搜索点击历史记录与每个兴趣维度的相似度评分值,并将其作为对应每个兴趣维度的评分值;Calculate the sum of the word frequencies of all words belonging to each interest dimension in the user's search click history, and use it as the score value corresponding to each interest dimension; or, calculate the similarity score between the search click history and each interest dimension value, and use it as the scoring value corresponding to each dimension of interest; 由对应每个兴趣维度的评分值作为向量生成所述用户的动态兴趣模型。The user's dynamic interest model is generated by using the scoring value corresponding to each interest dimension as a vector. 12.根据权利要求11所述的方法,其特征在于,所述从用户数据中提取用户的兴趣模型还包括:12. The method according to claim 11, wherein said extracting the user's interest model from the user data further comprises: 分别对所述静态兴趣模型和所述动态兴趣模型进行归一化处理;respectively performing normalization processing on the static interest model and the dynamic interest model; 计算归一化处理后的一个或多个静态兴趣模型、和一个或多个动态兴趣模型的和,并将该和作为所述用户的兴趣模型。The sum of the one or more static interest models and the one or more dynamic interest models after normalization processing is calculated, and the sum is used as the interest model of the user. 13.根据权利要求11所述的方法,其特征在于,所述从用户数据中提取用户的兴趣模型还包括:13. The method according to claim 11, wherein said extracting the user's interest model from the user data further comprises: 将一个或多个所述静态兴趣模型、和一个或多个所述动态兴趣模型进行加权相加;performing weighted addition of one or more of the static interest models and one or more of the dynamic interest models; 将加权相加的和进行归一化处理,并将归一化处理后的结果作为所述用户的兴趣模型。The sum of the weighted additions is normalized, and the normalized result is used as the interest model of the user. 14.根据权利要求1所述的方法,其特征在于,所述计算各搜索类型域的加权评分值包括:14. The method according to claim 1, wherein said calculating the weighted scoring value of each search type field comprises: 计算所述搜索请求与所述搜索类型域的相似度,并将其归一化处理;calculating the similarity between the search request and the search type field, and normalizing it; 计算所述搜索请求对应所述搜索类型域的大众搜索率,并将其归一化处理;calculating the popular search rate of the search request corresponding to the search type domain, and normalizing it; 计算所述搜索类型域的个性化用户兴趣评分值,并将其归一化处理;Calculating the personalized user interest score value of the search type domain, and normalizing it; 将上述任意两个或多个归一化处理后的值进行加权相加,得到所述搜索类型域的加权评分值。Weighted addition of any two or more normalized values above to obtain the weighted score value of the search type field. 15.一种移动搜索装置,其特征在于,包括:15. A mobile search device, characterized in that it comprises: 接收单元,用于接收搜索请求,所述搜索请求中包含一个或多个查询关键字;a receiving unit, configured to receive a search request, wherein the search request includes one or more query keywords; 计算单元,用于计算各搜索类型域的评分值,所述评分值为以下任意一项的评分值或多项的综合评分值:所述搜索请求与所述搜索类型域的相似度、所述搜索请求对应所述搜索类型域的大众搜索率、搜索类型域的个性化用户兴趣评分值;A calculation unit, configured to calculate the score value of each search type field, and the score value is a score value of any one of the following items or a comprehensive score value of multiple items: the similarity between the search request and the search type field, the The search request corresponds to the popular search rate of the search type domain and the personalized user interest score value of the search type domain; 选择单元,根据各搜索类型域的评分值选择其中一个或几个搜索类型域;Selection unit, select one or several search type fields according to the scoring value of each search type field; 搜索单元,用于利用所述选择单元选择的搜索类型域搜索所述查询关键字。A search unit, configured to search for the query keyword using the search type field selected by the selection unit. 16.根据权利要求15所述的装置,其特征在于,所述计算单元计算各搜索类型域的综合评分值为根据所述搜索请求与所述搜索类型域的相似度、所述搜索请求对应所述搜索类型域的大众搜索率和搜索类型域的个性化用户兴趣评分值中多项计算乘积评分值、平均评分值或加权评分值。16. The device according to claim 15, wherein the calculation unit calculates the comprehensive score value of each search type field according to the similarity between the search request and the search type field, and the corresponding search request. Calculate the product score value, average score value or weighted score value of multiple items in the popular search rate of the above search type domain and the personalized user interest score value of the search type domain. 17.根据权利要求15所述的装置,其特征在于,所述计算单元包括以下任意一个或多个单元:17. The device according to claim 15, wherein the calculation unit comprises any one or more of the following units: 相似度计算单元,用于计算所述搜索请求与各搜索类型域的相似度;a similarity calculation unit, configured to calculate the similarity between the search request and each search type field; 大众搜索率计算单元,用于计算所述搜索请求对应各搜索类型域的大众搜索率;a public search rate calculation unit, configured to calculate the public search rate corresponding to each search type field of the search request; 用户兴趣评分值计算单元,用于计算各搜索类型域的个性化用户兴趣评分值。The user interest scoring value calculation unit is used to calculate the personalized user interest scoring value of each search type domain. 18.根据权利要求17所述的装置,其特征在于,所述相似度计算单元包括:18. The device according to claim 17, wherein the similarity calculation unit comprises: 权重设置子单元,用于为所述查询关键字设置权重;a weight setting subunit, configured to set weights for the query keywords; 查询向量生成子单元,用于由所述查询关键字的权重生成查询向量;a query vector generating subunit, configured to generate a query vector by the weight of the query keyword; 域向量生成单元,用于由所述搜索类型域的各词的权重生成对应该搜索类型域的域向量;A domain vector generating unit, configured to generate a domain vector corresponding to the search type domain from the weight of each word in the search type domain; 第一计算子单元,用于通过计算所述查询向量和域向量得到所述所述搜索请求与搜索类型域的相似度。The first calculation subunit is configured to obtain the similarity between the search request and the search type domain by calculating the query vector and the domain vector. 19.根据权利要求18所述的装置,其特征在于,所述装置还包括:19. The device according to claim 18, further comprising: 设置单元,用于通过人工方式确定所述搜索类型域中的主题词和相关词,以及各词的权重;或者A setting unit is used to manually determine the subject words and related words in the search type field, as well as the weight of each word; or 学习单元,用于通过自动学习方式确定所述搜索类型域中的主题词和相关词,以及各词的权重。The learning unit is used to determine the subject words and related words in the search type field and the weight of each word through automatic learning. 20.根据权利要求19所述的装置,其特征在于,所述学习单元包括:20. The device according to claim 19, wherein the learning unit comprises: 语料样本获取子单元,用于对于每个搜索类型域,获取对应该搜索类型域的训练文本语料样本;The corpus sample acquisition subunit is used to obtain, for each search type domain, a training text corpus sample corresponding to the search type domain; 词库生成子单元,用于对所述语料样本进行切词,生成该搜索类型域的词库;Thesaurus generation subunit is used to carry out word segmentation to described corpus sample, generates the thesaurus of this search type field; 权重计算子单元,用于计算所述词库中各词的权重;a weight calculation subunit for calculating the weight of each word in the thesaurus; 主题词确定子单元,用于根据各词的权重确定所述搜索类型域中的主题词和相关词。The subject word determination subunit is used to determine subject words and related words in the search type field according to the weight of each word. 21.根据权利要求20所述的装置,其特征在于,所述学习单元还包括:21. The device according to claim 20, wherein the learning unit further comprises: 档次划分子单元,用于将所述词库中的所有词按照权重划分为不同档次的集合;A grade division subunit is used to divide all the words in the thesaurus into sets of different grades according to their weights; 评分值设置子单元,用于为每个档次的集合设置最终评分值,并将每个档次的最终评分值作为该档次内的各词的权重。The scoring value setting subunit is used to set the final scoring value for the set of each grade, and use the final scoring value of each grade as the weight of each word in the grade. 22.根据权利要求17所述的装置,其特征在于,所述大众搜索率计算单元包括:22. The device according to claim 17, wherein the public search rate calculation unit comprises: 第二计算子单元,用于计算所述搜索请求中每个查询关键字对应的各搜索类型域的大众搜索率;The second calculation subunit is used to calculate the public search rate of each search type field corresponding to each query keyword in the search request; 相加子单元,用于将所述搜索请求中所有查询关键字对应的同一个搜索类型域的大众搜索率的和作为所述搜索请求对应该搜索类型域的大众搜索率。The adding subunit is configured to use the sum of the popular search rates of the same search type field corresponding to all the query keywords in the search request as the popular search rate of the search request corresponding to the search type field. 23.根据权利要求17所述的装置,其特征在于,所述用户兴趣评分值计算单元包括:23. The device according to claim 17, wherein the user interest score calculation unit comprises: 兴趣模型提取子单元,用于从用户数据中提取用户的兴趣模型,所述用户的兴趣模型为所述用户数据针对多个兴趣维度的评分值组成的向量;The interest model extraction subunit is used to extract the user's interest model from the user data, and the user's interest model is a vector composed of the score values of the user data for multiple interest dimensions; 第三计算子单元,用于将所述搜索类型域对应所述用户兴趣模型的一个或多个兴趣维度的评分值之和作为所述搜索类型域的个性化用户兴趣评分值。The third calculation subunit is configured to use the sum of the score values of the search type field corresponding to one or more interest dimensions of the user interest model as the personalized user interest score value of the search type field. 24.根据权利要求23所述的装置,其特征在于,所述用户的兴趣模型为:静态兴趣模型或动态兴趣模型;24. The device according to claim 23, wherein the user's interest model is: a static interest model or a dynamic interest model; 所述兴趣模型提取子单元包括:The interest model extraction subunit includes: 第一提取子单元,用于计算用户的静态档案中属于每个兴趣维度的所有词的词频之和,并将其作为对应每个兴趣维度的评分值,或者计算用户的静态档案与每个兴趣维度的相似度评分值,并将其作为对应每个兴趣维度的评分值,由对应每个兴趣维度的评分值作为向量生成所述用户兴趣模型;或者The first extraction subunit is used to calculate the sum of the word frequencies of all words belonging to each interest dimension in the user's static profile, and use it as the score value corresponding to each interest dimension, or calculate the user's static profile and each interest The similarity score value of the dimension, and use it as the score value corresponding to each interest dimension, and generate the user interest model by using the score value corresponding to each interest dimension as a vector; or 第二提取子单元,用于计算用户的搜索点击历史记录中属于每个兴趣维度的所有词的词频之和,并将其作为对应每个兴趣维度的评分值,或者计算搜索点击历史记录与每个兴趣维度的相似度评分值,并将其作为对应每个兴趣维度的评分值,由对应每个兴趣维度的评分值作为向量生成所述用户的动态兴趣模型。The second extraction subunit is used to calculate the sum of the word frequencies of all words belonging to each interest dimension in the user's search click history record, and use it as a score value corresponding to each interest dimension, or calculate the search click history record and each The similarity score value of each interest dimension is used as the score value corresponding to each interest dimension, and the dynamic interest model of the user is generated by using the score value corresponding to each interest dimension as a vector. 25.根据权利要求23所述的装置,其特征在于,所述兴趣模型提取子单元还包括:25. The device according to claim 23, wherein the interest model extraction subunit further comprises: 第一处理子单元,用于分别对所述静态兴趣模型和所述动态兴趣模型进行归一化处理;A first processing subunit, configured to perform normalization processing on the static interest model and the dynamic interest model respectively; 第一加权子单元,用于计算归一化处理后的一个或多个静态兴趣模型、和一个或多个动态兴趣模型的和,并将该和作为所述用户的兴趣模型。The first weighting subunit is configured to calculate the sum of one or more static interest models and one or more dynamic interest models after normalization processing, and use the sum as the interest model of the user. 26.根据权利要求23所述的装置,其特征在于,所述兴趣模型提取子单元还包括:26. The device according to claim 23, wherein the interest model extraction subunit further comprises: 第二加权子单元,用于将一个或多个所述静态兴趣模型、和一个或多个所述动态兴趣模型进行加权相加;A second weighting subunit, configured to weight and add one or more of the static interest models and one or more of the dynamic interest models; 第二处理子单元,用于将所述第二加权子单元输出的结果进行归一化处理,并将归一化处理后的结果作为所述用户的兴趣模型。The second processing subunit is configured to normalize the results output by the second weighting subunit, and use the normalized results as the interest model of the user. 27.根据权利要求23所述的装置,其特征在于,所述计算单元还包括:27. The device according to claim 23, wherein the computing unit further comprises: 归一化处理单元,用于分别对所述相似度计算单元、所述大众搜索率计算单元和所述用户兴趣评分值计算单元计算得到的值进行归一化处理;A normalization processing unit is used to perform normalization processing on the values calculated by the similarity calculation unit, the popular search rate calculation unit and the user interest score value calculation unit respectively; 加权处理单元,用于对所述归一化处理单元得到的任意两个或多个归一化后的值进行加权相加,得到各搜索类型域的评分值。A weighting processing unit, configured to perform weighted addition of any two or more normalized values obtained by the normalization processing unit to obtain the scoring value of each search type field.
CN200910140119A 2009-02-27 2009-07-01 Method and device for mobile search Pending CN101820592A (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN200910140119A CN101820592A (en) 2009-02-27 2009-07-01 Method and device for mobile search
PCT/CN2009/074758 WO2010096986A1 (en) 2009-02-27 2009-11-05 Mobile search method and device
US13/219,058 US20110314059A1 (en) 2009-02-27 2011-08-26 Mobile search method and apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN200910118632.7 2009-02-27
CN200910140119A CN101820592A (en) 2009-02-27 2009-07-01 Method and device for mobile search

Publications (1)

Publication Number Publication Date
CN101820592A true CN101820592A (en) 2010-09-01

Family

ID=42655489

Family Applications (1)

Application Number Title Priority Date Filing Date
CN200910140119A Pending CN101820592A (en) 2009-02-27 2009-07-01 Method and device for mobile search

Country Status (1)

Country Link
CN (1) CN101820592A (en)

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102364467A (en) * 2011-09-29 2012-02-29 北京亿赞普网络技术有限公司 Network search method and system
CN102436495A (en) * 2011-11-14 2012-05-02 百度在线网络技术(北京)有限公司 Method and device for providing dynamic search page
CN102436496A (en) * 2011-11-14 2012-05-02 百度在线网络技术(北京)有限公司 Method for providing personated searching labels and device thereof
CN102521350A (en) * 2011-12-12 2012-06-27 浙江大学 Selection method of distributed information retrieval sets based on historical click data
CN102955813A (en) * 2011-08-29 2013-03-06 中国移动通信集团四川有限公司 Information searching method and information searching system
CN102999521A (en) * 2011-09-15 2013-03-27 北京百度网讯科技有限公司 Method and device for identifying search requirement
CN102999520A (en) * 2011-09-15 2013-03-27 北京百度网讯科技有限公司 Method and device for identifying search request
CN103339623A (en) * 2010-09-08 2013-10-02 纽昂斯通讯公司 Method and apparatus relating to internet searching
CN103455499A (en) * 2012-05-29 2013-12-18 北京百度网讯科技有限公司 Method and system for automatically matching search types according to search terms in mobile terminal
CN103530385A (en) * 2013-10-18 2014-01-22 北京奇虎科技有限公司 Method and device for searching for information based on vertical searching channels
CN103729359A (en) * 2012-10-12 2014-04-16 阿里巴巴集团控股有限公司 Method and system for recommending search terms
CN104699737A (en) * 2013-12-09 2015-06-10 国际商业机器公司 Method and system for managing a search
CN104933090A (en) * 2015-05-18 2015-09-23 深圳市金立通信设备有限公司 Information searching method and terminal
CN105245589A (en) * 2015-09-28 2016-01-13 小米科技有限责任公司 Information display method and device
CN105512298A (en) * 2015-12-10 2016-04-20 成都陌云科技有限公司 Interested content prediction method based on machine learning
CN105550282A (en) * 2015-12-10 2016-05-04 成都陌云科技有限公司 User interest forecasting method by utilizing multidimensional data
CN108415903A (en) * 2018-03-12 2018-08-17 武汉斗鱼网络科技有限公司 Judge evaluation method, storage medium and the equipment of search intention identification validity
CN104915429B (en) * 2015-06-15 2018-09-04 小米科技有限责任公司 Keyword search methodology and device

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103339623A (en) * 2010-09-08 2013-10-02 纽昂斯通讯公司 Method and apparatus relating to internet searching
CN102955813B (en) * 2011-08-29 2015-11-25 中国移动通信集团四川有限公司 A kind of information search method and system
CN102955813A (en) * 2011-08-29 2013-03-06 中国移动通信集团四川有限公司 Information searching method and information searching system
CN102999521A (en) * 2011-09-15 2013-03-27 北京百度网讯科技有限公司 Method and device for identifying search requirement
CN102999520A (en) * 2011-09-15 2013-03-27 北京百度网讯科技有限公司 Method and device for identifying search request
CN102999521B (en) * 2011-09-15 2016-06-15 北京百度网讯科技有限公司 A kind of method and device identifying search need
CN102999520B (en) * 2011-09-15 2016-04-27 北京百度网讯科技有限公司 A kind of method and apparatus of search need identification
CN102364467A (en) * 2011-09-29 2012-02-29 北京亿赞普网络技术有限公司 Network search method and system
CN102436496A (en) * 2011-11-14 2012-05-02 百度在线网络技术(北京)有限公司 Method for providing personated searching labels and device thereof
CN102436495A (en) * 2011-11-14 2012-05-02 百度在线网络技术(北京)有限公司 Method and device for providing dynamic search page
CN102521350A (en) * 2011-12-12 2012-06-27 浙江大学 Selection method of distributed information retrieval sets based on historical click data
CN102521350B (en) * 2011-12-12 2014-07-16 浙江大学 Selection method of distributed information retrieval sets based on historical click data
CN103455499A (en) * 2012-05-29 2013-12-18 北京百度网讯科技有限公司 Method and system for automatically matching search types according to search terms in mobile terminal
US9489688B2 (en) 2012-10-12 2016-11-08 Alibaba Group Holding Limited Method and system for recommending search phrases
CN103729359A (en) * 2012-10-12 2014-04-16 阿里巴巴集团控股有限公司 Method and system for recommending search terms
CN103729359B (en) * 2012-10-12 2017-03-01 阿里巴巴集团控股有限公司 A kind of method and system recommending search word
CN103530385A (en) * 2013-10-18 2014-01-22 北京奇虎科技有限公司 Method and device for searching for information based on vertical searching channels
CN104699737A (en) * 2013-12-09 2015-06-10 国际商业机器公司 Method and system for managing a search
US11176124B2 (en) 2013-12-09 2021-11-16 International Business Machines Corporation Managing a search
US10176227B2 (en) 2013-12-09 2019-01-08 International Business Machines Corporation Managing a search
US9996588B2 (en) 2013-12-09 2018-06-12 International Business Machines Corporation Managing a search
CN104933090A (en) * 2015-05-18 2015-09-23 深圳市金立通信设备有限公司 Information searching method and terminal
CN104915429B (en) * 2015-06-15 2018-09-04 小米科技有限责任公司 Keyword search methodology and device
CN105245589A (en) * 2015-09-28 2016-01-13 小米科技有限责任公司 Information display method and device
CN105245589B (en) * 2015-09-28 2019-06-14 小米科技有限责任公司 Information displaying method and device
CN105550282A (en) * 2015-12-10 2016-05-04 成都陌云科技有限公司 User interest forecasting method by utilizing multidimensional data
CN105512298A (en) * 2015-12-10 2016-04-20 成都陌云科技有限公司 Interested content prediction method based on machine learning
CN108415903A (en) * 2018-03-12 2018-08-17 武汉斗鱼网络科技有限公司 Judge evaluation method, storage medium and the equipment of search intention identification validity

Similar Documents

Publication Publication Date Title
CN101820592A (en) Method and device for mobile search
CN109815308B (en) Method and device for determining intention recognition model and method and device for searching intention recognition
CN103593425B (en) Intelligent retrieval method and system based on preference
CN101661475B (en) Search method and system
CN106815297B (en) Academic resource recommendation service system and method
CN102056335B (en) Mobile search method, device and system
US8380697B2 (en) Search and retrieval methods and systems of short messages utilizing messaging context and keyword frequency
US20110314059A1 (en) Mobile search method and apparatus
CN107958014B (en) Search engine
CN108694647B (en) Method and device for mining merchant recommendation reason and electronic equipment
CN105260390B (en) A Group-Oriented Item Recommendation Method Based on Joint Probability Matrix Factorization
CN111090771B (en) Song searching method, device and computer storage medium
CN103310003A (en) Method and system for predicting click rate of new advertisement based on click log
CN108334610A (en) A kind of newsletter archive sorting technique, device and server
CN102831128A (en) Method and device for sorting information of namesake persons on Internet
CN102968417A (en) Searching method and system applied to computer network
CN103440242A (en) User search behavior-based personalized recommendation method and system
CN103020049A (en) Searching method and searching system
CN101685456B (en) A search method, system and device
CN103473244A (en) Device and method for recommending applications used in application group
CN103744918A (en) Vertical domain based micro blog searching ranking method and system
CN105653546B (en) Method and system for retrieving a target subject
CN106951420A (en) Literature search method and apparatus, author&#39;s searching method and equipment
CN115168700A (en) Information flow recommendation method, system and medium based on pre-training algorithm
CN104077327A (en) Core word importance recognition method and equipment and search result sorting method and equipment

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20100901