CN114077712B - Search result ordering method and device - Google Patents
Search result ordering method and device Download PDFInfo
- Publication number
- CN114077712B CN114077712B CN202010818963.8A CN202010818963A CN114077712B CN 114077712 B CN114077712 B CN 114077712B CN 202010818963 A CN202010818963 A CN 202010818963A CN 114077712 B CN114077712 B CN 114077712B
- Authority
- CN
- China
- Prior art keywords
- cluster
- picture
- search
- target
- result
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 66
- 239000013598 vector Substances 0.000 claims description 35
- 230000011218 segmentation Effects 0.000 claims description 9
- 238000012163 sequencing technique Methods 0.000 claims 11
- 230000032683 aging Effects 0.000 claims 3
- 238000012545 processing Methods 0.000 description 11
- 238000004891 communication Methods 0.000 description 10
- 238000010586 diagram Methods 0.000 description 9
- 238000005516 engineering process Methods 0.000 description 7
- 230000003287 optical effect Effects 0.000 description 4
- 230000005236 sound signal Effects 0.000 description 4
- 101001121408 Homo sapiens L-amino-acid oxidase Proteins 0.000 description 3
- 102100026388 L-amino-acid oxidase Human genes 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 230000018109 developmental process Effects 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 230000014509 gene expression Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 101000827703 Homo sapiens Polyphosphoinositide phosphatase Proteins 0.000 description 1
- 102100023591 Polyphosphoinositide phosphatase Human genes 0.000 description 1
- 101100012902 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) FIG2 gene Proteins 0.000 description 1
- 101100233916 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) KAR5 gene Proteins 0.000 description 1
- 230000001133 acceleration Effects 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9538—Presentation of query results
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/55—Clustering; Classification
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
本申请实施例公开了一种搜索结果排序方法及装置,该方法包括:接收客户端发送的图片搜索请求,获取图片搜索请求对应的搜索结果,获取时效性事件的信息聚类结果;根据所述图片搜索请求对应的搜索结果与所述图片类簇的匹配程度,判断图片搜索请求是否为时效性事件搜索请求;当图片搜索请求为时效性事件搜索请求,利用信息聚类结果对搜索结果进行排序,生成搜索结果队列。通过获取到的实时的时效性事件的信息聚类结果,可以及时地对用户的图片搜索请求进行是否是时效性事件搜索请求的判断,当图片搜索请求为时效性事件搜索请求时,利用信息聚类结果进行搜索结果的排序,可以得到较为准确的搜索结果队列。
The embodiment of the present application discloses a search result sorting method and device, the method comprising: receiving an image search request sent by a client, obtaining search results corresponding to the image search request, and obtaining information clustering results of time-sensitive events; judging whether the image search request is a time-sensitive event search request based on the degree of matching between the search results corresponding to the image search request and the image cluster; when the image search request is a time-sensitive event search request, sorting the search results using the information clustering results to generate a search result queue. By obtaining the real-time information clustering results of time-sensitive events, it is possible to timely judge whether the user's image search request is a time-sensitive event search request. When the image search request is a time-sensitive event search request, the information clustering results are used to sort the search results, and a more accurate search result queue can be obtained.
Description
技术领域Technical Field
本申请涉及互联网技术领域,具体涉及一种搜索结果排序方法及装置。The present application relates to the field of Internet technology, and in particular to a search result sorting method and device.
背景技术Background Art
在用户进行搜索查询时,会根据用户的搜索请求生成对应的搜索结果。当判断用户的搜索请求是时效性事件的搜索请求时,可以对搜索结果进行调整,将时效性较强的搜索结果排在较高的位置,以便用户进行浏览和选择。When a user performs a search query, corresponding search results will be generated according to the user's search request. When it is determined that the user's search request is a search request for a time-sensitive event, the search results can be adjusted to rank search results with stronger timeliness at a higher position for the user to browse and select.
目前,在判断搜索请求是否是时效性事件的搜索请求存在着一定的滞后性,使得不能及时地向用户反馈时效性较强的搜索结果,导致对应于用户的搜索请求得到的搜索结果队列排序不够准确。Currently, there is a certain lag in determining whether a search request is a time-sensitive event, which makes it impossible to promptly feed back search results with strong timeliness to users, resulting in inaccurate sorting of the search result queue corresponding to the user's search request.
发明内容Summary of the invention
有鉴于此,本申请实施例提供一种搜索结果排序方法及装置,能够实时地对用户的搜索请求进行判断,对搜索结果进行相应的排序。In view of this, an embodiment of the present application provides a search result sorting method and device, which can judge the user's search request in real time and sort the search results accordingly.
为解决上述问题,本申请实施例提供的技术方案如下:To solve the above problems, the technical solutions provided in the embodiments of the present application are as follows:
一种搜索结果排序方法,所述方法包括:A search result sorting method, the method comprising:
接收客户端发送的图片搜索请求,获取所述图片搜索请求对应的搜索结果;Receive a picture search request sent by a client, and obtain search results corresponding to the picture search request;
获取时效性事件的信息聚类结果,所述时效性事件的信息聚类结果为对网络中预设时间段内产生的信息聚类得到的,所述信息聚类结果包括至少一个图片类簇,每个所述图片类簇中包括至少一幅图片;Acquire information clustering results of time-sensitive events, where the information clustering results of time-sensitive events are obtained by clustering information generated in a preset time period in the network, and the information clustering results include at least one picture cluster, and each of the picture clusters includes at least one picture;
根据所述图片搜索请求对应的搜索结果与所述图片类簇的匹配程度,判断所述图片搜索请求是否为时效性事件搜索请求;Determining whether the image search request is a timeliness event search request according to a matching degree between the search result corresponding to the image search request and the image cluster;
当所述图片搜索请求为时效性事件搜索请求,利用所述信息聚类结果对所述搜索结果进行排序,生成搜索结果队列。When the image search request is a time-sensitive event search request, the search results are sorted using the information clustering results to generate a search result queue.
在一种可能的实现方式中,所述根据所述图片搜索请求对应的搜索结果与所述图片类簇的匹配程度,判断所述图片搜索请求是否为时效性事件搜索请求,包括:In a possible implementation, judging whether the image search request is a timeliness event search request according to a matching degree between a search result corresponding to the image search request and the image cluster includes:
确定各个所述图片类簇中的图片出现在所述图片搜索请求对应的搜索结果中的第一数量;Determine a first number of images in each of the image clusters that appear in the search results corresponding to the image search request;
将所述第一数量大于第一阈值的图片类簇确定为第一目标图片类簇;Determine the picture clusters whose first number is greater than a first threshold as first target picture clusters;
如果所述图片搜索请求对应的搜索结果所属的第一目标图片类簇数量大于第二阈值,判断所述图片搜索请求为时效性事件搜索请求。If the number of first target image clusters to which the search results corresponding to the image search request belong is greater than a second threshold, it is determined that the image search request is a timeliness event search request.
在一种可能的实现方式中,所述当所述图片搜索请求为时效性事件搜索请求,利用所述信息聚类结果对所述搜索结果进行排序,生成搜索结果队列,包括:In a possible implementation, when the image search request is a time-sensitive event search request, sorting the search results using the information clustering results to generate a search result queue includes:
当所述图片搜索请求为时效性事件搜索请求,获取所述图片搜索请求对应的搜索结果所属的第二目标图片类簇的类簇特征;所述第二目标图片类簇为所述图片搜索请求对应的搜索结果中各个图片所对应的图片类簇;When the image search request is a time-sensitive event search request, obtaining cluster features of a second target image cluster to which the search results corresponding to the image search request belong; the second target image cluster is an image cluster corresponding to each image in the search results corresponding to the image search request;
按照所述第二目标图片类簇的类簇特征对所述第二目标图片类簇进行排序,生成排序结果;sorting the second target image clusters according to the cluster features of the second target image clusters to generate a sorting result;
在每个所述第二目标图片类簇中选择一幅图片作为第一目标搜索结果;Selecting a picture in each of the second target picture clusters as the first target search result;
按照所述排序结果,对所述第一目标搜索结果以及所述搜索结果中不属于所述第二目标图片类簇的其他搜索结果进行排序,生成搜索结果队列,在所述搜索结果队列中,所述第一目标搜索结果的排序位置高于所述其他搜索结果的排序位置。According to the sorting result, the first target search result and other search results in the search results that do not belong to the second target image cluster are sorted to generate a search result queue, in which the sorting position of the first target search result is higher than the sorting positions of the other search results.
在一种可能的实现方式中,所述获取所述图片搜索请求对应的搜索结果所属的第二目标图片类簇的类簇特征,包括:In a possible implementation, obtaining the cluster feature of the second target image cluster to which the search result corresponding to the image search request belongs includes:
获取所述图片搜索请求对应的搜索结果所属的第二目标图片类簇中图片的第二数量、所述第二目标图片类簇中图片对应的来源网站数量;Obtain a second number of images in a second target image cluster to which the search results corresponding to the image search request belong, and a number of source websites corresponding to the images in the second target image cluster;
获取所述第二目标图片类簇中图片对应的文本特征;Obtaining text features corresponding to the pictures in the second target picture cluster;
所述按照所述第二目标图片类簇的类簇特征对所述目标图片类簇进行排序,生成排序结果,包括:The step of sorting the target image clusters according to the cluster features of the second target image clusters to generate a sorting result includes:
按照所述搜索请求包括的查询词与所述第二目标图片类簇中图片对应的文本特征的匹配程度、所述第二目标图片类簇中图片的第二数量以及所述第二目标图片类簇中图片对应的来源网站数量,对所述第二目标图片类簇进行排序,生成排序结果。The second target image cluster is sorted according to the matching degree between the query term included in the search request and the text features corresponding to the images in the second target image cluster, the second number of images in the second target image cluster, and the number of source websites corresponding to the images in the second target image cluster to generate a sorting result.
在一种可能的实现方式中,所述获取所述第二目标图片类簇中图片对应的文本特征,包括:In a possible implementation, the acquiring text features corresponding to the pictures in the second target picture cluster includes:
获取所述第二目标图片类簇中图片对应的描述文本中各个分词的词频,将所述词频最高的至少一个分词作为所述第二目标图片类簇中图片对应的文本特征;Obtaining the word frequency of each word in the description text corresponding to the picture in the second target picture cluster, and taking at least one word with the highest word frequency as the text feature corresponding to the picture in the second target picture cluster;
或者,获取所述第二目标图片类簇中图片对应的描述文本中各个分词的特征向量,在所述特征向量中将出现次数最多的至少一个特征向量作为所述第二目标图片类簇中图片对应的文本特征。Alternatively, the feature vectors of each word in the description text corresponding to the picture in the second target picture cluster are obtained, and at least one feature vector with the largest number of occurrences in the feature vectors is used as the text feature corresponding to the picture in the second target picture cluster.
在一种可能的实现方式中,所述方法还包括:In a possible implementation, the method further includes:
在所述时效性事件的信息聚类结果中确定垃圾信息类簇,在所述时效性事件的信息聚类结果中去除所述垃圾信息类簇;Determining a junk information cluster in the information clustering result of the timeliness event, and removing the junk information cluster in the information clustering result of the timeliness event;
所述在所述时效性事件的信息聚类结果中确定垃圾信息类簇,包括:The step of determining a spam information cluster from the information clustering result of the timeliness event includes:
如果第三目标图片类簇中图片的第三数量大于第三阈值,或者第三目标图片类簇对应的来源网站数量小于第四阈值,将所述第三目标图片类簇确定为垃圾信息类簇,所述第三目标图片类簇为所述时效性事件的信息聚类结果中包括的图片类簇;If a third number of pictures in the third target picture cluster is greater than a third threshold, or the number of source websites corresponding to the third target picture cluster is less than a fourth threshold, the third target picture cluster is determined as a spam information cluster, and the third target picture cluster is a picture cluster included in the information clustering result of the timeliness event;
或者,如果第三目标图片类簇中各个图片对应的描述文本之间的相似度均小于第五阈值,将所述第三目标图片类簇确定为垃圾信息类簇,所述第三目标图片类簇为所述时效性事件的信息聚类结果中包括的图片类簇。Alternatively, if the similarities between the description texts corresponding to the pictures in the third target picture cluster are all less than the fifth threshold, the third target picture cluster is determined to be a spam cluster, and the third target picture cluster is the picture cluster included in the information clustering result of the timeliness event.
一种搜索结果排序装置,所述装置包括:A search result sorting device, the device comprising:
搜索结果获取单元,用于接收客户端发送的图片搜索请求,获取所述图片搜索请求对应的搜索结果;A search result acquisition unit, configured to receive an image search request sent by a client and acquire search results corresponding to the image search request;
信息聚类结果获取单元,用于获取时效性事件的信息聚类结果,所述时效性事件的信息聚类结果为对网络中预设时间段内产生的信息聚类得到的,所述信息聚类结果包括至少一个图片类簇,每个所述图片类簇中包括至少一幅图片;An information clustering result acquisition unit, configured to acquire an information clustering result of a time-sensitive event, wherein the information clustering result of the time-sensitive event is obtained by clustering information generated in a preset time period in the network, and the information clustering result includes at least one picture cluster, and each of the picture clusters includes at least one picture;
判断单元,用于根据所述图片搜索请求对应的搜索结果与所述图片类簇的匹配程度,,判断所述图片搜索请求是否为时效性事件搜索请求;A judging unit, configured to judge whether the image search request is a timeliness event search request according to a matching degree between the search result corresponding to the image search request and the image cluster;
搜索结果队列生成单元,用于当所述图片搜索请求为时效性事件搜索请求,利用所述信息聚类结果对所述搜索结果进行排序,生成搜索结果队列。A search result queue generating unit is used to sort the search results using the information clustering results to generate a search result queue when the image search request is a time-sensitive event search request.
在一种可能的实现方式中,所述判断单元,包括:In a possible implementation, the determining unit includes:
第一数量确定模块,用于确定各个所述图片类簇中的图片出现在所述图片搜索请求对应的搜索结果中的第一数量;A first quantity determination module, configured to determine a first quantity of images in each of the image clusters that appear in the search results corresponding to the image search request;
图片类簇确定模块,用于将所述第一数量大于第一阈值的图片类簇确定为第一目标图片类簇;A picture cluster determination module, configured to determine picture clusters whose first number is greater than a first threshold as first target picture clusters;
第一判断模块,用于如果所述图片搜索请求对应的搜索结果所属的第一目标图片类簇数量大于第二阈值,判断所述图片搜索请求为时效性事件搜索请求。The first judgment module is configured to judge that the image search request is a timeliness event search request if the number of first target image clusters to which the search results corresponding to the image search request belong is greater than a second threshold.
在一种可能的实现方式中,所述搜索结果队列生成单元,包括:In a possible implementation, the search result queue generating unit includes:
类簇特征获取子单元,用于当所述图片搜索请求为时效性事件搜索请求,获取所述图片搜索请求对应的搜索结果所属的第二目标图片类簇的类簇特征;所述第二目标图片类簇为所述图片搜索请求对应的搜索结果中各个图片所对应的图片类簇;A cluster feature acquisition subunit is used to acquire, when the image search request is a time-sensitive event search request, a cluster feature of a second target image cluster to which the search result corresponding to the image search request belongs; the second target image cluster is an image cluster corresponding to each image in the search result corresponding to the image search request;
排序结果生成子单元,用于按照所述第二目标图片类簇的类簇特征对所述第二目标图片类簇进行排序,生成排序结果;a sorting result generating subunit, configured to sort the second target image clusters according to the cluster features of the second target image clusters to generate a sorting result;
第一选择子单元,用于在每个所述第二目标图片类簇中选择一幅图片作为第一目标搜索结果;A first selection subunit, configured to select a picture in each of the second target picture clusters as a first target search result;
第一搜索结果队列生成子单元,用于按照所述排序结果,对所述第一目标搜索结果以及所述搜索结果中不属于所述第二目标图片类簇的其他搜索结果进行排序,生成搜索结果队列,在所述搜索结果队列中所述第一目标搜索结果的排序位置高于所述其他搜索结果的排序位置。The first search result queue generating subunit is used to sort the first target search result and other search results in the search results that do not belong to the second target image cluster according to the sorting result, and generate a search result queue, in which the sorting position of the first target search result is higher than the sorting position of the other search results.
在一种可能的实现方式中,所述类簇特征获取子单元,包括:In a possible implementation, the cluster feature acquisition subunit includes:
来源网站数量获取模块,用于获取所述图片搜索请求对应的搜索结果所属的第二目标图片类簇中图片的第二数量、所述第二目标图片类簇中图片对应的来源网站数量;A source website quantity acquisition module, used to acquire a second quantity of images in a second target image cluster to which the search result corresponding to the image search request belongs, and a quantity of source websites corresponding to the images in the second target image cluster;
文本特征获取模块,用于获取所述第二目标图片类簇中图片对应的文本特征;A text feature acquisition module, used to acquire text features corresponding to the pictures in the second target picture cluster;
所述排序结果生成子单元,具体用于:The sorting result generating subunit is specifically used for:
按照所述搜索请求包括的查询词与所述第二目标图片类簇中图片对应的文本特征的匹配程度、所述第二目标图片类簇中图片的第二数量以及所述第二目标图片类簇中图片对应的来源网站数量,对所述第二目标图片类簇进行排序,生成排序结果。The second target image cluster is sorted according to the matching degree between the query term included in the search request and the text features corresponding to the images in the second target image cluster, the second number of images in the second target image cluster, and the number of source websites corresponding to the images in the second target image cluster to generate a sorting result.
在一种可能的实现方式中,所述文本特征获取模块,具体用于:In a possible implementation, the text feature acquisition module is specifically used to:
获取所述第二目标图片类簇中图片对应的描述文本中各个分词的词频,将所述词频最高的至少一个分词作为所述第二目标图片类簇中图片对应的文本特征;Obtaining the word frequency of each word in the description text corresponding to the picture in the second target picture cluster, and taking at least one word with the highest word frequency as the text feature corresponding to the picture in the second target picture cluster;
或者,获取所述第二目标图片类簇中图片对应的描述文本中各个分词的特征向量,在所述特征向量中将出现次数最多的至少一个特征向量作为所述第二目标图片类簇中图片对应的文本特征。Alternatively, the feature vectors of each word in the description text corresponding to the picture in the second target picture cluster are obtained, and at least one feature vector with the largest number of occurrences in the feature vectors is used as the text feature corresponding to the picture in the second target picture cluster.
在一种可能的实现方式中,所述装置还包括:In a possible implementation manner, the device further includes:
垃圾信息类簇确定单元,用于在所述时效性事件的信息聚类结果中确定垃圾信息类簇;A spam information cluster determination unit, configured to determine a spam information cluster from the information clustering result of the timeliness event;
垃圾信息类簇去除单元,用于在所述时效性事件的信息聚类结果中去除所述垃圾信息类簇;A spam information cluster removal unit, used to remove the spam information cluster from the information clustering result of the timeliness event;
所述垃圾信息类簇确定单元,具体用于:The junk information cluster determination unit is specifically used to:
如果第三目标图片类簇中图片的第三数量大于第三阈值,或者第三目标图片类簇对应的来源网站数量小于第四阈值,将所述第三目标图片类簇确定为垃圾信息类簇,所述第三目标图片类簇为所述时效性事件的信息聚类结果中包括的图片类簇;If a third number of pictures in the third target picture cluster is greater than a third threshold, or the number of source websites corresponding to the third target picture cluster is less than a fourth threshold, the third target picture cluster is determined as a spam information cluster, and the third target picture cluster is a picture cluster included in the information clustering result of the timeliness event;
或者,如果第三目标图片类簇中各个图片对应的描述文本之间的相似度均小于第五阈值,将所述第三目标图片类簇确定为垃圾信息类簇,所述第三目标图片类簇为所述时效性事件的信息聚类结果中包括的图片类簇。Alternatively, if the similarities between the description texts corresponding to the pictures in the third target picture cluster are all less than the fifth threshold, the third target picture cluster is determined to be a spam cluster, and the third target picture cluster is the picture cluster included in the information clustering result of the timeliness event.
一种搜索结果排序装置,包括有存储器,以及一个或者一个以上的程序,其中一个或者一个以上程序存储于存储器中,且经配置以由一个或者一个以上处理器执行所述一个或者一个以上程序包含用于进行以下操作的指令:A search result ranking device includes a memory and one or more programs, wherein the one or more programs are stored in the memory and are configured to be executed by one or more processors. The one or more programs include instructions for performing the following operations:
接收客户端发送的图片搜索请求,获取所述图片搜索请求对应的搜索结果;Receive a picture search request sent by a client, and obtain search results corresponding to the picture search request;
获取时效性事件的信息聚类结果,所述时效性事件的信息聚类结果为对网络中预设时间段内产生的信息聚类得到的,所述信息聚类结果包括至少一个图片类簇,每个所述图片类簇中包括至少一幅图片;Acquire information clustering results of time-sensitive events, where the information clustering results of time-sensitive events are obtained by clustering information generated in a preset time period in the network, and the information clustering results include at least one picture cluster, and each of the picture clusters includes at least one picture;
根据所述图片搜索请求对应的搜索结果与所述图片类簇的匹配程度,判断所述图片搜索请求是否为时效性事件搜索请求;Determining whether the image search request is a timeliness event search request according to a matching degree between the search result corresponding to the image search request and the image cluster;
当所述图片搜索请求为时效性事件搜索请求,利用所述信息聚类结果对所述搜索结果进行排序,生成搜索结果队列。When the image search request is a time-sensitive event search request, the search results are sorted using the information clustering results to generate a search result queue.
一种计算机可读介质,其上存储有指令,当由一个或多个处理器执行时,使得装置执行所述的搜索结果排序方法。A computer-readable medium stores instructions which, when executed by one or more processors, enable a device to execute the search result ranking method.
由此可见,本申请实施例具有如下有益效果:It can be seen that the embodiments of the present application have the following beneficial effects:
本申请实施例提供的搜索结果排序方法中,服务器接收客户端发送的图片搜索请求,获取图片搜索请求对应的搜索结果,获取时效性事件的信息聚类结果,其中,时效性事件的信息聚类结果为对网络中预设时间段内产生的信息聚类得到的,信息聚类结果包括至少一个图片类簇,每个图片类簇中包括至少一幅图片;服务器根据图片搜索请求对应的搜索结果与图片类簇的匹配程度,判断图片搜索请求是否为时效性事件搜索请求;当图片搜索请求为时效性事件搜索请求时,利用信息聚类结果对搜索结果进行排序,生成搜索结果队列。其中,根据图片搜索请求对应的搜索结果与图片类簇的匹配程度,确定的属于时效性事件搜索请求的图片搜索请求,具有实时性。通过图片搜索请求对应的搜索结果与图片类簇的匹配程度,可以及时地发现用户的图片搜索请求进行是否为时效性事件搜索请求,当图片搜索请求为时效性事件搜索请求时,利用信息聚类结果进行搜索结果的排序,可以得到较为准确的搜索结果队列。In the search result sorting method provided in the embodiment of the present application, the server receives the image search request sent by the client, obtains the search result corresponding to the image search request, and obtains the information clustering result of the timeliness event, wherein the information clustering result of the timeliness event is obtained by clustering the information generated within a preset time period in the network, and the information clustering result includes at least one image cluster, and each image cluster includes at least one image; the server determines whether the image search request is a timeliness event search request according to the matching degree between the search result corresponding to the image search request and the image cluster; when the image search request is a timeliness event search request, the search results are sorted using the information clustering result to generate a search result queue. Among them, according to the matching degree between the search result corresponding to the image search request and the image cluster, the image search request belonging to the timeliness event search request is determined to be real-time. Through the matching degree between the search result corresponding to the image search request and the image cluster, it is possible to timely find out whether the user's image search request is a timeliness event search request. When the image search request is a timeliness event search request, the search results are sorted using the information clustering result, and a more accurate search result queue can be obtained.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
图1为本申请实施例提供的示例性应用场景的框架示意图;FIG1 is a schematic diagram of a framework of an exemplary application scenario provided in an embodiment of the present application;
图2为本申请实施例提供的一种搜索结果排序方法的流程图;FIG2 is a flow chart of a search result sorting method provided in an embodiment of the present application;
图3为本申请实施例提供的一种搜索请求是否为时效性事件搜索请求的判断方法的流程图;FIG3 is a flow chart of a method for determining whether a search request is a time-sensitive event search request provided by an embodiment of the present application;
图4为本申请实施例提供的一种生成搜索结果队列方法的流程图;FIG4 is a flowchart of a method for generating a search result queue provided in an embodiment of the present application;
图5为本申请实施例提供的一种搜索结果排序装置的结构示意图;FIG5 is a schematic diagram of the structure of a search result sorting device provided in an embodiment of the present application;
图6为本申请实施例提供的一种用于搜索结果排序装置的结构示意图;FIG6 is a schematic diagram of the structure of a device for sorting search results provided in an embodiment of the present application;
图7为本申请实施例提供的一种服务器设备的结构示意图。FIG. 7 is a schematic diagram of the structure of a server device provided in an embodiment of the present application.
具体实施方式DETAILED DESCRIPTION
为使本申请的上述目的、特征和优点能够更加明显易懂,下面结合附图和具体实施方式对本申请实施例作进一步详细的说明。In order to make the above-mentioned objects, features and advantages of the present application more obvious and easy to understand, the embodiments of the present application are further described in detail below in conjunction with the accompanying drawings and specific implementation methods.
为了便于理解和解释本申请实施例提供的技术方案,下面将先对本申请的背景技术进行说明。In order to facilitate the understanding and explanation of the technical solutions provided by the embodiments of the present application, the background technology of the present application will be described below.
发明人对传统的搜索结果排序进行研究后发现,在进行搜索结果排序之前,先进行搜索请求是否为时效性事件搜索请求的判断,其中,时效性事件是指发生时间较近的热点事件。在现有技术中,时效性事件是通过获取搜索请求中的查询词出现的频率来确定的,当查询词在网络中的搜索请求中出现的频率较高时,将该查询词对应的事件确定为时效性事件。但是,通过搜索请求中查询词出现的频率来确定查询词对应的事件是否为时效性事件,需要大量具有该查询词的搜索请求时才可以达到将该事件确定为时效性事件的条件,这使得时效性事件的确定具有一定的滞后性,不能在进行搜索请求时及时地确定该搜索请求是否为时效性事件搜索请求。After studying the traditional search result sorting, the inventor found that before sorting the search results, it is first determined whether the search request is a time-sensitive event search request, where a time-sensitive event refers to a hot event that occurred recently. In the prior art, time-sensitive events are determined by obtaining the frequency of occurrence of query terms in search requests. When the frequency of occurrence of query terms in search requests on the network is high, the event corresponding to the query term is determined as a time-sensitive event. However, by determining whether the event corresponding to the query term is a time-sensitive event based on the frequency of occurrence of the query term in the search request, a large number of search requests with the query term are required to meet the condition of determining the event as a time-sensitive event, which makes the determination of time-sensitive events have a certain lag, and it is not possible to promptly determine whether the search request is a time-sensitive event search request when the search request is made.
基于此,本申请实施例提供了一种搜索结果排序方法,该方法包括:首先,接收客户端发送的图片搜索请求,获取图片搜索请求对应的搜索结果,还获取时效性事件的信息聚类结果,时效性事件的信息聚类结果为对网络中预设时间段内产生的信息聚类得到的,信息聚类结果包括至少一个图片类簇,每个图片类簇中包括至少一幅图片;其次,根据图片搜索请求对应的搜索结果与图片类簇的匹配程度,判断图片搜索请求是否为时效性事件搜索请求,从而实现实时地对图片搜索请求进行时效性事件搜索请求的判断;最后,当图片搜索请求为时效性事件搜索请求,利用信息聚类结果对搜索结果进行排序,生成搜索结果队列,可以得到较为准确的搜索结果。Based on this, an embodiment of the present application provides a search result sorting method, which includes: first, receiving an image search request sent by a client, obtaining a search result corresponding to the image search request, and also obtaining an information clustering result of a timeliness event, wherein the information clustering result of the timeliness event is obtained by clustering information generated within a preset time period in the network, and the information clustering result includes at least one image cluster, and each image cluster includes at least one image; secondly, judging whether the image search request is a timeliness event search request based on the degree of matching between the search result corresponding to the image search request and the image cluster, thereby realizing real-time judgment of the image search request as a timeliness event search request; finally, when the image search request is a timeliness event search request, the information clustering result is used to sort the search results, and a search result queue is generated, so as to obtain more accurate search results.
参见图1,该图为本申请实施例提供的示例性应用场景的框架示意图。其中,本申请实施例提供的搜索结果排序方法可以应用于服务器20中。See FIG1 , which is a schematic diagram of a framework of an exemplary application scenario provided by an embodiment of the present application. The search result sorting method provided by an embodiment of the present application can be applied to a server 20 .
在实际应用中,服务器20获取客户端10发送的搜索请求,根据图片搜索请求获取对应的图片搜索结果,获取时效性事件的信息聚类结果,利用图片搜索请求对应的搜索结果与所述图片类簇的匹配程度判断图片搜索请求是否为时效性事件搜索请求,当图片搜索请求是时效性事件搜索请求时,利用信息聚类结果进行搜索结果的排序,生成搜索结果队列,并向客户端10发送搜索结果队列,以便客户端10进行搜索结果队列的显示。In actual applications, the server 20 obtains the search request sent by the client 10, obtains the corresponding image search results according to the image search request, obtains the information clustering results of the timeliness event, and uses the degree of matching between the search results corresponding to the image search request and the image cluster to determine whether the image search request is a timeliness event search request. When the image search request is a timeliness event search request, the information clustering results are used to sort the search results, generate a search result queue, and send the search result queue to the client 10 so that the client 10 can display the search result queue.
本领域技术人员可以理解,图1所示的框架示意图仅是本申请的实施方式可以在其中得以实现的一个示例。本申请实施方式的适用范围不受到该框架任何方面的限制。Those skilled in the art will appreciate that the framework diagram shown in FIG1 is only an example in which the embodiments of the present application can be implemented. The scope of application of the embodiments of the present application is not limited in any aspect by the framework.
需要注意的是,客户端10可以是现有的、正在研发的或将来研发的、能够通过任何形式的有线和/或无线连接(例如,Wi-Fi、LAN、蜂窝、同轴电缆等)相互交互的任何用户设备,包括但不限于:现有的、正在研发的或将来研发的智能可穿戴设备、智能手机、非智能手机、平板电脑、膝上型个人计算机、桌面型个人计算机、小型计算机、中型计算机、大型计算机等。本申请的实施方式在此方面不受任何限制。还需要注意的是,本申请实施例中服务器20可以是现有的、正在研发的或将来研发的、能够向用户提供信息推荐的应用服务的设备的一个示例。本申请的实施方式在此方面不受任何限制。It should be noted that the client 10 can be any user device that is currently available, under development, or will be developed in the future, and can interact with each other through any form of wired and/or wireless connection (e.g., Wi-Fi, LAN, cellular, coaxial cable, etc.), including but not limited to: smart wearable devices, smart phones, non-smart phones, tablet computers, laptop personal computers, desktop personal computers, minicomputers, mid-range computers, mainframe computers, etc. that are currently available, under development, or will be developed in the future. The implementation of the present application is not limited in this regard. It should also be noted that the server 20 in the embodiment of the present application can be an example of a device that is currently available, under development, or will be developed in the future, and can provide information recommendation application services to users. The implementation of the present application is not limited in this regard.
为便于理解本申请实施例提供的技术方案,下面将结合附图对本申请实施例提供的搜索结果排序方法进行说明。To facilitate understanding of the technical solution provided in the embodiment of the present application, the search result sorting method provided in the embodiment of the present application will be described below with reference to the accompanying drawings.
参见图2,该图为本申请实施例提供的一种搜索结果排序方法的流程图,如图2所示,该方法可以包括S201-S204:See FIG. 2 , which is a flow chart of a search result sorting method provided in an embodiment of the present application. As shown in FIG. 2 , the method may include S201-S204:
S201:接收客户端发送的图片搜索请求,获取图片搜索请求对应的搜索结果。S201: Receive an image search request sent by a client, and obtain search results corresponding to the image search request.
客户端发送的图片搜索请求可以为包括有查询词的用于进行图片搜索的请求,图片搜索请求可以用于表示所要进行图片搜索的事件。The image search request sent by the client may be a request for performing an image search including a query word, and the image search request may be used to indicate an event for performing an image search.
通过接收到的图片搜索请求,可以通过在网络中抓取到的信息确定与图片搜索请求对应的搜索结果。可以理解的是,在向客户端发送搜索结果之前,还需要调整搜索结果的排序,得到搜索结果队列,使得搜索结果所显示的顺序更便于用户的浏览和使用。Through the received image search request, the search results corresponding to the image search request can be determined through the information captured in the network. It is understandable that before sending the search results to the client, the order of the search results needs to be adjusted to obtain the search result queue so that the order in which the search results are displayed is more convenient for users to browse and use.
在本申请实施例中,通过接收客户端发送的图片搜索请求,可以获取与图片搜索请求对应的搜索结果,并及时地对图片搜索请求进行是否为时效性事件搜索请求的判断。当图片搜索请求为时效性事件搜索请求时,进一步对搜索结果进行排序,得到更准确的排序结果队列。In the embodiment of the present application, by receiving the image search request sent by the client, the search results corresponding to the image search request can be obtained, and the image search request can be promptly judged whether it is a time-sensitive event search request. When the image search request is a time-sensitive event search request, the search results are further sorted to obtain a more accurate sorting result queue.
S202:获取时效性事件的信息聚类结果,时效性事件的信息聚类结果为对网络中预设时间段内产生的信息聚类得到的,信息聚类结果包括至少一个图片类簇,每个图片类簇中包括至少一幅图片。S202: Obtain information clustering results of time-sensitive events, where the information clustering results of time-sensitive events are obtained by clustering information generated within a preset time period in the network, and the information clustering results include at least one picture cluster, and each picture cluster includes at least one picture.
时效性事件是指发生时间较近的热点事件。时效性事件通常先是在网络中具有相关事件信息的报道,之后通过信息的传播获得用户的关注,用户进一步的利用客户端通过发送搜索请求进行时效性事件的搜索。时效性事件通常是先具有相关信息的报道,之后才会引起用户关注,进行相关信息的搜索。由此可见,可以通过获取网络中预设时间段内产生的信息,对网络中预设时间段内产生的信息进行聚类,可以及时地得到在该时间段内网络中产生的信息所对应的信息聚类结果,该信息聚类结果就为时效性事件的信息聚类结果。Time-sensitive events refer to hot events that occurred recently. Time-sensitive events usually first have relevant event information reported on the network, and then attract the attention of users through the dissemination of information. Users further use the client to search for time-sensitive events by sending search requests. Time-sensitive events usually first have relevant information reported, and then attract the attention of users to search for relevant information. It can be seen that by obtaining information generated in a preset time period in the network and clustering the information generated in the preset time period in the network, the information clustering results corresponding to the information generated in the network in the time period can be obtained in a timely manner. The information clustering results are the information clustering results of time-sensitive events.
本申请实施例中不限定预设时间段的具体时间长度,预设时间段可以根据需要获取的时效性事件的时效性程度进行对应的设置。例如,当需要获取的时效性事件的时效性较高时,对应的预设时间段的时间长度较短,例如3个小时;当需要获取的时效性事件的时效性稍低时,对应的预设时间段的时间长度可以稍长,例如24小时。The specific length of the preset time period is not limited in the embodiment of the present application, and the preset time period can be set accordingly according to the timeliness of the timeliness event to be obtained. For example, when the timeliness of the timeliness event to be obtained is high, the length of the corresponding preset time period is short, such as 3 hours; when the timeliness of the timeliness event to be obtained is slightly lower, the length of the corresponding preset time period can be slightly longer, such as 24 hours.
信息聚类结果是根据获取到的网络中预设时间段内产生的信息进行聚类得到的。信息聚类结果中包括至少一个信息类簇,一个信息类簇可以对应于一个时效性事件的信息。一个信息类簇中具有属于同一事件的同类的相关信息,例如,与同一时效性事件相关的同类图片或者是与同一时效性事件相关的同类的文本关键词等。具体的,当信息类簇为图片类簇时,对应的信息聚类结果中至少包括一个图片类簇。每个图片类簇中至少包括一幅图片,一个图片类簇中具有同样的图片。同样的图片可以为内容一致,具有相同角度的图片,对图片进行放大缩小的处理以及添加水印等图片处理得到的图片可以认为为同样的图片。The information clustering result is obtained by clustering the information generated within a preset time period in the acquired network. The information clustering result includes at least one information cluster, and an information cluster can correspond to the information of a time-sensitive event. An information cluster contains related information of the same type belonging to the same event, for example, pictures of the same type related to the same time-sensitive event or text keywords of the same type related to the same time-sensitive event. Specifically, when the information cluster is a picture cluster, the corresponding information clustering result includes at least one picture cluster. Each picture cluster includes at least one picture, and a picture cluster contains the same pictures. The same picture can be a picture with the same content and the same angle. Pictures obtained by zooming in and out and adding watermarks can be considered to be the same picture.
S203:根据图片搜索请求对应的搜索结果与图片类簇的匹配程度,判断图片搜索请求是否为时效性事件搜索请求。S203: judging whether the image search request is a timeliness event search request according to the matching degree between the search results corresponding to the image search request and the image cluster.
在获取到时效性事件的信息聚类结果后,可以根据时效性事件的信息聚类结果进行时效性事件搜索请求的判断。After obtaining the information clustering results of the timeliness events, the timeliness event search request can be judged according to the information clustering results of the timeliness events.
具体的,图片搜索请求对应于具体的事件,通过获取时效性事件的信息聚类结果中的图片类簇,可以通过搜索结果与图片类簇的匹配程度确定图片搜索请求是否是时效性事件搜索请求。当该图片搜索请求是时效性事件搜索请求时,需要对对应的搜索结果进行排序,调整时效性更高的搜索结果的排序,得到更为准确的搜索结果队列。Specifically, the image search request corresponds to a specific event. By obtaining the image clusters in the information clustering results of the time-sensitive event, it is possible to determine whether the image search request is a time-sensitive event search request based on the matching degree between the search results and the image clusters. When the image search request is a time-sensitive event search request, it is necessary to sort the corresponding search results, adjust the sorting of search results with higher timeliness, and obtain a more accurate search result queue.
根据图片搜索请求对应的搜索结果与图片类簇的匹配程度,可以得到搜索结果中的图片与图片类簇中的图片的相似或者相同程度。具体的,当图片搜索请求对应的搜索结果与图片类簇的匹配程度较高时,说明该图片搜索请求对应的搜索结果中具有较多的与图片类簇中的图片相似或者相同的图片,对应的搜索请求为时效性事件搜索请求;当图片搜索请求对应的搜索结果与图片类簇的匹配程度低时,说明该图片搜索请求对应的搜索结果中具有较少的与图片类簇中的图片相似或者相同的图片,对应的搜索请求不为时效性事件搜索请求。According to the matching degree between the search results corresponding to the image search request and the image cluster, the similarity or identity degree between the images in the search results and the images in the image cluster can be obtained. Specifically, when the matching degree between the search results corresponding to the image search request and the image cluster is high, it means that the search results corresponding to the image search request have more images that are similar or identical to the images in the image cluster, and the corresponding search request is a time-sensitive event search request; when the matching degree between the search results corresponding to the image search request and the image cluster is low, it means that the search results corresponding to the image search request have fewer images that are similar or identical to the images in the image cluster, and the corresponding search request is not a time-sensitive event search request.
本申请实施例提供了S203的一种具体实施方式,请参见下文。The present application embodiment provides a specific implementation of S203, please see below.
需要说明的是,在获取时效性事件的信息聚类结果后,需要及时地对信息聚类结果进行加载,以便实时地通过信息聚类结果进行图片搜索请求的判断。It should be noted that after obtaining the information clustering results of the timeliness event, the information clustering results need to be loaded in a timely manner so as to judge the image search request in real time through the information clustering results.
在本申请实施例中,由于信息聚类结果是网络中在预设时间段生成的信息进行聚类得到的,得到的信息聚类结果具有实时性。通过利用信息聚类结果中的图片类簇进行图片搜索请求的判断,可以实现较为及时地对图片搜索请求进行判断,以便准确地对搜索结果进行排序。In the embodiment of the present application, since the information clustering result is obtained by clustering the information generated in the network in a preset time period, the obtained information clustering result is real-time. By using the image clusters in the information clustering result to judge the image search request, the image search request can be judged more timely so as to accurately sort the search results.
S204:当图片搜索请求为时效性事件搜索请求,利用信息聚类结果对搜索结果进行排序,生成搜索结果队列。S204: When the image search request is a time-sensitive event search request, the search results are sorted using the information clustering results to generate a search result queue.
根据图片搜索请求得到的搜索结果是未进行排序的,在进行搜索结果的显示之前,需要对搜索结果进行排序,以便进行显示。如果图片搜索请求是时效性事件搜索请求,需要将时效性事件搜索请求关联度较高、时效性较高的搜索结果排在较高的位置,以便用户快速的查询到与该时效性事件搜索请求关联度较高的搜索结果,避免与该时效性事件关联度不高的搜索结果排在搜索结果队列较高的位置,导致用户无法快速的获取到准确的信息。The search results obtained from the image search request are not sorted. Before displaying the search results, the search results need to be sorted for display. If the image search request is a time-sensitive event search request, the search results with a high relevance to the time-sensitive event search request and a high timeliness need to be ranked higher so that users can quickly query the search results with a high relevance to the time-sensitive event search request, and avoid the search results with a low relevance to the time-sensitive event being ranked higher in the search result queue, which prevents users from quickly obtaining accurate information.
得到的搜索结果队列具有按照一定顺序排列的搜索结果。搜索结果队列可以用于客户端显示搜索结果,以便用户可以进行搜索结果的浏览。The obtained search result queue has search results arranged in a certain order. The search result queue can be used for displaying the search results on the client so that the user can browse the search results.
在本申请实施例中,当图片搜索请求是时效性事件搜索请求时,利用信息聚类结果对搜索结果进行排序,得到搜索结果队列。根据信息聚类结果排序得到的搜索结果队列,可以更为准确的将与时效性事件关联度较高、时效性较高的搜索结果排放在队列中较高的位置,便于用户快速的获取到准确的信息。In the embodiment of the present application, when the image search request is a time-sensitive event search request, the search results are sorted using the information clustering results to obtain a search result queue. The search result queue obtained by sorting the information clustering results can more accurately place search results with a high degree of relevance to time-sensitive events and high timeliness at a higher position in the queue, making it easier for users to quickly obtain accurate information.
另外,在一种可能的实现方式中,本申请实施例提供一种利用信息聚类结果对搜索结果进行排序,生成搜索结果队列的方法,具体请参见下文具体实施方式。In addition, in a possible implementation, an embodiment of the present application provides a method for sorting search results using information clustering results to generate a search result queue. For details, please refer to the specific implementation method below.
基于上述S201至S204的相关内容可知,在本申请实施例中,接收客户端发送的图片搜索请求,获取对应的搜索结果,获取时效性事件的信息聚类结果。由于获取的时效性事件的信息聚类结果是实时的,所以可以及时地利用时效性事件的信息聚类结果判断搜索请求。如果搜索请求为时效性事件搜索请求,利用信息聚类结果对搜索结果进行排序,生成搜索结果队列。通过对网络中预设时间段内产生的信息聚类,可以及时地得到时效性事件的信息聚类结果,并由此及时地对搜索请求进行判断,进而生成较为准确的搜索结果队列。Based on the relevant contents of S201 to S204 above, it can be known that in an embodiment of the present application, an image search request sent by a client is received, a corresponding search result is obtained, and an information clustering result of a time-sensitive event is obtained. Since the information clustering result of the time-sensitive event obtained is real-time, the search request can be judged in a timely manner using the information clustering result of the time-sensitive event. If the search request is a time-sensitive event search request, the search results are sorted using the information clustering result to generate a search result queue. By clustering the information generated within a preset time period in the network, the information clustering result of the time-sensitive event can be obtained in a timely manner, and the search request can be judged in a timely manner, thereby generating a more accurate search result queue.
可以理解的是,网络中预设时间段内产生的信息并非全部是与时效性事件相关的信息,这会导致得到的时效性事件的信息聚类结果中,可能存在着不能用于进行搜索请求判断以及搜索结果排序的垃圾信息类簇。若时效性事件的信息聚类结果中具有垃圾信息类簇,可能会将不是时效性事件搜索请求误判为时效性事件搜索请求,影响对于搜索请求的判断的准确性。It is understandable that not all information generated in the network within a preset time period is related to time-sensitive events, which may result in the existence of spam information clusters in the information clustering results of time-sensitive events that cannot be used for search request judgment and search result sorting. If there are spam information clusters in the information clustering results of time-sensitive events, search requests that are not time-sensitive events may be misjudged as time-sensitive event search requests, affecting the accuracy of the judgment of the search request.
基于上述在获取到的时效性事件的信息聚类结果中存在着垃圾信息类簇,影响对搜索请求的判断以及搜索结果排序的问题,本申请实施例还提供了搜索结果排序的另一实施方式,在该实施方式中,搜索结果排序方法除了包括上述S201至S204以外,在执行上述S202之后,还可以包括以下步骤:Based on the problem that there are spam information clusters in the obtained information clustering results of timeliness events, which affects the judgment of search requests and the sorting of search results, the embodiment of the present application also provides another implementation method for sorting search results. In this implementation method, in addition to including the above S201 to S204, after executing the above S202, the search result sorting method may also include the following steps:
在时效性事件的信息聚类结果中确定垃圾信息类簇,在时效性事件的信息聚类结果中去除垃圾信息类簇。The junk information clusters are determined in the information clustering results of the time-sensitive events, and the junk information clusters are removed in the information clustering results of the time-sensitive events.
垃圾信息类簇是指影响对于搜索请求判断以及搜索结果排序的信息类簇,垃圾信息类簇可以包括不属于时效性事件的信息类簇、与时效性事件关联程度不高的信息类簇、以及可信度不高的信息类簇等。其中,不属于时效性事件的信息类簇是指由在互联网上预设时间段内产生的常用的信息或者是与具体事件无关的宣传信息聚类得到的信息类簇。具体的,例如,可以为与常用语、常用表情相关的信息类簇以及与广告相关的信息类簇等。与时效性事件关联程度不高的信息类簇是指由于时效性事件部分相关的信息聚类得到的信息类簇。例如,若一个时效性事件主要是由人物和地点组成,则仅针对该人物或该地点的信息就为与该有效性事件关联程度不高的部分信息,不能完全的反映该时效性事件,聚类得到的信息类簇就为与时效性事件关联程度不高的信息类簇。此外,可信度不高的信息类簇也可以为垃圾信息类簇,例如信息类簇中具有的信息数量较多,但是信息的来源网站数量较少,由这类信息聚类得到的信息类簇为可信度不高的信息类簇。Spam clusters refer to information clusters that affect the judgment of search requests and the ranking of search results. Spam clusters may include information clusters that do not belong to timeliness events, information clusters that are not highly associated with timeliness events, and information clusters that are not highly credible. Among them, information clusters that do not belong to timeliness events refer to information clusters obtained by clustering commonly used information generated within a preset time period on the Internet or publicity information that is not related to specific events. Specifically, for example, it can be information clusters related to common phrases and common expressions, and information clusters related to advertisements. Information clusters that are not highly associated with timeliness events refer to information clusters obtained by clustering information that is partially related to timeliness events. For example, if a timeliness event is mainly composed of people and places, then only the information for the person or the place is partial information that is not highly associated with the validity event, and cannot fully reflect the timeliness event. The information cluster obtained by clustering is an information cluster that is not highly associated with the timeliness event. In addition, information clusters with low credibility may also be spam information clusters. For example, there is a large amount of information in the information cluster, but the number of source websites of the information is small. The information cluster obtained by clustering such information is an information cluster with low credibility.
在确定垃圾信息类簇后,可以将确定的垃圾信息类簇从时效性事件的信息聚类结果中去除,得到更新后的时效性事件的信息聚类结果。利用更新后的信息聚类结果可以更加准确的进行对于搜索请求的判断,可以避免垃圾信息类簇对于搜索请求误判的影响,进而提高搜索结果队列排序的准确度。After determining the spam cluster, the determined spam cluster can be removed from the information clustering result of the timeliness event to obtain an updated information clustering result of the timeliness event. The updated information clustering result can be used to more accurately judge the search request, avoid the influence of the spam cluster on the misjudgment of the search request, and thus improve the accuracy of the search result queue sorting.
在本申请实施例中,垃圾信息类簇可以为垃圾图片类簇,则在时效性事件的信息聚类结果中确定垃圾信息类簇,具体可以包括A1-A3:In the embodiment of the present application, the spam information cluster may be a spam picture cluster. Then, the spam information cluster is determined in the information clustering result of the timeliness event, and specifically may include A1-A3:
A1:如果第三目标图片类簇中图片的第三数量大于第三阈值,将第三目标图片类簇确定为垃圾信息类簇。A1: If the third number of pictures in the third target picture cluster is greater than a third threshold, the third target picture cluster is determined as a spam cluster.
第三目标图片类簇为时效性事件的信息聚类结果中包括的图片类簇。通过判断第三目标图片类簇中图片的第三数量是否大于第三阈值,来确定第三目标图片类簇是否为垃圾信息类簇。其中,第三阈值可以为较大数值,例如,第三阈值为1000。可以理解的是,当第三目标图片类簇中图片的第三数量较大时,对应的第三目标图片类簇中具有的图片较多,可能不具有针对时效性事件的相关信息。对应的第三目标图片类簇可能是不属于时效性事件的图片类簇或者与时效性事件关联程度不高的图片类簇。例如,由常用语对应的图片组成的图片类簇、或者是宣传图片组成的图片类簇,此类图片类簇具有的图片数量较多,与时效性事件无关,并不包括时效性事件的信息。此外,第三目标图片类簇还可以为与时效性事件关联程度不高的图片类簇,由于该类图片类簇中的图片对应的信息所包括的范围较大,所以对应的具有的图片数量较多。The third target picture cluster is a picture cluster included in the information clustering result of the timeliness event. Whether the third target picture cluster is a spam information cluster is determined by judging whether the third number of pictures in the third target picture cluster is greater than the third threshold. Among them, the third threshold can be a larger value, for example, the third threshold is 1000. It can be understood that when the third number of pictures in the third target picture cluster is larger, the corresponding third target picture cluster has more pictures and may not have relevant information for the timeliness event. The corresponding third target picture cluster may be a picture cluster that does not belong to the timeliness event or a picture cluster that is not highly associated with the timeliness event. For example, a picture cluster composed of pictures corresponding to common phrases or a picture cluster composed of propaganda pictures has a large number of pictures, which is irrelevant to the timeliness event and does not include information about the timeliness event. In addition, the third target picture cluster can also be a picture cluster that is not highly associated with the timeliness event. Since the information corresponding to the pictures in this type of picture cluster includes a larger range, the corresponding number of pictures is larger.
A2:如果第三目标图片类簇对应的来源网站数量小于第四阈值,将第三目标图片类簇确定为垃圾信息类簇。A2: If the number of source websites corresponding to the third target image cluster is less than a fourth threshold, the third target image cluster is determined to be a spam cluster.
第三目标图片类簇为时效性事件的信息聚类结果中包括的图片类簇。当第三目标图片类簇对应的来源网站数量小于第四阈值时,可以认为第三目标图片类簇中图片的来源范围较小,可信度不高。其中,第四阈值可以为较小数值。The third target image cluster is an image cluster included in the information clustering result of the timeliness event. When the number of source websites corresponding to the third target image cluster is less than the fourth threshold, it can be considered that the source range of the images in the third target image cluster is small and the credibility is not high. The fourth threshold can be a small value.
A3:如果第三目标图片类簇中各个图片对应的描述文本之间的相似度均小于第五阈值,将第三目标图片类簇确定为垃圾信息类簇。A3: If the similarities between the description texts corresponding to the pictures in the third target picture cluster are all less than the fifth threshold, the third target picture cluster is determined to be a spam cluster.
第三目标图片类簇为时效性事件的信息聚类结果中包括的图片类簇。图片对应的描述文本可以表征图片的特征,可以根据图片对应的描述文本之间的相似度进行第三目标图片类簇是否为垃圾信息类簇的判断。当第三目标图片类簇中各个图片对应的描述文本之间的相似度均小于第五阈值时,表示各个图片的描述文本之间的相似度较低,说明第三目标图片类簇中的各个图片对应的描述文本无法体现该类簇中图片的共同特征,图片可能与描述文本之间不具有较强的联系,可能是较为常用的图片,例如,网页中的装饰图片或者是常用的表情图片。由此类图片组成的图片类簇与具体的时效性事件无关,属于垃圾信息类簇。The third target image cluster is an image cluster included in the information clustering result of the timeliness event. The description text corresponding to the image can characterize the characteristics of the image, and whether the third target image cluster is a spam information cluster can be judged based on the similarity between the description texts corresponding to the images. When the similarity between the description texts corresponding to each image in the third target image cluster is less than the fifth threshold, it means that the similarity between the description texts of each image is low, indicating that the description texts corresponding to each image in the third target image cluster cannot reflect the common characteristics of the images in the cluster, and the image may not have a strong connection with the description text, and may be a more commonly used image, such as a decorative image in a web page or a commonly used emoticon image. The image cluster composed of such images has nothing to do with the specific timeliness event and belongs to a spam information cluster.
基于上述内容可知,在本申请实施例中,通过在获取的时效性事件的信息聚类结果中去除垃圾信息类簇,可以避免由于垃圾信息类簇对于利用信息聚类结果对于图片搜索请求判断的影响,可以更加准确判断图片搜索请求是否为有效性事件搜索请求,进而得到更为准确的搜索结果队列。Based on the above content, it can be known that in an embodiment of the present application, by removing spam information clusters from the information clustering results of the acquired timeliness events, the influence of spam information clusters on the judgment of image search requests using the information clustering results can be avoided, and it can be more accurately judged whether the image search request is a validity event search request, thereby obtaining a more accurate search result queue.
在一种可能的实现方式中,可以根据图片搜索请求对应的搜索结果中具有的各个图片类簇中图片的数量,来确定图片搜索请求对应的搜索结果与图片类簇的匹配程度,进而判断搜索请求是否为时效性事件搜索请求。参见图3,该图为本申请实施例提供的一种搜索请求是否为时效性事件搜索请求的判断方法的流程图,如图3所示,该方法可以包括S301-S303:In a possible implementation, the degree of matching between the search results corresponding to the image search request and the image clusters can be determined based on the number of images in each image cluster in the search results corresponding to the image search request, thereby determining whether the search request is a time-sensitive event search request. Referring to FIG. 3 , which is a flow chart of a method for determining whether a search request is a time-sensitive event search request provided in an embodiment of the present application, as shown in FIG. 3 , the method may include S301-S303:
S301:确定各个图片类簇中的图片出现在图片搜索请求对应的搜索结果中的第一数量。S301: Determine a first number of images in each image cluster that appear in search results corresponding to an image search request.
获取图片搜索请求对应的搜索结果,确定各个图片类簇中的图片出现在搜索结果中的第一数量。第一数量是指图片类簇中的图片出现在搜索结果中的数量,通过获取各个图片类簇的图片出现在搜索结果中的第一数量,可以得到图片类簇中的图片在图片搜索请求对应的搜索结果中出现的情况,以便进行搜索请求的判断。The search results corresponding to the image search request are obtained, and the first number of images in each image cluster that appear in the search results is determined. The first number refers to the number of images in the image cluster that appear in the search results. By obtaining the first number of images in each image cluster that appear in the search results, the appearance of images in the image cluster in the search results corresponding to the image search request can be obtained, so as to make a judgment on the search request.
由于图片搜索请求对应的搜索结果中具有的图片数量可能较多,可以先将图片搜索请求对应的搜索结果进行初步排序,利用排序较前的图片作为确定各个图片类簇中的图片出现在搜索结果中第一数量所使用的图片。本申请实施例中不限定初步排序的排序方法,可以根据搜索结果中图片与搜索请求的关联度、图片的时效性、图片的质量等进行搜索结果中图片的初步排序。Since the number of images in the search results corresponding to the image search request may be large, the search results corresponding to the image search request may be preliminarily sorted first, and the images with higher sorting are used as the images used to determine the first number of images in each image cluster that appear in the search results. In the embodiments of the present application, the sorting method of the preliminary sorting is not limited, and the preliminary sorting of the images in the search results may be performed based on the relevance of the images in the search results to the search request, the timeliness of the images, the quality of the images, etc.
S302:将第一数量大于第一阈值的图片类簇确定为第一目标图片类簇。S302: Determine picture clusters whose first number is greater than a first threshold as first target picture clusters.
在得到各个图片类簇中的图片出现在图片搜索请求对应的搜索结果中的第一数量之后,将第一数量大于第一阈值的图片类簇确定为第一目标图片类簇。其中,第一阈值可以为图片类簇中出现在搜索结果中的图片的数量对应的阈值,也可以为图片类簇中出现在搜索结果中的图片的数量与该图片类簇中图片的总数量的比值对应的阈值。当图片类簇的第一数量大于第一阈值时,该图片类簇中具有的出现在搜索结果中的图片数量或者是图片比例较高,可以将该图片类簇确定为第一目标图片类簇。After obtaining the first number of images in each image cluster that appear in the search results corresponding to the image search request, the image cluster whose first number is greater than the first threshold is determined as the first target image cluster. The first threshold may be a threshold corresponding to the number of images in the image cluster that appear in the search results, or a threshold corresponding to the ratio of the number of images in the image cluster that appear in the search results to the total number of images in the image cluster. When the first number of image clusters is greater than the first threshold, the number of images or the proportion of images in the image cluster that appear in the search results is high, and the image cluster may be determined as the first target image cluster.
第一目标图片类簇可以用于进一步确定搜索请求是否为有效性事件搜索请求。可以理解的是,一个图片类簇中具有同样的图片,当图片类簇中具有的出现在搜索结果中的图片数量或者是图片比例较高时,可以认为该图片类簇与搜索结果的相关度较高,可以进一步的通过第一目标图片类簇确定搜索请求是否为有效性事件搜索请求。相对应的,如果一个图片类簇不属于第一目标图片类簇,说明该图片类簇中具有的出现在搜索结果中的图片数量或者是图片比例较低,与搜索结果的相关度较低,在利用图片类簇确定搜索请求是否为有效性事件搜索请求时,可以将不考虑该图片类簇。The first target image cluster can be used to further determine whether the search request is a validity event search request. It is understandable that when a picture cluster contains the same pictures, and the number of pictures or the proportion of pictures in the picture cluster that appear in the search results is high, it can be considered that the picture cluster has a high relevance to the search results, and the first target image cluster can be used to further determine whether the search request is a validity event search request. Correspondingly, if a picture cluster does not belong to the first target picture cluster, it means that the number of pictures or the proportion of pictures in the picture cluster that appear in the search results is low, and the relevance to the search results is low. When using the picture cluster to determine whether the search request is a validity event search request, the picture cluster can be ignored.
本申请实施例中,通过进行第一数量与第一阈值的比较,将第一数量大于第一阈值的图片类簇确定为第一目标图片类簇,在进行图片搜索结果的判断时,可以根据第一目标图片类簇的数量进行判断,由此可以排除掉出现在图片搜索请求对应的搜索结果中图片的第一数量较小的图片类簇,避免该类图片类簇对于搜索请求判断的影响,提高搜索请求判断的准确性。In an embodiment of the present application, by comparing the first number with the first threshold, the picture cluster whose first number is greater than the first threshold is determined as the first target picture cluster. When judging the image search results, the judgment can be made according to the number of the first target picture clusters. In this way, the picture clusters with a small first number of pictures that appear in the search results corresponding to the picture search request can be excluded, thereby avoiding the influence of such picture clusters on the search request judgment, and improving the accuracy of the search request judgment.
S303:如果图片搜索请求对应的搜索结果所属的第一目标图片类簇数量大于第二阈值,判断图片搜索请求为时效性事件搜索请求。S303: If the number of first target image clusters to which the search results corresponding to the image search request belong is greater than a second threshold, it is determined that the image search request is a timeliness event search request.
当图片搜索请求对应的搜索结果所属的第一目标图片类簇数量较多时,图片搜索请求对应的搜索结果中具有多种时效性事件对应的图片,由此可以将该搜索请求确定为时效性事件搜索请求。其中,第二阈值可以为第一目标图片类簇的数量对应的阈值。When the number of first target image clusters to which the search results corresponding to the image search request belong is large, the search results corresponding to the image search request include images corresponding to multiple time-sensitive events, thereby determining the search request as a time-sensitive event search request. The second threshold may be a threshold corresponding to the number of first target image clusters.
基于上述S301至S303的内容可知,本申请实施例中,根据搜索结果中的图片与图片类簇中的图片,先确定第一目标图片类簇,再通过第一目标图片类簇的数量确定搜索请求是否为时效性事件的搜索请求。第一目标图片类簇为与搜索结果相关度较高的图片类簇,当具有数量较多的第一目标图片类簇时,则可以认为搜索结果是有效性事件的搜索结果,对应的搜索请求为有效性事件搜索请求。通过确定第一目标图片类簇,可以在判断搜索请求时,去除与搜索结果相关度较低的图片类簇的影响,进一步提高对于图片搜索请求判断的准确性。Based on the contents of S301 to S303 above, it can be known that in the embodiment of the present application, based on the images in the search results and the images in the image clusters, the first target image cluster is first determined, and then the number of first target image clusters is used to determine whether the search request is a search request for a timeliness event. The first target image cluster is an image cluster with a high relevance to the search results. When there are a large number of first target image clusters, the search results can be considered to be search results for validity events, and the corresponding search request is a validity event search request. By determining the first target image cluster, the influence of image clusters with a low relevance to the search results can be removed when judging the search request, further improving the accuracy of judging the image search request.
进一步的,本申请实施例提供一种搜索请求为图片搜索请求时,利用信息聚类结果对搜索结果进行排序,生成搜索结果队列的方法。参见图4,该图为本申请实施例提供的一种生成搜索结果队列方法的流程图,如图4所示,该方法可以包括S401-S404:Further, the embodiment of the present application provides a method for sorting search results and generating a search result queue by using information clustering results when the search request is an image search request. Referring to FIG. 4 , this figure is a flow chart of a method for generating a search result queue provided by an embodiment of the present application. As shown in FIG. 4 , the method may include S401-S404:
S401:当图片搜索请求为时效性事件搜索请求,获取图片搜索请求对应的搜索结果所属的第二目标图片类簇的类簇特征;第二目标图片类簇为图片搜索请求对应的搜索结果中各个图片所对应的图片类簇。S401: When the image search request is a time-sensitive event search request, obtain cluster features of a second target image cluster to which the search results corresponding to the image search request belong; the second target image cluster is the image cluster corresponding to each image in the search results corresponding to the image search request.
在确定图片搜索请求为有效性事件搜索请求之后,需要对搜索结果进行排序,由于搜索结果中可能具有与时效性事件的搜索请求相关的,但时效性较低的搜索结果,所以需要将时效性较高的搜索结果排在较高的位置,以便用户进行浏览和选择。After determining that the image search request is a validity event search request, the search results need to be sorted. Since the search results may contain search results that are related to the search request for timeliness events but have lower timeliness, the search results with higher timeliness need to be ranked higher to facilitate user browsing and selection.
在进行搜索结果排序之前,可以获取图片搜索请求对应的搜索结果的第二目标图片类簇的类簇特征。其中,第二目标图片类簇是指图片搜索请求对应的搜索结果中各个图片所对应的图片类簇,也就是说,若图片类簇中的图片出现在搜索结果中,就将该图片类簇确定为第二目标图片类簇。Before sorting the search results, the cluster features of the second target image cluster of the search results corresponding to the image search request can be obtained. The second target image cluster refers to the image cluster corresponding to each image in the search results corresponding to the image search request, that is, if the images in the image cluster appear in the search results, the image cluster is determined as the second target image cluster.
第二目标图片类簇的类簇特征可以为用于表征第二目标图片类簇中图片相关信息的特征。第二目标图片类簇的类簇特征可以根据第二目标图片类簇中的全部或者部分图片进行确定,可以避免仅根据单张图片得到该图片对应的第二目标图片类簇的类簇特征所导致的类簇特征不准确的问题。The cluster feature of the second target image cluster may be a feature used to characterize information related to images in the second target image cluster. The cluster feature of the second target image cluster may be determined based on all or part of the images in the second target image cluster, thereby avoiding the problem of inaccurate cluster features caused by obtaining the cluster feature of the second target image cluster corresponding to the image based on only a single image.
在一种可能的实现方式中,第二目标图片类簇的类簇特征可以包括第二目标图片类簇中图片的第二数量、对应的来源网站数量以及对应的文本特征。In a possible implementation, the cluster feature of the second target image cluster may include the second number of images in the second target image cluster, the number of corresponding source websites, and the corresponding text features.
对应的,获取图片搜索请求对应的搜索结果所属的第二目标图片类簇的类簇特征,可以包括B1-B2:Correspondingly, the cluster features of the second target image cluster to which the search results corresponding to the image search request belong are obtained, which may include B1-B2:
B1:获取图片搜索请求对应的搜索结果所属的第二目标图片类簇中图片的第二数量、第二目标图片类簇中图片对应的来源网站数量。B1: Obtain the second number of images in the second target image cluster to which the search results corresponding to the image search request belong, and the number of source websites corresponding to the images in the second target image cluster.
第二目标图片类簇中图片的第二数量可以为第二目标图片类簇中所具有的图片的数量,第二目标图片类簇中图片对应的来源网站数量是指第二目标图片类簇中图片对应的来源网站的数量。The second number of pictures in the second target picture cluster may be the number of pictures in the second target picture cluster, and the number of source websites corresponding to the pictures in the second target picture cluster refers to the number of source websites corresponding to the pictures in the second target picture cluster.
B2:获取第二目标图片类簇中图片对应的文本特征。B2: Obtain text features corresponding to the images in the second target image cluster.
第二目标图片类簇中图片对应的文本特征是指图片对应的描述文本的文本特征。可以理解的是,图片对应的文本中具有与图片对应的特征,可以将第二目标图片类簇中图片对应的文本特征作为第二目标图片类簇的类簇特征之一。另外,本申请实施例还提供了获取第二目标图片类簇中图片对应的文本特征的实施方式,请参见下文具体实施方式。The text features corresponding to the pictures in the second target picture cluster refer to the text features of the description text corresponding to the pictures. It can be understood that the text corresponding to the pictures has features corresponding to the pictures, and the text features corresponding to the pictures in the second target picture cluster can be used as one of the cluster features of the second target picture cluster. In addition, the embodiment of the present application also provides an implementation method for obtaining the text features corresponding to the pictures in the second target picture cluster, please refer to the specific implementation method below.
S402:按照第二目标图片类簇的类簇特征对第二目标图片类簇进行排序,生成排序结果。S402: Sort the second target image clusters according to the cluster features of the second target image clusters to generate a sorting result.
按照获取到的第二目标图片类簇的类簇特征,对第二目标图片类簇进行排序,得到第二目标图片类簇的排序结果。对于不同的第二目标图片类簇,图片类簇的质量不同,可以先对第二目标图片类簇进行排序,得到第二目标图片类簇的排序结果,以便根据第二目标图片类簇的排序结果对搜索结果进行排序。According to the obtained cluster features of the second target image clusters, the second target image clusters are sorted to obtain the sorting results of the second target image clusters. For different second target image clusters, the quality of the image clusters is different. The second target image clusters can be sorted first to obtain the sorting results of the second target image clusters, so as to sort the search results according to the sorting results of the second target image clusters.
在一种可能的实现方式中,当第二目标图片类簇的类簇特征包括第二目标图片类簇中图片的第二数量、对应的来源网站数量以及对应的文本特征时,按照第二目标图片类簇的类簇特征对目标图片类簇进行排序,生成排序结果,可以包括:In a possible implementation, when the cluster feature of the second target image cluster includes the second number of images in the second target image cluster, the number of corresponding source websites, and the corresponding text feature, sorting the target image clusters according to the cluster feature of the second target image cluster to generate a sorting result may include:
按照搜索请求包括的查询词与第二目标图片类簇中图片对应的文本特征的匹配程度、第二目标图片类簇中图片的第二数量以及第二目标图片类簇中图片对应的来源网站数量,对第二目标图片类簇进行排序,生成排序结果。其中,需要说明的是,对应于不同的查询方式,查询词可以具有不同的来源。在一种可能的实现方式中,如果用户输入的搜索请求是由字符组成的字符串,查询词可以是字符串或者是基于字符串的分词。如果用户输入的是图片,则查询词可以是基于对图片进行识别后得到的对应的语义词。According to the matching degree between the query words included in the search request and the text features corresponding to the pictures in the second target picture cluster, the second number of pictures in the second target picture cluster, and the number of source websites corresponding to the pictures in the second target picture cluster, the second target picture cluster is sorted to generate a sorting result. It should be noted that, corresponding to different query methods, the query words may have different sources. In a possible implementation, if the search request input by the user is a string composed of characters, the query word may be a string or a word segmentation based on the string. If the user inputs a picture, the query word may be a corresponding semantic word obtained after the picture is recognized.
可以理解的是,搜索请求中具有查询词,搜索结果是根据查询词获得的,查询词可以表示搜索结果的特征,可以将搜索请求中包括的查询词与第二目标图片类簇中图片对应的文本特征的匹配程度作为排序的参考依据。本申请实施例中不限定确定搜索请求中包括的查询词与第二目标图片类簇中图片对应的文本特征的匹配程度的方式,可以获取查询词的特征向量,与图片的文本特征进行比较,得到查询词与图片对应的文本特征的匹配程度;也可以将查询词与图片的文本特征进行比较,得到查询词与图片对应的文本特征的匹配程度。It is understandable that the search request contains query words, the search results are obtained based on the query words, the query words can represent the characteristics of the search results, and the degree of matching between the query words included in the search request and the text features corresponding to the pictures in the second target picture cluster can be used as a reference for sorting. In the embodiment of the present application, the method for determining the degree of matching between the query words included in the search request and the text features corresponding to the pictures in the second target picture cluster is not limited. The feature vector of the query words can be obtained and compared with the text features of the pictures to obtain the degree of matching between the query words and the text features corresponding to the pictures; the query words can also be compared with the text features of the pictures to obtain the degree of matching between the query words and the text features corresponding to the pictures.
当第二目标图片类簇中图片的文本特征与查询词的匹配程度较高、图片的第二数量较大以及图片对应的来源网站数量较多时,该第二目标图片类簇与搜索请求的关联程度较高、具有的图片较为丰富、图片的可信度较高。因此,对第二目标图片类簇进行排序时,可以将匹配程度较高、图片的第二数量较大以及图片对应的来源网站数量较多的第二目标图片类簇放置在排序结果的较高的位置,以使根据第二目标图片类簇的排序结果进行排序得到的搜索结果队列较为准确。When the text features of the pictures in the second target picture cluster have a high degree of matching with the query words, the second number of pictures is large, and the number of source websites corresponding to the pictures is large, the second target picture cluster has a high degree of association with the search request, has richer pictures, and has a high credibility. Therefore, when sorting the second target picture clusters, the second target picture clusters with a high degree of matching, a large second number of pictures, and a large number of source websites corresponding to the pictures can be placed at a higher position in the sorting result, so that the search result queue obtained by sorting according to the sorting result of the second target picture cluster is more accurate.
S403:在每个第二目标图片类簇中选择一幅图片作为第一目标搜索结果。S403: Selecting a picture in each second target picture cluster as the first target search result.
第二目标图片类簇中具有至少一幅图片,可以在各个第二目标图片类簇中选择一幅图片作为第一目标搜索结果,用于生成搜索结果队列。在进行第二目标图片类簇中图片的选择时,可以选择第二目标图片类簇中图片质量最优的图片作为对应的第一目标搜索结果。通过在每个第二目标图片类簇中对应的选择一幅图片作为第一目标搜索结果,使得一个第二目标图片类簇对应于一个第一目标搜索结果,可以避免搜索结果的重复,提高搜索结果队列中搜索结果的有效程度,使得搜索结果队列更加准确。The second target picture cluster has at least one picture, and one picture can be selected from each second target picture cluster as the first target search result to generate a search result queue. When selecting pictures from the second target picture cluster, the picture with the best picture quality in the second target picture cluster can be selected as the corresponding first target search result. By selecting one picture from each second target picture cluster as the first target search result, one second target picture cluster corresponds to one first target search result, which can avoid duplication of search results, improve the effectiveness of search results in the search result queue, and make the search result queue more accurate.
S404:按照排序结果,对第一目标搜索结果以及搜索结果中不属于第二目标图片类簇的其他搜索结果进行排序,生成搜索结果队列,在搜索结果队列中第一目标搜索结果的排序位置高于其他搜索结果的排序位置。S404: Sort the first target search result and other search results in the search results that do not belong to the second target image cluster according to the sorting result to generate a search result queue, in which the sorting position of the first target search result is higher than the sorting positions of other search results.
可以理解的是,第一目标搜索结果是从第二目标图片类簇中选择的,时效性较高,与时效性事件的搜索结果的关联程度较高,在进行搜索结果排序时,可以将第一目标搜索结果排在较高的位置,以便用户浏览和选择。搜索请求对应的搜索结果中还具有不与图片类簇对应的搜索结果,此类搜索结果可能是时效性较低的搜索结果或者是与搜索请求关联度较低的搜索结果。在进行搜索结果的排序时,将该类搜索结果排在较后的位置。It is understandable that the first target search result is selected from the second target image cluster, has a higher timeliness, and has a higher degree of relevance to the search results of the timeliness event. When sorting the search results, the first target search result can be ranked higher for users to browse and select. The search results corresponding to the search request also have search results that do not correspond to the image cluster. Such search results may be search results with lower timeliness or search results with lower relevance to the search request. When sorting the search results, this type of search results is ranked at a later position.
在一些情况下,例如,信息聚类结果中可能具有由历史图片聚类得到的图片类簇,此类图片类簇在是根据预设时间段内产生的图片信息聚类得到的,但是图片类簇中的图片与历史图片相同,可以将此类图片类簇确定为历史图片类簇。在进行搜索结果的排序时,可以将不属于第二目标图片类簇的其他搜索结果排在第一目标搜索结果之后的位置。例如将属于历史图片类簇的第一目标搜索结果排在较低的排列位置,避免历史图片对于搜索结果排序的干扰。In some cases, for example, the information clustering results may include image clusters obtained by clustering historical images. Such image clusters are obtained by clustering image information generated within a preset time period, but the images in the image clusters are the same as historical images. Such image clusters can be determined as historical image clusters. When sorting the search results, other search results that do not belong to the second target image cluster can be arranged after the first target search results. For example, the first target search result belonging to the historical image cluster is arranged at a lower position to avoid interference of historical images in the sorting of search results.
根据得到的第二目标图片类簇的排序结果,对搜索结果进行排序,生成搜索结果队列。搜索结果中具有第一目标搜索结果和其他的搜索结果,将第一目标搜索结果排在其他搜索结果之前,使得用户根据搜索结果队列可以快速的浏览到与搜索请求关联程度较高的搜索结果。According to the obtained sorting result of the second target image cluster, the search results are sorted to generate a search result queue. The search results include the first target search results and other search results, and the first target search results are sorted before the other search results, so that the user can quickly browse the search results with a high degree of relevance to the search request according to the search result queue.
基于上述S401-S404的内容可知,本申请实施例中,当搜索请求是图片搜索请求,并且是有效性事件搜索请求时,利用搜索结果所属的第二目标图片类簇的类簇特征对第二目标图片类簇进行排序,再从第二目标图片类簇中选取图片作为第一目标搜索结果,最后根据排序结果和第一目标搜索结果对搜索结果进行排序。通过对第二目标图片类簇进行排序以及图片的选取,可以得到时效性较高、与时效性事件关联度较高的第一目标搜索结果,将第一目标搜索结果排在搜索结果队列中较高的位置,可以得到较为准确的搜索结果队列,以便用户快速获得与有效性事件搜索请求关联度较高的搜索结果。Based on the contents of S401-S404 above, it can be known that in the embodiment of the present application, when the search request is an image search request and a validity event search request, the second target image cluster is sorted using the cluster characteristics of the second target image cluster to which the search results belong, and then an image is selected from the second target image cluster as the first target search result, and finally the search results are sorted according to the sorting result and the first target search result. By sorting the second target image cluster and selecting the image, a first target search result with a high timeliness and a high correlation with the timeliness event can be obtained, and the first target search result is arranged at a higher position in the search result queue, so that a more accurate search result queue can be obtained, so that the user can quickly obtain a search result with a high correlation with the validity event search request.
基于上述获取第二目标图片类簇中图片对应的文本特征的相关内容,本申请实施例还提供了获取第二目标图片类簇中图片对应的文本特征的方法,包括C1和C2两种获取文本特征的方法。Based on the above-mentioned content of obtaining text features corresponding to pictures in the second target picture cluster, the embodiment of the present application also provides a method for obtaining text features corresponding to pictures in the second target picture cluster, including two methods C1 and C2 for obtaining text features.
C1:获取第二目标图片类簇中图片对应的描述文本中各个分词的词频,将词频最高的至少一个分词作为第二目标图片类簇中图片对应的文本特征。C1: Obtain the word frequency of each word in the description text corresponding to the picture in the second target picture cluster, and use at least one word with the highest word frequency as the text feature corresponding to the picture in the second target picture cluster.
可以理解的是,第二目标图片类簇中图片对应的描述文本可以是由多个词汇组成的句子,而搜索请求中通常具有的是查询词,所以将第二目标图片类簇中图片对应的描述文本进行分词,并选取分词作为图片对应的文本特征,以便后续将选取的分词与查询词进行匹配,确定文本特征与查询词的匹配程度。It is understandable that the description text corresponding to the picture in the second target picture cluster can be a sentence composed of multiple words, and the search request usually contains query words, so the description text corresponding to the picture in the second target picture cluster is segmented, and the segmented words are selected as the text features corresponding to the pictures, so that the selected segmented words can be matched with the query words later to determine the matching degree between the text features and the query words.
需要说明的是,第二目标图片类簇中图片对应的描述文本可以为第二目标图片类簇中全部图片对应的描述文本或者是大部分图片对应的描述文本,通过数量占比较大的图片对应的描述文本获取文本特征,可以保证文本特征的准确度。获取描述文本中各个分词的词频,词频最高的分词可以表示该图片对应的文本特征。并且,部分图像对应的文本特征是由多个分词组成的,对应的,可以将词频最高的至少一个分词作为第二目标图片类簇中图片对应的文本特征,确保文本特征的完整性。It should be noted that the description text corresponding to the pictures in the second target picture cluster can be the description text corresponding to all pictures in the second target picture cluster or the description text corresponding to most pictures. The accuracy of the text features can be ensured by obtaining the text features through the description text corresponding to the pictures with a relatively large number. The word frequency of each word segment in the description text is obtained, and the word segment with the highest word frequency can represent the text feature corresponding to the picture. In addition, the text feature corresponding to some images is composed of multiple word segments. Correspondingly, at least one word segment with the highest word frequency can be used as the text feature corresponding to the picture in the second target picture cluster to ensure the integrity of the text feature.
在上述基于词频选取分词作为文本特征的方法中,仅考虑到分词出现频率,会忽略语义对于文本特征的影响。为解决上述问题,本申请实施例还提供另一种获取文本特征的方法。In the above method of selecting word segmentation as text feature based on word frequency, only the frequency of word segmentation is considered, and the influence of semantics on text features is ignored. To solve the above problem, the embodiment of the present application also provides another method for obtaining text features.
C2:获取第二目标图片类簇中图片对应的描述文本中各个分词的特征向量,在特征向量中将出现次数最多的至少一个特征向量作为第二目标图片类簇中图片对应的文本特征。C2: Obtain feature vectors of each word in the description text corresponding to the picture in the second target picture cluster, and use at least one feature vector with the largest number of occurrences in the feature vectors as the text feature corresponding to the picture in the second target picture cluster.
在对第二目标图片类簇中图片对应的描述文本进行分词的基础上,进一步获取各个分词的特征向量,将在特征向量中出现次数最多的至少一个特征向量作为文本特征。通过将在特征向量中出现次数最多的至少一个特征向量作为文本特征,可以从语义以及频率两个方面确定特征向量,使得得到的特征向量可以更好的体现图片的特征。On the basis of word segmentation of the description text corresponding to the picture in the second target picture cluster, the feature vectors of each word segmentation are further obtained, and at least one feature vector that appears most times in the feature vector is used as a text feature. By using at least one feature vector that appears most times in the feature vector as a text feature, the feature vector can be determined from two aspects of semantics and frequency, so that the obtained feature vector can better reflect the characteristics of the picture.
进一步的,后续获取的搜索请求包括的查询词与第二目标图片类簇中图片对应的文本特征的匹配程度,相对应的为查询词的特征向量与图片对应的文本特征的匹配程度。Furthermore, the matching degree between the query word included in the search request obtained subsequently and the text features corresponding to the pictures in the second target picture cluster corresponds to the matching degree between the feature vector of the query word and the text features corresponding to the pictures.
基于上述本申请实施例提供的获取第二目标图片类簇中图片对应的文本特征的内容可知,本申请实施例通过将第二目标图片类簇中图片对应的描述文本中词频最高的分词作为文本特征或者是出现次数最多的特征向量作为文本特征,可以获取到更为准确的图片对应的文本特征,进一步的,可以得到较为准确的查询词与图片对应的文本特征的匹配程度,用于对搜索结果进行排序,最终得到较为准确的搜索结果队列。Based on the content of obtaining the text features corresponding to the pictures in the second target picture cluster provided in the above-mentioned embodiment of the present application, it can be known that the embodiment of the present application can obtain more accurate text features corresponding to the pictures by taking the most frequent word segments in the description text corresponding to the pictures in the second target picture cluster as text features or the feature vectors with the largest number of occurrences as text features. Furthermore, a more accurate degree of matching between the query terms and the text features corresponding to the pictures can be obtained, which is used to sort the search results, and finally a more accurate search result queue can be obtained.
可以理解的是,在图片搜索请求之外,本申请实施例还可以接收客户端发送的网页搜索请求,获取网页搜索请求对应的搜索结果。获取时效性事件的信息聚类结果,时效性事件的信息聚类结果为对网络中预设时间段内产生的信息聚类得到的,信息聚类结果包括至少一个文本关键词。然后根据搜索请求包括的查询词与文本关键词的匹配程度,判断网页搜索请求是否为时效性事件搜索请求。It is understandable that, in addition to the image search request, the embodiment of the present application can also receive a webpage search request sent by the client and obtain the search results corresponding to the webpage search request. The information clustering result of the timeliness event is obtained by clustering the information generated within a preset time period in the network, and the information clustering result includes at least one text keyword. Then, according to the matching degree between the query term included in the search request and the text keyword, it is determined whether the webpage search request is a timeliness event search request.
当搜索请求为网页搜索时,获取到的网页搜索对应的搜索结果中具有搜索请求中的关键词,可以根据搜索请求包括的查询词与信息聚类结果中的文本关键词进行匹配,根据查询词与文本关键词的匹配程度,判断搜索请求是否为时效性事件搜索请求。本申请实施例中不限定查询词与文本关键词匹配程度的确定方法,在一种可能的实现方式中,可以获取查询词的特征向量以及文本关键词的特征向量,通过进行查询词的特征向量与文本关键词的特征向量相似度的计算,确定查询词与文本关键词的匹配程度。When the search request is a web page search, the search results corresponding to the obtained web page search contain the keywords in the search request, and the query words included in the search request can be matched with the text keywords in the information clustering results, and the search request can be judged whether it is a time-sensitive event search request based on the matching degree between the query words and the text keywords. The method for determining the matching degree between the query words and the text keywords is not limited in the embodiments of the present application. In a possible implementation, the feature vector of the query words and the feature vector of the text keywords can be obtained, and the matching degree between the query words and the text keywords is determined by calculating the similarity between the feature vector of the query words and the feature vector of the text keywords.
进一步的,本申请实施例还提供一种当网页搜索请求为时效性事件搜索请求,利用信息聚类结果对搜索结果进行排序,生成搜索结果队列的方法。参见图5,该图为本申请实施例提供的一种生成搜索结果队列方法的流程图,如图5所示,该方法可以包括S501-S502:Furthermore, the embodiment of the present application also provides a method for generating a search result queue by sorting the search results using the information clustering results when the webpage search request is a time-sensitive event search request. Referring to FIG. 5 , this figure is a flow chart of a method for generating a search result queue provided by an embodiment of the present application. As shown in FIG. 5 , the method may include S501-S502:
D1:当网页搜索请求为时效性事件搜索请求,获取网页搜索请求对应的搜索结果中包括文本关键词的搜索结果作为第二目标搜索结果。D1: When the webpage search request is a time-sensitive event search request, a search result including a text keyword in the search results corresponding to the webpage search request is obtained as a second target search result.
网页搜索请求对应的搜索结果为与搜索请求中的查询词相关的网页搜索结果。The search results corresponding to the web page search request are web page search results related to the query terms in the search request.
可以理解的是,文本关键词是信息聚类结果中的,用于表示时效性时间的相关信息。通过将具有文本关键词的搜索结果作为第二目标搜索结果,可以通过文本关键词确定时效性较高的搜索结果。以便在后续进行排序时将第二目标搜索结果排列在搜索结果队列中较高的位置,以便用户快速获得对应的搜索结果。It is understandable that the text keyword is used to indicate relevant information of timeliness in the information clustering result. By using the search result with the text keyword as the second target search result, the search result with higher timeliness can be determined by the text keyword. In order to arrange the second target search result at a higher position in the search result queue during subsequent sorting, the user can quickly obtain the corresponding search result.
D2:按照网页搜索请求包括的查询词与第二目标搜索结果的匹配程度,对第二目标搜索结果以及搜索结果中的其他搜索结果进行排序,生成搜索结果队列,第二目标搜索结果的排序位置高于搜索结果中的其他搜索结果的排序位置。D2: Sort the second target search results and other search results in the search results according to the matching degree between the query term included in the webpage search request and the second target search results to generate a search result queue, wherein the ranking position of the second target search results is higher than the ranking positions of other search results in the search results.
第二目标搜索结果为具有文本关键词的时效性较高的搜索结果,根据第二目标搜索结果与查询词的匹配程度,对第二目标搜索结果以及搜索结果中的其他搜索结果进行排序,可以将与查询词匹配程度较高的第二目标搜索结果放置在排序位置较高的位置,由此可以将时效性较高并且与搜索请求关联程度较高的搜索结果放置在搜索结果队列中排序位置较高的位置上,使得得到的搜索结果队列的排序更加准确。The second target search result is a search result with text keywords and higher timeliness. The second target search result and other search results in the search results are sorted according to the degree of match between the second target search result and the query word. The second target search result with a higher degree of match with the query word can be placed at a higher sorting position. Thus, the search result with higher timeliness and a higher degree of relevance to the search request can be placed at a higher sorting position in the search result queue, so that the sorting of the search result queue is more accurate.
本申请实施例中,当搜索请求为网页搜索请求,并且是时效性事件搜索请求时,先通过文本关键词确定第二目标搜索结果,再根据搜索请求包括的查询词与第二目标搜索结果的匹配程度对第二目标搜索结果以及搜索结果中的其他搜索结果进行排序。由此,可以将与搜索请求包括的查询词匹配程度较高,并且时效性较高的搜索结果排列至排序位置较高的位置中,得到排序更为准确的搜索结果队列,以便用户根据较为准确的搜索结果队列,快速的获取到对应的搜索结果。In the embodiment of the present application, when the search request is a webpage search request and a time-sensitive event search request, the second target search result is first determined by the text keywords, and then the second target search result and other search results in the search results are sorted according to the degree of matching between the query terms included in the search request and the second target search result. Thus, the search results that have a high degree of matching with the query terms included in the search request and are more timely can be arranged in a higher ranking position, and a search result queue with a more accurate ranking can be obtained, so that the user can quickly obtain the corresponding search results according to the more accurate search result queue.
基于上述方法实施例提供的搜索结果排序方法,本申请实施例还提供了一种搜索结果排序装置,下面结合附图进行解释和说明。Based on the search result sorting method provided by the above method embodiment, the embodiment of the present application also provides a search result sorting device, which is explained and illustrated below in conjunction with the accompanying drawings.
参见图5,该图为本申请实施例提供的一种搜索结果排序装置的结构示意图。本申请实施例提供的搜索结果排序装置,包括:See Figure 5, which is a schematic diagram of the structure of a search result sorting device provided in an embodiment of the present application. The search result sorting device provided in an embodiment of the present application includes:
搜索结果获取单元501,用于接收客户端发送的图片搜索请求,获取所述图片搜索请求对应的搜索结果;The search result acquisition unit 501 is used to receive an image search request sent by a client and acquire search results corresponding to the image search request;
信息聚类结果获取单元502,用于获取时效性事件的信息聚类结果,所述时效性事件的信息聚类结果为对网络中预设时间段内产生的信息聚类得到的,所述信息聚类结果包括至少一个图片类簇,每个所述图片类簇中包括至少一幅图片;An information clustering result acquisition unit 502 is used to acquire information clustering results of time-sensitive events, wherein the information clustering results of time-sensitive events are obtained by clustering information generated in a preset time period in the network, and the information clustering results include at least one picture cluster, and each of the picture clusters includes at least one picture;
判断单元503,用于根据所述图片搜索请求对应的搜索结果与所述图片类簇的匹配程度,,判断所述图片搜索请求是否为时效性事件搜索请求;A judging unit 503 is used to judge whether the image search request is a timeliness event search request according to the matching degree between the search result corresponding to the image search request and the image cluster;
搜索结果队列生成单元504,用于当所述图片搜索请求为时效性事件搜索请求,利用所述信息聚类结果对所述搜索结果进行排序,生成搜索结果队列。The search result queue generating unit 504 is configured to, when the image search request is a time-sensitive event search request, sort the search results using the information clustering result to generate a search result queue.
在一种可能的实现方式中,所述判断单元,包括:In a possible implementation, the determining unit includes:
第一数量确定模块,用于确定各个所述图片类簇中的图片出现在所述图片搜索请求对应的搜索结果中的第一数量;A first quantity determination module, configured to determine a first quantity of images in each of the image clusters that appear in the search results corresponding to the image search request;
图片类簇确定模块,用于将所述第一数量大于第一阈值的图片类簇确定为第一目标图片类簇;A picture cluster determination module, configured to determine picture clusters whose first number is greater than a first threshold as first target picture clusters;
第一判断模块,用于如果所述图片搜索请求对应的搜索结果所属的第一目标图片类簇数量大于第二阈值,判断所述图片搜索请求为时效性事件搜索请求。The first judgment module is configured to judge that the image search request is a timeliness event search request if the number of first target image clusters to which the search results corresponding to the image search request belong is greater than a second threshold.
在一种可能的实现方式中,所述搜索结果队列生成单元504,包括:In a possible implementation, the search result queue generating unit 504 includes:
类簇特征获取子单元,用于当所述图片搜索请求为时效性事件搜索请求,获取所述图片搜索请求对应的搜索结果所属的第二目标图片类簇的类簇特征;所述第二目标图片类簇为所述图片搜索请求对应的搜索结果中各个图片所对应的图片类簇;A cluster feature acquisition subunit is used to acquire, when the image search request is a time-sensitive event search request, a cluster feature of a second target image cluster to which the search result corresponding to the image search request belongs; the second target image cluster is an image cluster corresponding to each image in the search result corresponding to the image search request;
排序结果生成子单元,用于按照所述第二目标图片类簇的类簇特征对所述第二目标图片类簇进行排序,生成排序结果;a sorting result generating subunit, configured to sort the second target image clusters according to the cluster features of the second target image clusters to generate a sorting result;
第一选择子单元,用于在每个所述第二目标图片类簇中选择一幅图片作为第一目标搜索结果;A first selection subunit, configured to select a picture in each of the second target picture clusters as a first target search result;
第一搜索结果队列生成子单元,用于按照所述排序结果,对所述第一目标搜索结果以及所述搜索结果中不属于所述第二目标图片类簇的其他搜索结果进行排序,生成搜索结果队列,在所述搜索结果队列中所述第一目标搜索结果的排序位置高于所述其他搜索结果的排序位置。The first search result queue generating subunit is used to sort the first target search result and other search results in the search results that do not belong to the second target image cluster according to the sorting result, and generate a search result queue, in which the sorting position of the first target search result is higher than the sorting position of the other search results.
在一种可能的实现方式中,所述类簇特征获取子单元,包括:In a possible implementation, the cluster feature acquisition subunit includes:
来源网站数量获取模块,用于获取所述图片搜索请求对应的搜索结果所属的第二目标图片类簇中图片的第二数量、所述第二目标图片类簇中图片对应的来源网站数量;A source website quantity acquisition module, used to acquire a second quantity of images in a second target image cluster to which the search result corresponding to the image search request belongs, and a quantity of source websites corresponding to the images in the second target image cluster;
文本特征获取模块,用于获取所述第二目标图片类簇中图片对应的文本特征;A text feature acquisition module, used to acquire text features corresponding to the pictures in the second target picture cluster;
所述排序结果生成子单元,具体用于:The sorting result generating subunit is specifically used for:
按照所述搜索请求包括的查询词与所述第二目标图片类簇中图片对应的文本特征的匹配程度、所述第二目标图片类簇中图片的第二数量以及所述第二目标图片类簇中图片对应的来源网站数量,对所述第二目标图片类簇进行排序,生成排序结果。The second target image cluster is sorted according to the matching degree between the query term included in the search request and the text features corresponding to the images in the second target image cluster, the second number of images in the second target image cluster, and the number of source websites corresponding to the images in the second target image cluster to generate a sorting result.
在一种可能的实现方式中,所述文本特征获取模块,具体用于:In a possible implementation, the text feature acquisition module is specifically used to:
获取所述第二目标图片类簇中图片对应的描述文本中各个分词的词频,将所述词频最高的至少一个分词作为所述第二目标图片类簇中图片对应的文本特征;Obtaining the word frequency of each word in the description text corresponding to the picture in the second target picture cluster, and taking at least one word with the highest word frequency as the text feature corresponding to the picture in the second target picture cluster;
或者,获取所述第二目标图片类簇中图片对应的描述文本中各个分词的特征向量,在所述特征向量中将出现次数最多的至少一个特征向量作为所述第二目标图片类簇中图片对应的文本特征。Alternatively, the feature vectors of each word in the description text corresponding to the picture in the second target picture cluster are obtained, and at least one feature vector with the largest number of occurrences in the feature vectors is used as the text feature corresponding to the picture in the second target picture cluster.
在一种可能的实现方式中,所述装置还包括:In a possible implementation manner, the device further includes:
垃圾信息类簇确定单元,用于在所述时效性事件的信息聚类结果中确定垃圾信息类簇;A spam information cluster determination unit, configured to determine a spam information cluster from the information clustering result of the timeliness event;
垃圾信息类簇去除单元,用于在所述时效性事件的信息聚类结果中去除所述垃圾信息类簇;A spam information cluster removal unit, used to remove the spam information cluster from the information clustering result of the timeliness event;
所述垃圾信息类簇确定单元,具体用于:The junk information cluster determination unit is specifically used to:
如果第三目标图片类簇中图片的第三数量大于第三阈值,或者第三目标图片类簇对应的来源网站数量小于第四阈值,将所述第三目标图片类簇确定为垃圾信息类簇,所述第三目标图片类簇为所述时效性事件的信息聚类结果中包括的图片类簇;If a third number of pictures in the third target picture cluster is greater than a third threshold, or the number of source websites corresponding to the third target picture cluster is less than a fourth threshold, the third target picture cluster is determined as a spam information cluster, and the third target picture cluster is a picture cluster included in the information clustering result of the timeliness event;
或者,如果第三目标图片类簇中各个图片对应的描述文本之间的相似度均小于第五阈值,将所述第三目标图片类簇确定为垃圾信息类簇,所述第三目标图片类簇为所述时效性事件的信息聚类结果中包括的图片类簇。Alternatively, if the similarities between the description texts corresponding to the pictures in the third target picture cluster are all less than the fifth threshold, the third target picture cluster is determined to be a spam cluster, and the third target picture cluster is the picture cluster included in the information clustering result of the timeliness event.
图6示出了一种用于搜索结果排序装置1200的框图。例如,装置1200可以是移动电话,计算机,数字广播终端,消息收发设备,游戏控制台,平板设备,医疗设备,健身设备,个人数字助理等。6 shows a block diagram of a search result ranking apparatus 1200. For example, the apparatus 1200 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, a fitness device, a personal digital assistant, etc.
参照图6,装置1200可以包括以下一个或多个组件:处理组件1202,存储器1204,电源组件1206,多媒体组件1208,音频组件1210,输入/输出(I/O)的接口1212,传感器组件1214,以及通信组件1216。6 , the device 1200 may include one or more of the following components: a processing component 1202 , a memory 1204 , a power component 1206 , a multimedia component 1208 , an audio component 1210 , an input/output (I/O) interface 1212 , a sensor component 1214 , and a communication component 1216 .
处理组件1202通常控制装置1200的整体操作,诸如与显示,电话呼叫,数据通信,相机操作和记录操作相关联的操作。处理元件1202可以包括一个或多个处理器1220来执行指令,以完成上述的方法的全部或部分步骤。此外,处理组件1202可以包括一个或多个模块,便于处理组件1202和其他组件之间的交互。例如,处理部件1202可以包括多媒体模块,以方便多媒体组件1208和处理组件1202之间的交互。The processing component 1202 generally controls the overall operation of the device 1200, such as operations associated with display, phone calls, data communications, camera operations, and recording operations. The processing component 1202 may include one or more processors 1220 to execute instructions to perform all or part of the steps of the above-described method. In addition, the processing component 1202 may include one or more modules to facilitate interaction between the processing component 1202 and other components. For example, the processing component 1202 may include a multimedia module to facilitate interaction between the multimedia component 1208 and the processing component 1202.
存储器1204被配置为存储各种类型的数据以支持在设备1200的操作。这些数据的示例包括用于在装置1200上操作的任何应用程序或方法的指令,联系人数据,电话簿数据,消息,图片,视频等。存储器1204可以由任何类型的易失性或非易失性存储设备或者它们的组合实现,如静态随机存取存储器(SRAM),电可擦除可编程只读存储器(EEPROM),可擦除可编程只读存储器(EPROM),可编程只读存储器(PROM),只读存储器(ROM),磁存储器,快闪存储器,磁盘或光盘。The memory 1204 is configured to store various types of data to support operations on the device 1200. Examples of such data include instructions for any application or method operating on the device 1200, contact data, phone book data, messages, pictures, videos, etc. The memory 1204 can be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic disk or optical disk.
电源组件1206为装置1200的各种组件提供电力。电源组件1206可以包括电源管理系统,一个或多个电源,及其他与为装置1200生成、管理和分配电力相关联的组件。The power supply component 1206 provides power to the various components of the device 1200. The power supply component 1206 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power to the device 1200.
多媒体组件1208包括在所述装置1200和用户之间的提供一个输出接口的屏幕。在一些实施例中,屏幕可以包括液晶显示器(LCD)和触摸面板(TP)。如果屏幕包括触摸面板,屏幕可以被实现为触摸屏,以接收来自用户的输入信号。触摸面板包括一个或多个触摸传感器以感测触摸、滑动和触摸面板上的手势。所述触摸传感器可以不仅感测触摸或滑动动作的边界,而且还检测与所述触摸或滑动操作相关的持续时间和压力。在一些实施例中,多媒体组件1208包括一个前置摄像头和/或后置摄像头。当设备1200处于操作模式,如拍摄模式或视频模式时,前置摄像头和/或后置摄像头可以接收外部的多媒体数据。每个前置摄像头和后置摄像头可以是一个固定的光学透镜系统或具有焦距和光学变焦能力。The multimedia component 1208 includes a screen that provides an output interface between the device 1200 and the user. In some embodiments, the screen may include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from the user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundaries of the touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 1208 includes a front camera and/or a rear camera. When the device 1200 is in an operating mode, such as a shooting mode or a video mode, the front camera and/or the rear camera may receive external multimedia data. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.
音频组件1210被配置为输出和/或输入音频信号。例如,音频组件1210包括一个麦克风(MIC),当装置1200处于操作模式,如呼叫模式、记录模式和语音识别模式时,麦克风被配置为接收外部音频信号。所接收的音频信号可以被进一步存储在存储器1204或经由通信组件1216发送。在一些实施例中,音频组件1210还包括一个扬声器,用于输出音频信号。The audio component 1210 is configured to output and/or input audio signals. For example, the audio component 1210 includes a microphone (MIC), and when the device 1200 is in an operation mode, such as a call mode, a recording mode, and a speech recognition mode, the microphone is configured to receive an external audio signal. The received audio signal can be further stored in the memory 1204 or sent via the communication component 1216. In some embodiments, the audio component 1210 also includes a speaker for outputting audio signals.
I/O接口为处理组件1202和外围接口模块之间提供接口,上述外围接口模块可以是键盘,点击轮,按钮等。这些按钮可包括但不限于:主页按钮、音量按钮、启动按钮和锁定按钮。The I/O interface provides an interface between the processing component 1202 and the peripheral interface module, which may be a keyboard, a click wheel, buttons, etc. These buttons may include but are not limited to: a home button, a volume button, a start button, and a lock button.
传感器组件1214包括一个或多个传感器,用于为装置1200提供各个方面的状态评估。例如,传感器组件1214可以检测到设备1200的打开/关闭状态,组件的相对定位,例如所述组件为装置1200的显示器和小键盘,传感器组件1214还可以检测装置1200或装置1200一个组件的位置改变,用户与装置1200接触的存在或不存在,装置1200方位或加速/减速和装置1200的温度变化。传感器组件1214可以包括接近传感器,被配置用来在没有任何的物理接触时检测附近物体的存在。传感器组件1214还可以包括光传感器,如CMOS或CCD图像传感器,用于在成像应用中使用。在一些实施例中,该传感器组件1214还可以包括加速度传感器,陀螺仪传感器,磁传感器,压力传感器或温度传感器。The sensor assembly 1214 includes one or more sensors for providing various aspects of the status assessment of the device 1200. For example, the sensor assembly 1214 can detect the open/closed state of the device 1200, the relative positioning of components, such as the display and keypad of the device 1200, the sensor assembly 1214 can also detect the position change of the device 1200 or a component of the device 1200, the presence or absence of user contact with the device 1200, the orientation or acceleration/deceleration of the device 1200, and the temperature change of the device 1200. The sensor assembly 1214 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 1214 may also include an optical sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 1214 may also include an accelerometer, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
通信组件1216被配置为便于装置1200和其他设备之间有线或无线方式的通信。装置1200可以接入基于通信标准的无线网络,如WiFi,2G或3G,或它们的组合。在一个示例性实施例中,通信部件1216经由广播信道接收来自外部广播管理系统的广播信号或广播相关信息。在一个示例性实施例中,所述通信部件1216还包括近场通信(NFC)模块,以促进短程通信。例如,在NFC模块可基于射频识别(RFID)技术,红外数据协会(IrDA)技术,超宽带(UWB)技术,蓝牙(BT)技术和其他技术来实现。The communication component 1216 is configured to facilitate wired or wireless communication between the device 1200 and other devices. The device 1200 can access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component 1216 receives a broadcast signal or broadcast-related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 1216 also includes a near field communication (NFC) module to facilitate short-range communication. For example, the NFC module can be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology and other technologies.
在示例性实施例中,装置1200可以被一个或多个应用专用集成电路(ASIC)、数字信号处理器(DSP)、数字信号处理设备(DSPD)、可编程逻辑器件(PLD)、现场可编程门阵列(FPGA)、控制器、微控制器、微处理器或其他电子元件实现,用于执行下述方法:In an exemplary embodiment, the apparatus 1200 may be implemented by one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable gate arrays (FPGAs), controllers, microcontrollers, microprocessors, or other electronic components to perform the following methods:
接收客户端发送的图片搜索请求,获取所述图片搜索请求对应的搜索结果;Receive an image search request sent by a client, and obtain search results corresponding to the image search request;
获取时效性事件的信息聚类结果,所述时效性事件的信息聚类结果为对网络中预设时间段内产生的信息聚类得到的,所述信息聚类结果包括至少一个图片类簇,每个所述图片类簇中包括至少一幅图片;Acquire information clustering results of time-sensitive events, where the information clustering results of time-sensitive events are obtained by clustering information generated in a preset time period in the network, and the information clustering results include at least one picture cluster, and each of the picture clusters includes at least one picture;
根据所述图片搜索请求对应的搜索结果与所述图片类簇的匹配程度,判断所述图片搜索请求是否为时效性事件搜索请求;Determining whether the image search request is a timeliness event search request according to a matching degree between the search result corresponding to the image search request and the image cluster;
当所述图片搜索请求为时效性事件搜索请求,利用所述信息聚类结果对所述搜索结果进行排序,生成搜索结果队列。When the image search request is a time-sensitive event search request, the search results are sorted using the information clustering results to generate a search result queue.
可选的,所述根据所述图片搜索请求对应的搜索结果与所述图片类簇的匹配程度,判断所述图片搜索请求是否为时效性事件搜索请求,包括:Optionally, judging whether the image search request is a timeliness event search request according to a matching degree between the search result corresponding to the image search request and the image cluster includes:
确定各个所述图片类簇中的图片出现在所述图片搜索请求对应的搜索结果中的第一数量;Determine a first number of images in each of the image clusters that appear in the search results corresponding to the image search request;
将所述第一数量大于第一阈值的图片类簇确定为第一目标图片类簇;Determine the picture clusters whose first number is greater than a first threshold as first target picture clusters;
如果所述图片搜索请求对应的搜索结果所属的第一目标图片类簇数量大于第二阈值,判断所述图片搜索请求为时效性事件搜索请求。If the number of first target image clusters to which the search results corresponding to the image search request belong is greater than a second threshold, it is determined that the image search request is a timeliness event search request.
可选的,所述当所述图片搜索请求为时效性事件搜索请求,利用所述信息聚类结果对所述搜索结果进行排序,生成搜索结果队列,包括:Optionally, when the image search request is a time-sensitive event search request, sorting the search results using the information clustering results to generate a search result queue includes:
当所述图片搜索请求为时效性事件搜索请求,获取所述图片搜索请求对应的搜索结果所属的第二目标图片类簇的类簇特征;所述第二目标图片类簇为所述图片搜索请求对应的搜索结果中各个图片所对应的图片类簇;When the image search request is a time-sensitive event search request, obtaining cluster features of a second target image cluster to which the search results corresponding to the image search request belong; the second target image cluster is an image cluster corresponding to each image in the search results corresponding to the image search request;
按照所述第二目标图片类簇的类簇特征对所述第二目标图片类簇进行排序,生成排序结果;sorting the second target image clusters according to the cluster features of the second target image clusters to generate a sorting result;
在每个所述第二目标图片类簇中选择一幅图片作为第一目标搜索结果;Selecting a picture in each of the second target picture clusters as the first target search result;
按照所述排序结果,对所述第一目标搜索结果以及所述搜索结果中不属于所述第二目标图片类簇的其他搜索结果进行排序,生成搜索结果队列,在所述搜索结果队列中,所述第一目标搜索结果的排序位置高于所述其他搜索结果的排序位置。According to the sorting result, the first target search result and other search results in the search results that do not belong to the second target image cluster are sorted to generate a search result queue, in which the sorting position of the first target search result is higher than the sorting positions of the other search results.
可选的,所述获取所述图片搜索请求对应的搜索结果所属的第二目标图片类簇的类簇特征,包括:Optionally, obtaining the cluster feature of the second target image cluster to which the search result corresponding to the image search request belongs includes:
获取所述图片搜索请求对应的搜索结果所属的第二目标图片类簇中图片的第二数量、所述第二目标图片类簇中图片对应的来源网站数量;Obtain a second number of images in a second target image cluster to which the search results corresponding to the image search request belong, and a number of source websites corresponding to the images in the second target image cluster;
获取所述第二目标图片类簇中图片对应的文本特征;Obtaining text features corresponding to the pictures in the second target picture cluster;
所述按照所述第二目标图片类簇的类簇特征对所述目标图片类簇进行排序,生成排序结果,包括:The step of sorting the target image clusters according to the cluster features of the second target image clusters to generate a sorting result includes:
按照所述搜索请求包括的查询词与所述第二目标图片类簇中图片对应的文本特征的匹配程度、所述第二目标图片类簇中图片的第二数量以及所述第二目标图片类簇中图片对应的来源网站数量,对所述第二目标图片类簇进行排序,生成排序结果。The second target image cluster is sorted according to the matching degree between the query term included in the search request and the text features corresponding to the images in the second target image cluster, the second number of images in the second target image cluster, and the number of source websites corresponding to the images in the second target image cluster to generate a sorting result.
可选的,所述获取所述第二目标图片类簇中图片对应的文本特征,包括:Optionally, obtaining text features corresponding to the pictures in the second target picture cluster includes:
获取所述第二目标图片类簇中图片对应的描述文本中各个分词的词频,将所述词频最高的至少一个分词作为所述第二目标图片类簇中图片对应的文本特征;Obtaining the word frequency of each word in the description text corresponding to the picture in the second target picture cluster, and taking at least one word with the highest word frequency as the text feature corresponding to the picture in the second target picture cluster;
或者,获取所述第二目标图片类簇中图片对应的描述文本中各个分词的特征向量,在所述特征向量中将出现次数最多的至少一个特征向量作为所述第二目标图片类簇中图片对应的文本特征。Alternatively, the feature vectors of each word in the description text corresponding to the picture in the second target picture cluster are obtained, and at least one feature vector with the largest number of occurrences in the feature vectors is used as the text feature corresponding to the picture in the second target picture cluster.
可选的,所述方法还包括:Optionally, the method further includes:
在所述时效性事件的信息聚类结果中确定垃圾信息类簇,在所述时效性事件的信息聚类结果中去除所述垃圾信息类簇;Determining a junk information cluster in the information clustering result of the timeliness event, and removing the junk information cluster in the information clustering result of the timeliness event;
所述在所述时效性事件的信息聚类结果中确定垃圾信息类簇,包括:The step of determining a spam information cluster from the information clustering result of the timeliness event includes:
如果第三目标图片类簇中图片的第三数量大于第三阈值,或者第三目标图片类簇对应的来源网站数量小于第四阈值,将所述第三目标图片类簇确定为垃圾信息类簇,所述第三目标图片类簇为所述时效性事件的信息聚类结果中包括的图片类簇;If a third number of pictures in the third target picture cluster is greater than a third threshold, or the number of source websites corresponding to the third target picture cluster is less than a fourth threshold, the third target picture cluster is determined as a spam information cluster, and the third target picture cluster is a picture cluster included in the information clustering result of the timeliness event;
或者,如果第三目标图片类簇中各个图片对应的描述文本之间的相似度均小于第五阈值,将所述第三目标图片类簇确定为垃圾信息类簇,所述第三目标图片类簇为所述时效性事件的信息聚类结果中包括的图片类簇。Alternatively, if the similarities between the description texts corresponding to the pictures in the third target picture cluster are all less than the fifth threshold, the third target picture cluster is determined to be a spam cluster, and the third target picture cluster is the picture cluster included in the information clustering result of the timeliness event.
图7是本发明实施例中服务器的结构示意图。该服务器1300可因配置或性能不同而产生比较大的差异,可以包括一个或一个以上中央处理器(central processing units,CPU)1322(例如,一个或一个以上处理器)和存储器1332,一个或一个以上存储应用程序1342或数据1344的存储介质1330(例如一个或一个以上海量存储设备)。其中,存储器1332和存储介质1330可以是短暂存储或持久存储。存储在存储介质1330的程序可以包括一个或一个以上模块(图示没标出),每个模块可以包括对服务器中的一系列指令操作。更进一步地,中央处理器1322可以设置为与存储介质1330通信,在服务器1300上执行存储介质1330中的一系列用于执行上述搜索结果排序方法的指令操作。FIG7 is a schematic diagram of the structure of a server in an embodiment of the present invention. The server 1300 may have relatively large differences due to different configurations or performances, and may include one or more central processing units (CPU) 1322 (for example, one or more processors) and memory 1332, and one or more storage media 1330 (for example, one or more mass storage devices) storing application programs 1342 or data 1344. Among them, the memory 1332 and the storage medium 1330 may be temporary storage or permanent storage. The program stored in the storage medium 1330 may include one or more modules (not shown in the figure), and each module may include a series of instruction operations on the server. Furthermore, the central processing unit 1322 may be configured to communicate with the storage medium 1330, and execute a series of instruction operations in the storage medium 1330 on the server 1300 for executing the above-mentioned search result sorting method.
服务器1300还可以包括一个或一个以上电源1326,一个或一个以上有线或无线网络接口1350,一个或一个以上输入输出接口1356,一个或一个以上键盘1356,和/或,一个或一个以上操作系统1341,例如Windows ServerTM,Mac OS XTM,UnixTM,LinuxTM,FreeBSDTM等等。The server 1300 may also include one or more power supplies 1326, one or more wired or wireless network interfaces 1350, one or more input and output interfaces 1356, one or more keyboards 1356, and/or, one or more operating systems 1341, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, etc.
另外,本申请实施例还提供一种计算机可读介质,其上存储有指令,当由一个或多个处理器执行时,使得装置执行上述的搜索结果排序方法。In addition, an embodiment of the present application also provides a computer-readable medium having instructions stored thereon, which, when executed by one or more processors, enables the device to perform the above-mentioned search result sorting method.
需要说明的是,本说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似部分互相参见即可。对于实施例公开的系统或装置而言,由于其与实施例公开的方法相对应,所以描述的比较简单,相关之处参见方法部分说明即可。It should be noted that the various embodiments in this specification are described in a progressive manner, and each embodiment focuses on the differences from other embodiments, and the same or similar parts between the various embodiments can be referred to each other. For the system or device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant parts can be referred to the method part description.
应当理解,在本申请中,“至少一个(项)”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,用于描述关联对象的关联关系,表示可以存在三种关系,例如,“A和/或B”可以表示:只存在A,只存在B以及同时存在A和B三种情况,其中A,B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。“以下至少一项(个)”或其类似表达,是指这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b或c中的至少一项(个),可以表示:a,b,c,“a和b”,“a和c”,“b和c”,或“a和b和c”,其中a,b,c可以是单个,也可以是多个。It should be understood that in the present application, "at least one (item)" means one or more, and "plurality" means two or more. "And/or" is used to describe the association relationship of associated objects, indicating that three relationships may exist. For example, "A and/or B" can mean: only A exists, only B exists, and A and B exist at the same time, where A and B can be singular or plural. The character "/" generally indicates that the objects associated before and after are in an "or" relationship. "At least one of the following" or similar expressions refers to any combination of these items, including any combination of single or plural items. For example, at least one of a, b or c can mean: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", where a, b, c can be single or multiple.
还需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。It should also be noted that, in this article, relational terms such as first and second, etc. are only used to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Moreover, the terms "include", "comprise" or any other variants thereof are intended to cover non-exclusive inclusion, so that a process, method, article or device including a series of elements includes not only those elements, but also other elements not explicitly listed, or also includes elements inherent to such process, method, article or device. In the absence of further restrictions, the elements defined by the sentence "comprise a ..." do not exclude the presence of other identical elements in the process, method, article or device including the elements.
结合本文中所公开的实施例描述的方法或算法的步骤可以直接用硬件、处理器执行的软件模块,或者二者的结合来实施。软件模块可以置于随机存储器(RAM)、内存、只读存储器(ROM)、电可编程ROM、电可擦除可编程ROM、寄存器、硬盘、可移动磁盘、CD-ROM、或技术领域内所公知的任意其它形式的存储介质中。The steps of the method or algorithm described in conjunction with the embodiments disclosed herein may be implemented directly using hardware, a software module executed by a processor, or a combination of the two. The software module may be placed in a random access memory (RAM), a memory, a read-only memory (ROM), an electrically programmable ROM, an electrically erasable programmable ROM, a register, a hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
对所公开的实施例的上述说明,使本领域专业技术人员能够实现或使用本申请。对这些实施例的多种修改对本领域的专业技术人员来说将是显而易见的,本文中所定义的一般原理可以在不脱离本申请的精神或范围的情况下,在其它实施例中实现。因此,本申请将不会被限制于本文所示的这些实施例,而是要符合与本文所公开的原理和新颖特点相一致的最宽的范围。The above description of the disclosed embodiments enables those skilled in the art to implement or use the present application. Various modifications to these embodiments will be apparent to those skilled in the art, and the general principles defined herein may be implemented in other embodiments without departing from the spirit or scope of the present application. Therefore, the present application will not be limited to the embodiments shown herein, but will conform to the widest scope consistent with the principles and novel features disclosed herein.
Claims (12)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010818963.8A CN114077712B (en) | 2020-08-14 | 2020-08-14 | Search result ordering method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010818963.8A CN114077712B (en) | 2020-08-14 | 2020-08-14 | Search result ordering method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114077712A CN114077712A (en) | 2022-02-22 |
CN114077712B true CN114077712B (en) | 2024-10-29 |
Family
ID=80279420
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010818963.8A Active CN114077712B (en) | 2020-08-14 | 2020-08-14 | Search result ordering method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114077712B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103020066A (en) * | 2011-09-21 | 2013-04-03 | 北京百度网讯科技有限公司 | Method and device for recognizing search demand |
CN107180093A (en) * | 2017-05-15 | 2017-09-19 | 北京奇艺世纪科技有限公司 | Information search method and device and ageing inquiry word recognition method and device |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103838877B (en) * | 2014-03-26 | 2017-04-12 | 北京奇虎科技有限公司 | Method and device for pushing timeliness information webpage results based on search |
CN105574185A (en) * | 2015-12-22 | 2016-05-11 | 北京奇虎科技有限公司 | Method and device for providing clustering type intelligent summaries |
TWI617930B (en) * | 2016-09-23 | 2018-03-11 | 李雨暹 | Method and system for sorting a search result with space objects, and a computer-readable storage device |
CN108537274B (en) * | 2018-04-08 | 2020-06-19 | 武汉大学 | A fast spatial multi-scale clustering method for enterprise POI location points based on grid |
CN110633330B (en) * | 2018-06-01 | 2022-02-22 | 北京百度网讯科技有限公司 | Event discovery method, device, equipment and storage medium |
-
2020
- 2020-08-14 CN CN202010818963.8A patent/CN114077712B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103020066A (en) * | 2011-09-21 | 2013-04-03 | 北京百度网讯科技有限公司 | Method and device for recognizing search demand |
CN107180093A (en) * | 2017-05-15 | 2017-09-19 | 北京奇艺世纪科技有限公司 | Information search method and device and ageing inquiry word recognition method and device |
Also Published As
Publication number | Publication date |
---|---|
CN114077712A (en) | 2022-02-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110222256B (en) | Information recommendation method and device and information recommendation device | |
CN107315487B (en) | Input processing method and device and electronic equipment | |
WO2017181663A1 (en) | Method and device for matching image to search information | |
CN112784142A (en) | Information recommendation method and device | |
CN113901241B (en) | Page display method and device, electronic equipment and storage medium | |
CN112307281B (en) | Entity recommendation method and device | |
CN105373580A (en) | Method and device for displaying subjects | |
CN110309324A (en) | A search method and related device | |
CN109213942A (en) | A kind of search result methods of exhibiting and device | |
CN113239183B (en) | Training method and device for ranking model, electronic equipment and storage medium | |
CN107256242A (en) | Search result display methods and device, terminal, server and storage medium | |
CN110020082B (en) | Searching method and device | |
CN110020335B (en) | Favorite processing method and device | |
CN112307294B (en) | Data processing method and device | |
CN111368161A (en) | Search intention recognition method and intention recognition model training method and device | |
CN105450510B (en) | Friend management method, device and server for social network-i i-platform | |
CN112256890A (en) | Information display method, device, electronic device and storage medium | |
CN114077712B (en) | Search result ordering method and device | |
CN108205534B (en) | Skin resource display method and device and electronic equipment | |
CN110020206B (en) | Search result ordering method and device | |
CN112052395B (en) | Data processing method and device | |
CN111324805A (en) | Query intent determination method and device, search method and search engine | |
CN107301188B (en) | Method for acquiring user interest and electronic equipment | |
WO2022257883A1 (en) | Presentation method and presentation apparatus | |
CN114139005B (en) | Picture keyword determining method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |