[go: up one dir, main page]

CN111309872B - Search processing method, device and equipment - Google Patents

Search processing method, device and equipment Download PDF

Info

Publication number
CN111309872B
CN111309872B CN202010223795.8A CN202010223795A CN111309872B CN 111309872 B CN111309872 B CN 111309872B CN 202010223795 A CN202010223795 A CN 202010223795A CN 111309872 B CN111309872 B CN 111309872B
Authority
CN
China
Prior art keywords
entity
text
searched
candidate
knowledge
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010223795.8A
Other languages
Chinese (zh)
Other versions
CN111309872A (en
Inventor
林泽南
卢佳俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202010223795.8A priority Critical patent/CN111309872B/en
Publication of CN111309872A publication Critical patent/CN111309872A/en
Application granted granted Critical
Publication of CN111309872B publication Critical patent/CN111309872B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/38Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2216/00Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
    • G06F2216/03Data mining

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Library & Information Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a search processing method, device and equipment, relates to the technical field of artificial intelligence, and particularly relates to the technical field of knowledge maps. The technical scheme disclosed by the application comprises the following steps: acquiring a text to be searched, and determining at least one candidate entity according to the text to be searched; for each candidate entity, acquiring entity information corresponding to the candidate entity from the knowledge graph, and marking keywords in the text to be searched according to the entity information corresponding to the candidate entity to obtain a marking result corresponding to the candidate entity; and determining a target entity corresponding to the text to be searched according to the labeling result corresponding to each of the at least one candidate entity. In the process, the keywords in the text to be searched are marked by utilizing the entity information in the knowledge graph, so that the meaning and intention of the text to be searched are thoroughly understood, and the accuracy of the search result is improved.

Description

Search processing method, device and equipment
Technical Field
The present application relates to the field of artificial intelligence technologies, and in particular, to a search processing method, apparatus, and device.
Background
When a user uses a search engine or the like having a search function, text to be searched input by the user may include an entity name and entity information. The entity name is used for indicating the entity to be searched, and the entity information is information for further describing or limiting the entity to be searched. Only if the entity to be searched is accurately identified according to the text to be searched, accurate search results can be displayed to the user.
In the prior art, when a text to be searched is identified, matching identification is performed on the text to be searched by using a preset template so as to determine an entity to be searched. The entity to be searched determined in the mode is low in accuracy, so that the search result is not accurate enough.
Disclosure of Invention
The application provides a search processing method, device and equipment, which are used for improving the accuracy of search results.
In a first aspect, an embodiment of the present application provides a search processing method, including:
acquiring a text to be searched; determining at least one candidate entity according to the text to be searched;
for each candidate entity, acquiring entity information corresponding to the candidate entity from the knowledge graph, and labeling the keywords in the text to be searched according to the entity information corresponding to the candidate entity to obtain a labeling result corresponding to the candidate entity; and determining the target entity corresponding to the text to be searched according to the labeling result corresponding to each of the at least one candidate entity.
According to the scheme, the keywords in the text to be searched are marked by utilizing the entity information in the knowledge graph, and the knowledge attributes corresponding to the keywords can be marked for each keyword in the text to be searched, so that each keyword in the text to be searched can be accurately understood, and the meaning and intention of the text to be searched can be thoroughly understood. Furthermore, according to the understanding result of the text to be searched, the entity to be searched can be accurately identified, and therefore the accuracy of the searching result is improved. In addition, the search processing method of the embodiment can be used for the text to be searched with any structure and any length, and has high universality.
In one possible implementation manner, entity information corresponding to one candidate entity is used for indicating an attribute value corresponding to at least one knowledge attribute of the candidate entity; and the labeling result is used for indicating knowledge attributes corresponding to the keywords in the text to be searched.
In a possible implementation manner, the marking the keywords in the text to be searched according to the entity information corresponding to the candidate entity to obtain a marking result corresponding to the candidate entity includes: cutting and matching the keywords in the text to be searched according to the attribute value corresponding to at least one knowledge attribute of the candidate entity, and determining the knowledge attribute corresponding to each keyword in the text to be searched; and generating a labeling result corresponding to the candidate entity according to the knowledge attribute corresponding to each keyword in the text to be searched.
In a possible implementation manner, the at least one knowledge attribute corresponds to priority; the step of segmenting and matching the keywords in the text to be searched according to the attribute value corresponding to at least one knowledge attribute of the candidate entity, and determining the knowledge attribute corresponding to each keyword in the text to be searched comprises the following steps: and cutting and matching the keywords in the text to be searched according to the attribute value corresponding to at least one knowledge attribute of the candidate entity and the priority, and determining the knowledge attribute corresponding to each keyword in the text to be searched.
In the implementation mode, the problem of dislocation of the text to be searched in the segmentation process can be solved by means of the priority among knowledge attributes, and the accuracy of keyword labeling is improved.
In a possible implementation manner, after the generating the labeling result corresponding to the candidate entity, the method further includes: if the labeling result indicates that the knowledge attribute corresponding to the first keyword in the text to be searched is unknown, matching the first keyword according to a preset regular matching rule, and generating the knowledge attribute corresponding to the first keyword according to the matching result.
In the implementation manner, when the knowledge attribute corresponding to the first keyword is marked as unknown, the first keyword is subjected to secondary matching by utilizing a preset regular matching rule, so that the marking comprehensiveness and accuracy of each keyword in the text to be searched can be ensured.
In a possible implementation manner, the determining, according to the labeling results corresponding to the at least one candidate entity, the target entity corresponding to the text to be searched includes: determining the matching degree of entity information corresponding to each candidate entity and the text to be searched according to the labeling result corresponding to each candidate entity; and determining the candidate entity corresponding to the highest matching degree as a target entity.
In a possible implementation manner, the at least one knowledge attribute corresponds to a weight coefficient; and determining the matching degree of the entity information corresponding to each candidate entity and the text to be searched according to the labeling result corresponding to each candidate entity, wherein the matching degree comprises the following steps: and determining the matching degree of the entity information corresponding to the candidate entity and the text to be searched according to the labeling result corresponding to each candidate entity and the weight coefficient corresponding to at least one knowledge attribute of the candidate entity.
In the implementation manner, through the evaluation process of the labeling result, the accuracy of understanding the text to be searched can be improved, so that the accuracy of the identified entity to be searched is ensured.
In a possible implementation manner, the determining at least one candidate entity according to the text to be searched includes: according to the entity name dictionary, carrying out matching processing on the text to be searched, and determining at least one candidate entity name from the text to be searched; the entity name dictionary comprises at least one entity name and at least one entity corresponding to each entity name; and determining the entity corresponding to each candidate entity name as the at least one candidate entity.
In the implementation mode, the text to be searched is matched by utilizing the entity name dictionary, so that at least one candidate entity is determined, and the comprehensiveness and accuracy of the determined candidate entity are ensured.
In a possible implementation manner, before determining at least one candidate entity according to the text to be searched, the method further includes: determining that the text to be searched comprises a first type keyword and a second type keyword according to the entity name dictionary; the first type of keywords are keywords matched with any entity name in the entity name dictionary, and the second type of keywords are keywords which are not matched with all entity names in the entity name dictionary.
In a possible implementation manner, the knowledge graph is used for indicating an entity name and entity information corresponding to at least one entity; the method further comprises the steps of: and generating the entity name dictionary according to the knowledge graph.
In a possible implementation manner, the generating the entity name dictionary according to the knowledge graph includes: adding entity names corresponding to the entities in the knowledge graph into the entity name dictionary; mining entity names corresponding to the entities in the knowledge graph, and adding the mined entity names into the entity name dictionary; wherein the digging comprises at least one of: alias mining, transform mining, abbreviation mining, error correction mining.
In this implementation, through the above mining process, the number of entity names in the entity name dictionary will be greatly enriched. And when the candidate entity is determined by matching the text to be searched by using the entity name dictionary, the comprehensiveness and the accuracy of the determined candidate entity are ensured.
In a second aspect, an embodiment of the present application provides a search processing apparatus, including:
the acquisition module is used for acquiring the text to be searched; the selection module is used for determining at least one candidate entity according to the text to be searched; the identification module is used for acquiring entity information corresponding to each candidate entity from the knowledge graph according to each candidate entity, and marking the keywords in the text to be searched according to the entity information corresponding to the candidate entity to obtain a marking result corresponding to the candidate entity; and the determining module is used for determining the target entity corresponding to the text to be searched according to the labeling result corresponding to each of the at least one candidate entity.
In one possible implementation manner, entity information corresponding to one candidate entity is used for indicating an attribute value corresponding to at least one knowledge attribute of the candidate entity; and the labeling result is used for indicating knowledge attributes corresponding to the keywords in the text to be searched.
In a possible implementation manner, the labeling module is specifically configured to: cutting and matching the keywords in the text to be searched according to the attribute value corresponding to at least one knowledge attribute of the candidate entity, and determining the knowledge attribute corresponding to each keyword in the text to be searched; and generating a labeling result corresponding to the candidate entity according to the knowledge attribute corresponding to each keyword in the text to be searched.
In a possible implementation manner, the at least one knowledge attribute corresponds to priority; the labeling module is specifically used for: and cutting and matching the keywords in the text to be searched according to the attribute value corresponding to at least one knowledge attribute of the candidate entity and the priority, and determining the knowledge attribute corresponding to each keyword in the text to be searched.
In a possible implementation manner, the labeling module is further configured to: if the labeling result indicates that the knowledge attribute corresponding to the first keyword in the text to be searched is unknown, matching the first keyword according to a preset regular matching rule, and generating the knowledge attribute corresponding to the first keyword according to the matching result.
In a possible implementation manner, the determining module is specifically configured to: determining the matching degree of entity information corresponding to each candidate entity and the text to be searched according to the labeling result corresponding to each candidate entity; and determining the candidate entity corresponding to the highest matching degree as a target entity.
In a possible implementation manner, the at least one knowledge attribute corresponds to a weight coefficient; the determining module is specifically configured to: and determining the matching degree of the entity information corresponding to the candidate entity and the text to be searched according to the labeling result corresponding to each candidate entity and the weight coefficient corresponding to at least one knowledge attribute of the candidate entity.
In a possible implementation manner, the selecting module is specifically configured to: according to the entity name dictionary, carrying out matching processing on the text to be searched, and determining at least one candidate entity name from the text to be searched; the entity name dictionary comprises at least one entity name and at least one entity corresponding to each entity name; and determining the entity corresponding to each candidate entity name as the at least one candidate entity.
In a possible implementation manner, the selecting module is further configured to: determining that the text to be searched comprises a first type keyword and a second type keyword according to the entity name dictionary; the first type of keywords are keywords matched with any entity name in the entity name dictionary, and the second type of keywords are keywords which are not matched with all entity names in the entity name dictionary.
In a possible implementation manner, the knowledge graph is used for indicating an entity name and entity information corresponding to at least one entity; the apparatus further comprises: and the generation module is used for generating the entity name dictionary according to the knowledge graph.
In a possible implementation manner, the generating module is specifically configured to: adding entity names corresponding to the entities in the knowledge graph into the entity name dictionary; mining entity names corresponding to the entities in the knowledge graph, and adding the mined entity names into the entity name dictionary; wherein the digging comprises at least one of: alias mining, transform mining, abbreviation mining, error correction mining.
In a third aspect, an embodiment of the present application provides an electronic device, including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of the first aspects.
In a fourth aspect, embodiments of the present application provide a non-transitory computer-readable storage medium storing computer instructions for causing the computer to perform the method of any one of the first aspects.
According to the search processing method, the device and the equipment, the keywords in the text to be searched are marked by utilizing the entity information in the knowledge graph, and the knowledge attributes corresponding to the keywords can be marked for each keyword in the text to be searched, so that each keyword in the text to be searched can be accurately understood, and the meaning and intention of the text to be searched can be thoroughly understood. Furthermore, according to the understanding result of the text to be searched, the entity to be searched can be accurately identified, and therefore the accuracy of the searching result is improved. In addition, the search processing method of the embodiment can be used for the text to be searched with any structure and any length, and has high universality.
Other effects of the above alternative will be described below in connection with specific embodiments.
Drawings
The drawings are for better understanding of the present solution and do not constitute a limitation of the present application. Wherein:
fig. 1 is a schematic diagram of a network architecture to which embodiments of the present application are applicable;
FIG. 2 is a schematic diagram of one possible search interaction process according to an embodiment of the present application;
FIG. 3 is a flowchart of a search processing method according to an embodiment of the present disclosure;
Fig. 4A to fig. 4C are schematic diagrams of a knowledge graph provided in an embodiment of the present application;
fig. 5 is a flowchart of a search processing method according to another embodiment of the present application;
fig. 6 is a schematic diagram of a search result interface of a terminal device provided in an embodiment of the present application;
fig. 7 is a schematic structural diagram of a search processing apparatus according to an embodiment of the present application;
fig. 8 is a schematic structural diagram of a search processing apparatus according to another embodiment of the present application;
fig. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
Exemplary embodiments of the present application are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present application to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Fig. 1 is a schematic diagram of a network architecture applicable to the embodiment of the present application, and as shown in fig. 1, the network architecture includes at least one terminal device 11 and at least one server 12. The terminal device 11 provides a search portal to the user. The search entry may be a search engine installed in the terminal device 11, or may be another application having a search function. The Terminal device 11 is also called a Terminal, user Equipment (UE), an access Terminal, a subscriber unit, a mobile device, a user Terminal, a wireless communication device, a user agent, or a user equipment. The terminal device may be a personal digital processing (personal digital assistant, PDA for short), a smart television, a handheld device with wireless communication function (e.g., smart phone, tablet computer), a computing device (e.g., personal computer (personal computer, PC for short), a vehicle-mounted device, a wearable device, etc.
The server 12 has storage, analysis and retrieval functions. The server may be a centralized server or a distributed server. The server may also be a cloud server. The server 12 may be provided with a database in which a large amount of entity-related information is stored in advance.
Fig. 2 is a schematic diagram of one possible search interaction procedure according to an embodiment of the present application. As shown in fig. 2, the user may input text to be searched at a search portal provided by the terminal device 11. The text to be searched may include one or more keywords. After detecting that the user operates the "search" button, the terminal device 11 transmits the text to be searched to the server 12 for search processing. The server 12 identifies and analyzes the text to be searched to determine the entity to be searched, further searches the database to obtain the related information of the entity to be searched, and returns the search result to the terminal device 11. The terminal device 11 presents the search results in an interface.
In the searching process, accurate searching results can be displayed to the user only if the entity to be searched by the user is accurately identified according to the text to be searched. In the related art, when a text to be searched is identified, matching identification is performed on the text to be searched by using a preset template so as to determine an entity to be searched. The entity to be searched determined in the mode is low in accuracy, so that the search result is not accurate enough. In addition, the above manner generally requires that the text to be searched meets the preset form requirement, has limitation, and is not easy to expand.
The embodiment of the application provides a search processing method, which can accurately identify an entity to be searched by accurately understanding each keyword in a text to be searched by utilizing a knowledge graph, so that the accuracy of a search result is improved.
In some scenarios, the method of the embodiments of the present application may be applied to a server as shown in fig. 1. In other scenarios, when the terminal device has relatively high computing power and storage power, the method of the embodiments of the present application may also be performed by the terminal device as in fig. 1.
The technical solutions of the present application are described in detail below in connection with several specific embodiments. The following embodiments may be combined with each other and the description may not be repeated in some embodiments for the same or similar matters.
Fig. 3 is a flow chart of a search processing method according to an embodiment of the present application. As shown in fig. 3, the method of the present embodiment includes:
s301: and acquiring the text to be searched.
The text to be searched refers to text input by the user for search intention. The text to be searched may be input by a user in a text form or may be input by a voice form, or of course, may be input by other forms, which is not limited in this embodiment. When the method of the embodiment is executed by the server, the terminal device sends the text to be searched to the server after receiving the text to be searched input by the user.
The text to be searched may include one or more keywords. Optionally, at least one keyword in the text to be searched is used for indicating the name of the entity to be searched, and the rest keywords are used for further defining or describing other information of the entity to be searched.
For example, the text to be searched may be "color set, sitcoms book 2019, 20 th set". Here, it is assumed that "color set" is the name of a television show and "reddish" is the name of a lead actor. In the text to be searched, the keyword "color set" indicates the name of the entity to be searched, and other keywords indicate information for further describing or defining the entity to be searched.
S302: and determining at least one candidate entity according to the text to be searched.
In the embodiment of the present application, entities refer to things that exist objectively and can be distinguished from each other. An entity may be a specific person, thing, or an abstract concept or relationship. For example, the entity may be a television show, a piece of music, a person, a place, etc.
In this embodiment, at least one candidate entity may be determined by performing preliminary recognition on the text to be searched. Where the candidate entity refers to an entity that the user may want to search for.
For example, in the above example, when the text to be searched is "color set reddish tv 2001, set 18", the determined candidate entity may include: the "color set" is the television play, and the "reddish" is the actor. Of course, in some examples, when there are multiple dramas each named "color set," these dramas may all be candidate entities. In other examples, a movie may also be considered a candidate entity when the name of the movie is also "color set".
It should be noted that, specific embodiments of S302 may be various. For example, the text to be searched can be identified through a named entity tool to obtain at least one candidate entity; the foregoing may be further implemented by other means, and may be specifically described in detail in the following embodiments, which are not described herein.
S303: and aiming at each candidate entity, acquiring entity information corresponding to the candidate entity from the knowledge graph, and marking keywords in the text to be searched according to the entity information corresponding to the candidate entity to obtain a marking result corresponding to the candidate entity.
Knowledge-graph is essentially a semantic network. The knowledge graph may include nodes and edges for connecting the nodes. Where nodes represent entities (entities) or concepts (concepts), and edges represent various semantic relationships between entities/concepts.
In this embodiment, the knowledge graph is used to indicate an entity name and entity information corresponding to at least one entity. The entity information is related information for describing an entity. Optionally, the entity information of an entity is used to indicate an attribute value corresponding to at least one knowledge attribute of the entity. Fig. 4A to fig. 4C are schematic diagrams of a knowledge graph provided in an embodiment of the present application. Taking a knowledge graph in the film and television field as an example, fig. 4A to 4C respectively illustrate related information of one entity in the knowledge graph. As shown in fig. 4A to 4C, each entity corresponds to a television series or movie. The knowledge attributes of each entity may include: director, drama, actors, characters, year, version, region, quarter information, field, genre, broadcast site (site a, site B, etc.), collection number, viewing intent (free, undisclosed, high definition, 1080p, full version, etc.).
For convenience of the following examples, the present embodiment assumes that there are 2 dramas named "color set" and there are also 1 movie named "color set". Fig. 4A to 4C illustrate entity information corresponding to the above 3 entities (2 dramas and 1 movie). As shown in fig. 4A to 4C, the entity names corresponding to the entities ID1 to ID3 are all "color sets". The entity information of the entity ID includes: the director is reddish, the director is Zhang one, the type is urban drama, the year is 2001, the album number is 20, and the playing site is site A. The entity information of entity ID2 includes: the main director is little green, the director is Zhang II, the type is ancient drama, the year is 2010, the collection number is 30, and the playing site is site B. The entity information of the entity ID3 includes: the director is bluish, the director is Zhang three, the type is science fiction, the year is 2020, and the playing site is site C.
After determining the candidate entity in this embodiment, for each candidate entity, entity information corresponding to the candidate entity (i.e., attribute values of knowledge attributes of the candidate entity) may be obtained from the knowledge maps shown in fig. 4A to 4C. And further, the entity information corresponding to the candidate entity can be utilized to label the keywords in the text to be searched, and a labeling result corresponding to the candidate entity is obtained. The labeling result indicates knowledge attributes corresponding to the keywords in the text to be searched.
In a possible implementation manner, the keywords in the text to be searched may be segmented and matched according to the attribute value corresponding to at least one knowledge attribute of the candidate entity, so as to determine the knowledge attribute corresponding to each keyword in the text to be searched. And generating a labeling result corresponding to the candidate entity according to the knowledge attribute corresponding to each keyword in the text to be searched.
For example, taking entity ID1 in the knowledge graph as an example, when the keywords in the "color set, the number 18 of the television bluish red 2001" of the text to be searched are labeled by using entity information corresponding to the entity ID1, the attribute value of each knowledge attribute of the entity ID1 is used for matching with the text to be searched. For example, if the attribute value "drama" of the domain of the entity ID1 matches successfully with the keyword "drama" in the text to be searched, the knowledge attribute corresponding to the keyword "drama" in the text to be searched is set as "domain". And if the attribute value of the actor of the entity ID1 is successfully matched with the keyword 'reddish', setting the knowledge attribute corresponding to the keyword 'reddish' in the text to be searched as the actor. The attribute value "2001" of the year of the entity ID1 is successfully matched with the keyword "2001" in the text to be searched, and then the knowledge attribute corresponding to the keyword "2001" in the text to be searched is set as "year". And if the attribute value of each knowledge attribute of the entity ID1 is not matched with the 18 th set of keywords in the text to be searched, setting the knowledge attribute corresponding to the 18 th set of keywords in the text to be searched as unknown. Thus, labeling results as shown in Table 1 can be obtained finally.
TABLE 1
Keyword(s) Knowledge attributes
Color collection Entity name
TV play FIELD
Xiao Hong Actor(s)
2001 Year of year
Set 18 Unknown
In the above matching process, a plurality of existing text segmentation matching techniques may be adopted, which is not particularly limited in this embodiment. For example, a multimode matching tree algorithm can be adopted for matching, and knowledge attributes of all keywords in the text to be searched can be marked by one-time calculation through the multimode matching tree algorithm, so that marking efficiency is improved.
In a possible implementation manner, after the labeling result is obtained by using the matching process, the method may further include: if the labeling result indicates that the knowledge attribute corresponding to the first keyword in the text to be searched is unknown, matching the first keyword according to a preset regular matching rule, and generating the knowledge attribute corresponding to the first keyword according to the matching result. The first keyword may be any keyword in the text to be searched.
For example, in the labeling result shown in table 1, the knowledge attribute corresponding to the keyword "18 th set" is "unknown", which indicates that the keyword is not successfully matched. Therefore, a preset regular matching rule may be further adopted to match the keyword "18 th set" with the attribute value "20 sets" of the set number of the entity ID 1. For example, the number 18 in the keyword "18 th set" is extracted, and the number is compared with the number 20 in the attribute value "20 set" of the set number, and since 18 is smaller than 20, the keyword can be considered to indicate a certain set, and thus the knowledge attribute of the keyword "18 th set" can be set to "set number".
When the knowledge attribute corresponding to the first keyword is marked as unknown, the first keyword is subjected to secondary matching by utilizing a preset regular matching rule, so that the marking comprehensiveness and accuracy of each keyword in the text to be searched can be ensured.
Further, among the plurality of knowledge attributes of each entity in the knowledge graph, some of the knowledge attributes are used to describe viewing intent, for example: play sites, collection numbers, etc. When "site a 18 th set" is included in the text to be searched for, which is input by the user, the viewing intention of the user can be regarded as "view 18 th set through site a". Therefore, in the labeling result shown in table 1, when the knowledge attribute corresponding to a certain keyword is a play site or a collection number, the knowledge attribute of the keyword may be further labeled as "viewing intention".
The following is an example. Assume that the candidate entities determined in S302 are entity ID1, entity ID2, and entity ID3 in the knowledge-graph shown in fig. 4A to 4C. And labeling the text to be searched 'the 18 th set of the color set TV bluish red 2001' by adopting entity information corresponding to the entity ID1 aiming at the entity ID1, wherein the obtained labeling result is shown in the table 2. And labeling the text to be searched 'the 18 th set of the color set TV drama red 2001' by adopting entity information corresponding to the entity ID2 aiming at the entity ID2, wherein the obtained labeling result is shown in the table 3. And labeling the text to be searched 'the 18 th set of the color set TV bluish red 2001' by adopting entity information corresponding to the entity ID3 aiming at the entity ID3, wherein the obtained labeling result is shown in the table 4.
TABLE 2
Keyword(s) Knowledge attributes
Color collection Entity name
TV play FIELD
Xiao Hong Actor(s)
2001 Year of year
Set 18 Collection number (viewing intention)
TABLE 3 Table 3
Keyword(s) Knowledge attributes
Color collection Entity name
TV play FIELD
Red 2001 Unknown
Set 18 Collection number (viewing intention)
TABLE 4 Table 4
Keyword(s) Knowledge attributes
Color collection Entity name
TV set reddish 2001, 18 th set Unknown
The labeling process of the embodiment has no limitation on the structure and the length of the text to be searched, and even if the text to be searched is extremely complex, the embodiment of the application can accurately label the keywords of the text to be searched, so that the semantics of the text to be searched can be accurately understood.
Alternatively, in the embodiment S303, since the process of performing keyword labeling on the text to be searched by using the entity information of each candidate entity is independent, multi-thread parallel computation may be performed on multiple candidate entities. Therefore, when the number of candidate entities is very large (such as 50 or more), the multithreaded parallel computing can greatly improve the labeling efficiency, improve the online computing performance and reduce the search delay.
S304: and determining a target entity corresponding to the text to be searched according to the labeling result corresponding to each of the at least one candidate entity.
The matching degree of the entity information corresponding to each candidate entity and the text to be searched is determined according to the labeling result corresponding to each candidate entity, and the candidate entity corresponding to the highest matching degree is determined as the target entity. For example, according to the labeling results shown in tables 2 to 4, only the entity information of the entity ID completely cuts the "text to be searched", and the knowledge attribute is labeled for each keyword, so the meaning and intention of the text to be searched can be fully understood by using the entity ID 1. The labeling results of other entity IDs are unknown. It can be seen that the matching degree of the entity information corresponding to the entity ID1 and the text to be searched is highest, so that the entity ID1 is used as the target entity corresponding to the text to be searched. I.e. the entity to be searched by the user is the television play corresponding to the entity ID 1.
The search processing method provided in this embodiment includes: acquiring a text to be searched, and determining at least one candidate entity according to the text to be searched; for each candidate entity, acquiring entity information corresponding to the candidate entity from the knowledge graph, and marking keywords in the text to be searched according to the entity information corresponding to the candidate entity to obtain a marking result corresponding to the candidate entity; and determining a target entity corresponding to the text to be searched according to the labeling result corresponding to each of the at least one candidate entity. In the process, the keywords in the text to be searched are marked by utilizing the entity information in the knowledge graph, and the knowledge attributes corresponding to the keywords can be marked for each keyword in the text to be searched, so that each keyword in the text to be searched can be accurately understood, and the meaning and intention of the text to be searched are thoroughly understood. Furthermore, according to the understanding result of the text to be searched, the entity to be searched can be accurately identified, and therefore the accuracy of the searching result is improved. In addition, the search processing method of the embodiment can be used for the text to be searched with any structure and any length, and has high universality.
Fig. 5 is a flowchart of a search processing method according to another embodiment of the present application. Based on the embodiment shown in fig. 3, the technical scheme of the application is further refined in this embodiment. As shown in fig. 5, the method of the present embodiment may include:
s501: and acquiring the text to be searched.
The specific implementation of S501 in this embodiment is similar to S301, and will not be described here again.
S502: and determining that the text to be searched comprises the first type keywords and the second type keywords according to the entity name dictionary.
The entity name dictionary comprises at least one entity name and at least one entity corresponding to each entity name. Table 5 is an example of one possible entity name dictionary.
TABLE 5
Entity name Entity ID
Color collection Entity ID1, entity ID2, entity ID3
Xiao Hong Entity ID4
Little green Entity ID5
Alternatively, the entity name dictionary may be generated from a knowledge graph. A possible entity name dictionary generating process may be referred to in the detailed description of the following embodiments, which will not be described herein.
In this embodiment, the first type of keywords are keywords that match any entity name in the entity name dictionary (or, in other words, the first type of keywords are keywords that are used to indicate the names of the entities), and the second type of keywords are keywords that do not match all entity names in the entity name dictionary (or, in other words, the second type of keywords are keywords that are used to describe or define related information of the entities).
For example, the text to be searched may be matched with the entity name dictionary, and if the text to be searched is directly equal to a certain entity name in the entity name dictionary, it is explained that only the first type of keywords are included in the text to be searched (i.e. only keywords for indicating the entity names are included in the text to be searched). And if the text to be searched is not directly equal to any entity name in the entity name dictionary, the text to be searched is explained to comprise the first type keywords and the second type keywords.
It should be understood that, when it is determined that only the first type of keyword is included in the text to be searched, the entity ID corresponding to the first type of keyword may be directly used as the entity to be searched, which is similar to the prior art. When determining that the text to be searched includes the first type of keywords and the second type of keywords, the following S503 to S506 may be continuously executed, and the keywords in the text to be searched are labeled by using the knowledge graph, so that the entity to be searched is determined after the semantics of the text to be searched are accurately understood.
S503: and carrying out matching processing on the text to be searched according to the entity name dictionary, determining at least one candidate entity name from the text to be searched, and determining an entity corresponding to each candidate entity name in the entity name dictionary as at least one candidate entity.
S503 in the present embodiment gives one possible implementation of S302 in the above embodiment. And determining at least one candidate entity by utilizing the entity name dictionary to perform matching processing on the text to be searched. The matching process may use an existing matching algorithm.
In one example, a multi-mode matching tree algorithm may be used to read the entity name dictionary into the multi-mode matching tree, perform multi-mode matching calculation on the text to be searched, and calculate all candidate entity names contained in the text to be searched at one time. Further, the entity IDs corresponding to the candidate entity names are obtained by referring to the entity name dictionary shown in table 5, and these entity IDs are used as candidate entities. For example, assuming that the text to be searched is "color set sitcom 2001, set 18", the candidate entity names may be obtained through the matching process described above, including: "color set", "reddish". Further, by referring to table 5, entity ID1 to entity ID4 are used as candidate entities.
And the entity name dictionary is utilized to carry out matching processing on the text to be searched, so that at least one candidate entity is determined, and the comprehensiveness and accuracy of the determined candidate entity are ensured.
S504: and for each candidate entity, acquiring entity information corresponding to the candidate entity from the knowledge graph, and marking the keywords in the text to be searched according to the entity information corresponding to the candidate entity to obtain a marking result corresponding to the candidate entity.
In this embodiment, the specific implementation of S504 is similar to S303 in the embodiment shown in fig. 3, and will not be described in detail here.
On the basis of the embodiment shown in fig. 3, another possible implementation of S504 is given below. In this embodiment, priority information may be set for a plurality of knowledge attributes involved in the knowledge graph. Each priority may correspond to one or more knowledge attributes. When one priority corresponds to a plurality of knowledge attributes, it is explained that the priorities of the knowledge attributes are the same.
For example, one possible priority information is shown in table 6. In table 6, the priority levels are sequentially lowered in the order of 1 to 7.
TABLE 6
Priority sequence number Knowledge attributes
Priority 1 Entity name
Priority 2 FIELD
Priority 3 Quaternary part
Priority 4 Version, year, collection number
Priority 5 Director, drama, actor, character
Priority 6 Country, type
Priority 7 Broadcast site, viewing intent
Further, in S504, when the keyword labeling is performed on the text to be searched by using the entity information of the candidate entity, the priority of each knowledge attribute shown in table 6 may be used as a basis for keyword segmentation and matching, that is, the keyword in the text to be searched is segmented and matched according to the attribute value and the priority corresponding to at least one knowledge attribute of the candidate entity, so as to determine the knowledge attribute corresponding to each keyword in the text to be searched.
The following is an illustration. Let's assume that the text to be searched is "where stars Zhang Mianfei mediate," where' stars are names of dramas, "tense" is an actor and 'Fei Jie' is a character. When the priority of the knowledge attribute is "actor=role > viewing intention", the knowledge attribute is preferentially identified as: the knowledge attribute of the keyword "tense" is "actor", the knowledge attribute of the keyword "Fei Jie" is "character", and the keyword "free" is not recognized as viewing intention. When the priority of the knowledge attribute is "viewing intention > actor", the knowledge attribute is preferentially identified as: the knowledge attribute of the keyword "free" is "viewing intent", instead of identifying the keyword "free of tension" as an actor.
Therefore, by means of the priority among knowledge attributes, the problem of dislocation of the text to be searched in the segmentation process can be solved, and the accuracy of keyword labeling is improved.
S505: and determining a target entity corresponding to the text to be searched according to the labeling result corresponding to each of the at least one candidate entity.
The specific implementation of S505 in this embodiment is similar to S304 in the embodiment shown in fig. 3, and will not be described in detail here.
On the basis of the embodiment shown in fig. 3, another possible implementation of S505 is given below. In this embodiment, a weight coefficient may be set for a plurality of knowledge attributes involved in the knowledge graph. Wherein the weight coefficients corresponding to different knowledge attributes may be different. Or, the weight coefficients corresponding to some knowledge attributes are the same, and the weight coefficients corresponding to some knowledge attributes are different.
In one example, the matching degree of the entity information corresponding to the candidate entity and the text to be searched may be determined according to the labeling result corresponding to each candidate entity and the weight coefficient corresponding to at least one knowledge attribute of the candidate entity. Further, the candidate entity corresponding to the highest matching degree is determined as the target entity.
Optionally, the weight coefficient corresponding to the knowledge attribute is positively correlated with the priority corresponding to the knowledge attribute, that is, the higher the priority corresponding to a certain knowledge attribute is, the higher the weight coefficient corresponding to the knowledge attribute is; the lower the priority corresponding to a certain knowledge attribute, the lower the weight coefficient corresponding to the knowledge attribute. In this way, for the labeling result of a certain candidate entity, weighting summation can be performed according to the knowledge attribute and the weight coefficient corresponding to each keyword identified in the labeling result, so as to obtain the matching degree of the entity information corresponding to the candidate entity and the text to be searched.
Through the evaluation process of the labeling result, the accuracy of understanding the text to be searched can be improved, so that the accuracy of the identified entity to be searched is ensured.
S506: and determining the entity card to be displayed and the display information according to the labeling result corresponding to the target entity.
According to the method and the terminal device, the entity card to be displayed and the display information can be determined according to the labeling result corresponding to the target entity, so that the terminal device displays the entity card.
Optionally, the viewing intention indicated by the labeling result may also be considered when determining the presentation information. For example: specific set count ranges, specified free, specified site, specified un-pruned version, specified 1080p, etc. For example, for the text to be searched, "color set television bluish red 2001, 18 th set online viewing site a", the following keywords are labeled as viewing intents in the labeling results:
On-line viewing: viewing (viewing intention)
Station a: player station (viewing intention)
Set 18: collection number (viewing intention)
Therefore, when the station A is judged to comprise the playing resources of the entity card, the entity card is determined to be displayed. And, the "site a" is ordered to the forefront in the presentation information, and only the 18 th set is reserved at the number of play sets.
Fig. 6 is a schematic diagram of a search result interface of a terminal device according to an embodiment of the present application. Assuming that the text to be searched input by the user is "color set television bluish red 2001, 18 th set online viewing site a", the search result interface displayed by the terminal is shown in fig. 6. In fig. 6, entity cards corresponding to entity ID1 in the knowledge graph are shown, and "site a" is ordered to the forefront, and only the 18 th set is reserved at the number of play sets. Therefore, the search requirement of the user can be directly met, and the user experience is improved.
The process of generating the entity name dictionary in the embodiment shown in fig. 5 is described below in connection with a specific example. The generation process of the entity name dictionary of the embodiment may be performed online or offline.
In one example, the process of generating the entity name dictionary from the knowledge-graph may include: and adding entity names corresponding to the entities in the knowledge graph into an entity name dictionary. For example, in the knowledge graph shown in fig. 4, the entity names of the entity IDs 1 to 3 are "color sets", and therefore, the entity names "color sets" are added to the entity name dictionary, and the entity names "color sets" are associated with the entity IDs 1, 2, and 3 in the entity name dictionary. In addition, assuming that the knowledge graph further includes an entity ID4 and an entity ID5, and the entity names of the entity ID4 and the entity ID5 are respectively "reddish" and "greenish", the entity names "reddish" and "greenish" may be added into the entity name dictionary, and an association relationship is established between the entity name "reddish" and the entity ID4 and between the entity name "greenish" and the entity ID5 in the entity name dictionary. Thus, the entity name dictionary shown in table 5 was obtained.
Further, to enrich the data in the entity name dictionary, the process of generating the entity name dictionary according to the knowledge graph may further include: and mining entity names corresponding to the entities in the knowledge graph, and adding the mined entity names into an entity name dictionary. Wherein the excavating comprises at least one of: alias mining, transform mining, abbreviation mining, error correction mining.
The alias mining refers to performing alias replacement on the entity names in the knowledge graph to obtain new entity names. For example: assuming that the actor "reddish" has an alias called "red" as well, the entity name "red" may also be added to the entity name dictionary. And establishes an association between "red and red" and entity ID4 in the entity name dictionary.
The transformation mining refers to performing certain transformation on entity names in the knowledge graph to obtain new entity names. For example: assuming that the knowledge graph also includes an entity ID6, the entity name of which is "color set 2", the entity name may be converted into "color set two", "color set ii", "color set second season", and so on. Therefore, the transformed entity names may be added to the entity name dictionary, and the association between these transformed entity names and entity ID6 may be established in the entity name dictionary. The above examples are given by taking digital conversion as an example, and in practical application, the entity name may be converted in various forms, which is not limited in this embodiment.
The mining refers to the process of replacing entity names in the knowledge graph to obtain new entity names. For example: assuming that the tv show "color set" is also known as "color" for short, the entity name "color" may also be added to the entity name dictionary. And the association relation between the color and the entity ID1 and ID2 is established in the entity name dictionary.
Error correction mining refers to correcting errors in entity names in a knowledge graph to obtain new entity names. For example: assuming that the knowledge graph also includes an entity ID7, the entity name of which is a "colored red bridge", and the entity name of which is an error, and the entity name of which is a "colored iridescent bridge", the "rainbow bridge" may also be added to the entity name dictionary, and an association relationship between the "colored iridescent bridge" and the entity ID7 may be established in the entity name dictionary.
It can be appreciated that through the mining process described above, the number of entity names in the entity name dictionary will be greatly enriched. And when the candidate entity is determined by matching the text to be searched by using the entity name dictionary, the comprehensiveness and the accuracy of the determined candidate entity are ensured.
Fig. 7 is a schematic structural diagram of a search processing apparatus according to an embodiment of the present application. The apparatus of this embodiment may be in the form of software and/or hardware. As shown in fig. 7, the search processing apparatus 800 of the present embodiment may include: an acquisition module 801, a selection module 802, a labeling module 803, and a determination module 804. Wherein,,
An obtaining module 801, configured to obtain a text to be searched; a selection module 802, configured to determine at least one candidate entity according to the text to be searched; the identification module 803 is configured to obtain, for each candidate entity, entity information corresponding to the candidate entity from the knowledge graph, and label the keywords in the text to be searched according to the entity information corresponding to the candidate entity, so as to obtain a labeling result corresponding to the candidate entity; a determining module 804, configured to determine, according to the labeling results corresponding to each of the at least one candidate entity, a target entity corresponding to the text to be searched.
In one possible implementation manner, entity information corresponding to one candidate entity is used for indicating an attribute value corresponding to at least one knowledge attribute of the candidate entity; and the labeling result is used for indicating knowledge attributes corresponding to the keywords in the text to be searched.
In a possible implementation manner, the labeling module 803 is specifically configured to: cutting and matching the keywords in the text to be searched according to the attribute value corresponding to at least one knowledge attribute of the candidate entity, and determining the knowledge attribute corresponding to each keyword in the text to be searched; and generating a labeling result corresponding to the candidate entity according to the knowledge attribute corresponding to each keyword in the text to be searched.
In a possible implementation manner, the at least one knowledge attribute corresponds to priority; the labeling module 803 is specifically configured to: and cutting and matching the keywords in the text to be searched according to the attribute value corresponding to at least one knowledge attribute of the candidate entity and the priority, and determining the knowledge attribute corresponding to each keyword in the text to be searched.
In a possible implementation manner, the labeling module 803 is further configured to: if the labeling result indicates that the knowledge attribute corresponding to the first keyword in the text to be searched is unknown, matching the first keyword according to a preset regular matching rule, and generating the knowledge attribute corresponding to the first keyword according to the matching result.
In a possible implementation manner, the determining module 804 is specifically configured to: determining the matching degree of entity information corresponding to each candidate entity and the text to be searched according to the labeling result corresponding to each candidate entity; and determining the candidate entity corresponding to the highest matching degree as a target entity.
In a possible implementation manner, the at least one knowledge attribute corresponds to a weight coefficient; the determining module 804 is specifically configured to: and determining the matching degree of the entity information corresponding to the candidate entity and the text to be searched according to the labeling result corresponding to each candidate entity and the weight coefficient corresponding to at least one knowledge attribute of the candidate entity.
In a possible implementation manner, the selecting module 802 is specifically configured to: according to the entity name dictionary, carrying out matching processing on the text to be searched, and determining at least one candidate entity name from the text to be searched; the entity name dictionary comprises at least one entity name and at least one entity corresponding to each entity name; and determining the entity corresponding to each candidate entity name as the at least one candidate entity.
In a possible implementation manner, the selecting module 802 is further configured to: determining that the text to be searched comprises a first type keyword and a second type keyword according to the entity name dictionary; the first type of keywords are keywords matched with any entity name in the entity name dictionary, and the second type of keywords are keywords which are not matched with all entity names in the entity name dictionary.
Fig. 8 is a schematic structural diagram of a search processing apparatus according to another embodiment of the present application. In a possible implementation manner, the knowledge graph is used for indicating an entity name and entity information corresponding to at least one entity. As shown in fig. 8, the apparatus of this embodiment may further include: and a generating module 805, configured to generate the entity name dictionary according to the knowledge graph.
In a possible implementation manner, the generating module 805 is specifically configured to: adding entity names corresponding to the entities in the knowledge graph into the entity name dictionary; mining entity names corresponding to the entities in the knowledge graph, and adding the mined entity names into the entity name dictionary; wherein the digging comprises at least one of: alias mining, transform mining, abbreviation mining, error correction mining.
The search processing device provided in this embodiment may be used to execute the technical solution in any of the above method embodiments, and its implementation principle and technical effects are similar, and are not repeated here.
According to embodiments of the present application, an electronic device and a readable storage medium are also provided.
As shown in fig. 9, a block diagram of an electronic device according to a search processing method according to an embodiment of the present application is shown. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the application described and/or claimed herein.
As shown in fig. 9, the electronic device includes: one or more processors 701, memory 702, and interfaces for connecting the various components, including high-speed interfaces and low-speed interfaces. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions executing within the electronic device, including instructions stored in or on memory to display graphical information of the GUI on an external input/output device, such as a display device coupled to the interface. In other embodiments, multiple processors and/or multiple buses may be used, if desired, along with multiple memories and multiple memories. Also, multiple electronic devices may be connected, each providing a portion of the necessary operations (e.g., as a server array, a set of blade servers, or a multiprocessor system). One processor 701 is illustrated in fig. 9.
Memory 702 is a non-transitory computer-readable storage medium provided herein. Wherein the memory stores instructions executable by the at least one processor to cause the at least one processor to perform the search processing methods provided herein. The non-transitory computer readable storage medium of the present application stores computer instructions for causing a computer to execute the search processing method provided by the present application.
The memory 702 is used as a non-transitory computer readable storage medium, and may be used to store a non-transitory software program, a non-transitory computer executable program, and modules, such as program instructions/modules corresponding to the search processing method in the embodiments of the present application (e.g., the obtaining module 801, the selecting module 802, the labeling module 803, the determining module 804, and the generating module 805 shown in fig. 8) shown in fig. 7. The processor 701 executes various functional applications of a server or a terminal device and data processing by executing a non-transitory software program, instructions, and modules stored in the memory 702, that is, implements the search processing method in the above-described method embodiment.
Memory 702 may include a storage program area that may store an operating system, at least one application program required for functionality, and a storage data area; the storage data area may store data created by use of the electronic device, and the like. In addition, the memory 702 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage device. In some embodiments, memory 702 may optionally include memory located remotely from processor 701, which may be connected to the electronic device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The electronic device may further include: an input device 703 and an output device 704. The processor 701, the memory 702, the input device 703 and the output device 704 may be connected by a bus or otherwise, for example in fig. 9.
The input device 703 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the electronic device, such as a touch screen, keypad, mouse, trackpad, touchpad, pointer stick, one or more mouse buttons, trackball, joystick, and like input devices. The output device 704 may include a display apparatus, auxiliary lighting devices (e.g., LEDs), and haptic feedback devices (e.g., vibration motors), among others. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device may be a touch screen.
Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASIC (application specific integrated circuit), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
These computing programs (also referred to as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented in a high-level procedural and/or object-oriented programming language, and/or in assembly/machine language. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present application may be performed in parallel, sequentially, or in a different order, provided that the desired results of the technical solutions disclosed in the present application can be achieved, and are not limited herein.
The above embodiments do not limit the scope of the application. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present application are intended to be included within the scope of the present application.

Claims (12)

1. A search processing method, comprising:
acquiring a text to be searched;
determining at least one candidate entity according to the text to be searched;
for each candidate entity, acquiring entity information corresponding to the candidate entity from the knowledge graph, and labeling the keywords in the text to be searched according to the entity information corresponding to the candidate entity to obtain a labeling result corresponding to the candidate entity; entity information corresponding to a candidate entity is used for indicating an attribute value corresponding to at least one knowledge attribute of the candidate entity; the labeling result is used for indicating knowledge attributes corresponding to the keywords in the text to be searched;
determining a target entity corresponding to the text to be searched according to the labeling results corresponding to the at least one candidate entity;
The determining the target entity corresponding to the text to be searched according to the labeling results corresponding to the at least one candidate entity respectively comprises the following steps:
determining the matching degree of entity information corresponding to each candidate entity and the text to be searched according to the labeling result corresponding to each candidate entity;
and determining the candidate entity corresponding to the highest matching degree as a target entity.
2. The method of claim 1, wherein the labeling the keywords in the text to be searched according to the entity information corresponding to the candidate entity to obtain the labeling result corresponding to the candidate entity includes:
cutting and matching the keywords in the text to be searched according to the attribute value corresponding to at least one knowledge attribute of the candidate entity, and determining the knowledge attribute corresponding to each keyword in the text to be searched;
and generating a labeling result corresponding to the candidate entity according to the knowledge attribute corresponding to each keyword in the text to be searched.
3. The method of claim 2, wherein the at least one knowledge attribute each corresponds to a priority; the step of segmenting and matching the keywords in the text to be searched according to the attribute value corresponding to at least one knowledge attribute of the candidate entity, and determining the knowledge attribute corresponding to each keyword in the text to be searched comprises the following steps:
And cutting and matching the keywords in the text to be searched according to the attribute value corresponding to at least one knowledge attribute of the candidate entity and the priority, and determining the knowledge attribute corresponding to each keyword in the text to be searched.
4. The method of claim 2, further comprising, after the generating the labeling result corresponding to the candidate entity:
if the labeling result indicates that the knowledge attribute corresponding to the first keyword in the text to be searched is unknown, matching the first keyword according to a preset regular matching rule, and generating the knowledge attribute corresponding to the first keyword according to the matching result.
5. The method of any one of claims 1 to 4, wherein the at least one knowledge attribute each corresponds to a weighting coefficient; and determining the matching degree of the entity information corresponding to each candidate entity and the text to be searched according to the labeling result corresponding to each candidate entity, wherein the matching degree comprises the following steps:
and determining the matching degree of the entity information corresponding to the candidate entity and the text to be searched according to the labeling result corresponding to each candidate entity and the weight coefficient corresponding to at least one knowledge attribute of the candidate entity.
6. The method according to any one of claims 1 to 4, wherein said determining at least one candidate entity from said text to be searched comprises:
according to the entity name dictionary, carrying out matching processing on the text to be searched, and determining at least one candidate entity name from the text to be searched; the entity name dictionary comprises at least one entity name and at least one entity corresponding to each entity name;
and determining the entity corresponding to each candidate entity name as the at least one candidate entity.
7. The method of claim 6, wherein before determining at least one candidate entity from the text to be searched, further comprising:
determining that the text to be searched comprises a first type keyword and a second type keyword according to the entity name dictionary; the first type of keywords are keywords matched with any entity name in the entity name dictionary, and the second type of keywords are keywords which are not matched with all entity names in the entity name dictionary.
8. The method of claim 6, wherein the knowledge-graph is used to indicate an entity name and entity information corresponding to at least one entity; the method further comprises the steps of:
And generating the entity name dictionary according to the knowledge graph.
9. The method of claim 8, wherein generating the entity name dictionary from the knowledge-graph comprises:
adding entity names corresponding to the entities in the knowledge graph into the entity name dictionary;
mining entity names corresponding to the entities in the knowledge graph, and adding the mined entity names into the entity name dictionary; wherein the digging comprises at least one of: alias mining, transform mining, abbreviation mining, error correction mining.
10. A search processing apparatus, comprising:
the acquisition module is used for acquiring the text to be searched;
the selection module is used for determining at least one candidate entity according to the text to be searched;
the identification module is used for acquiring entity information corresponding to each candidate entity from the knowledge graph according to each candidate entity, and marking the keywords in the text to be searched according to the entity information corresponding to the candidate entity to obtain a marking result corresponding to the candidate entity; entity information corresponding to a candidate entity is used for indicating an attribute value corresponding to at least one knowledge attribute of the candidate entity; the labeling result is used for indicating knowledge attributes corresponding to the keywords in the text to be searched;
The determining module is used for determining the target entity corresponding to the text to be searched according to the labeling result corresponding to each of the at least one candidate entity;
the determining module is specifically configured to determine, according to a labeling result corresponding to each candidate entity, a matching degree between entity information corresponding to the candidate entity and the text to be searched; and determining the candidate entity corresponding to the highest matching degree as a target entity.
11. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1 to 9.
12. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1 to 9.
CN202010223795.8A 2020-03-26 2020-03-26 Search processing method, device and equipment Active CN111309872B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010223795.8A CN111309872B (en) 2020-03-26 2020-03-26 Search processing method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010223795.8A CN111309872B (en) 2020-03-26 2020-03-26 Search processing method, device and equipment

Publications (2)

Publication Number Publication Date
CN111309872A CN111309872A (en) 2020-06-19
CN111309872B true CN111309872B (en) 2023-08-08

Family

ID=71157330

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010223795.8A Active CN111309872B (en) 2020-03-26 2020-03-26 Search processing method, device and equipment

Country Status (1)

Country Link
CN (1) CN111309872B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112905884B (en) * 2021-02-10 2024-05-31 北京百度网讯科技有限公司 Method, apparatus, medium and program product for generating sequence annotation model
CN113139033B (en) * 2021-05-13 2024-07-09 平安国际智慧城市科技股份有限公司 Text processing method, device, equipment and storage medium
CN116414998A (en) * 2022-01-05 2023-07-11 腾讯科技(深圳)有限公司 Resource feedback method, related device, equipment and storage medium
CN114741550B (en) * 2022-06-09 2023-02-10 腾讯科技(深圳)有限公司 Image searching method and device, electronic equipment and computer readable storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109388793A (en) * 2017-08-03 2019-02-26 阿里巴巴集团控股有限公司 Entity mask method, intension recognizing method and corresponding intrument, computer storage medium
WO2019057191A1 (en) * 2017-09-25 2019-03-28 腾讯科技(深圳)有限公司 Content retrieval method, terminal and server, electronic device and storage medium
CN109977233A (en) * 2019-03-15 2019-07-05 北京金山数字娱乐科技有限公司 A kind of idiom knowledge map construction method and device
CN109992689A (en) * 2019-03-26 2019-07-09 华为技术有限公司 Search method, terminal and medium
CN110245259A (en) * 2019-05-21 2019-09-17 北京百度网讯科技有限公司 Video tagging method and device based on knowledge graph, and computer readable medium
CN110516047A (en) * 2019-09-02 2019-11-29 湖南工业大学 Retrieval method and retrieval system based on knowledge graph in packaging field
CN110569367A (en) * 2019-09-10 2019-12-13 苏州大学 A method, device and device for spatial keyword query based on knowledge graph
CN110659366A (en) * 2019-09-24 2020-01-07 Oppo广东移动通信有限公司 Semantic analysis method and device, electronic equipment and storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7743046B2 (en) * 2005-04-20 2010-06-22 Tata Consultancy Services Ltd Cybernetic search with knowledge maps
US20180366013A1 (en) * 2014-08-28 2018-12-20 Ideaphora India Private Limited System and method for providing an interactive visual learning environment for creation, presentation, sharing, organizing and analysis of knowledge on subject matter
CN108268580A (en) * 2017-07-14 2018-07-10 广东神马搜索科技有限公司 The answering method and device of knowledge based collection of illustrative plates

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109388793A (en) * 2017-08-03 2019-02-26 阿里巴巴集团控股有限公司 Entity mask method, intension recognizing method and corresponding intrument, computer storage medium
WO2019057191A1 (en) * 2017-09-25 2019-03-28 腾讯科技(深圳)有限公司 Content retrieval method, terminal and server, electronic device and storage medium
CN109977233A (en) * 2019-03-15 2019-07-05 北京金山数字娱乐科技有限公司 A kind of idiom knowledge map construction method and device
CN109992689A (en) * 2019-03-26 2019-07-09 华为技术有限公司 Search method, terminal and medium
CN110245259A (en) * 2019-05-21 2019-09-17 北京百度网讯科技有限公司 Video tagging method and device based on knowledge graph, and computer readable medium
CN110516047A (en) * 2019-09-02 2019-11-29 湖南工业大学 Retrieval method and retrieval system based on knowledge graph in packaging field
CN110569367A (en) * 2019-09-10 2019-12-13 苏州大学 A method, device and device for spatial keyword query based on knowledge graph
CN110659366A (en) * 2019-09-24 2020-01-07 Oppo广东移动通信有限公司 Semantic analysis method and device, electronic equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于城市安全知识图谱的多关键词流式并行检索算法;管健;汪璟玢;卞倩虹;计算机科学(第002期);全文 *

Also Published As

Publication number Publication date
CN111309872A (en) 2020-06-19

Similar Documents

Publication Publication Date Title
CN111309872B (en) Search processing method, device and equipment
KR102725045B1 (en) Question and answer processing, language model training method, device, equipment and storage medium
CN111522967B (en) Knowledge graph construction method, device, equipment and storage medium
CN112507068B (en) Document query method, device, electronic equipment and storage medium
CN111522994B (en) Method and device for generating information
CN111737559B (en) Resource ordering method, method for training ordering model and corresponding device
CN111666372B (en) Method, device, electronic equipment and readable storage medium for analyzing query word query
CN111538815B (en) Text query method, device, equipment and storage medium
CN113032673B (en) Resource acquisition method and device, computer equipment and storage medium
CN113495965B (en) A multimedia content retrieval method, device, equipment and storage medium
CN111737501B (en) Content recommendation method and device, electronic equipment and storage medium
CN111090991B (en) Scene error correction method, device, electronic equipment and storage medium
JP7146961B2 (en) Audio package recommendation method, device, electronic device and storage medium
CN111831821A (en) Training sample generation method and device of text classification model and electronic equipment
CN111291192B (en) Method and device for calculating triplet confidence in knowledge graph
CN111563198B (en) Material recall method, device, equipment and storage medium
CN111858905B (en) Model training method, information identification device, electronic equipment and storage medium
CN111241242B (en) Method, device, equipment and computer readable storage medium for determining target content
CN111881255B (en) Synonymous text acquisition method and device, electronic equipment and storage medium
CN113516491B (en) Popularization information display method and device, electronic equipment and storage medium
CN111666417B (en) Method, device, electronic equipment and readable storage medium for generating synonyms
CN111625706B (en) Information retrieval method, device, equipment and storage medium
CN111984876B (en) Point-of-interest processing method, device, equipment and computer readable storage medium
CN110659422A (en) Retrieval method, retrieval device, electronic equipment and storage medium
CN117171296A (en) Information acquisition method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant