[go: up one dir, main page]

CN114281919B - Node adding method, device, equipment and storage medium based on directory tree - Google Patents

Node adding method, device, equipment and storage medium based on directory tree

Info

Publication number
CN114281919B
CN114281919B CN202111095271.6A CN202111095271A CN114281919B CN 114281919 B CN114281919 B CN 114281919B CN 202111095271 A CN202111095271 A CN 202111095271A CN 114281919 B CN114281919 B CN 114281919B
Authority
CN
China
Prior art keywords
vocabulary
node
matrix
text
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111095271.6A
Other languages
Chinese (zh)
Other versions
CN114281919A (en
Inventor
王苏羽晨
赵瑞辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202111095271.6A priority Critical patent/CN114281919B/en
Publication of CN114281919A publication Critical patent/CN114281919A/en
Application granted granted Critical
Publication of CN114281919B publication Critical patent/CN114281919B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请公开了一种基于目录树的节点添加方法、装置、设备以及存储介质,本申请实施例提供的技术方案能够应用在人工智能以及云技术等领域。通过本申请实施例提供的技术方案,在向目录树中添加节点时,可以选定目录树中任意位置作为候选位置,候选位置也即是可能添加节点的位置。在节点添加过程中,获取了目标词汇的释义文本,基于各个节点对应的释义文本来确定匹配信息,最后基于匹配信息来在候选位置添加节点,这样不仅能够在目录树中添加叶子节点,也能够向目录树中添加非叶子节点,使得目录树的扩展方式更为多样,能够提高目录树中信息的丰富程度,扩大目录树的适用范围。

The present application discloses a method, device, equipment and storage medium for adding nodes based on a directory tree. The technical solution provided by the embodiments of the present application can be applied in the fields of artificial intelligence and cloud technology. Through the technical solution provided by the embodiments of the present application, when adding a node to a directory tree, any position in the directory tree can be selected as a candidate position, and the candidate position is also the position where the node may be added. In the process of adding nodes, the definition text of the target vocabulary is obtained, and the matching information is determined based on the definition text corresponding to each node. Finally, the node is added to the candidate position based on the matching information. In this way, not only leaf nodes can be added to the directory tree, but non-leaf nodes can also be added to the directory tree, making the expansion method of the directory tree more diverse, improving the richness of information in the directory tree, and expanding the scope of application of the directory tree.

Description

Node adding method, device, equipment and storage medium based on directory tree
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method, an apparatus, a device, and a storage medium for adding nodes based on a directory tree.
Background
The directory tree is a knowledge graph with a hierarchical structure representing the relation of the upper and lower terms. In the directory tree, each node corresponds to a vocabulary (such as "fruit" and "apple"), and for each edge in the directory tree, if there is an edge pointing from A to B, node A is the parent node of node B, indicating that A has a certain upper relationship with B, and if A is fruit and B is apple, it indicates that "apple is a fruit" or "fruit contains apple", etc. As time goes by, the child node under each parent node may need to be expanded, for example, if the directory tree is a medical directory, for a disease represented by a certain node in the medical directory, a new therapy may appear as the medical progresses, and the node corresponding to the new therapy needs to be added under the node as a new child node of the node.
In the related art, the directory tree is often regarded as a graph network, and nodes are added to the directory tree in a graph convolution manner to realize expansion of the directory tree. However, the graph convolution-based method can only add leaf nodes to the directory tree, can not add non-leaf nodes to the directory tree, and can not fully expand the directory tree, wherein the leaf nodes are nodes without child nodes.
Disclosure of Invention
The embodiment of the application provides a node adding method, a node adding device, node adding equipment and a storage medium based on a directory tree, which can fully expand the directory tree and have the following technical scheme:
in one aspect, there is provided a node adding method based on a directory tree, the method comprising:
Determining candidate positions in a directory tree, wherein the directory tree comprises a plurality of nodes, the nodes respectively correspond to a plurality of words, the candidate positions are positions between a first node and a second node in the directory tree, and a first word corresponding to the first node is an upper word of a second word corresponding to the second node;
Acquiring a paraphrase text of a target word, wherein the paraphrase text of the target word is used for explaining the target word;
Determining matching information between the target vocabulary and the candidate position based on the paraphrasing text of the target vocabulary, the paraphrasing text of the first vocabulary, the paraphrasing text of the second vocabulary and the paraphrasing texts corresponding to a plurality of sub-nodes of the first node, wherein the matching information is used for representing the matching degree between the target vocabulary and the candidate position;
and responding to the matching information meeting a target condition, and adding nodes corresponding to the target vocabulary in the candidate positions.
In one aspect, there is provided a node adding apparatus based on a directory tree, the apparatus comprising:
The candidate position determining module is used for determining candidate positions in a directory tree, the directory tree comprises a plurality of nodes, the nodes respectively correspond to a plurality of words, the candidate positions are positions between a first node and a second node in the directory tree, and a first word corresponding to the first node is an upper word of a second word corresponding to the second node;
The paraphrasing text acquisition module is used for acquiring the paraphrasing text of the target vocabulary, wherein the paraphrasing text of the target vocabulary is used for explaining the target vocabulary;
The matching information determining module is used for determining matching information between the target vocabulary and the candidate position based on the paraphrasing text of the target vocabulary, the paraphrasing text of the first vocabulary, the paraphrasing text of the second vocabulary and the paraphrasing texts corresponding to a plurality of child nodes of the first node, wherein the matching information is used for representing the matching degree between the target vocabulary and the candidate position;
and the node adding module is used for responding to the matching information meeting a target condition and adding the node corresponding to the target vocabulary in the candidate position.
In one possible implementation manner, the paraphrase text obtaining module is configured to query in a paraphrase text database by using the target vocabulary to obtain a paraphrase text of the target vocabulary, where the paraphrase text database stores a plurality of vocabularies and paraphrase texts corresponding to the plurality of vocabularies respectively.
In one possible implementation manner, the paraphrase text acquisition module is configured to query the paraphrase text database with the target vocabulary, acquire semantic similarity between the target vocabulary and the paraphrase texts corresponding to the plurality of nodes when a plurality of terms corresponding to the target vocabulary exist in the paraphrase text database, and determine the paraphrase text of a first term as the paraphrase text of the target vocabulary, where the semantic similarity between the paraphrase text corresponding to a reference node is a term that meets a first similarity condition, and the semantic similarity between the corresponding paraphrase text and the target vocabulary is a node that meets a second similarity condition.
In one possible implementation manner, the matching information determining module is configured to input the paraphrase text of the target vocabulary, the paraphrase text of the first vocabulary, the paraphrase text of the second vocabulary, and the paraphrase texts corresponding to the plurality of child nodes of the first node into a matching information determining model, and output matching information between the target vocabulary and the candidate position through the matching information determining model.
In a possible implementation manner, the matching information determining module is configured to perform the following steps through the matching information determining model:
acquiring a first relation feature based on the paraphrasing text of the target vocabulary and the paraphrasing text of the first vocabulary, wherein the first relation feature is used for representing whether the first vocabulary is an upper word of the target vocabulary or not;
acquiring second relation features based on the paraphrasing text of the target vocabulary and the paraphrasing text of the second vocabulary, wherein the second relation features are used for representing whether the target vocabulary is an upper word of the second vocabulary or not;
Determining a first child node and a second child node from the plurality of child nodes based on the paraphrasing text of the target vocabulary and the paraphrasing text corresponding to the plurality of child nodes, wherein the first child node is the child node with the highest semantic similarity between the corresponding paraphrasing text and the target vocabulary, and the second child node is the child node with the lowest semantic similarity between the corresponding paraphrasing text and the target vocabulary;
And outputting matching information between the target vocabulary and the candidate position based on the first relation feature, the second relation feature, the paraphrase text of the target vocabulary, the paraphrase text corresponding to the first sub-node and the paraphrase text corresponding to the second sub-node.
In one possible implementation manner, the matching information determining module is configured to encode the paraphrase text of the target vocabulary based on an attention mechanism to obtain a paraphrase matrix of the target vocabulary, where the paraphrase matrix of the target vocabulary is used to represent the paraphrase text of the target vocabulary, encode the paraphrase text of the first vocabulary based on the attention mechanism to obtain a paraphrase matrix of the first vocabulary, where the paraphrase matrix of the first vocabulary is used to represent the paraphrase text of the first vocabulary, and obtain the first relational feature based on the paraphrase matrix of the target vocabulary and the paraphrase matrix of the first vocabulary.
In one possible implementation manner, the matching information determining module is configured to encode the paraphrase matrix of the target vocabulary with a plurality of encoding vectors to obtain a representation matrix of the target vocabulary, where the plurality of encoding vectors are used to adjust dimensions of the matrix, encode the paraphrase matrix of the first vocabulary with the plurality of encoding vectors to obtain a representation matrix of the first vocabulary, where the representation matrix of the target vocabulary has the same dimensions as the representation matrix of the first vocabulary, and encode the representation matrix of the target vocabulary and the representation matrix of the first vocabulary based on an attention mechanism to obtain the first relational feature.
In one possible implementation manner, the matching information determining module is configured to obtain a third relationship feature based on the paraphrase text of the target vocabulary and the paraphrase text corresponding to the first child node, where the third relationship feature is used to represent whether the target vocabulary is a similar word of the vocabulary corresponding to the first child node, obtain a fourth relationship feature based on the paraphrase text of the target vocabulary and the paraphrase text corresponding to the second child node, and output matching information between the target vocabulary and the candidate position based on the first relationship feature, the second relationship feature, the third relationship feature, and the fourth relationship feature.
In one possible implementation, the matching information determining module is configured to encode a paraphrase text of the target vocabulary based on an attention mechanism to obtain a paraphrase matrix of the target vocabulary, where the paraphrase matrix of the target vocabulary is used to represent the paraphrase text of the target vocabulary, encode the paraphrase text corresponding to the first sub-node based on the attention mechanism to obtain a paraphrase matrix of the first sub-vocabulary, where the paraphrase matrix of the first sub-vocabulary is used to represent the paraphrase text corresponding to the first sub-node, and obtain the third relationship feature based on the paraphrase matrix of the target vocabulary and the paraphrase matrix of the first sub-vocabulary.
In one possible implementation manner, the matching information determining module is configured to encode the paraphrase matrix of the target vocabulary with a plurality of encoding vectors to obtain a representation matrix of the target vocabulary, where the plurality of encoding vectors are used to adjust dimensions of the matrix, encode the paraphrase matrix of the first sub-vocabulary with the plurality of encoding vectors to obtain a representation matrix of the first sub-vocabulary, where the representation matrix of the target vocabulary and the representation matrix of the first sub-vocabulary have the same dimensions, and encode the representation matrix of the target vocabulary and the representation matrix of the first sub-vocabulary based on an attention mechanism to obtain the third relational feature.
In a possible implementation manner, the matching information determining module is configured to splice the first relationship feature, the second relationship feature, the third relationship feature and the fourth relationship feature into a feature matrix, and perform full connection and normalization on the feature matrix to output matching information between the target vocabulary and the candidate position.
In one possible embodiment, the apparatus further comprises:
The system comprises a directory tree, an adjustment module, a matching information determination module and a matching information determination module, wherein the directory tree is used for acquiring sample nodes and a plurality of sample candidate positions from the directory tree, the sample nodes are nodes except for root nodes in the directory tree, the sample nodes and the plurality of sample candidate positions are input into the matching information determination model, the matching information determination model outputs predicted matching information of the sample nodes and the plurality of sample candidate positions, model parameters of the matching information determination model are adjusted based on difference information between the predicted matching information and target matching information, and the target matching information is matching information between the sample nodes and actual positions of the sample nodes in the directory tree.
In one possible implementation manner, the paraphrase text obtaining module is further configured to query, for a first sub-node of the first node, a paraphrase text database with a first sub-word corresponding to the first sub-node, obtain semantic similarity between a plurality of semantic items corresponding to the first sub-word and the paraphrase text of the first word when the plurality of semantic items exist in the paraphrase text database, and determine the paraphrase text of a second semantic item as the paraphrase text of the target word, where the semantic similarity between the second semantic item and the paraphrase text of the first word meets a third similarity condition.
In one aspect, a computer device is provided that includes one or more processors and one or more memories having at least one computer program stored therein, the computer program loaded and executed by the one or more processors to implement the directory tree based node addition method.
In one aspect, a computer readable storage medium having at least one computer program stored therein is provided, the computer program being loaded and executed by a processor to implement the directory tree based node addition method.
In one aspect, a computer program product or a computer program is provided, the computer program product or computer program comprising a program code, the program code being stored in a computer readable storage medium, the program code being read from the computer readable storage medium by a processor of a computer device, the program code being executed by the processor, causing the computer device to perform the above-mentioned directory tree based node addition method.
By the technical scheme provided by the embodiment of the application, when the node is added into the directory tree, any position in the directory tree can be selected as a candidate position, and the candidate position is the position where the node is possibly added. In the node adding process, the paraphrase text of the target vocabulary is obtained, the matching information is determined based on the paraphrase text corresponding to each node, and finally, the node is added at the candidate position based on the matching information, so that not only can the leaf node be added in the directory tree, but also the non-leaf node can be added in the directory tree, the expansion mode of the directory tree is more various, the information richness in the directory tree can be improved, and the application range of the directory tree is enlarged.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of an implementation environment of a node addition method based on a directory tree according to an embodiment of the present application;
FIG. 2 is a flow chart of a method for adding nodes based on a directory tree according to an embodiment of the present application;
FIG. 3 is a flow chart of a method for adding nodes based on a directory tree according to an embodiment of the present application;
FIG. 4 is a schematic diagram of a directory tree according to an embodiment of the present application;
FIG. 5 is a logical block diagram of acquiring paraphrasing text provided by an embodiment of the present application;
FIG. 6 is a schematic structural diagram of a matching information determination model according to an embodiment of the present application;
FIG. 7 is a flowchart of a training method of a matching information determination model according to an embodiment of the present application;
Fig. 8 is a schematic diagram of a node adding device based on a directory tree according to an embodiment of the present application;
fig. 9 is a schematic structural diagram of a terminal according to an embodiment of the present application;
Fig. 10 is a schematic structural diagram of a server according to an embodiment of the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the following detailed description of the embodiments of the present application will be given with reference to the accompanying drawings.
The terms "first," "second," and the like in this disclosure are used for distinguishing between similar elements or items having substantially the same function and function, and it should be understood that there is no logical or chronological dependency between the terms "first," "second," and "n," and that there is no limitation on the amount and order of execution.
The term "at least one" in the present application means one or more, "a plurality" means two or more, for example, a plurality of reference face images means two or more reference face images.
Artificial intelligence (ARTIFICIAL INTELLIGENCE, AI) is the theory, method, technique, and application system that simulates, extends, and extends human intelligence using a digital computer or a machine controlled by a digital computer, perceives the environment, obtains knowledge, and uses the knowledge to obtain optimal results.
Semantic features-features used to represent the semantics expressed by a text-different texts may correspond to the same semantic features, e.g., the text "how weather today" and the text "how weather today" may correspond to the same semantic feature. The computer device may map the characters in the text to character vectors, and combine and operate the character vectors according to the relationship between the characters to obtain semantic features of the text. For example, the computer device may employ a bi-directional encoder representation (Bidirectional Encoder Representations from Transformers, BERT) of the codec.
Normalization, namely mapping the sequences with different value ranges to a (0, 1) interval, so as to facilitate the data processing. In some cases, the normalized value may be directly implemented as a probability.
Attention weight, which may represent the importance of a certain data in the training or prediction process, the importance represents the magnitude of the impact of the input data on the output data. The data with high importance has higher corresponding attention weight value, and the data with low importance has lower corresponding attention weight value. The importance of the data is not the same in different scenarios, and the process of training attention weights of the model is the process of determining the importance of the data.
The hypernym hypernym is a subject word which extends more conceptually. For example, "fruit" is the superscript of "apple" and "plant" is the superscript of "flower". Correspondingly, the hyponym (hyponym) refers to a subject word that is conceptually more narrow.
Similar words, a group of words with certain relation in terms of voice, semantics, structure, source or word-forming material. For example, like words representing colors include "red", "yellow", "orange", "cyan", "green", and like words representing characters include "pictographic characters", "ideographic characters", "alphabetic characters", and like words.
Fig. 1 is a schematic diagram of an implementation environment of a node adding method based on a directory tree according to an embodiment of the present application, and referring to fig. 1, the implementation environment may include a terminal 110 and a server 140.
Terminal 110 is connected to server 140 via a wireless network or a wired network. Alternatively, the terminal 110 is a smart phone, a tablet computer, a notebook computer, a desktop computer, or the like, but is not limited thereto. Terminal 110 installs and runs an application that supports adding nodes in the directory tree.
The server 140 is an independent physical server, or a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, a distribution network (Content Delivery Network, CDN), basic cloud computing services such as big data and an artificial intelligence platform, and the like. Server 140 provides background services for applications running on terminal 110.
Those skilled in the art will recognize that the number of terminals may be greater or lesser. Such as only one terminal, or tens or hundreds, or more, other terminals are also included in the implementation environment. The embodiment of the application does not limit the number of terminals and the equipment type.
After the implementation environment of the embodiment of the present application is introduced, the application scenario of the node adding method based on the directory tree provided by the embodiment of the present application is described below, and in the following description, the terminal is the terminal 110 in the implementation environment, and the server is the server 140 in the implementation environment.
The node adding method based on the directory tree can be applied to the scene of adding nodes for various directory trees, such as the scene of adding nodes for medical directory trees, the scene of adding nodes for news directory trees, the scene of adding nodes for commodity directory trees, or the scene of adding nodes for scientific directory trees.
In the case of adding nodes to the medical directory tree, the vocabulary corresponding to the nodes in the medical directory tree is the vocabulary related to medical treatment, such as disease name, medicine name, treatment method name or treatment equipment name. In the medical directory tree, a disease name is often used as a root node, and a child node below the root node is used to indicate a drug name or a treatment method related to the disease name corresponding to the root node. Different vocabularies are stored in the medical directory tree through the upper-lower relationship, so that corresponding information such as medicines and treatment methods can be conveniently and rapidly given when a certain disease is inquired. With the development of medical technology, medicines, treatment methods and treatment equipment for diseases are continuously appeared, and accordingly, the medical directory tree is often required to be updated, and the process of updating the medical directory tree, that is, the process of adding new nodes in the medical directory tree, is performed. For example, a root node a exists in the medical directory tree, the vocabulary corresponding to the root node a is "disease X", a sub node B exists under the root node a, the vocabulary corresponding to the sub node B is "medical equipment E" used for treating the "disease X", and along with the development of medical technology, a method for treating the "disease X" by using the "medical equipment E" is called "treatment method M", so that the node corresponding to the "treatment method M" needs to be added into the medical directory tree, so that in the subsequent query process, the content related to the "treatment method M" can be queried. In this case, the terminal uploads the vocabulary "treatment method M" to the server, and the server adds nodes corresponding to the vocabulary "treatment method M" in the medical directory tree through the directory tree-based node addition method provided by the embodiment of the present application. when the server adds a node to the medical directory tree, a plurality of candidate positions, that is, positions of nodes corresponding to the word "treatment method M" may be added to the medical directory tree, can be determined in the medical directory tree. For example, one candidate position is a position between a first node and a second node in the medical directory tree, the first vocabulary corresponding to the first node is an upper word of the vocabulary corresponding to the second node, and the above example is used continuously, the first vocabulary is "disease X", the first node is a node corresponding to the first vocabulary "disease X", the second vocabulary is "medical equipment E", and the second node is a node corresponding to the second vocabulary "medical equipment E". The server acquires the paraphrasing text of the word treatment method M, and determines matching information between the word treatment method M and the candidate position based on the paraphrasing text of the word treatment method M, the paraphrasing text of the first word disease X, the paraphrasing text of the second word medical equipment E and the paraphrasing texts corresponding to a plurality of child nodes of the first node. If the matching information indicates that the matching degree between the vocabulary "treatment method M" and the candidate location is higher, the server can add a node corresponding to the vocabulary "treatment method M" at the candidate location, where the server adds a node corresponding to the vocabulary "treatment method M" at the candidate location, that is, adds a directed edge between the first node and the "node N" in the medical directory tree, where the directed edge points to the "node N" for the first node, and indicates that the first vocabulary corresponding to the first node is an upper word of the vocabulary "treatment method M" corresponding to the "node N". And adding a directed edge between the second node and the node N, wherein the directed edge points to the second node for the node N, and the vocabulary treatment method M corresponding to the node N is the upper word of the first vocabulary corresponding to the second node. After the medical directory tree is updated by adopting the method, the vocabulary treatment method M can be inquired in the medical directory tree, and the treatment method M can be found by inquiring the disease X, so that the medical directory tree is updated.
In the scenario of adding nodes to the news directory tree, the vocabulary corresponding to the nodes in the news directory tree is the vocabulary related to the news type. In the news directory tree, a news type is often taken as a root node, and a sub-node below the root node is used for indicating the sub-type under the news type corresponding to the root node. Different vocabularies are stored in the news catalog tree through the context, so that corresponding news types can be conveniently and quickly given when a certain news story is queried, or the news story can be searched through the types. Over time, the types of news may be continuously subdivided, and accordingly, the news directory tree is often updated, i.e., a process of adding new nodes in the news directory tree. For example, there is a node C in the news directory tree, the word corresponding to the node C is "football", there is a sub-node D under the node C, the word corresponding to the sub-node D is a team name "team T", and if the "team T" joins the newly established "tournament K", then the node corresponding to the "tournament K" needs to be added to the news directory tree, so that the content related to the "tournament K" can be queried in the subsequent query process. In this case, the terminal uploads the word "tournament K" to the server, and the server adds the node corresponding to the word "tournament K" in the news directory tree through the node adding method based on the directory tree provided by the embodiment of the application. When the server adds a node in the news directory tree, a plurality of candidate positions can be determined in the news directory tree, wherein the candidate positions are positions of the node corresponding to the word 'tournament K' possibly added. For example, one candidate position is a position between a first node and a second node in the news directory tree, the first vocabulary corresponding to the first node is an upper word of the vocabulary corresponding to the second node, the first vocabulary is a "football", the first node is a node corresponding to the first vocabulary, namely the "football", the second vocabulary is a "team T", and the second node is a node corresponding to the second vocabulary, namely the "team T". The server acquires the paraphrase text of the word 'tournament K', and determines matching information between the word 'tournament K' and the candidate position based on the paraphrase text of the word 'tournament K', the paraphrase text of the first word 'football', the paraphrase text of the second word 'team T', and the paraphrase texts corresponding to a plurality of child nodes of the first node. If the matching information indicates that the matching degree between the vocabulary "tournament K" and the candidate location is higher, the server can add a node corresponding to the vocabulary "tournament K" at the candidate location, wherein the server adds the node corresponding to the vocabulary "tournament K" at the candidate location, that is, adds a directed edge between the first node and the node N "in the news directory tree, the directed edge points to the node N for the first node, and indicates that the first vocabulary corresponding to the first node is a superword of the vocabulary" tournament K "corresponding to the node N". And adding a directed edge between the second node and the node N, wherein the directed edge points to the second node for the node N, and the word tournament K corresponding to the node N is the upper word of the first word corresponding to the second node. After the news directory tree is updated by adopting the method, the word 'tournament K' can be queried in the news directory tree, and certainly, the 'tournament K' can be found by querying 'football', so that the news directory tree is updated.
Under the scene of adding nodes to the commodity catalog tree, the vocabulary corresponding to the nodes in the commodity catalog tree is the vocabulary related to the commodity type. In the commodity directory tree, commodity types are often taken as root nodes, and sub-nodes below the root nodes are used for indicating sub-types under commodity types corresponding to the root nodes. Different vocabularies are stored in the commodity catalog tree through the upper-lower relationship, so that corresponding commodity types can be conveniently and quickly given out when a certain commodity is inquired, or the commodity is searched through the types. Over time, the types of the commodities are continuously subdivided, and accordingly, the commodity directory tree is often required to be updated, namely, a process of adding new nodes in the commodity directory tree. for example, there is a node E in the commodity directory tree, the vocabulary corresponding to the node E is "beverage", there is a sub-node F under the node E, the vocabulary corresponding to the sub-node F is a beverage name "latte", and if the vocabulary "soft beverage" is popular and the term "latte" belongs to "soft beverage", then the node corresponding to the "soft beverage" needs to be added to the commodity directory tree, so that the content related to the "soft beverage" can be queried in the subsequent query process. In this case, the terminal uploads the vocabulary "soft drink" to the server, and the server adds the node corresponding to the vocabulary "soft drink" in the commodity directory tree through the node adding method based on the directory tree provided by the embodiment of the application. When the server adds a node in the commodity catalog tree, a plurality of candidate positions can be determined in the commodity catalog tree, wherein the candidate positions are positions of the node corresponding to the word soft drink possibly added. For example, one candidate position is a position between a first node and a second node in the commodity directory tree, the first vocabulary corresponding to the first node is an upper word of the vocabulary corresponding to the second node, the first vocabulary is a "beverage", the first node is a node corresponding to the first vocabulary "beverage", the second vocabulary is a "latte", and the second node is a node corresponding to the second vocabulary "latte", which is continued to follow the above example. The server acquires the paraphrasing text of the word soft drink, and determines matching information between the word soft drink and the candidate position based on the paraphrasing text of the word soft drink, the paraphrasing text of the first word beverage, the paraphrasing text of the second word latte and the paraphrasing texts corresponding to the plurality of child nodes of the first node. If the matching information indicates that the matching degree between the vocabulary "soft drink" and the candidate position is higher, the server can add a node corresponding to the vocabulary "soft drink" at the candidate position, wherein the server adds the node corresponding to the vocabulary "soft drink" at the candidate position, that is, adds a directed edge between the first node and the node N in the commodity catalog tree, the directed edge points to the node N for the first node, and indicates that the first vocabulary corresponding to the first node is an upper word of the vocabulary "soft drink" corresponding to the node N. And adding a directional edge between the second node and the node N, wherein the directional edge points to the second node for the node N, and the word soft drink corresponding to the node N is the upper word of the first word corresponding to the second node. After the commodity directory tree is updated by adopting the method, the word soft drink can be inquired in the commodity directory tree, and certainly, the soft drink can be found by inquiring the drink, so that the commodity directory tree is updated.
In the above description, the node adding method based on the directory tree provided by the embodiment of the present application is described by taking as an example that the node adding method based on the directory tree is applied to a scenario of adding nodes to a medical directory tree, a scenario of adding nodes to a news directory tree, and a scenario of adding nodes to a commodity directory tree, respectively.
In the embodiment of the application, as described in the application scenario, the terminal can upload the vocabulary to the server, and the server adds the node corresponding to the vocabulary into the directory tree. Or the terminal may directly add the node corresponding to the vocabulary to the directory tree, or the server may obtain the vocabulary, and based on the vocabulary, add the corresponding node to the directory tree, which is not limited in the embodiment of the present application.
After the implementation environment and the application scenario of the embodiment of the present application are introduced, the node adding method based on the directory tree provided by the embodiment of the present application is described below.
Fig. 2 is a flowchart of a node adding method based on a directory tree according to an embodiment of the present application, taking an execution body as a terminal, referring to fig. 2, the method includes:
201. The terminal determines candidate positions in a directory tree, wherein the directory tree comprises a plurality of nodes, the nodes respectively correspond to a plurality of words, the candidate positions are positions between a first node and a second node in the directory tree, and the first word corresponding to the first node is an upper word of the second word corresponding to the second node.
The candidate positions are positions in the directory tree, where nodes may be added, and the first node and the second node are used for representing the candidate positions in the directory tree. Because the first vocabulary corresponding to the first node is the hypernym of the second vocabulary corresponding to the second node, the first node is the father node of the second node, or the father-son relationship is formed between the first node and the second node. In some embodiments, when a terminal adds a node to a directory tree, a plurality of candidate locations can be obtained, and a processing method of the terminal for the plurality of candidate locations belongs to the same invention concept.
202. The terminal acquires the paraphrase text of the target vocabulary, and the paraphrase text of the target vocabulary is used for explaining the target vocabulary.
Where paraphrasing text, i.e., text that is used to interpret the meaning of a word, for some more ambiguous word exchanges, paraphrasing text can express the ambiguous word by some combination of simple words. For example, for a term "fourier transform," its paraphrasing text is "a transformation method that represents a function that satisfies a certain condition as a linear combination of trigonometric functions (sine and/or cosine functions) or their integrals," that is, a complex word that disassembles a obscure word into a combination of simple words, which is easy to understand.
203. The terminal determines matching information between the target vocabulary and the candidate position based on the paraphrasing text of the target vocabulary, the paraphrasing text of the first vocabulary, the paraphrasing text of the second vocabulary and the paraphrasing text corresponding to a plurality of child nodes of the first node, wherein the matching information is used for representing the matching degree between the target vocabulary and the candidate position.
For a plurality of sub-nodes of the first node, the first vocabulary corresponding to the first node is the upper word of the sub-vocabulary corresponding to the plurality of sub-nodes respectively.
204. And responding to the matching information meeting the target condition, and adding the node corresponding to the target vocabulary at the candidate position by the terminal.
By the technical scheme provided by the embodiment of the application, when the node is added into the directory tree, any position in the directory tree can be selected as a candidate position, and the candidate position is the position where the node is possibly added. In the node adding process, the paraphrase text of the target vocabulary is obtained, the matching information is determined based on the paraphrase text corresponding to each node, and finally, the node is added at the candidate position based on the matching information, so that not only can the leaf node be added in the directory tree, but also the non-leaf node can be added in the directory tree, the expansion mode of the directory tree is more various, the information richness in the directory tree can be improved, and the application range of the directory tree is enlarged.
The foregoing steps 201 to 204 are a simple description of the node adding method based on a directory tree provided by the embodiment of the present application, and the following will describe in more detail, with reference to some examples, the node adding method based on a directory tree provided by the embodiment of the present application, and it should be noted that, when a terminal adds a node in a directory tree by using the node adding method based on a directory tree provided by the embodiment of the present application, a plurality of candidate positions are determined, matching information between the plurality of candidate positions and a target vocabulary is obtained, and a position of a node corresponding to the target vocabulary added in the directory tree is determined based on the matching information between the plurality of candidate positions and the target vocabulary. Since the method for determining the matching information between the candidate positions and the target vocabulary by the terminal belongs to the same inventive concept, in the following description process, description will be given by taking the matching information between one candidate position and the target vocabulary as an example. Also taking the execution body as a terminal for example, referring to fig. 3, the method includes:
301. The terminal determines candidate positions in a directory tree, wherein the directory tree comprises a plurality of nodes, the nodes respectively correspond to a plurality of words, the candidate positions are positions between a first node and a second node in the directory tree, and the first word corresponding to the first node is an upper word of the second word corresponding to the second node.
Fig. 4 shows a schematic structure of a directory tree 400, see fig. 4, the directory tree 400 comprising a plurality of nodes, each node corresponding to a vocabulary. The nodes are connected through directed edges, two nodes connected by the directed edges form a father-son relationship, and the direction of the directed edges is used for indicating father nodes and son nodes in the two nodes, wherein the two nodes form the father-son relationship, namely, the upper-lower relationship is formed between vocabularies corresponding to the two nodes. A parent node may correspond to a plurality of child nodes, i.e., a vocabulary may have a plurality of hyponyms, e.g., for the vocabulary "color", hyponyms corresponding to "red", "yellow", "blue", "green", and "white", etc., and if there are nodes in the directory tree corresponding to the vocabulary "color", then nodes corresponding to the vocabulary "red", "yellow", "blue", "green", and "white", etc., are child nodes of nodes corresponding to the vocabulary "color". In some embodiments, the directed edge points from a parent node to a child node of two nodes, e.g., node 401 in directory tree 400 is a parent node, node 402 is a child node of node 401, and node 401 and node 402 are connected by a directed edge pointing from node 401 to node 402. The nodes forming the parent-child relationship in the directory tree can be quickly determined through the directed edges. In some embodiments, the node correspondence vocabulary is also referred to as "concept," in which case each node is used to represent a "concept.
For example, for directory tree τ 0=(N00), a candidate location may be represented by (p, c), where N 0 is the number of nodes in the directory tree, ε 0 is the directed edge between the nodes, o is any non-leaf node in the directory tree, i.e., the first node, and c is a child node of p in the directory tree, e.g., p, and c is the second node.
In addition, referring to the previous description of the application scenario of the embodiment of the present application, in different scenarios, the directory tree is a directory tree of different types, for example, in a scenario in which a node is added to a medical directory tree, the directory tree is a medical directory tree, in a scenario in which a node is added to a news directory tree, the directory tree is a news directory tree, and in a scenario in which a node is added to a commodity directory tree, the directory tree is a commodity directory tree. In the following explanation, the medical directory tree will be exemplified as the directory tree.
In one possible embodiment, the terminal determines a non-leaf node in the directory tree, i.e. the first node. The terminal determines a descendant node of the first node, i.e. the second node, in the directory tree, and the candidate location, i.e. the location between the first node and the second node. Wherein, the non-leaf node is the node with child nodes in the directory tree. For example, three non-leaf nodes "disease X", "disease Y" and "disease Z" exist in the medical directory tree, and the term corresponding to the node or the concept represented by the called node is referred to as the term corresponding to the node. The terminal determines a non-leaf node "disease X" as a first node if the non-leaf node "disease X" includes three child nodes "medical device E", "drug W", and "treatment method M". If the terminal determines the child node "medicine W" as the second node, a candidate position is a position between the first node "disease X" and the second node "medicine W", and the vocabulary corresponding to the node added in the candidate position is a hyponym of "disease X" and an hypernym of "medicine W".
For example, when adding a node to a directory tree, a terminal traverses the directory tree, acquires a plurality of candidate locations from the directory tree, and records the plurality of candidate locations in a candidate location list. The terminal obtains the candidate position from the candidate position list. In some embodiments, each candidate location stored in the candidate location list corresponds to a number, and the terminal obtains candidate locations from the candidate location list in order of the number from the smaller to the larger.
302. The terminal acquires the paraphrase text of the target vocabulary, and the paraphrase text of the target vocabulary is used for explaining the target vocabulary.
In one possible implementation manner, the terminal uses the target vocabulary to query in a paraphrase text database to obtain a paraphrase text of the target vocabulary, and the paraphrase text database stores a plurality of vocabularies and paraphrase texts corresponding to the vocabularies respectively. In some embodiments, the paraphrasing text database is a real-time updated database, for example, an encyclopedia database in the paraphrasing text database, and the real-time updated database can ensure that the latest vocabulary or the paraphrasing text with the latest vocabulary is queried. Of course, the paraphrasing text database may be other types of databases besides encyclopedia databases, which are not limited in this embodiment of the present application. When the paraphrasing text database is queried, the terminal inputs the vocabulary to be queried into the paraphrasing text database. In some embodiments, this way of obtaining paraphrasing text of the target vocabulary is referred to as a dynamic programming algorithm. By obtaining paraphrasing text using real-time updated encyclopedias, the most recently occurring vocabulary can be identified more timely.
For example, the terminal uses the target vocabulary to query in the paraphrase text database to obtain a page corresponding to the target vocabulary, wherein the page comprises a description text of the target vocabulary, and the description text comprises a plurality of sentences related to the target vocabulary. The terminal acquires a first sentence in the descriptive text of the target vocabulary as a paraphrase text of the target vocabulary.
If the paraphrase text database is an encyclopedia database, the target vocabulary is a treatment method M, and the terminal adopts the target vocabulary treatment method M to query in the encyclopedia database to obtain an encyclopedia page corresponding to the treatment method M in the encyclopedia, wherein the encyclopedia page comprises a description text of the treatment method M. In some embodiments, this descriptive text is used to describe the definition, origin, and details of "treatment method M," etc. The terminal acquires a first sentence from the descriptive text displayed in the encyclopedia page as a paraphrase text of "treatment method M".
In some embodiments, if the target vocabulary is an ambiguous word, the target vocabulary may correspond to a plurality of terms in the paraphrase text database, where the description content of each of the different concept meaning objects under the same term name is referred to as a term, or the term is the content of the same term in different fields, where the term is the content that best represents the attribute and the feature of the term object. Such as the term "apple" has multiple meaning items, including fruit trees, fruits, companies, movies, etc. The descriptive text under the non-synonym is also different for the same target vocabulary, and correspondingly, the paraphrasing text obtained from the descriptive text under the non-synonym is also different, and the terminal selects the corresponding paraphrasing of the target vocabulary in the following way. The terminal adopts the target vocabulary to query in the paraphrasing text database, and under the condition that a plurality of semantic items corresponding to the target vocabulary exist in the paraphrasing text database, the semantic similarity between the target vocabulary and the paraphrasing texts corresponding to the nodes is obtained. And determining the paraphrase text of the first term as the paraphrase text of the target vocabulary, wherein the first term is a term with the semantic similarity between the paraphrase texts corresponding to the reference nodes meeting the first similarity condition, and the reference nodes are nodes with the semantic similarity between the corresponding paraphrase text and the target vocabulary meeting the second similarity condition.
In this embodiment, for different nodes in the directory tree, the corresponding vocabulary belongs to the same domain, for example, in the medical directory tree, the vocabulary corresponding to different nodes belongs to the medical domain. Because the non-synonym is used for representing that the target vocabulary is in different fields, the terminal can compare the target vocabulary with the paraphrasing texts corresponding to a plurality of nodes in the directory tree, select the node with the semantic similarity meeting the second similarity condition from the directory tree, and then compare the paraphrasing text corresponding to the node with a plurality of semantic items of the target vocabulary to select the first semantic item with the semantic similarity meeting the first similarity condition. The terminal obtains the paraphrasing text of the first paraphrasing item as the paraphrasing text of the target vocabulary, so that the accuracy of Gao Yi item selection can be improved.
The semantic similarity meeting the first similarity condition means that the semantic similarity is greater than or equal to a first similarity threshold, or the semantic similarity is the highest of the plurality of semantic similarities. The semantic similarity meeting the second similarity condition means that the semantic similarity is greater than or equal to a second similarity threshold, or that the semantic similarity is the highest of the plurality of semantic similarities. The first similarity threshold and the second similarity threshold are set by a technician according to actual conditions, and the first similarity threshold and the second similarity threshold may be the same or different, which is not limited in the embodiment of the present application.
For example, the terminal inputs the target vocabulary into a semantic feature extraction model, and the semantic feature extraction model performs semantic feature extraction on the target vocabulary to obtain a first semantic feature of the target vocabulary. The terminal inputs the paraphrasing texts corresponding to the plurality of nodes into the semantic feature extraction model, and the semantic feature extraction model extracts semantic features of the paraphrasing texts corresponding to the plurality of nodes respectively to obtain second semantic features of each node. The terminal obtains semantic similarity between the target vocabulary and the paraphrasing texts corresponding to the plurality of nodes based on the first semantic features and the plurality of second semantic features, and determines the node corresponding to the paraphrasing text with the semantic similarity meeting the second similarity condition as a reference node. The terminal inputs a plurality of semantic items corresponding to the target vocabulary into the semantic feature extraction model, and the semantic feature extraction model performs feature extraction on the plurality of semantic items respectively to obtain third semantic features of each semantic item. The terminal obtains semantic similarity between the paraphrase text corresponding to the reference node and a plurality of semantic items based on the second semantic feature of the reference node and the third semantic feature of each semantic item, and determines the semantic item with the semantic similarity meeting the first similarity condition as the first semantic item. The terminal acquires a first sentence in the descriptive text under the first meaning item as the descriptive text of the target vocabulary.
The semantic feature extraction model is a BERT model, or SpaCy (a natural language text processing library of Python and CPython) and other types of semantic feature extraction models, which are not limited in this embodiment of the present application, and it should be noted that the BERT model includes a basic BERT model and variants of various BERT models, such as RoBERTa or ALBERT. In some embodiments, the semantic feature extraction model is a pre-trained model, and the terminal can not only train the semantic feature extraction model in advance, but also directly obtain the pre-trained semantic feature extraction model from the network, which is not limited in the embodiments of the present application. The pretrained language model BERT with excellent performance can greatly improve the accuracy of model judgment.
For example, the semantic feature extraction model is a BERT model, the terminal inputs the target vocabulary into the semantic feature extraction model, the semantic feature extraction model encodes the target vocabulary based on an attention mechanism to obtain a first semantic vector of the target vocabulary, and the first semantic vector is used for representing the first semantic feature of the target vocabulary. The terminal inputs the paraphrasing texts corresponding to the plurality of nodes into the semantic feature extraction model, and the semantic feature extraction model respectively codes the paraphrasing texts corresponding to the plurality of nodes based on an attention mechanism to obtain second semantic vectors of all the nodes, wherein the second semantic vectors of all the nodes are used for representing the second semantic features of all the nodes. The terminal obtains cosine similarity between the first semantic vector and a plurality of second semantic vectors, wherein the cosine similarity is used for representing semantic similarity between the target vocabulary and paraphrasing texts corresponding to a plurality of nodes. And the terminal determines the node corresponding to the paraphrase text with the highest semantic similarity between the target vocabularies as a reference node. The terminal inputs a plurality of semantic items corresponding to the target vocabulary into the semantic feature extraction model, the semantic feature extraction model performs attention coding on the plurality of semantic items based on an attention mechanism to obtain third semantic vectors of the semantic items, and the third semantic vectors of the semantic items are used for representing the third semantic features of the nodes. The terminal obtains cosine similarity between the second semantic vector of the reference node and the third semantic vector of each semantic item, wherein the cosine similarity is used for representing semantic similarity between the paraphrase text corresponding to the reference node and a plurality of semantic items. And the terminal determines the semantic item with the highest semantic similarity between the paraphrasing texts corresponding to the reference nodes as the first semantic item. The terminal acquires a first sentence in the descriptive text under the first meaning item as the descriptive text of the target vocabulary.
In the above description, the terminal performs semantic feature extraction on the paraphrasing text corresponding to the node in the directory tree in real time through the semantic feature extraction model, and in other possible embodiments, the terminal may perform semantic feature extraction on the paraphrasing text corresponding to each node in the directory tree in advance to obtain the second semantic feature of each node, so when determining the semantic item corresponding to the target vocabulary, only the first semantic feature of the target vocabulary and the third semantic feature of each semantic item of the target vocabulary need to be extracted, and the second semantic features of a plurality of nodes do not need to be extracted again, thereby improving the efficiency of determining the semantic item of the target vocabulary.
In one possible implementation, before the terminal uses the target vocabulary to query in the paraphrase text database, a word segmentation tool (such as jieba (resultant) word segmentation) can be further used to segment the target vocabulary, so as to obtain multiple sub-vocabularies of the target vocabulary, where the multiple sub-vocabularies include different numbers of characters, and there is a superposition between the multiple sub-vocabularies. The terminal queries in the paraphrase text database by adopting a plurality of sub-vocabularies of the target vocabulary, responds to the query of the paraphrase text corresponding to any sub-vocabulary, acquires the paraphrase text corresponding to the sub-vocabulary, scores the paraphrase text corresponding to the vocabulary based on the number of characters contained in the sub-vocabulary, and obtains the score of the paraphrase text corresponding to the vocabulary, wherein the score is positively correlated with the number of characters contained in the vocabulary, namely, the smaller the number of characters of the vocabulary is, the lower the score of the paraphrase text corresponding to the vocabulary is, and the larger the number of characters of the vocabulary is, the higher the score of the paraphrase text corresponding to the vocabulary is. The terminal determines the paraphrase text with the highest corresponding score as the paraphrase text of the target vocabulary. Of course, if the terminal only obtains the paraphrase text corresponding to one sub-word of the target word, the paraphrase text corresponding to the sub-word is directly determined as the paraphrase text of the target word. By adopting the method, the complex vocabulary (concept) can be disassembled into simpler and complete fragments, and the recognition degree of the model on the complex vocabulary (concept) is improved.
In the embodiment, the terminal can segment the target vocabulary and query the target vocabulary by adopting the sub-vocabulary of the target vocabulary, so that the probability of acquiring the paraphrase text of the target vocabulary is improved. Meanwhile, for the same target vocabulary, if the paraphrasing text of a plurality of sub-vocabularies is obtained, the paraphrasing text can be scored according to the number of characters in the sub-vocabularies, the more the number of characters in the sub-vocabularies is, the closer the sub-vocabularies are to the target vocabulary, and the higher the accuracy is as the paraphrasing text of the target vocabulary by adopting the paraphrasing text corresponding to the sub-vocabularies with higher scores.
For example, for a target word "ABCD", the terminal uses a word segmentation tool to segment the target word "ABCD" into "a", "AB", "ABC", and "ABCD", and the like, and the terminal uses multiple sub-words of the target word "ABCD" to query in the paraphrasing text database, to determine whether the paraphrasing text database stores paraphrasing texts corresponding to the sub-words of the target word "ABCD", respectively, and if the paraphrasing text database is an encyclopedia database, that is, to determine whether there are encyclopedia pages corresponding to the sub-words of the target word "ABCD", respectively. In response to any sub-word of the target word "ABCD" corresponding to an encyclopedia page, the terminal obtains a first sentence from descriptive text displayed in the encyclopedia page as paraphrased text of "ABCD".
If the terminal obtains the paraphrasing text corresponding to the sub-word "a" and the paraphrasing text corresponding to the sub-word "ABC", the terminal scores the paraphrasing text corresponding to the sub-word "a", for example, 1, according to the number 1 of the characters contained in the sub-word "a". The terminal scores the paraphrasing text corresponding to the sub-word "ABC", for example, 3, according to the number of characters 3 contained in the sub-word "ABC". Because of 3>1, the terminal determines the paraphrasing text corresponding to the sub-word "ABC" as the paraphrasing text corresponding to the target word "ABCD".
The above step 302 will be described with reference to fig. 5.
Referring to fig. 5, the terminal inputs a word n into a word segmentation tool, and divides the word n into a plurality of sub-words C through the word segmentation tool. The terminal determines whether the number of characters in the first sub-vocabulary is greater than or equal to the length of the vocabulary, and if the number of characters in the first sub-vocabulary is greater than or equal to the length of the vocabulary, the process is ended. If the number of characters in the first sub-word is smaller than the length of the word, the terminal determines whether the first sub-word is a noun through SpaCy, and if the first sub-word is not a noun, the paraphrase text of the word is determined to be the sub-word. If the first sub-term is a noun, it is determined whether the first sub-term corresponds to a plurality of sense items in the encyclopedia page. If the first sub-word corresponds to a sense term in the encyclopedia page, a first sentence of the descriptive text under the sense term is obtained as paraphrase text for the word. If the first sub-vocabulary corresponds to a plurality of sense items in the encyclopedia page, the terminal determines whether the vocabulary is an existing vocabulary in the directory tree. If the vocabulary is not the existing vocabulary in the directory tree, the terminal determines a reference node with the highest similarity between the paraphrase text and the vocabulary from the directory tree, and selects a first paraphrase item with the highest similarity between the semantic meaning and the paraphrase text corresponding to the reference node from the plurality of the paraphrases. The terminal acquires a first sentence of the descriptive text under the first sense term as paraphrase text of the vocabulary. If the vocabulary is an existing vocabulary in the directory tree, it is determined whether the vocabulary corresponds to a root node of the directory tree. If the vocabulary does not correspond to the root node of the directory tree, the terminal selects from the plurality of sense items based on the parent node to which the vocabulary corresponds. If the vocabulary corresponds to the root node of the directory tree, then the sense item to which the vocabulary corresponds is manually selected by the technician.
303. The terminal inputs the paraphrase text of the target vocabulary, the paraphrase text of the first vocabulary, the paraphrase text of the second vocabulary and the paraphrase text corresponding to a plurality of child nodes of the first node into a matching information determining model, and outputs the matching information between the target vocabulary and the candidate position through the matching information determining model, wherein the matching information is used for representing the matching degree between the target vocabulary and the candidate position.
For a clearer description of the above step 303, the following description of step 303 is divided into 3031-3037.
3031. The terminal inputs the paraphrase text of the target vocabulary, the paraphrase text of the first vocabulary, the paraphrase text of the second vocabulary and the paraphrase text corresponding to the plurality of child nodes of the first node into a matching information determining model.
Among other things, referring to fig. 6, the matching information determination model 600 includes a sequence encoding Unit (Sequence Encoding Unit) 601, an encoding attention Unit (Code Attention Unit) 602, a parent-child cross attention Unit (Parental Cross Attention Unit) 603, a sibling cross attention Unit (Sibling Cross Attention Unit) 604, and a scoring Unit (Score Unit) 605. The sequence encoding unit 601 is used for encoding paraphrasing text into a paraphrasing matrix, the encoding attention unit 602 is used for encoding the paraphrasing matrix into a representation matrix, the father-son cross attention unit 603 is used for acquiring a first relationship feature and a second relationship feature, the brother cross attention unit 604 is used for acquiring a third relationship feature and a fourth relationship feature, and the scoring unit 605 is used for acquiring matching information. In some embodiments, the matching information determination model further comprises a paraphrasing text acquisition unit for performing step 302 described above. Wherein q represents the paraphrase text of the target vocabulary, p represents the paraphrase text of the first vocabulary, c represents the paraphrase text of the second vocabulary, s represents the paraphrase text of the first sub-vocabulary, and w represents the paraphrase text of the second sub-vocabulary.
In a possible implementation manner, for the paraphrasing text corresponding to each node in the directory tree, the terminal can acquire the paraphrasing text corresponding to each node in real time, also can acquire the paraphrasing text corresponding to each node in advance, store the paraphrasing text acquired in advance in the hard disk, and can directly call the paraphrasing text corresponding to each node from the hard disk subsequently without acquiring in real time, thereby improving the operation efficiency.
Taking the terminal to obtain the paraphrasing text corresponding to a plurality of nodes in the directory tree in advance as an example, the terminal directly obtains the paraphrasing text of the first vocabulary, the paraphrasing text of the second vocabulary and the paraphrasing text corresponding to a plurality of child nodes of the first node from the hard disk.
In the process of acquiring the paraphrasing text corresponding to a plurality of nodes in the directory tree, if a vocabulary corresponding to a certain node in the directory tree is an ambiguous word, that is, the vocabulary corresponds to a plurality of meaning items in the paraphrasing text database, the terminal can select the meaning item corresponding to the vocabulary in the following manner.
In one possible implementation manner, for a first sub-node of the first node, the terminal queries in the paraphrase text database by using a first sub-word corresponding to the first sub-node, and in the case that a plurality of sense items corresponding to the first sub-word exist in the paraphrase text database, the terminal obtains semantic similarity between the plurality of sense items and the paraphrase text of the first word. And the terminal determines the paraphrase text of a second term as the paraphrase text of the target vocabulary, wherein the second term is a term with semantic similarity meeting a third similarity condition with the paraphrase text of the first vocabulary.
In this embodiment, when a sub-word has multiple meaning items, the meaning items are selected by performing semantic similarity calculation using the paraphrase text corresponding to the parent node instead of performing semantic similarity calculation using the sub-word, so that the accuracy of the selection of Gao Yi items can be improved.
For example, the terminal uses a first sub-word corresponding to the first sub-node to query in the paraphrase text database, and obtains a page corresponding to the first sub-word, where the page includes a plurality of terms of the first sub-word. The terminal inputs the first vocabulary into a semantic feature extraction model, and the semantic feature extraction model extracts semantic features of the first vocabulary to obtain fourth semantic features of the first vocabulary, wherein the fourth semantic features are the semantic features of the first node. The terminal inputs a plurality of semantic items corresponding to the first sub-vocabulary into the semantic feature extraction model, and the semantic feature extraction model performs feature extraction on the plurality of semantic items respectively to obtain fifth semantic features of each semantic item. The terminal obtains semantic similarity between the paraphrase text corresponding to the first node and a plurality of semantic items based on fourth semantic features of the first node and fifth semantic features of each semantic item, and determines the semantic items with the semantic similarity meeting a fourth similarity condition as second semantic items. The terminal acquires a first sentence in the descriptive text under the second meaning item as the descriptive text of the first sub-word.
For example, the semantic feature extraction model is a BERT model, and the terminal uses a first sub-word corresponding to the first sub-node to query in the paraphrase text database to obtain a page corresponding to the first sub-word, where the page includes a plurality of meaning items of the first sub-word. The terminal inputs the first vocabulary into a semantic feature extraction model, the semantic feature extraction model encodes the first vocabulary based on an attention mechanism to obtain a fourth semantic vector of the first vocabulary, and the fourth semantic vector is used for representing fourth semantic features of the first vocabulary. The terminal inputs a plurality of semantic items corresponding to the first sub-vocabulary into the semantic feature extraction model, the semantic feature extraction model performs attention coding on the plurality of semantic items based on an attention mechanism to obtain fifth semantic vectors of the semantic items, and the fifth semantic vectors of the semantic items are used for representing fifth semantic features of the nodes. The terminal obtains cosine similarity between the fourth semantic vector of the first node and the fifth semantic vector of each semantic item, wherein the cosine similarity is used for representing semantic similarity between the paraphrase text corresponding to the first node and a plurality of semantic items. And the terminal determines the semantic item with the highest semantic similarity between the paraphrasing texts corresponding to the first node as a second semantic item. The terminal acquires a first sentence in the descriptive text under the second meaning item as the descriptive text of the first sub-word.
The terminal determines a model by the matching information, and performs the following steps 3032-3037.
3032. The terminal obtains a first relation feature based on the paraphrase text of the target vocabulary and the paraphrase text of the first vocabulary, wherein the first relation feature is used for indicating whether the first vocabulary is a hypernym of the target vocabulary.
In one possible implementation, the terminal encodes the paraphrase text of the target vocabulary based on the attention mechanism to obtain a paraphrase matrix of the target vocabulary, where the paraphrase matrix of the target vocabulary is used to represent the paraphrase text of the target vocabulary. The terminal encodes the paraphrase text of the first vocabulary based on the attention mechanism to obtain a paraphrase matrix of the first vocabulary, wherein the paraphrase matrix of the first vocabulary is used for representing the paraphrase text of the first vocabulary. The terminal obtains the first relationship feature based on the paraphrase matrix of the target vocabulary and the paraphrase matrix of the first vocabulary.
In order to more clearly describe the above embodiments, the above embodiments will be described below in three sections.
The first part and the terminal encode the paraphrase text of the target vocabulary based on the attention mechanism to obtain a paraphrase matrix of the target vocabulary, wherein the paraphrase matrix of the target vocabulary is used for representing the paraphrase text of the target vocabulary. Wherein the part is implemented by the sequence encoding unit 601 of the matching information determination model 600.
In one possible implementation, the terminal divides the paraphrasing text of the target vocabulary into a plurality of first phrases, the plurality of first phrases constituting the paraphrasing text of the target vocabulary. The terminal performs embedded coding on the plurality of first phrases to obtain embedded vectors of the first phrases. The terminal adopts an attention mechanism to encode based on the embedded vectors of the first word groups to obtain the paraphrasing matrix of the target word.
For example, the terminal employs a word segmentation tool to segment the paraphrasing text of the target word into a plurality of first phrases. The terminal performs embedded coding on a plurality of first phrases in a Word Vector (Word to Vector) mode to obtain embedded vectors of the first phrases, and the combination of the embedded vectors of the first phrases is the expression sequence of the paraphrasing text of the target vocabulary. The terminal inputs the expression sequence of the paraphrase text of the target word, namely the embedded vector of each first word group into a semantic feature extraction model, and obtains a query matrix, a key matrix and a value matrix of each first word group through the semantic feature extraction model. The terminal obtains a paraphrase matrix of the target vocabulary based on the query matrix, the key matrix and the value matrix of each first phrase. In some embodiments, the word segmentation tool is SpaCy or jieba and the semantic feature extraction model is a BERT model. The above process can be expressed by the formula (1).
Wherein D q is the paraphrase matrix of the target vocabulary, BERT () is the processing function of the BERT model, X q is the representation sequence of the paraphrase text of the target vocabulary, l q is the length of the representation sequence of the paraphrase text of the target vocabulary, and k is the number of layers of the BERT model.
The method for obtaining the paraphrase matrix of the target vocabulary by the terminal through the semantic feature extraction model is described below.
Taking a semantic feature extraction model as a BERT model as an example, the terminal inputs the embedded vectors of each first phrase into the semantic feature extraction model, and the semantic feature extraction model adopts three linear transformation matrixes to process the embedded vectors of each first phrase so as to obtain a query matrix, a key matrix and a value matrix of each first phrase. Wherein the three linear transformation matrices are respectively a query transformation matrix WQ 1, a key transformation matrix WK 1, and a value transformation matrix WV 1, parameters in the three linear transformation matrices are determined during training of the semantic feature extraction model, and in some embodiments, the semantic feature extraction model is a pre-trained BERT model, then the terminal can directly use the semantic feature extraction model without additional training. For a first phrase in the paraphrase text of the target word, the terminal performs dot multiplication on the query matrix of the first phrase and key matrices of other first phrases in the paraphrase text of the target word respectively to obtain the attention weight between the first phrase and the other first phrases. The terminal adopts the attention weight between the first phrase and other first phrases to multiply with the corresponding value matrix of the first phrase respectively to obtain the initial attention matrix of the other first phrases to the first phrase. And the terminal fuses the initial attention matrix of the first phrase with other first phrases to obtain the attention matrix of the first phrase. And the terminal fuses the attention matrixes of the first word groups to obtain the paraphrasing matrix of the target word. The above procedure can be expressed by the formula (2) -formula (4).
Wherein x is an embedded vector of the first phrase, Q is a query matrix of the first phrase, K is a key matrix of the first phrase, V is a value matrix of the first phrase, WQ 1 is a query transformation matrix, WK 1 is a key transformation matrix, and WV 1 is a value transformation matrix.
Wherein A is the attention weight, softmax is the normalization function, K 1 is the value matrix of the other first phrase, and D is a constant.
S=A·V (4)
Wherein S is the paraphrase matrix of the target vocabulary.
The second part and the terminal encode the paraphrase text of the first vocabulary based on the attention mechanism to obtain a paraphrase matrix of the first vocabulary, wherein the paraphrase matrix of the first vocabulary is used for representing the paraphrase text of the first vocabulary. Wherein the part is implemented by the sequence encoding unit 601 of the matching information determination model 600.
In one possible implementation, the terminal divides the paraphrasing text of the first vocabulary into a plurality of second phrases, the plurality of second phrases constituting the paraphrasing text of the first vocabulary. And the terminal performs embedded coding on the plurality of second phrases to obtain embedded vectors of the second phrases. The terminal adopts an attention mechanism to encode based on the embedded vectors of the second word groups to obtain the paraphrasing matrix of the first word.
For example, the terminal employs a word segmentation tool to segment the paraphrasing text of the first vocabulary into a plurality of second phrases. The terminal performs embedded coding on a plurality of second phrases in a Word Vector (Word to Vector) mode to obtain embedded vectors of the second phrases, and the combination of the embedded vectors of the second phrases is the expression sequence of the paraphrasing text of the first vocabulary. The terminal inputs the expression sequence of the paraphrase text of the first vocabulary, namely the embedded vector of each second phrase into a semantic feature extraction model, and obtains a query matrix, a key matrix and a value matrix of each second phrase through the semantic feature extraction model. The terminal obtains a paraphrase matrix of the first vocabulary based on the query matrix, the key matrix and the value matrix of each second phrase. The above process can be expressed by the formula (5).
Wherein D p is the paraphrasing matrix of the first vocabulary, X p is the representation sequence of the paraphrasing text of the first vocabulary, and l p is the length of the representation sequence of the paraphrasing text of the first vocabulary.
The method for obtaining the paraphrase matrix of the first vocabulary by the terminal through the semantic feature extraction model is described below.
The terminal inputs the embedded vectors of the second phrases into a semantic feature extraction model, and the semantic feature extraction model adopts three linear transformation matrixes to process the embedded vectors of the second phrases so as to obtain a query matrix, a key matrix and a value matrix of the second phrases. The three linear transformation matrices are respectively a query transformation matrix WQ 1, a key transformation matrix WK 1, and a value transformation matrix WV 1. For a second phrase in the paraphrase text of the first vocabulary, the terminal respectively performs dot multiplication on the query matrix of the second phrase and key matrixes of other second phrases in the paraphrase text of the first vocabulary to obtain the attention weight between the second phrase and other second phrases. The terminal multiplies the attention weight between the second phrase and other second phrases by the corresponding value matrix of the second phrase to obtain the initial attention matrix of the other second phrases to the second phrase. And the terminal fuses the initial attention matrix of the second phrase with other second phrases to obtain the attention matrix of the second phrase. And the terminal fuses the attention matrixes of the second word groups to obtain the paraphrasing matrix of the first word.
And the third part and the terminal acquire the first relation feature based on the paraphrasing matrix of the target vocabulary and the paraphrasing matrix of the first vocabulary. Wherein the part is implemented by a parent-child cross attention unit 603.
The first relationship feature is used for indicating whether the first vocabulary is an upper word of the target vocabulary, that is, whether a parent-child relationship can be formed between a node corresponding to the target vocabulary and the first node, or whether the first node is a parent node of the node corresponding to the target vocabulary.
In one possible implementation, the terminal encodes the paraphrase matrix of the target vocabulary with a plurality of encoding vectors to obtain a representation matrix of the target vocabulary, where the plurality of encoding vectors are used to adjust the dimensions of the matrix. The terminal encodes the paraphrase matrix of the first vocabulary by adopting the plurality of encoding vectors to obtain a representation matrix of the first vocabulary, wherein the representation matrix of the target vocabulary and the representation matrix of the first vocabulary have the same dimensionality. The terminal encodes the representation matrix of the target vocabulary and the representation matrix of the first vocabulary based on an attention mechanism to obtain the first relation feature. Since the lengths of the paraphrasing text of the target vocabulary and the paraphrasing text of the first vocabulary are often different, the representation sequence of the paraphrasing text of the target vocabulary and the representation sequence of the paraphrasing text of the first vocabulary obtained in the processing of the first part and the second part have different lengths, and the dimensions of the finally obtained representation matrix are different. After the processing is performed by adopting the embodiment, the paraphrasing matrix with different dimensions can be processed into the representing matrix with the same dimension.
In this embodiment, the terminal is able to encode the paraphrase matrix of the target vocabulary and the paraphrase matrix of the first vocabulary by using a plurality of encoding vectors, that is, to further extract useful information from the paraphrase matrix of the target vocabulary and the paraphrase matrix of the first vocabulary using a plurality of encoding vectors. The representation matrix of the target vocabulary and the representation matrix of the first vocabulary are obtained after the plurality of encoding vectors are adopted for encoding, and have the same dimension, so that the subsequent operation efficiency can be improved, and the storage cost of the model in the offline calculation process can be reduced.
Taking a terminal as an example to encode the paraphrase matrix of the target vocabulary and the paraphrase matrix of the first vocabulary by using one encoding vector u, the terminal uses the encoding vector u to encode a plurality of row vectors in the paraphrase matrix of the target vocabulary based on a attention mechanism, so as to obtain the attention weight of each row vector in the paraphrase matrix of the target vocabulary. The terminal adopts the attention weights of all the row vectors in the paraphrasing matrix of the target vocabulary to carry out weighted summation on the corresponding row vectors to obtain a representation vector of the target vocabulary, and a plurality of representation vectors form a representation matrix of the target vocabulary. And the terminal adopts the coding vector u to code a plurality of row vectors in the paraphrasing matrix of the first vocabulary based on the attention mechanism, so as to obtain the attention weight of each row vector in the paraphrasing matrix of the first vocabulary. The terminal adopts the attention weights of all the row vectors in the paraphrasing matrix of the first vocabulary to carry out weighted summation on the corresponding row vectors to obtain a representation vector of the first vocabulary, and a plurality of representation vectors form the representation matrix of the first vocabulary. The terminal inputs the representation matrix of the target vocabulary and the representation matrix of the first vocabulary into a transducer encoder, encodes the representation matrix of the target vocabulary and the representation matrix of the first vocabulary by the transducer encoder, and outputs the first relationship vector, where the first relationship vector is used to represent the first relationship feature, and in some embodiments, the number of layers of the transducer encoder is 3.
For example, the terminal can acquire the attention weights of the respective row vectors in the paraphrase matrix of the target vocabulary using the following formula (6).
Wherein, the Attention weight for row vector number i, e j for code vector number j,For the row vector numbered i, D n is the paraphrase matrix for the target vocabulary,For a row vector numbered k, l n is the number of rows of D n.
In addition, the terminal can also obtain the attention weight of each row vector in the paraphrase matrix of the first vocabulary through the above formula (6), and change the paraphrase matrix D n of the target vocabulary in the formula into the paraphrase matrix G n of the first vocabulary.
The terminal can acquire the representation vector of the target vocabulary by the following formula (7).
Wherein, the Is the representation vector numbered j.
Multiple representation vectorsA representation matrix of the target vocabulary is formed.
Of course, the terminal can also obtain the expression vector of the target word through the above formula (7), and change the paraphrasing matrix D n of the target word in the formula to the paraphrasing matrix G n of the first word.
The terminal inputs the representation matrix of the target word and the representation matrix of the first word into a transducer encoder, encodes the representation matrix of the target word and the representation matrix of the first word by the transducer encoder based on the following formula (8), and outputs the first relation vector, namely a parent-child relation vector.
Wherein, the For the first relation vector, transducer h () is the processing formula of the attention encoder,For the representation matrix of the target vocabulary,Is a representation matrix of the first vocabulary.
In some embodiments, the terminal concatenates the representation matrix of the target vocabulary, the representation matrix of the first vocabulary, the start matrix, and the two separation matrices to obtain an input matrix, where the start matrix and the separation matrix are both obtained during model training. If it is adoptedA representation matrix for representing the target vocabulary, which adoptsA representation matrix representing the first vocabulary, usingTo represent the start matrix, usingTo represent the separation matrix, then the terminal will represent the matrix of representation of the target vocabularyRepresentation matrix of first vocabularyStart matrixSplicing two separation matrixesObtaining an input matrix The terminal will input matrixInput transducer encoder, input matrix by transducer encoderEncoding to obtainThe corresponding output is determined as a first relation vector
3033. And the terminal acquires a second relation characteristic based on the paraphrasing text of the target vocabulary and the paraphrasing text of the second vocabulary, wherein the second relation characteristic is used for indicating whether the target vocabulary is the upper word of the second vocabulary.
In one possible implementation, the terminal encodes the paraphrase text of the second vocabulary based on an attention mechanism to obtain a paraphrase matrix of the second vocabulary, where the paraphrase matrix of the second vocabulary is used to represent the paraphrase text of the second vocabulary. The terminal obtains the second relationship feature based on the paraphrase matrix of the target vocabulary and the paraphrase matrix of the second vocabulary.
In order to more clearly describe the above embodiments, the above embodiments will be described below in two parts.
The first part and the terminal encode the paraphrase text of the second vocabulary based on the attention mechanism to obtain a paraphrase matrix of the second vocabulary, wherein the paraphrase matrix of the second vocabulary is used for representing the paraphrase text of the second vocabulary. The terminal obtains the paraphrase matrix of the second vocabulary, which is implemented by the sequence encoding unit 601 of the matching information determining model 600.
In one possible implementation, the terminal divides the paraphrase text of the second vocabulary into a plurality of second phrases, and the plurality of second phrases constitute the paraphrase text of the second vocabulary. And the terminal performs embedded coding on the plurality of second phrases to obtain embedded vectors of the second phrases. And the terminal adopts an attention mechanism to encode based on the embedded vectors of the second word groups to obtain a paraphrasing matrix of the second word.
For example, the terminal employs a word segmentation tool to segment the paraphrasing text of the second vocabulary into a plurality of second phrases. The terminal performs embedded coding on a plurality of second phrases in a Word Vector (Word to Vector) mode to obtain embedded vectors of the second phrases, and the combination of the embedded vectors of the second phrases is the expression sequence of the paraphrasing text of the second vocabulary. The terminal inputs the expression sequence of the paraphrase text of the second vocabulary, namely the embedded vector of each second phrase into a semantic feature extraction model, and obtains a query matrix, a key matrix and a value matrix of each second phrase through the semantic feature extraction model. The terminal obtains a paraphrase matrix of the second vocabulary based on the query matrix, the key matrix and the value matrix of each second phrase. The above process can also be expressed by the formula (9).
Wherein D c is the paraphrasing matrix of the second vocabulary, X c is the representation sequence of the paraphrasing text of the second vocabulary, and l c is the length of the representation sequence of the paraphrasing text of the second vocabulary.
The method for obtaining the paraphrase matrix of the second vocabulary by the terminal through the semantic feature extraction model is described below.
The terminal inputs the embedded vectors of the second phrases into a semantic feature extraction model, and the semantic feature extraction model adopts three linear transformation matrixes to process the embedded vectors of the second phrases so as to obtain a query matrix, a key matrix and a value matrix of the second phrases. The three linear transformation matrices are respectively a query transformation matrix WQ 1, a key transformation matrix WK 1, and a value transformation matrix WV 1. For a second phrase in the paraphrase text of the second word, the terminal respectively performs dot multiplication on the query matrix of the second phrase and key matrixes of other second phrases in the paraphrase text of the second word to obtain the attention weight between the second phrase and other second phrases. The terminal multiplies the attention weight between the second phrase and other second phrases by the corresponding value matrix of the second phrase to obtain the initial attention matrix of the other second phrases to the second phrase. And the terminal fuses the initial attention matrix of the second phrase with other second phrases to obtain the attention matrix of the second phrase. And the terminal fuses the attention matrixes of the second word groups to obtain a paraphrasing matrix of the second word.
And the second part and the terminal acquire the second relation characteristic based on the paraphrasing matrix of the target vocabulary and the paraphrasing matrix of the second vocabulary. Wherein the acquisition of the second relational feature is performed by the parent-child cross attention unit 603 of the matching information determination model 600.
The second relationship feature is used to indicate whether the second vocabulary is a hyponym of the target vocabulary, that is, whether a parent-child relationship can be formed between a node corresponding to the target vocabulary and the second node, or whether the second node is a child node of the node corresponding to the target vocabulary.
In one possible implementation, the terminal encodes the paraphrase matrix of the second vocabulary using the plurality of encoding vectors to obtain a representation matrix of the second vocabulary, where the representation matrix of the target vocabulary has the same dimension as the representation matrix of the second vocabulary. The terminal encodes the representation matrix of the target vocabulary and the representation matrix of the second vocabulary based on the attention mechanism to obtain the second relation feature.
In this embodiment, the terminal is able to encode the paraphrase matrix of the target vocabulary and the paraphrase matrix of the second vocabulary by using a plurality of encoding vectors, that is, to further extract useful information from the paraphrase matrix of the target vocabulary and the paraphrase matrix of the second vocabulary using a plurality of encoding vectors. The representation matrix of the target vocabulary and the representation matrix of the second vocabulary are obtained after the plurality of encoding vectors are adopted for encoding, and have the same dimension, so that the subsequent operation efficiency can be improved, and the storage cost of the model in the offline calculation process can be reduced.
Taking the terminal as an example to encode the paraphrase matrix of the target vocabulary and the paraphrase matrix of the second vocabulary by using one encoding vector u, the terminal uses the encoding vector u to encode a plurality of row vectors in the paraphrase matrix of the target vocabulary based on the attention mechanism, so as to obtain the attention weight of each row vector in the paraphrase matrix of the target vocabulary. The terminal adopts the attention weights of all the row vectors in the paraphrasing matrix of the target vocabulary to carry out weighted summation on the corresponding row vectors to obtain a representation vector of the target vocabulary, and a plurality of representation vectors form a representation matrix of the target vocabulary. And the terminal adopts the coding vector u to code a plurality of row vectors in the paraphrasing matrix of the second vocabulary based on the attention mechanism, so as to obtain the attention weight of each row vector in the paraphrasing matrix of the second vocabulary. The terminal adopts the attention weights of all the row vectors in the paraphrasing matrix of the second vocabulary to carry out weighted summation on the corresponding row vectors to obtain a representation vector of the second vocabulary, and a plurality of representation vectors form a representation matrix of the second vocabulary. The terminal inputs the representation matrix of the target vocabulary and the representation matrix of the second vocabulary into a transducer encoder, encodes the representation matrix of the target vocabulary and the representation matrix of the second vocabulary through the transducer encoder, and outputs the second relation vector, wherein the second relation vector is used for representing the second relation feature.
For example, the terminal can use the above formula (6) to obtain the attention weights of the respective row vectors in the paraphrase matrix of the target vocabulary.
In addition, the terminal can also obtain the attention weight of each row vector in the paraphrase matrix of the second vocabulary through the above formula (6), and change the paraphrase matrix D n of the target vocabulary in the formula into the paraphrase matrix T n of the second vocabulary.
The terminal can acquire the representation vector of the target vocabulary through the above formula (7).
Of course, the terminal can also obtain the expression vector of the target word through the above formula (7), and change the paraphrasing matrix D n of the target word in the formula to the paraphrasing matrix T n of the second word.
The terminal inputs the representation matrix of the target word and the representation matrix of the second word into a transducer encoder, encodes the representation matrix of the target word and the representation matrix of the second word by the transducer encoder based on the following formula (10), and outputs the second relation vector, that is, a parent-child relation vector.
Wherein, the As a second vector of the relationship,Is a representation matrix of the second vocabulary.
In some embodiments, the terminal concatenates the representation matrix of the target vocabulary, the representation matrix of the second vocabulary, the start matrix, and the two separation matrices to obtain an input matrix, where the start matrix and the separation matrix are both obtained during model training. If it is adoptedA representation matrix for representing the target vocabulary, which adoptsA representation matrix representing the second vocabulary, usingTo represent the start matrix, usingTo represent the separation matrix, then the terminal will represent the matrix of representation of the target vocabularyRepresentation matrix of second vocabularyStart matrixSplicing two separation matrixesObtaining an input matrix The terminal will input matrixInput transducer encoder, input matrix by transducer encoderEncoding to obtainThe corresponding output is determined as a second relation vector
If the second node does not exist, the terminal may directly execute step 3034 described below after step 3032.
3034. The terminal determines a first child node and a second child node from the plurality of child nodes based on the paraphrasing text of the target vocabulary and the paraphrasing text corresponding to the plurality of child nodes, wherein the first child node is the child node with the highest semantic similarity between the corresponding paraphrasing text and the target vocabulary, and the second child node is the child node with the lowest semantic similarity between the corresponding paraphrasing text and the target vocabulary.
In one possible implementation manner, the terminal inputs the paraphrase text of the target vocabulary into a semantic feature extraction model, and performs feature extraction on the paraphrase text of the target vocabulary through the semantic feature extraction model to obtain a sixth semantic feature of the paraphrase text of the target vocabulary. The terminal inputs the paraphrasing texts corresponding to the plurality of sub-nodes into the semantic feature extraction model, and feature extraction is carried out on the paraphrasing texts corresponding to the plurality of sub-nodes through the semantic feature extraction model, so that seventh semantic features of all the sub-nodes of the first node are obtained. And the terminal acquires semantic similarity between the paraphrase text corresponding to the node corresponding to the target vocabulary and the paraphrase text corresponding to each child node based on the sixth semantic feature and each seventh semantic feature. The terminal determines the child node with the highest semantic similarity as a first child node, and determines the child node with the lowest semantic similarity as a second child node.
For example, the terminal inputs the paraphrase text of the target vocabulary to a semantic feature extraction model, and the semantic feature extraction model encodes the paraphrase text of the target vocabulary based on an attention mechanism to obtain a sixth semantic vector of the paraphrase text of the target vocabulary, where the sixth semantic vector is used to represent a sixth semantic feature of the paraphrase text of the target vocabulary. The terminal inputs the paraphrasing texts corresponding to the plurality of sub-nodes into the semantic feature extraction model, the semantic feature extraction model respectively carries out attention coding on the paraphrasing texts corresponding to the plurality of sub-nodes based on an attention mechanism to obtain seventh semantic vectors of all the sub-nodes, and the seventh semantic vectors of all the sub-nodes are used for representing the seventh semantic features of all the sub-nodes. The terminal obtains cosine similarity between a sixth semantic vector of the paraphrasing text of the target vocabulary and a seventh semantic vector of each child node, wherein the cosine similarity is used for representing semantic similarity between the paraphrasing text of the target vocabulary and the paraphrasing texts corresponding to the child nodes. The terminal determines the child node with the highest semantic similarity with the paraphrasing text of the target vocabulary as a first child node, and determines the child node with the lowest semantic similarity with the paraphrasing text of the target vocabulary as a second child node.
For example, the terminal acquires the first child node by the following formula (11), and acquires the second child node by the following formula (12).
Wherein s is a first child node, w is a second child node, C p is a set of child nodes of the first node, X q is a paraphrasing text of the target vocabulary, X s is a paraphrasing text corresponding to the first child node, and X w is a paraphrasing text corresponding to the second child node.
It should be noted that there are no child nodes under the first node, i.eWhen this is the case, the following step 3037 may be directly performed.
3035. And the terminal acquires a third relation characteristic based on the paraphrasing text of the target vocabulary and the paraphrasing text corresponding to the first child node, wherein the third relation characteristic is used for indicating whether the target vocabulary is the same kind of word of the vocabulary corresponding to the first child node.
In one possible implementation, the terminal encodes the paraphrase text corresponding to the first sub-node based on the attention mechanism to obtain a paraphrase matrix of the first sub-word, where the paraphrase matrix of the first sub-word is used to represent the paraphrase text corresponding to the first sub-node. The terminal obtains the third relationship feature based on the paraphrase matrix of the target vocabulary and the paraphrase matrix of the first sub-vocabulary.
In order to more clearly describe the above embodiments, the above embodiments will be described below in two parts.
The first part and the terminal encode the paraphrasing text corresponding to the first sub-node based on the attention mechanism to obtain a paraphrasing matrix of the first sub-node, wherein the paraphrasing matrix of the first sub-node is used for representing the paraphrasing text corresponding to the first sub-node. Wherein the part is implemented by the sequence encoding unit 601 of the matching information determination model 600.
In one possible implementation, the terminal divides the paraphrasing text corresponding to the first sub-node into a plurality of third phrases, and the plurality of third phrases form the paraphrasing text corresponding to the first sub-node. And the terminal respectively performs embedded coding on the plurality of third phrases to obtain embedded vectors of the third phrases. And the terminal adopts an attention mechanism, and codes based on the embedded vectors of the third word groups to obtain the paraphrasing matrix of the first child node.
For example, the terminal adopts a word segmentation tool to segment the paraphrasing text corresponding to the first child node into a plurality of third word groups. The terminal performs embedded coding on a plurality of third phrases in a Word Vector (Word to Vector) mode to obtain embedded vectors of the third phrases, and the combination of the embedded vectors of the third phrases is the expression sequence of the paraphrasing text corresponding to the first child node. The terminal inputs the representing sequence of the paraphrase text corresponding to the first child node, namely the embedded vector of each third phrase into a semantic feature extraction model, and obtains a query matrix, a key matrix and a value matrix of each third phrase through the semantic feature extraction model. The terminal obtains the paraphrase matrix of the first sub-node based on the query matrix, the key matrix and the value matrix of each third phrase.
The above process can also be expressed by equation (13).
Wherein D s is the paraphrase matrix of the first child node, X s is the representation sequence of the paraphrase text corresponding to the first child node, and l s is the length of the representation sequence of the paraphrase text corresponding to the first child node.
The method for obtaining the paraphrasing matrix of the first child node by the terminal through the semantic feature extraction model is described below.
The terminal inputs the embedded vectors of the third phrases into a semantic feature extraction model, and the semantic feature extraction model adopts three linear transformation matrixes to process the embedded vectors of the third phrases so as to obtain a query matrix, a key matrix and a value matrix of the third phrases. The three linear transformation matrices are respectively a query transformation matrix WQ 2, a key transformation matrix WK 2, and a value transformation matrix WV 2. And for a third phrase in the paraphrasing text corresponding to the first child node, the terminal respectively performs dot multiplication on the query matrix of the third phrase and key matrixes of other third phrases in the paraphrasing text corresponding to the first child node to obtain the attention weight between the third phrase and the other third phrases. The terminal multiplies the attention weight between the third phrase and other third phrases by the corresponding value matrix of the third phrase to obtain the initial attention matrix of the other third phrases to the third phrase. And the terminal fuses the initial attention matrix of the third phrase with other third phrases to obtain the attention matrix of the third phrase. And the terminal fuses the attention matrixes of the plurality of third phrases to obtain the paraphrasing matrix of the first child node.
And the second part and the terminal acquire the third relation characteristic based on the paraphrase matrix of the target vocabulary and the paraphrase matrix of the first child node. Wherein the portion is implemented by sibling cross-attention unit 604.
The third relationship feature is used to indicate whether the vocabulary corresponding to the first child node is a similar word or a co-located word of the target vocabulary, that is, whether a sibling relationship can be formed between the node corresponding to the target vocabulary and the first child node, or whether the first child node is a sibling node of the node corresponding to the target vocabulary. In the following, it is described by an example what is a sibling node, for example, for a word "color", there are a plurality of hyponyms such as "red", "yellow" and "yellow", etc., and the "red", "yellow" and "yellow" are also referred to as like words or homonyms, and the nodes corresponding to the "red", "yellow" and "yellow" respectively are sibling nodes.
In one possible implementation manner, the terminal encodes the paraphrase matrix of the first sub-node by using the plurality of encoding vectors to obtain a representation matrix of the first sub-node, where the representation matrix of the target vocabulary and the representation matrix of the first sub-node have the same dimension. And the terminal encodes the representation matrix of the target vocabulary and the representation matrix of the first child node based on an attention mechanism to obtain the third relation feature. The judging of the brother node relation enables the model to model the relation among the nodes in the directory tree more fully, and the accuracy of the model is improved. The optimal potential sibling s (first child) can assist in the omnibearing evaluation of the candidate location.
In this embodiment, the terminal is able to encode the paraphrase matrix of the target vocabulary and the paraphrase matrix of the first sub-node by using a plurality of encoding vectors, that is, to further extract useful information from the paraphrase matrix of the target vocabulary and the paraphrase matrix of the first sub-node using a plurality of encoding vectors. The representation matrix of the target vocabulary and the representation matrix of the first child node obtained after the encoding is carried out by adopting a plurality of encoding vectors have the same dimension, so that the subsequent operation efficiency can be improved, and the storage cost of the model in the offline calculation process can be reduced.
Taking the terminal as an example to encode the paraphrasing matrix of the target vocabulary and the paraphrasing matrix of the first sub-node by using one encoding vector u, the terminal uses the encoding vector u to encode a plurality of row vectors in the paraphrasing matrix of the target vocabulary based on the attention mechanism, so as to obtain the attention weight of each row vector in the paraphrasing matrix of the target vocabulary. The terminal adopts the attention weights of all the row vectors in the paraphrasing matrix of the target vocabulary to carry out weighted summation on the corresponding row vectors to obtain a representation vector of the target vocabulary, and a plurality of representation vectors form a representation matrix of the target vocabulary. And the terminal adopts the coding vector u to code a plurality of row vectors in the paraphrasing matrix of the first child node based on an attention mechanism, so as to obtain the attention weight of each row vector in the paraphrasing matrix of the first child node. And the terminal adopts the attention weights of all the row vectors in the paraphrasing matrix of the first sub-node to carry out weighted summation on the corresponding row vectors to obtain a representation vector of the first sub-node, and a plurality of representation vectors form a representation matrix of the first sub-node. The terminal inputs the representation matrix of the target word and the representation matrix of the first sub-node into a transducer encoder, encodes the representation matrix of the target word and the representation matrix of the first sub-node by the transducer encoder, and outputs the third relation vector, wherein the third relation vector is used for representing the third relation feature.
For example, the terminal can use the above formula (6) to obtain the attention weights of the respective row vectors in the paraphrase matrix of the target vocabulary.
In addition, the terminal can also obtain the attention weight of each row vector in the paraphrase matrix of the first child node through the above formula (6), and change the paraphrase matrix D n of the target word in the formula into the paraphrase matrix O n of the first child node.
The terminal can acquire the representation vector of the target vocabulary through the above formula (7).
Of course, the terminal can also obtain the expression vector of the target vocabulary through the above formula (7), and change the paraphrasing matrix D n of the target vocabulary in the formula to the paraphrasing matrix O n of the first child node.
The terminal inputs the representation matrix of the target word and the representation matrix of the first child node into a transducer encoder, encodes the representation matrix of the target word and the representation matrix of the first child node by the transducer encoder based on the following formula (14), and outputs the third relation vector, that is, a sibling relation vector.
Wherein, the For the third relation vector, transducer b () is the processing formula of the attention encoder,Is a representation matrix of the first sub-vocabulary.
In some embodiments, the terminal concatenates the representation matrix of the target vocabulary, the representation matrix of the first child node, the start matrix, and the two separation matrices to obtain an input matrix, where the start matrix and the separation matrix are both obtained during model training. If it is adoptedA representation matrix for representing the target vocabulary, which adoptsA representation matrix representing the first child node, employingTo represent the start matrix, usingTo represent the separation matrix, then the terminal will represent the matrix of representation of the target vocabularyRepresentation matrix of first child nodeStart matrixSplicing two separation matrixesObtaining an input matrix The terminal will input matrixInput transducer encoder, input matrix by transducer encoderEncoding to obtainThe corresponding output is determined as a third relation vector
3036. And the terminal acquires a fourth relation characteristic based on the paraphrasing text of the target vocabulary and the paraphrasing text corresponding to the second sub-node, wherein the fourth relation characteristic is used for indicating whether the target vocabulary is the same kind of word of the vocabulary corresponding to the second sub-node.
In one possible implementation, the terminal encodes the paraphrase text corresponding to the second sub-node based on the attention mechanism to obtain a paraphrase matrix of the first sub-word, where the paraphrase matrix of the first sub-word is used to represent the paraphrase text corresponding to the second sub-node. The terminal obtains the fourth relationship feature based on the paraphrase matrix of the target vocabulary and the paraphrase matrix of the first sub-vocabulary.
In order to more clearly describe the above embodiments, the above embodiments will be described below in two parts.
The first part and the terminal encode the paraphrasing text corresponding to the second sub-node based on the attention mechanism to obtain a paraphrasing matrix of the second sub-node, wherein the paraphrasing matrix of the second sub-node is used for representing the paraphrasing text corresponding to the second sub-node. Wherein the part is implemented by the sequence encoding unit 601 of the matching information determination model 600.
In one possible implementation, the terminal divides the paraphrasing text corresponding to the second sub-node into a plurality of fourth phrases, and the plurality of fourth phrases form the paraphrasing text corresponding to the second sub-node. And the terminal respectively performs embedded coding on the plurality of fourth phrases to obtain embedded vectors of the fourth phrases. And the terminal adopts an attention mechanism to encode based on the embedded vectors of the fourth word groups to obtain the paraphrasing matrix of the second child node.
For example, the terminal uses a word segmentation tool to segment the paraphrasing text corresponding to the second sub-node into a plurality of fourth word groups. The terminal performs embedded coding on a plurality of fourth phrases in a Word Vector (Word to Vector) mode to obtain embedded vectors of the fourth phrases, and the combination of the embedded vectors of the fourth phrases is the expression sequence of the paraphrasing text corresponding to the second child node. The terminal inputs the representing sequence of the paraphrase text corresponding to the second child node, namely the embedded vector of each fourth phrase into a semantic feature extraction model, and obtains a query matrix, a key matrix and a value matrix of each fourth phrase through the semantic feature extraction model. The terminal obtains the paraphrase matrix of the second sub-node based on the query matrix, the key matrix and the value matrix of each fourth phrase.
The above process can also be expressed by the formula (15).
Wherein D w is the paraphrase matrix of the second child node, X w is the representation sequence of the paraphrase text corresponding to the second child node, and l w is the length of the representation sequence of the paraphrase text corresponding to the second child node.
The method for obtaining the paraphrasing matrix of the second child node by the terminal through the semantic feature extraction model is described below.
The terminal inputs the embedded vectors of the fourth phrases into a semantic feature extraction model, and the semantic feature extraction model adopts three linear transformation matrixes to process the embedded vectors of the fourth phrases so as to obtain a query matrix, a key matrix and a value matrix of the fourth phrases. The three linear transformation matrices are respectively a query transformation matrix WQ 2, a key transformation matrix WK 2, and a value transformation matrix WV 2. And for a fourth phrase in the paraphrase text corresponding to the second child node, the terminal respectively performs dot multiplication on the query matrix of the fourth phrase and key matrixes of other fourth phrases in the paraphrase text corresponding to the second child node to obtain the attention weight between the fourth phrase and other fourth phrases. The terminal multiplies the attention weight between the fourth phrase and other fourth phrases by the corresponding value matrix of the fourth phrase to obtain the initial attention matrix of the other fourth phrases to the fourth phrase. And the terminal fuses the initial attention moment matrix of the fourth phrase with other fourth phrases to obtain the attention matrix of the fourth phrase. And the terminal fuses the attention matrixes of the fourth word groups to obtain the paraphrasing matrix of the second child node.
And the second part and the terminal acquire the fourth relation characteristic based on the paraphrasing matrix of the target vocabulary and the paraphrasing matrix of the second child node. Wherein the portion is implemented by sibling cross-attention unit 604.
The fourth relationship feature is used to indicate whether the vocabulary corresponding to the second child node is a similar word or a co-located word of the target vocabulary, that is, whether a sibling relationship can be formed between the node corresponding to the target vocabulary and the second child node, or whether the second child node is a sibling node of the node corresponding to the target vocabulary.
In one possible implementation manner, the terminal encodes the paraphrase matrix of the second sub-node by using the plurality of encoding vectors to obtain a representation matrix of the second sub-node, where the representation matrix of the target vocabulary and the representation matrix of the second sub-node have the same dimension. And the terminal encodes the representation matrix of the target vocabulary and the representation matrix of the second child node based on an attention mechanism to obtain the fourth relation feature. The introduction of the worst potential brother node w (second child node) can provide a judging basis for the node corresponding to the recognition target vocabulary, and avoids the relation judgment of meaningless pseudo leaf nodes.
In this embodiment, the terminal is able to encode the paraphrase matrix of the target vocabulary and the paraphrase matrix of the second child node by using a plurality of encoding vectors, that is, to further extract useful information from the paraphrase matrix of the target vocabulary and the paraphrase matrix of the second child node using a plurality of encoding vectors. The representation matrix of the target vocabulary and the representation matrix of the second child node obtained after the encoding is carried out by adopting a plurality of encoding vectors have the same dimension, so that the subsequent operation efficiency can be improved, and the storage cost of the model in the offline calculation process can be reduced.
Taking the terminal as an example to encode the paraphrasing matrix of the target vocabulary and the paraphrasing matrix of the second sub-node by using one encoding vector u, the terminal uses the encoding vector u to encode a plurality of row vectors in the paraphrasing matrix of the target vocabulary based on the attention mechanism, so as to obtain the attention weight of each row vector in the paraphrasing matrix of the target vocabulary. The terminal adopts the attention weights of all the row vectors in the paraphrasing matrix of the target vocabulary to carry out weighted summation on the corresponding row vectors to obtain a representation vector of the target vocabulary, and a plurality of representation vectors form a representation matrix of the target vocabulary. And the terminal adopts the coding vector u to code a plurality of row vectors in the paraphrasing matrix of the second child node based on the attention mechanism, so as to obtain the attention weight of each row vector in the paraphrasing matrix of the second child node. And the terminal adopts the attention weights of all the row vectors in the paraphrasing matrix of the second sub-node to carry out weighted summation on the corresponding row vectors to obtain a representation vector of the second sub-node, and a plurality of representation vectors form a representation matrix of the second sub-node. The terminal inputs the representation matrix of the target word and the representation matrix of the second sub-node into a transducer encoder, encodes the representation matrix of the target word and the representation matrix of the second sub-node by the transducer encoder, and outputs the fourth relation vector, wherein the fourth relation vector is used for representing the fourth relation feature.
For example, the terminal can use the above formula (6) to obtain the attention weights of the respective row vectors in the paraphrase matrix of the target vocabulary.
In addition, the terminal can also obtain the attention weight of each row vector in the paraphrase matrix of the second child node through the above formula (6), and change the paraphrase matrix D n of the target word in the formula into the paraphrase matrix O n of the second child node.
The terminal can acquire the representation vector of the target vocabulary through the above formula (7).
Of course, the terminal can also obtain the expression vector of the target vocabulary through the above formula (7), and change the paraphrasing matrix D n of the target vocabulary in the formula to the paraphrasing matrix O n of the second child node.
The terminal inputs the representation matrix of the target word and the representation matrix of the second child node into a transducer encoder, encodes the representation matrix of the target word and the representation matrix of the second child node by the transducer encoder based on the following formula (16), and outputs the fourth relationship vector, that is, a sibling relationship vector.
Wherein, the As a fourth vector of the relationship,Is a representation matrix of the first sub-vocabulary.
In some embodiments, the terminal concatenates the representation matrix of the target vocabulary, the representation matrix of the second child node, the start matrix, and the two separation matrices to obtain an input matrix, where the start matrix and the separation matrix are both obtained during model training. If it is adoptedA representation matrix for representing the target vocabulary, which adoptsA representation matrix representing the second child node, employingTo represent the start matrix, usingTo represent the separation matrix, then the terminal will represent the matrix of representation of the target vocabularyRepresentation matrix of second child nodeStart matrixSplicing two separation matrixesObtaining an input matrix The terminal will input matrixInput transducer encoder, input matrix by transducer encoderEncoding to obtainThe corresponding output is determined as a fourth relation vector
3037. The terminal outputs matching information between the target vocabulary and the candidate position based on the first relationship feature, the second relationship feature, the third relationship feature and the fourth relationship feature.
In one possible implementation, the terminal concatenates the first relationship feature, the second relationship feature, the third relationship feature, and the fourth relationship feature into a feature matrix. And the terminal performs full connection and normalization on the feature matrix and outputs matching information between the target vocabulary and the candidate position. Wherein this part is implemented by the scoring unit 605 of the matching information determination model 600.
For example, the terminal splices the first relationship feature, the second relationship feature, the third relationship feature and the fourth relationship feature into feature matrices, inputs the feature matrices into the multi-layer perceptron MLP (Multilayer Perceptron), performs at least one full connection on the feature matrices through MLP, and normalizes the feature matrices after full connection through Sigmoid (S-type growth curve), so as to obtain a matching score between the target vocabulary and the candidate position, where the matching score is used to represent matching information.
For example, the terminal can acquire the matching score between the target vocabulary and the candidate position by the following formula (17).
Where q is the target vocabulary, p is the first node, c is the second node, (p, c) represents the candidate location, f (q, (p, c)) is the matching score, σ () is the Sigmoid function, and MLP () is the function of the multi-layer perceptron. In some embodiments, MLP () comprises two fully connected networks.
It should be noted that, if any of the second node, the first sub-node, and the second sub-node does not exist, the terminal may use a placeholder e na instead ofOr alternativelyA matching score may be calculated, the placeholder being a trainable placeholder.
304. And responding to the matching information meeting the target condition, and adding the node corresponding to the target vocabulary at the candidate position by the terminal.
The matching information accords with the target condition, namely, the matching information is larger than or equal to a matching information threshold value, or the matching information is one of the highest R pieces of the plurality of matching information, wherein R is a positive integer, and the plurality of matching information is the matching information obtained by the terminal based on different candidate positions in the target vocabulary and the directory tree. In some embodiments, the node to which the target vocabulary corresponds is also referred to as the requesting node.
In one possible implementation manner, in response to the matching information meeting a target condition, the terminal adds a node corresponding to a target word in the directory tree, adds a directed edge between a first node of the candidate position and the node, and adds a directed edge between the node and a second node of the candidate position. If a directed edge exists between the first node and the second node of the candidate position, deleting the directed edge.
In some embodiments, there may be multiple correct candidate locations for a target vocabulary, and when there are multiple candidate locations, these candidate locations typically share the same p (first node) or the same c (second node).
Any combination of the above optional solutions may be adopted to form an optional embodiment of the present application, which is not described herein.
By the technical scheme provided by the embodiment of the application, when the node is added into the directory tree, any position in the directory tree can be selected as a candidate position, and the candidate position is the position where the node is possibly added. In the node adding process, the paraphrase text of the target vocabulary is obtained, the matching information is determined based on the paraphrase text corresponding to each node, and finally, the node is added at the candidate position based on the matching information, so that not only can the leaf node be added in the directory tree, but also the non-leaf node can be added in the directory tree, the expansion mode of the directory tree is more various, the information richness in the directory tree can be improved, and the application range of the directory tree is enlarged.
The technical scheme provided by the embodiment of the application can support natural language understanding based on paraphrasing text of continuously appeared brand-new diseases and therapies, is used for accurately updating the medical directory tree, so as to continuously enrich the content of the medical directory tree and improve the application range of the medical directory tree.
In the above step 303, the terminal performs the corresponding steps using the matching information determination model, and a training method of the matching information determination model is described below, referring to fig. 7, and the method includes:
701. The terminal acquires a sample node and a plurality of sample candidate positions from the directory tree, wherein the sample node is a node except a root node in the directory tree.
The plurality of sample candidate positions include both a positive sample candidate position (positive example) of the sample node, which refers to an actual position of the sample node in the directory tree, and a negative sample candidate position (negative example) of the sample node, which refers to a position in the directory tree where the sample node is not located. In some embodiments, positive sample candidate locations are represented by 1s and negative sample candidate locations are represented by 0 s.
In one possible implementation, the terminal adds a pseudo-leaf node in the directory tree, the pseudo-leaf node being connected under each node in the directory tree as a child node of each node in the directory tree, the pseudo-leaf node corresponding to a meaningless blank vocabulary. The terminal obtains a plurality of sample candidate locations based on a plurality of nodes in the directory tree and the pseudo-leaf node, wherein the sample node is not the pseudo-leaf node. Since the training of the matching information determination model includes a plurality of rounds (epochs), the processing procedures of the plurality of rounds belong to the same inventive concept, and in the following description, one round is taken as an example.
For example, if the directory tree before adding the pseudo-leaf node is denoted τ 0=(N00), then the directory tree after adding the pseudo-leaf node is denoted asIn some embodiments, τ 0=(N00) is referred to as a seed directory tree. For the sample node, K sample candidate positions are sampled in the directory tree after the terminal adds the pseudo leaf node, where the K sample candidate positions include a positive sample candidate position and a negative sample candidate position, and the K sample candidate positions form a Mini-Batch:{(p1,c1,y1),(p2,c2,y2),......,(pk,ck,yk)},, where y i, i e {1, 2.
Because the sample node and the candidate position are obtained from the directory tree, whether the candidate position is a positive sample candidate position or a negative sample candidate position can also be determined based on the directory tree, and the process is a self-supervision process, so that the additional overhead of labeling data is avoided.
702. The terminal inputs the sample node and the plurality of sample candidate positions into the matching information determination model, and outputs predicted matching information of the sample node and the plurality of sample candidate positions from the matching information determination model.
The method for outputting the predicted matching information of the sample node and the candidate positions of the plurality of samples by the terminal through the matching information determining model belongs to the same inventive concept as the step 303, and the implementation process is described in the step 303, and is not repeated here.
703. The terminal adjusts model parameters of the model determined by the matching information based on difference information between the predicted matching information and target matching information, wherein the target matching information is the matching information between the sample node and the actual position of the sample node in the directory tree.
In one possible implementation, the terminal constructs a binary cross entropy (Binary Cross Entropy) loss function based on the difference information between the predicted match information and the target match information, and adjusts model parameters of the match information determination model based on the binary cross entropy loss function. In some embodiments, the form of the binary cross entropy loss function is referred to as equation (18) below.
Wherein, the For the binary cross entropy loss function, K is the number of sample candidate locations,For the corresponding vocabulary of the sample node,For a sample candidate position numbered i, y i is a label for the sample candidate position numbered i, indicating whether the sample candidate position numbered i is a positive sample candidate position or a negative sample candidate position.
In some embodiments, in order for the parent/child/sibling cross-attention module to agree with the expected behavior during training, a canonical term for the parent/child/sibling cross-attention module is added to the model training's loss function. For example, a two-layer multi-layer perceptron (MLP) is added after the father-son/sibling cross-attention module, and 4 scores for regular terms are calculated using the MLP, the calculation process is described in the following formula (19).
Wherein, the For the corresponding vocabulary of the sample node,For the vocabulary corresponding to the parent node in the candidate location,Is the vocabulary corresponding to the child nodes in the candidate location,Is a father nodeIs provided with a first sub-node of the (c),Father nodeIs provided with a first sub-node of the first node,Is thatAnd (3) withA first vector of the relation between the two,Is thatAnd (3) withA second vector of the relation between the two,Is thatAnd (3) withA third vector of the relation between the two,Is thatAnd (3) withA fourth relationship vector therebetween, wherein the first relationship vector and the second relationship vector are used to represent parent-child relationships, the third relationship vector and the fourth relationship vector are used to represent sibling relationships,To pair(s)Is used for the scoring of the (c),To pair(s)Is used for the scoring of the (c),To pair(s)Is used for the scoring of the (c),To pair(s)Is a score of (2).
Respectively if and only ifAndCalculation in presenceAndAnd uses it to calculate the regularization term loss. The loss also utilizes a binary cross entropy loss function. For example, for a Mini-Batch, see equation (20) below.
Wherein, the As a binary cross-entropy loss function,
The matching information determines that the final loss function of the model is a weighted sum of the above 5 loss functions, see equation (21).
Wherein, the Lambda is a super parameter, or called weight, for the final loss function.
In some embodiments, the terminal can employ other types of loss functions, such as InfoNCE (Noise Contrastive Estimation, noise contrast estimation) loss functions or MARGIN RANKING (metric) loss functions, in addition to binary cross entropy loss functions to train the matching information determination model, which embodiments of the present application do not limit.
The matching information determining model is trained in a self-supervision mode, so that the additional overhead of labeling data can be avoided, and the training efficiency of the matching information determining model is improved.
Fig. 8 is a schematic structural diagram of a node adding device based on a directory tree according to an embodiment of the present application, and referring to fig. 8, the device includes a candidate location determining module 801, a paraphrase text obtaining module 802, a matching information determining module 803, and a node adding module 804.
The candidate position determining module 801 is configured to determine a candidate position in a directory tree, where the directory tree includes a plurality of nodes, the plurality of nodes respectively correspond to a plurality of vocabularies, the candidate position is a position between a first node and a second node in the directory tree, and a first vocabulary corresponding to the first node is an upper word of a second vocabulary corresponding to the second node.
The paraphrasing text obtaining module 802 is configured to obtain paraphrasing text of a target vocabulary, where the paraphrasing text of the target vocabulary is used to interpret the target vocabulary.
The matching information determining module 803 is configured to determine matching information between the target vocabulary and the candidate location based on the paraphrase text of the target vocabulary, the paraphrase text of the first vocabulary, the paraphrase text of the second vocabulary, and the paraphrase texts corresponding to the plurality of child nodes of the first node, where the matching information is used to represent a matching degree between the target vocabulary and the candidate location.
And the node adding module 804 is configured to add a node corresponding to the target vocabulary at the candidate position in response to the matching information meeting a target condition.
In one possible implementation, the paraphrase text obtaining module 802 is configured to query in a paraphrase text database using the target vocabulary to obtain a paraphrase text of the target vocabulary, where the paraphrase text database stores a plurality of vocabularies and paraphrase texts corresponding to the vocabularies respectively.
In one possible implementation, the paraphrase text obtaining module 802 is configured to query the paraphrase text database with the target vocabulary, and obtain semantic similarity between the target vocabulary and the paraphrase text corresponding to the plurality of nodes when a plurality of terms corresponding to the target vocabulary exist in the paraphrase text database. And determining the paraphrase text of the first term as the paraphrase text of the target vocabulary, wherein the first term is a term with the semantic similarity between the paraphrase texts corresponding to the reference nodes meeting the first similarity condition, and the reference nodes are nodes with the semantic similarity between the corresponding paraphrase text and the target vocabulary meeting the second similarity condition.
In one possible implementation, the matching information determining module 803 is configured to input the paraphrase text of the target vocabulary, the paraphrase text of the first vocabulary, the paraphrase text of the second vocabulary, and the paraphrase texts corresponding to the plurality of child nodes of the first node into a matching information determining model, and output the matching information between the target vocabulary and the candidate location through the matching information determining model.
In a possible implementation manner, the matching information determining module 803 is configured to perform the following steps through the matching information determining model:
Based on the paraphrase text of the target vocabulary and the paraphrase text of the first vocabulary, a first relation feature is obtained, and the first relation feature is used for indicating whether the first vocabulary is a superword of the target vocabulary.
And acquiring a second relation feature based on the paraphrase text of the target vocabulary and the paraphrase text of the second vocabulary, wherein the second relation feature is used for indicating whether the target vocabulary is a hypernym of the second vocabulary.
And determining a first child node and a second child node from the plurality of child nodes based on the paraphrasing text of the target vocabulary and the paraphrasing text corresponding to the plurality of child nodes, wherein the first child node is the child node with the highest semantic similarity between the corresponding paraphrasing text and the target vocabulary, and the second child node is the child node with the lowest semantic similarity between the corresponding paraphrasing text and the target vocabulary.
And outputting matching information between the target vocabulary and the candidate position based on the first relation feature, the second relation feature, the paraphrase text of the target vocabulary, the paraphrase text corresponding to the first sub-node and the paraphrase text corresponding to the second sub-node.
In one possible implementation, the matching information determining module 803 is configured to encode the paraphrase text of the target vocabulary based on the attention mechanism to obtain a paraphrase matrix of the target vocabulary, where the paraphrase matrix of the target vocabulary is used to represent the paraphrase text of the target vocabulary. And encoding the paraphrase text of the first vocabulary based on the attention mechanism to obtain a paraphrase matrix of the first vocabulary, wherein the paraphrase matrix of the first vocabulary is used for representing the paraphrase text of the first vocabulary. And acquiring the first relation feature based on the paraphrase matrix of the target vocabulary and the paraphrase matrix of the first vocabulary.
In one possible implementation, the matching information determining module 803 is configured to encode the paraphrase matrix of the target vocabulary with a plurality of encoding vectors to obtain a representation matrix of the target vocabulary, where the plurality of encoding vectors are used to adjust dimensions of the matrix. And encoding the paraphrase matrix of the first vocabulary by adopting the plurality of encoding vectors to obtain a representation matrix of the first vocabulary, wherein the representation matrix of the target vocabulary and the representation matrix of the first vocabulary have the same dimensionality. And encoding the representation matrix of the target vocabulary and the representation matrix of the first vocabulary based on an attention mechanism to obtain the first relation feature.
In a possible implementation manner, the matching information determining module 803 is configured to obtain a third relationship feature based on the paraphrase text of the target vocabulary and the paraphrase text corresponding to the first child node, where the third relationship feature is used to indicate whether the target vocabulary is a similar word of the vocabulary corresponding to the first child node. And acquiring a fourth relation characteristic based on the paraphrasing text of the target vocabulary and the paraphrasing text corresponding to the second sub-node, wherein the fourth relation characteristic is used for indicating whether the target vocabulary is the same kind of word of the vocabulary corresponding to the second sub-node. And outputting matching information between the target vocabulary and the candidate position based on the first relation feature, the second relation feature, the third relation feature and the fourth relation feature.
In one possible implementation, the matching information determining module 803 is configured to encode the paraphrase text of the target vocabulary based on the attention mechanism to obtain a paraphrase matrix of the target vocabulary, where the paraphrase matrix of the target vocabulary is used to represent the paraphrase text of the target vocabulary. And encoding the paraphrase text corresponding to the first sub-node based on the attention mechanism to obtain a paraphrase matrix of the first sub-word, wherein the paraphrase matrix of the first sub-word is used for representing the paraphrase text corresponding to the first sub-node. And acquiring the third relation feature based on the paraphrase matrix of the target vocabulary and the paraphrase matrix of the first sub-vocabulary.
In one possible implementation, the matching information determining module 803 is configured to encode the paraphrase matrix of the target vocabulary with a plurality of encoding vectors to obtain a representation matrix of the target vocabulary, where the plurality of encoding vectors are used to adjust dimensions of the matrix. And encoding the paraphrase matrix of the first sub-vocabulary by adopting the plurality of encoding vectors to obtain a representation matrix of the first sub-vocabulary, wherein the representation matrix of the target vocabulary and the representation matrix of the first sub-vocabulary have the same dimension. And encoding the representation matrix of the target vocabulary and the representation matrix of the first sub-vocabulary based on an attention mechanism to obtain the third relationship feature.
In a possible implementation manner, the matching information determining module 803 is configured to stitch the first relationship feature, the second relationship feature, the third relationship feature and the fourth relationship feature into a feature matrix. And carrying out full connection and normalization on the feature matrix, and outputting matching information between the target vocabulary and the candidate position.
In one possible embodiment, the apparatus further comprises:
And the adjustment module is used for acquiring sample nodes and a plurality of sample candidate positions from the directory tree, wherein the sample nodes are nodes except for the root node in the directory tree. The sample node and the plurality of sample candidate locations are input to the matching information determination model, and predicted matching information of the sample node and the plurality of sample candidate locations is output by the matching information determination model. And adjusting model parameters of the model determined by the matching information based on difference information between the predicted matching information and target matching information, wherein the target matching information is the matching information between the sample node and the actual position of the sample node in the directory tree.
In a possible implementation manner, the paraphrase text obtaining module 802 is further configured to query, for a first sub-node of the first node, a paraphrase text database with a first sub-word corresponding to the first sub-node, and obtain semantic similarity between a plurality of semantic items and a paraphrase text of the first word when the plurality of semantic items corresponding to the first sub-word exist in the paraphrase text database. And determining the paraphrase text of a second term as the paraphrase text of the target vocabulary, wherein the second term is a term with semantic similarity meeting a third similarity condition with the paraphrase text of the first vocabulary.
It should be noted that, when the node adding device based on the directory tree provided in the foregoing embodiment adds a node in the directory tree, only the division of the foregoing functional modules is used for illustrating, in practical application, the foregoing functional allocation may be completed by different functional modules according to needs, that is, the internal structure of the computer device is divided into different functional modules, so as to complete all or part of the functions described above. In addition, the node adding device based on the directory tree provided in the above embodiment and the node adding method embodiment based on the directory tree belong to the same concept, and the specific implementation process of the node adding device based on the directory tree is detailed in the method embodiment, which is not described herein again.
By the technical scheme provided by the embodiment of the application, when the node is added into the directory tree, any position in the directory tree can be selected as a candidate position, and the candidate position is the position where the node is possibly added. In the node adding process, the paraphrase text of the target vocabulary is obtained, the matching information is determined based on the paraphrase text corresponding to each node, and finally, the node is added at the candidate position based on the matching information, so that not only can the leaf node be added in the directory tree, but also the non-leaf node can be added in the directory tree, the expansion mode of the directory tree is more various, the information richness in the directory tree can be improved, and the application range of the directory tree is enlarged.
The embodiment of the application provides a computer device, which is used for executing the method, and can be realized as a terminal or a server, and the structure of the terminal is described below:
fig. 9 is a schematic structural diagram of a terminal according to an embodiment of the present application. The terminal 900 may be a smart phone, tablet computer, notebook computer or desktop computer. Terminal 900 may also be referred to by other names of user devices, portable terminals, laptop terminals, desktop terminals, etc.
In general, terminal 900 can include one or more processors 901 and one or more memories 902.
Processor 901 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and the like. The processor 901 may be implemented in at least one hardware form of DSP (DIGITAL SIGNAL Processing), FPGA (Field-Programmable gate array), PLA (Programmable Logic Array ). The processor 901 may also include a main processor, which is a processor for processing data in a wake-up state, also referred to as a CPU (Central Processing Unit ), and a coprocessor, which is a low-power processor for processing data in a standby state. In some embodiments, the processor 901 may integrate a GPU (Graphics Processing Unit, image processor) for rendering and drawing of content required to be displayed by the display screen. In some embodiments, the processor 901 may also include an AI (ARTIFICIAL INTELLIGENCE ) processor for processing computing operations related to machine learning.
The memory 902 may include one or more computer-readable storage media, which may be non-transitory. The memory 902 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 902 is used to store at least one computer program for execution by processor 901 to implement the directory tree based node addition method provided by the method embodiments of the present application.
In some embodiments, terminal 900 can optionally further include a peripheral interface 903 and at least one peripheral. The processor 901, memory 902, and peripheral interface 903 may be connected by a bus or signal line. The individual peripheral devices may be connected to the peripheral device interface 903 via buses, signal lines, or circuit boards. Specifically, the peripheral devices include at least one of a radio frequency circuit 904, a display 905, a camera assembly 906, an audio circuit 907, a positioning assembly 908, and a power source 909.
The peripheral interface 903 may be used to connect at least one peripheral device associated with an I/O (Input/Output) to the processor 901 and the memory 902. In some embodiments, the processor 901, the memory 902, and the peripheral interface 903 are integrated on the same chip or circuit board, and in some other embodiments, either or both of the processor 901, the memory 902, and the peripheral interface 903 may be implemented on separate chips or circuit boards, which is not limited in this embodiment.
The Radio Frequency circuit 904 is configured to receive and transmit RF (Radio Frequency) signals, also known as electromagnetic signals. The radio frequency circuit 904 communicates with a communication network and other communication devices via electromagnetic signals. The radio frequency circuit 904 converts an electrical signal into an electromagnetic signal for transmission, or converts a received electromagnetic signal into an electrical signal. Optionally, the radio frequency circuitry 904 includes an antenna system, an RF transceiver, one or more amplifiers, tuners, oscillators, digital signal processors, codec chipsets, subscriber identity module cards, and so forth.
The display 905 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display 905 is a touch display, the display 905 also has the ability to capture touch signals at or above the surface of the display 905. The touch signal may be input as a control signal to the processor 901 for processing. At this time, the display 905 may also be used to provide virtual buttons and/or a virtual keyboard, also referred to as soft buttons and/or a soft keyboard.
The camera assembly 906 is used to capture images or video. Optionally, the camera assembly 906 includes a front camera and a rear camera. Typically, the front camera is disposed on the front panel of the terminal and the rear camera is disposed on the rear surface of the terminal.
The audio circuit 907 may include a microphone and a speaker. The microphone is used for collecting sound waves of users and the environment, converting the sound waves into electric signals, and inputting the electric signals to the processor 901 for processing, or inputting the electric signals to the radio frequency circuit 904 for voice communication.
The location component 908 is used to locate the current geographic location of the terminal 900 to enable navigation or LBS (Location Based Service, location-based services).
The power supply 909 is used to supply power to the various components in the terminal 900. The power supply 909 may be an alternating current, a direct current, a disposable battery, or a rechargeable battery.
In some embodiments, terminal 900 can further include one or more sensors 910. The one or more sensors 910 include, but are not limited to, an acceleration sensor 911, a gyroscope sensor 912, a pressure sensor 913, a fingerprint sensor 914, an optical sensor 915, and a proximity sensor 916.
The acceleration sensor 911 can detect the magnitudes of accelerations on three coordinate axes of the coordinate system established with the terminal 900.
The gyro sensor 912 may be configured to sense a body direction and a rotation angle of the terminal 900, and the gyro sensor 912 may be configured to collect a 3D motion of the terminal 900 by a user in cooperation with the acceleration sensor 911.
The pressure sensor 913 may be provided at a side frame of the terminal 900 and/or at a lower layer of the display 905. When the pressure sensor 913 is provided at a side frame of the terminal 900, a grip signal of the user to the terminal 900 may be detected, and the processor 901 performs left-right hand recognition or shortcut operation according to the grip signal collected by the pressure sensor 913. When the pressure sensor 913 is provided at the lower layer of the display 905, the processor 901 performs control of the operability control on the UI interface according to the pressure operation of the user on the display 905.
The fingerprint sensor 914 is used for collecting the fingerprint of the user, and the processor 901 identifies the identity of the user according to the fingerprint collected by the fingerprint sensor 914 or the fingerprint sensor 914 identifies the identity of the user according to the collected fingerprint.
The optical sensor 915 is used to collect the intensity of ambient light. In one embodiment, the processor 901 may control the display brightness of the display panel 905 based on the intensity of ambient light collected by the optical sensor 915.
Proximity sensor 916 is used to collect the distance between the user and the front of terminal 900.
Those skilled in the art will appreciate that the structure shown in fig. 9 is not limiting and that more or fewer components than shown may be included or certain components may be combined or a different arrangement of components may be employed.
The computer device may also be implemented as a server, and the following describes the structure of the server:
Fig. 10 is a schematic structural diagram of a server according to an embodiment of the present application, where the server 1000 may have a relatively large difference due to configuration or performance, and may include one or more processors (Central Processing Units, CPU) 1001 and one or more memories 1002, where the one or more memories 1002 store at least one computer program, and the at least one computer program is loaded and executed by the one or more processors 1001 to implement the methods provided in the foregoing method embodiments. Of course, the server 1000 may also have a wired or wireless network interface, a keyboard, an input/output interface, and other components for implementing the functions of the device, which are not described herein.
In an exemplary embodiment, a computer readable storage medium, e.g. a memory comprising a computer program, executable by a processor to perform the directory tree based node addition method of the above embodiment is also provided. For example, the computer readable storage medium may be Read-Only Memory (ROM), random-access Memory (Random Access Memory, RAM), compact disc Read-Only Memory (CD-ROM), magnetic tape, floppy disk, optical data storage device, and the like.
In an exemplary embodiment, a computer program product or a computer program is also provided, the computer program product or computer program comprising a program code stored in a computer readable storage medium, the program code being read from the computer readable storage medium by a processor of a computer device, the program code being executed by the processor, causing the computer device to perform the above-mentioned directory tree based node addition method.
In some embodiments, a computer program according to an embodiment of the present application may be deployed to be executed on one computer device or on multiple computer devices located at one site or on multiple computer devices distributed across multiple sites and interconnected by a communication network, where the multiple computer devices distributed across multiple sites and interconnected by a communication network may constitute a blockchain system.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program for instructing relevant hardware, where the program may be stored in a computer readable storage medium, and the above storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The foregoing description of the preferred embodiments of the present application is not intended to be limiting, but rather is intended to cover all modifications, equivalents, alternatives, and improvements falling within the spirit and principles of the present application.

Claims (17)

1.一种基于目录树的节点添加方法,其特征在于,所述方法包括:1. A method for adding nodes based on a directory tree, characterized in that the method comprises: 确定目录树中的候选位置,所述目录树包括多个节点,所述多个节点分别对应于多个词汇,所述候选位置为所述目录树中第一节点和第二节点之间的位置,所述第一节点对应的第一词汇为所述第二节点对应的第二词汇的上位词;Determining a candidate position in a directory tree, the directory tree comprising a plurality of nodes, the plurality of nodes corresponding to a plurality of words respectively, the candidate position being a position between a first node and a second node in the directory tree, a first word corresponding to the first node being a hypernym of a second word corresponding to the second node; 获取目标词汇的释义文本,所述目标词汇的释义文本用于解释所述目标词汇;Obtaining a definition text of a target vocabulary, wherein the definition text of the target vocabulary is used to explain the target vocabulary; 基于所述目标词汇的释义文本、所述第一词汇的释义文本、所述第二词汇的释义文本以及所述第一节点的多个子节点对应的释义文本,确定所述目标词汇与所述候选位置之间的匹配信息,所述匹配信息用于表示所述目标词汇与所述候选位置之间的匹配程度;Determining matching information between the target vocabulary and the candidate position based on the definition text of the target vocabulary, the definition text of the first vocabulary, the definition text of the second vocabulary, and the definition texts corresponding to multiple child nodes of the first node, wherein the matching information is used to indicate a matching degree between the target vocabulary and the candidate position; 响应于所述匹配信息符合目标条件,在所述候选位置添加所述目标词汇对应的节点。In response to the matching information meeting the target condition, a node corresponding to the target vocabulary is added at the candidate position. 2.根据权利要求1所述的方法,其特征在于,所述获取目标词汇的释义文本包括:2. The method according to claim 1, wherein obtaining the definition text of the target vocabulary comprises: 采用所述目标词汇在释义文本数据库中进行查询,得到所述目标词汇的释义文本,所述释义文本数据库存储有多个词汇以及所述多个词汇分别对应的释义文本。The target vocabulary is used to search in a definition text database to obtain the definition text of the target vocabulary. The definition text database stores a plurality of vocabulary and the definition texts corresponding to the plurality of vocabulary. 3.根据权利要求2所述的方法,其特征在于,所述采用所述目标词汇在释义文本数据库中进行查询,得到所述目标词汇的释义文本包括:3. The method according to claim 2, wherein the step of querying a definition text database using the target vocabulary to obtain the definition text of the target vocabulary comprises: 采用所述目标词汇在所述释义文本数据库中进行查询,在所述释义文本数据库中存在所述目标词汇对应的多个义项的情况下,获取所述目标词汇与所述多个节点对应的释义文本之间的语义相似度;Using the target vocabulary to search the interpretation text database, if there are multiple meanings corresponding to the target vocabulary in the interpretation text database, obtaining the semantic similarity between the target vocabulary and the interpretation texts corresponding to the multiple nodes; 将第一义项的释义文本确定为所述目标词汇的释义文本,所述第一义项为与参考节点对应的释义文本之间的语义相似度符合第一相似度条件的义项,所述参考节点为对应的释义文本与所述目标词汇之间的语义相似度符合第二相似度条件的节点。The interpretation text of the first meaning item is determined as the interpretation text of the target vocabulary, wherein the first meaning item is a meaning item whose semantic similarity with the interpretation text corresponding to the reference node meets the first similarity condition, and the reference node is a node whose semantic similarity with the target vocabulary meets the second similarity condition. 4.根据权利要求1所述的方法,其特征在于,所述基于所述目标词汇的释义文本、所述第一词汇的释义文本、所述第二词汇的释义文本以及所述第一节点的多个子节点对应的释义文本,确定所述目标词汇与所述候选位置之间的匹配信息包括:4. The method according to claim 1, wherein determining the matching information between the target word and the candidate position based on the definition text of the target word, the definition text of the first word, the definition text of the second word, and the definition texts corresponding to the plurality of child nodes of the first node comprises: 将所述目标词汇的释义文本、所述第一词汇的释义文本、所述第二词汇的释义文本以及所述第一节点的多个子节点对应的释义文本输入匹配信息确定模型,通过所述匹配信息确定模型输出所述目标词汇与所述候选位置之间的匹配信息。The definition text of the target vocabulary, the definition text of the first vocabulary, the definition text of the second vocabulary and the definition texts corresponding to multiple child nodes of the first node are input into a matching information determination model, and the matching information between the target vocabulary and the candidate position is output through the matching information determination model. 5.根据权利要求4所述的方法,其特征在于,所述通过所述匹配信息确定模型输出所述目标词汇与所述候选位置之间的匹配信息包括:5. The method according to claim 4, wherein outputting the matching information between the target word and the candidate position through the matching information determination model comprises: 通过所述匹配信息确定模型执行下述步骤:The following steps are performed based on the matching information to determine the model: 基于所述目标词汇的释义文本和所述第一词汇的释义文本,获取第一关系特征,所述第一关系特征用于表示所述第一词汇是否为所述目标词汇的上位词;acquiring a first relationship feature based on the definition text of the target vocabulary and the definition text of the first vocabulary, wherein the first relationship feature is used to indicate whether the first vocabulary is a hypernym of the target vocabulary; 基于所述目标词汇的释义文本和所述第二词汇的释义文本,获取第二关系特征,所述第二关系特征用于表示所述目标词汇是否为所述第二词汇的上位词;acquiring a second relationship feature based on the definition text of the target vocabulary and the definition text of the second vocabulary, wherein the second relationship feature is used to indicate whether the target vocabulary is a hypernym of the second vocabulary; 基于所述目标词汇的释义文本以及所述多个子节点对应的释义文本,从所述多个子节点中确定出第一子节点和第二子节点,所述第一子节点为对应的释义文本与所述目标词汇之间语义相似度最高的子节点,所述第二子节点为对应的释义文本与所述目标词汇之间语义相似度最低的子节点;Based on the definition text of the target vocabulary and the definition texts corresponding to the multiple child nodes, determining a first child node and a second child node from the multiple child nodes, wherein the first child node is the child node having the highest semantic similarity between the corresponding definition text and the target vocabulary, and the second child node is the child node having the lowest semantic similarity between the corresponding definition text and the target vocabulary; 基于所述第一关系特征、所述第二关系特征、所述目标词汇的释义文本、所述第一子节点对应的释义文本以及所述第二子节点对应的释义文本,输出所述目标词汇与所述候选位置之间的匹配信息。Matching information between the target vocabulary and the candidate position is output based on the first relationship feature, the second relationship feature, the definition text of the target vocabulary, the definition text corresponding to the first child node, and the definition text corresponding to the second child node. 6.根据权利要求5所述的方法,其特征在于,所述基于所述目标词汇的释义文本和所述第一词汇的释义文本,获取第一关系特征包括:6. The method according to claim 5, wherein obtaining the first relationship feature based on the definition text of the target vocabulary and the definition text of the first vocabulary comprises: 基于注意力机制对所述目标词汇的释义文本进行编码,得到所述目标词汇的释义矩阵,所述目标词汇的释义矩阵用于表示所述目标词汇的释义文本;Encoding the definition text of the target vocabulary based on the attention mechanism to obtain a definition matrix of the target vocabulary, wherein the definition matrix of the target vocabulary is used to represent the definition text of the target vocabulary; 基于注意力机制对所述第一词汇的释义文本进行编码,得到所述第一词汇的释义矩阵,所述第一词汇的释义矩阵用于表示所述第一词汇的释义文本;encoding the interpretation text of the first vocabulary based on an attention mechanism to obtain a interpretation matrix of the first vocabulary, wherein the interpretation matrix of the first vocabulary is used to represent the interpretation text of the first vocabulary; 基于所述目标词汇的释义矩阵和所述第一词汇的释义矩阵,获取所述第一关系特征。The first relationship feature is obtained based on the definition matrix of the target vocabulary and the definition matrix of the first vocabulary. 7.根据权利要求6所述的方法,其特征在于,所述基于所述目标词汇的释义矩阵和所述第一词汇的释义矩阵,获取所述第一关系特征包括:7. The method according to claim 6, wherein obtaining the first relationship feature based on the definition matrix of the target vocabulary and the definition matrix of the first vocabulary comprises: 采用多个编码向量对所述目标词汇的释义矩阵进行编码,得到所述目标词汇的表示矩阵,所述多个编码向量用于调整矩阵的维度;Encoding the interpretation matrix of the target vocabulary using a plurality of encoding vectors to obtain a representation matrix of the target vocabulary, wherein the plurality of encoding vectors are used to adjust the dimension of the matrix; 采用所述多个编码向量对所述第一词汇的释义矩阵进行编码,得到所述第一词汇的表示矩阵,所述目标词汇的表示矩阵与所述第一词汇的表示矩阵具有相同的维度;encoding the interpretation matrix of the first vocabulary using the multiple encoding vectors to obtain a representation matrix of the first vocabulary, wherein the representation matrix of the target vocabulary has the same dimension as the representation matrix of the first vocabulary; 基于注意力机制对所述目标词汇的表示矩阵和所述第一词汇的表示矩阵进行编码,得到所述第一关系特征。The representation matrix of the target vocabulary and the representation matrix of the first vocabulary are encoded based on an attention mechanism to obtain the first relationship feature. 8.根据权利要求5所述的方法,其特征在于,所述基于所述第一关系特征、所述第二关系特征、所述目标词汇的释义文本、所述第一子节点对应的释义文本以及所述第二子节点对应的释义文本,输出所述目标词汇与所述候选位置之间的匹配信息包括:8. The method according to claim 5, wherein the step of outputting matching information between the target word and the candidate position based on the first relationship feature, the second relationship feature, the definition text of the target word, the definition text corresponding to the first child node, and the definition text corresponding to the second child node comprises: 基于所述目标词汇的释义文本和所述第一子节点对应的释义文本,获取第三关系特征,所述第三关系特征用于表示所述目标词汇是否为所述第一子节点对应的词汇的同类词;Obtaining a third relationship feature based on the definition text of the target vocabulary and the definition text corresponding to the first child node, wherein the third relationship feature is used to indicate whether the target vocabulary is a similar word to the vocabulary corresponding to the first child node; 基于所述目标词汇的释义文本和所述第二子节点对应的释义文本,获取第四关系特征,所述第四关系特征用于表示所述目标词汇是否为所述第二子节点对应的词汇的同类词;acquiring, based on the definition text of the target vocabulary and the definition text corresponding to the second child node, a fourth relationship feature, wherein the fourth relationship feature is used to indicate whether the target vocabulary is a similar word to the vocabulary corresponding to the second child node; 基于所述第一关系特征、所述第二关系特征、所述第三关系特征以及所述第四关系特征,输出所述目标词汇与所述候选位置之间的匹配信息。Based on the first relationship feature, the second relationship feature, the third relationship feature, and the fourth relationship feature, matching information between the target word and the candidate position is output. 9.根据权利要求8所述的方法,其特征在于,所述基于所述目标词汇的释义文本和所述第一子节点对应的释义文本,获取第三关系特征包括:9. The method according to claim 8, wherein obtaining the third relationship feature based on the definition text of the target vocabulary and the definition text corresponding to the first child node comprises: 基于注意力机制对所述目标词汇的释义文本进行编码,得到所述目标词汇的释义矩阵,所述目标词汇的释义矩阵用于表示所述目标词汇的释义文本;Encoding the definition text of the target vocabulary based on the attention mechanism to obtain a definition matrix of the target vocabulary, wherein the definition matrix of the target vocabulary is used to represent the definition text of the target vocabulary; 基于注意力机制对所述第一子节点对应的释义文本进行编码,得到第一子词汇的释义矩阵,所述第一子词汇的释义矩阵用于表示所述第一子节点对应的释义文本;encoding the interpretation text corresponding to the first child node based on an attention mechanism to obtain a interpretation matrix of the first sub-word, wherein the interpretation matrix of the first sub-word is used to represent the interpretation text corresponding to the first child node; 基于所述目标词汇的释义矩阵和所述第一子词汇的释义矩阵,获取所述第三关系特征。The third relationship feature is obtained based on the definition matrix of the target vocabulary and the definition matrix of the first sub-vocabulary. 10.根据权利要求9所述的方法,其特征在于,所述基于所述目标词汇的释义矩阵和所述第一子词汇的释义矩阵,获取所述第三关系特征包括:10. The method according to claim 9, wherein obtaining the third relationship feature based on the definition matrix of the target vocabulary and the definition matrix of the first sub-vocabulary comprises: 采用多个编码向量对所述目标词汇的释义矩阵进行编码,得到所述目标词汇的表示矩阵,所述多个编码向量用于调整矩阵的维度;Encoding the interpretation matrix of the target vocabulary using a plurality of encoding vectors to obtain a representation matrix of the target vocabulary, wherein the plurality of encoding vectors are used to adjust the dimension of the matrix; 采用所述多个编码向量对所述第一子词汇的释义矩阵进行编码,得到所述第一子词汇的表示矩阵,所述目标词汇的表示矩阵与所述第一子词汇的表示矩阵具有相同的维度;encoding the interpretation matrix of the first sub-vocabulary using the plurality of encoding vectors to obtain a representation matrix of the first sub-vocabulary, wherein the representation matrix of the target vocabulary has the same dimension as the representation matrix of the first sub-vocabulary; 基于注意力机制对所述目标词汇的表示矩阵和所述第一子词汇的表示矩阵进行编码,得到所述第三关系特征。The representation matrix of the target vocabulary and the representation matrix of the first sub-vocabulary are encoded based on an attention mechanism to obtain the third relationship feature. 11.根据权利要求8所述的方法,其特征在于,所述基于所述第一关系特征、所述第二关系特征、所述第三关系特征以及所述第四关系特征,输出所述目标词汇与所述候选位置之间的匹配信息包括:11. The method according to claim 8, wherein the outputting the matching information between the target word and the candidate position based on the first relationship feature, the second relationship feature, the third relationship feature, and the fourth relationship feature comprises: 将所述第一关系特征、所述第二关系特征、所述第三关系特征以及所述第四关系特征拼接为特征矩阵;splicing the first relationship feature, the second relationship feature, the third relationship feature, and the fourth relationship feature into a feature matrix; 对所述特征矩阵进行全连接和归一化,输出所述目标词汇与所述候选位置之间的匹配信息。The feature matrix is fully connected and normalized, and matching information between the target vocabulary and the candidate positions is output. 12.根据权利要求4所述的方法,其特征在于,所述方法还包括:12. The method according to claim 4, further comprising: 从所述目录树中获取样本节点和多个样本候选位置,所述样本节点为所述目录树中除根节点以外的节点;Acquire a sample node and a plurality of sample candidate positions from the directory tree, wherein the sample node is a node other than a root node in the directory tree; 将所述样本节点和多个样本候选位置输入所述匹配信息确定模型,由所述匹配信息确定模型输出所述样本节点和多个样本候选位置的预测匹配信息;Inputting the sample node and multiple sample candidate positions into the matching information determination model, and having the matching information determination model output predicted matching information of the sample node and multiple sample candidate positions; 基于所述预测匹配信息与目标匹配信息之间的差异信息,对所述匹配信息确定模型的模型参数进行调整,所述目标匹配信息为所述样本节点与所述样本节点在所述目录树中的实际位置之间的匹配信息。Model parameters of the matching information determination model are adjusted based on difference information between the predicted matching information and target matching information, where the target matching information is matching information between the sample node and the actual position of the sample node in the directory tree. 13.根据权利要求1所述的方法,其特征在于,所述方法还包括:13. The method according to claim 1, further comprising: 对于所述第一节点的第一子节点,采用所述第一子节点对应的第一子词汇在释义文本数据库中进行查询,在所述释义文本数据库中存在所述第一子词汇对应的多个义项的情况下,获取所述多个义项与所述第一词汇的释义文本之间的语义相似度;For a first child node of the first node, query a first sub-word corresponding to the first sub-node in a paraphrase text database, and if multiple senses corresponding to the first sub-word exist in the paraphrase text database, obtain semantic similarities between the multiple senses and the paraphrase text of the first word; 将第二义项的释义文本确定为所述目标词汇的释义文本,所述第二义项为与所述第一词汇的释义文本之间语义相似度符合第三相似度条件的义项。The interpretation text of the second meaning is determined as the interpretation text of the target vocabulary, wherein the semantic similarity between the second meaning and the interpretation text of the first vocabulary meets a third similarity condition. 14.一种基于目录树的节点添加装置,其特征在于,所述装置包括:14. A node adding device based on a directory tree, characterized in that the device comprises: 候选位置确定模块,用于确定目录树中的候选位置,所述目录树包括多个节点,所述多个节点分别对应于多个词汇,所述候选位置为所述目录树中第一节点和第二节点之间的位置,所述第一节点对应的第一词汇为所述第二节点对应的第二词汇的上位词;a candidate position determination module, configured to determine a candidate position in a directory tree, the directory tree comprising a plurality of nodes, the plurality of nodes corresponding to a plurality of words respectively, the candidate position being a position between a first node and a second node in the directory tree, the first word corresponding to the first node being a hypernym of a second word corresponding to the second node; 释义文本获取模块,用于获取目标词汇的释义文本,所述目标词汇的释义文本用于解释所述目标词汇;An interpretation text acquisition module is used to acquire the interpretation text of the target vocabulary, wherein the interpretation text of the target vocabulary is used to explain the target vocabulary; 匹配信息确定模块,用于基于所述目标词汇的释义文本、所述第一词汇的释义文本、所述第二词汇的释义文本以及所述第一节点的多个子节点对应的释义文本,确定所述目标词汇与所述候选位置之间的匹配信息,所述匹配信息用于表示所述目标词汇与所述候选位置之间的匹配程度;a matching information determination module, configured to determine matching information between the target vocabulary and the candidate position based on the definition text of the target vocabulary, the definition text of the first vocabulary, the definition text of the second vocabulary, and the definition texts corresponding to the plurality of child nodes of the first node, wherein the matching information is used to indicate a degree of matching between the target vocabulary and the candidate position; 节点添加模块,用于响应于所述匹配信息符合目标条件,在所述候选位置添加所述目标词汇对应的节点。The node adding module is configured to add a node corresponding to the target word at the candidate position in response to the matching information meeting the target condition. 15.一种计算机设备,其特征在于,所述计算机设备包括一个或多个处理器和一个或多个存储器,所述一个或多个存储器中存储有至少一条计算机程序,所述计算机程序由所述一个或多个处理器加载并执行以实现如权利要求1至权利要求13任一项所述的基于目录树的节点添加方法。15. A computer device, characterized in that the computer device includes one or more processors and one or more memories, at least one computer program is stored in the one or more memories, and the computer program is loaded and executed by the one or more processors to implement the directory tree-based node adding method as described in any one of claims 1 to 13. 16.一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有至少一条计算机程序,所述计算机程序由处理器加载并执行以实现如权利要求1至权利要求13任一项所述的基于目录树的节点添加方法。16. A computer-readable storage medium, characterized in that at least one computer program is stored in the computer-readable storage medium, and the computer program is loaded and executed by a processor to implement the directory tree-based node adding method according to any one of claims 1 to 13. 17.一种计算机程序产品,包括计算机程序,其特征在于,该计算机程序被处理器执行时实现权利要求1至权利要求13任一项所述的基于目录树的节点添加方法。17. A computer program product, comprising a computer program, characterized in that when the computer program is executed by a processor, it implements the directory tree-based node adding method described in any one of claims 1 to 13.
CN202111095271.6A 2021-09-17 2021-09-17 Node adding method, device, equipment and storage medium based on directory tree Active CN114281919B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111095271.6A CN114281919B (en) 2021-09-17 2021-09-17 Node adding method, device, equipment and storage medium based on directory tree

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111095271.6A CN114281919B (en) 2021-09-17 2021-09-17 Node adding method, device, equipment and storage medium based on directory tree

Publications (2)

Publication Number Publication Date
CN114281919A CN114281919A (en) 2022-04-05
CN114281919B true CN114281919B (en) 2025-10-28

Family

ID=80868606

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111095271.6A Active CN114281919B (en) 2021-09-17 2021-09-17 Node adding method, device, equipment and storage medium based on directory tree

Country Status (1)

Country Link
CN (1) CN114281919B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115455935B (en) * 2022-09-14 2025-08-12 华东师范大学 Text information intelligent processing system
CN115510103B (en) * 2022-09-23 2026-01-02 北京百度网讯科技有限公司 Processing method, device, equipment, medium and product of search flow
CN116759099B (en) * 2023-08-21 2024-07-02 潍坊医学院 Data processing method, device and equipment for medical insurance foundation auditing system
CN121116915A (en) * 2024-06-11 2025-12-12 华为技术有限公司 Query method, query device, processor and computing equipment of data catalogue

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110619066A (en) * 2019-08-30 2019-12-27 视联动力信息技术股份有限公司 Information acquisition method and device based on directory tree

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7305400B2 (en) * 2000-03-09 2007-12-04 The Web Access, Inc. Method and apparatus for performing a research task by interchangeably utilizing a multitude of search methodologies
US8234309B2 (en) * 2005-01-31 2012-07-31 International Business Machines Corporation Method for automatically modifying a tree structure
CN109635281B (en) * 2018-11-22 2023-01-31 创新先进技术有限公司 Method and device for updating nodes in business map

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110619066A (en) * 2019-08-30 2019-12-27 视联动力信息技术股份有限公司 Information acquisition method and device based on directory tree

Also Published As

Publication number Publication date
CN114281919A (en) 2022-04-05

Similar Documents

Publication Publication Date Title
CN116797684B (en) Image generation method, device, electronic equipment and storage medium
CN112347795B (en) Machine translation quality assessment method, device, equipment and medium
US20230162481A1 (en) Pre-training of computer vision foundational models
CN111897964B (en) Text classification model training method, device, equipment and storage medium
CN114281919B (en) Node adding method, device, equipment and storage medium based on directory tree
CN113392180B (en) Text processing method, device, equipment and storage medium
WO2022253061A1 (en) Voice processing method and related device
Guo et al. Jointly learning of visual and auditory: A new approach for RS image and audio cross-modal retrieval
CN113569002B (en) Text search method, device, equipment and storage medium
KR20170004154A (en) Method and system for automatically summarizing documents to images and providing the image-based contents
CN113095072B (en) Text processing method and device
CN114782722B (en) Image-text similarity determination method and device and electronic equipment
CN117453949A (en) A video positioning method and device
CN113569042B (en) Text information classification method, device, computer equipment and storage medium
WO2023091227A1 (en) Pre-training of computer vision foundational models
KR20200087977A (en) Multimodal ducument summary system and method
CN112085120A (en) Multimedia data processing method and device, electronic equipment and storage medium
KR20250047390A (en) Data processing method and device, entity linking method and device, and computer device
CN105989067A (en) Method for generating text abstract from image, user equipment and training server
US20240126993A1 (en) Transformer-based text encoder for passage retrieval
CN117972037A (en) Reply information display method and device, computer equipment and storage medium
CN117009570A (en) A picture and text retrieval method and device based on location information and confidence perception
CN118035945B (en) Label recognition model processing method and related device
CN115221298B (en) Question-answering matching methods, devices, electronic equipment, and storage media
CN114282543B (en) Text data processing method, device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant