[go: up one dir, main page]

CN114443824B - Data processing method, device, electronic equipment and computer storage medium - Google Patents

Data processing method, device, electronic equipment and computer storage medium Download PDF

Info

Publication number
CN114443824B
CN114443824B CN202210078998.1A CN202210078998A CN114443824B CN 114443824 B CN114443824 B CN 114443824B CN 202210078998 A CN202210078998 A CN 202210078998A CN 114443824 B CN114443824 B CN 114443824B
Authority
CN
China
Prior art keywords
node
vector corresponding
information
text information
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210078998.1A
Other languages
Chinese (zh)
Other versions
CN114443824A (en
Inventor
高莘
张寓弛
王永亮
董扬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alipay Hangzhou Information Technology Co Ltd
Original Assignee
Alipay Hangzhou Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alipay Hangzhou Information Technology Co Ltd filed Critical Alipay Hangzhou Information Technology Co Ltd
Priority to CN202210078998.1A priority Critical patent/CN114443824B/en
Publication of CN114443824A publication Critical patent/CN114443824A/en
Application granted granted Critical
Publication of CN114443824B publication Critical patent/CN114443824B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3347Query execution using vector based model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Databases & Information Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the specification discloses a data processing method, a data processing device, electronic equipment and a computer storage medium. Wherein the method comprises the following steps: and obtaining N information sources related to the user questions by inquiring the received user questions from a preset question-answer data set, and inputting the N information sources into a question-answer model, so as to obtain target answers of the user questions. The N is a positive integer greater than or equal to 2; at least two information sources in the N information sources are associated; the question-answering model is trained based on a plurality of user questions, N information sources corresponding to the user questions, and a plurality of standard answers corresponding to the user questions.

Description

Data processing method, device, electronic equipment and computer storage medium
Technical Field
The present disclosure relates to the field of communications technologies, and in particular, to a data processing method, an apparatus, an electronic device, and a computer storage medium.
Background
The community question-answering system is mostly constructed by manually writing answers, or selecting correct answers from given answers or extracting a sentence or a fragment from a given article as the answers based on an information extraction method. These methods merely generate answers based on a single data source, which give answers that are generally poorly related to the question.
Disclosure of Invention
The embodiment of the specification provides a data processing method, a data processing device, electronic equipment and a computer storage medium, which are used for understanding knowledge information contained in various information sources by combining the association among the various information sources, and then integrating the learned knowledge into the process of answering user questions, so that reasoning in the various information sources is realized, the consistency of generated answers and the user questions is improved, and therefore, the user viscosity and the user experience are improved. The technical scheme is as follows:
in a first aspect, embodiments of the present disclosure provide a data processing method, including:
receiving user questions input by a user;
inquiring from a preset question-answer data set based on the user questions and N preset information source types to obtain N information sources; the N is a positive integer greater than or equal to 2; at least two information sources in the N information sources are associated;
inputting the user questions and the N information sources into a question-answer model, and outputting target answers; the question-answering model is trained based on a plurality of user questions, N information sources corresponding to the user questions, and a plurality of standard answers corresponding to the user questions.
In one possible implementation manner, the inputting the user questions and the N information sources into a question-answer model, outputting target answers, includes:
inputting the user questions and the N information sources;
constructing an abnormal composition according to a preset rule based on the user problem and the N information sources; the heterogeneous graph comprises user problem nodes and N information source nodes; the heterograms characterize the relationship between the user questions and the N information sources;
encoding the text information corresponding to each node in the heterogram to obtain a vector corresponding to the text information of each node;
Updating the vector corresponding to the text information of each node based on the different composition to obtain the updated vector corresponding to each node;
and decoding the vector corresponding to the text information of the user question node based on the updated vector corresponding to each node to obtain a target answer.
In one possible implementation manner, the text information corresponding to each node in the heterogram includes at least one word;
the encoding the text information corresponding to each node in the heterogram to obtain a vector corresponding to the text information of each node includes:
encoding each word in the text information corresponding to each node to obtain a vector corresponding to each word;
And carrying out average pooling on the vectors corresponding to each word in each node to obtain the vectors corresponding to the text information of each node.
In one possible implementation manner, the updating the vector corresponding to the text information of each node based on the different composition to obtain an updated vector corresponding to each node includes:
Calculating a first attention score between two adjacent nodes based on the different composition and the vector corresponding to the text information of each node; the two adjacent nodes comprise a source node and a target node;
readjusting the first attention score based on the vector corresponding to the text information of the problem node and the vector corresponding to the text information of the source node to obtain a second attention score;
And determining a vector corresponding to each updated node based on the second attention score, the vector corresponding to the text information of the source node, the vector corresponding to the text information of the target node and the edge type between the source node and the target node.
In one possible implementation manner, the calculating the first attention score between two adjacent nodes based on the iso-graph and the vector corresponding to the text information of each node includes:
projecting the vector corresponding to the text information of each node to obtain a first vector and a second vector corresponding to each node; the first vector corresponding to each node corresponds to the second vector one by one;
and calculating a first attention score between every two adjacent nodes based on the different patterns and the first vector and the second vector corresponding to each node.
In one possible implementation manner, before the calculating the first attention score between two adjacent nodes based on the iso-graph and the vector corresponding to the text information of each node, the method further includes:
determining a source node and a target node based on the heterogeneous graph; the target node is adjacent to the source node;
The calculating a first attention score between two adjacent nodes based on the different composition and the vector corresponding to the text information of each node includes:
projecting a vector corresponding to the text information of the source node to a first space to obtain a first vector corresponding to the source node, and projecting a vector corresponding to the text information of the target node to a second space to obtain a second vector corresponding to the target node; a mapping relation exists between the first vector and the second vector;
and determining a first attention score between the source node and the target node based on the first vector corresponding to the source node, the second vector corresponding to the target node and the edge type between the source node and the target node.
In one possible implementation manner, the readjusting the first attention score based on the vector corresponding to the text information of the problem node and the vector corresponding to the text information of the source node to obtain a second attention score includes:
determining a correlation between the source node and the problem node based on a vector corresponding to the text information of the problem node, a vector corresponding to the text information of the source node, and a type of an edge between the problem node and the source node;
a second attention score is determined based on the correlation between the source node and the problem node.
In one possible implementation manner, the target node corresponds to M source nodes; m is a positive integer;
The determining the updated vector corresponding to each node based on the second attention score, the vector corresponding to the text information of the source node, the vector corresponding to the text information of the target node, and the edge type between the source node and the target node includes:
Determining a first message transmitted from the source node to the target node based on a vector corresponding to the text information of the source node and an edge type between the source node and the target node;
Determining a second message communicated by the source node to the target node based on the first message and the second attention score;
carrying out weighted summation on M pieces of second information transmitted to the target node by the M source nodes corresponding to the target node to obtain third information transmitted to the target node by all source nodes;
And carrying out nonlinear activation and linear projection of residual connection on the basis of the third message and the vector corresponding to the text information of the target node to obtain the updated vector corresponding to each node.
In one possible implementation manner, the decoding, based on the updated vector corresponding to each node, the vector corresponding to the text information of the user question node to obtain the target answer includes:
Respectively decoding the vector corresponding to the text information of the problem node and the updated vector corresponding to each node to obtain a problem decoding vector and a decoding vector corresponding to each updated node;
fusing the problem decoding vector with the decoding vector corresponding to each updated node, obtaining a target vector;
and inquiring from a preset vocabulary according to the target vector to obtain a target answer.
In a second aspect, embodiments of the present disclosure provide a data processing apparatus, including:
the receiving module is used for receiving user problems input by a user;
The query module is used for querying from the preset question-answer data set based on the user questions and N preset information source types to obtain N information sources; the N is a positive integer greater than or equal to 2; at least two information sources in the N information sources are associated;
The question-answering module is used for inputting the user questions and the N information sources into a question-answering model and outputting target answers; the question-answering model is trained based on a plurality of user questions, N information sources corresponding to the user questions, and a plurality of standard answers corresponding to the user questions.
In one possible implementation manner, the question and answer module includes:
An input unit for inputting the user questions and the N information sources;
the construction unit is used for constructing an abnormal composition according to a preset rule based on the user problem and the N information sources; the heterogeneous graph comprises user problem nodes and N information source nodes; the heterograms characterize the relationship between the user questions and the N information sources;
The coding unit is used for coding the text information corresponding to each node in the heterogram to obtain a vector corresponding to the text information of each node;
the updating unit is used for updating the vector corresponding to the text information of each node based on the different composition to obtain the updated vector corresponding to each node;
and the decoding unit is used for decoding the vector corresponding to the text information of the user question node based on the updated vector corresponding to each node to obtain a target answer.
In one possible implementation manner, the text information corresponding to each node in the heterogram includes at least one word;
The above-mentioned coding unit includes:
The coding subunit is used for coding each word in the text information corresponding to each node to obtain a vector corresponding to each word;
and the average pooling subunit is used for carrying out average pooling on the vector corresponding to each word in each node to obtain the vector corresponding to the text information of each node.
In one possible implementation manner, the updating unit includes:
A calculating subunit, configured to calculate a first attention score between two adjacent nodes based on the iso-graph and a vector corresponding to the text information of each node; the two adjacent nodes comprise a source node and a target node;
An adjustment subunit, configured to readjust the first attention score based on a vector corresponding to the text information of the problem node and a vector corresponding to the text information of the source node, to obtain a second attention score;
And a first determining subunit, configured to determine a vector corresponding to each updated node based on the second attention score, a vector corresponding to the text information of the source node, a vector corresponding to the text information of the target node, and an edge type between the source node and the target node.
In one possible implementation manner, the computing subunit is specifically configured to:
projecting the vector corresponding to the text information of each node to obtain a first vector and a second vector corresponding to each node; the first vector corresponding to each node corresponds to the second vector one by one;
and calculating a first attention score between every two adjacent nodes based on the different patterns and the first vector and the second vector corresponding to each node.
In one possible implementation manner, the updating unit further includes:
a second determining subunit, configured to determine a source node and a target node based on the heterogeneous graph; the target node is adjacent to the source node;
The above-mentioned calculation subunit is specifically configured to:
projecting a vector corresponding to the text information of the source node to a first space to obtain a first vector corresponding to the source node, and projecting a vector corresponding to the text information of the target node to a second space to obtain a second vector corresponding to the target node; a mapping relation exists between the first vector and the second vector;
and determining a first attention score between the source node and the target node based on the first vector corresponding to the source node, the second vector corresponding to the target node and the edge type between the source node and the target node.
In one possible implementation, the above-mentioned adjustment subunit is specifically configured to:
determining a correlation between the source node and the problem node based on a vector corresponding to the text information of the problem node, a vector corresponding to the text information of the source node, and a type of an edge between the problem node and the source node;
a second attention score is determined based on the correlation between the source node and the problem node.
In one possible implementation manner, the target node corresponds to M source nodes; m is a positive integer;
the first determining subunit is specifically configured to:
Determining a first message transmitted from the source node to the target node based on a vector corresponding to the text information of the source node and an edge type between the source node and the target node;
Determining a second message communicated by the source node to the target node based on the first message and the second attention score;
carrying out weighted summation on M pieces of second information transmitted to the target node by the M source nodes corresponding to the target node to obtain third information transmitted to the target node by all source nodes;
And carrying out nonlinear activation and linear projection of residual connection on the basis of the third message and the vector corresponding to the text information of the target node to obtain the updated vector corresponding to each node.
In one possible implementation manner, the decoding unit includes:
a decoding subunit, configured to decode a vector corresponding to the text information of the problem node and a vector corresponding to each updated node, respectively, to obtain a problem decoding vector and a decoding vector corresponding to each updated node;
a fusion subunit, configured to fuse the problem decoding vector with the decoding vector corresponding to each updated node, to obtain a target vector;
and the query subunit is used for querying and obtaining target answers from preset vocabulary according to the target vectors.
In a third aspect, embodiments of the present disclosure provide an electronic device, including: a processor and a memory;
the processor is connected with the memory;
The memory is used for storing executable program codes;
the processor executes a program corresponding to the executable program code stored in the memory by reading the executable program code for performing the method provided by the first aspect of the embodiments of the present specification or any one of the possible implementations of the first aspect.
In a fourth aspect, embodiments of the present specification provide a computer storage medium having stored thereon a plurality of instructions adapted to be loaded by a processor and to carry out the method provided by the first aspect of embodiments of the present specification or any one of the possible implementations of the first aspect.
According to the embodiment of the specification, the received user questions and N information sources related to the user questions are obtained through inquiring from the preset question-answer data set and input into a question-answer model, so that target answers of the user questions are obtained. The N is a positive integer greater than or equal to 2; at least two information sources in the N information sources are associated; the question-answering model is trained based on a plurality of user questions, N information sources corresponding to the user questions, and a plurality of standard answers corresponding to the user questions. In other words, in the embodiment of the specification, the information contained in the multiple information sources is understood by combining the association between the multiple information sources, and then the learned information is integrated into the process of answering the user questions, so that reasoning in the multiple heterogeneous information sources is realized, the consistency of the generated target answers and the user questions is improved, and the accuracy of answering the user questions is improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present description, the drawings that are required in the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present description, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a data processing system according to an exemplary embodiment of the present disclosure;
FIG. 2 is a schematic diagram of a question-answering model according to an exemplary embodiment of the present disclosure;
FIG. 3 is a schematic diagram of an isomerism diagram provided in an exemplary embodiment of the present disclosure;
FIG. 4 is a flow chart of a data processing method according to an exemplary embodiment of the present disclosure;
FIG. 5A is a schematic diagram of a terminal interface according to an exemplary embodiment of the present disclosure;
FIG. 5B is a schematic diagram of an information source queried according to a user problem according to an exemplary embodiment of the present disclosure;
FIG. 5C is a schematic diagram of another terminal interface provided in an exemplary embodiment of the present disclosure;
FIG. 6 is a schematic diagram of a question-answering model implementation flow according to an exemplary embodiment of the present disclosure;
FIG. 7 is a schematic diagram of another iso-patterning provided in an exemplary embodiment of the present disclosure;
FIG. 8 is a flow chart illustrating a decoding implementation process according to an exemplary embodiment of the present disclosure;
FIG. 9 is a schematic diagram of an implementation flow of updating an iso-composition according to an exemplary embodiment of the present disclosure;
FIG. 10 is a flowchart of an embodiment of updating an iso-composition according to an exemplary embodiment of the present disclosure;
Fig. 11 is a schematic diagram of a data processing apparatus according to an exemplary embodiment of the present disclosure;
fig. 12 is a schematic structural diagram of an electronic device according to an exemplary embodiment of the present disclosure.
Detailed Description
The technical solutions in the embodiments of the present specification will be clearly and completely described below with reference to the drawings in the embodiments of the present specification.
The terms first, second, third and the like in the description and in the claims and in the above drawings, are used for distinguishing between different objects and not necessarily for describing a particular sequential or chronological order. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those listed steps or elements but may include other steps or elements not listed or inherent to such process, method, article, or apparatus.
With reference to FIG. 1, FIG. 1 is a schematic diagram illustrating an architecture of a data processing system according to an exemplary embodiment of the present disclosure. As shown in fig. 1, a data processing system may include: a terminal cluster and a server 120. Wherein:
The terminal cluster may be a user terminal, and specifically includes one or more user terminals, where the plurality of user terminals may include a user terminal 110a, a user terminal 110b, a user terminal 110c …, and so on. And the terminal cluster can be provided with user version software for realizing functions of inputting questions on a user line, looking up target answers corresponding to the questions and the like. Any user terminal in the terminal cluster can establish a data relationship with the network, and establish a data connection relationship with the server 120 through the network, for example, sending a user question, receiving a target answer, and the like. Any user terminal in the terminal cluster can be, but is not limited to, a mobile phone, a tablet computer, a notebook computer and other devices provided with user software.
The server 120 may be a server capable of providing various data processing, and may receive data such as a user question sent by any one user side in a network or a terminal cluster, query from a preset question-answer data set based on the user question and N preset information source types to obtain N information sources, and then input the data such as the user question and N information sources into a question-answer model to obtain a target answer corresponding to the user question. The server 120 may also send data such as the corresponding target answer to any one of the clients in the network or the terminal cluster. The server 120 may be, but is not limited to, a hardware server, a virtual server, a cloud server, etc.
The network may be a medium that provides a communication link between the server 120 and any one of the user terminals in the terminal cluster, or may be the internet including network devices and transmission media, which is not limited thereto. The transmission medium may be a wired link (e.g., without limitation, coaxial cable, fiber-optic, and digital subscriber lines (digital subscriber line, DSL), etc.) or a wireless link (e.g., without limitation, wireless internet (WIRELESS FIDELITY, WIFI), bluetooth, and mobile device networks, etc.).
It will be appreciated that the number of terminal clusters and servers 120 in the data processing system shown in FIG. 1 is by way of example only, and that any number of clients and servers may be included in the data processing system in a particular implementation. The embodiment of the present specification is not particularly limited thereto. For example, and without limitation, server 120 may be a server cluster of multiple servers.
Next, a question-answer model provided in the embodiment of the present specification will be described with reference to fig. 1. Referring specifically to fig. 2, a schematic structural diagram of a question-answering model according to an exemplary embodiment of the present disclosure is shown. As shown in fig. 2, the question-answering model includes: an input module 210, a mapping module 220, an encoding module 230, a problem-aware map transformer 240, an answer generator 250, and an output module 260. Wherein:
The input module 210 may be a network interface, and is specifically configured to input text information such as a user question received by the server 120 through a network, and N types of information sources queried from the preset question-answer data set according to the user question and the N preset information source types. And N is a positive integer greater than or equal to 2. At least two information sources in the N information sources are associated. The above preset information source types include user articles, article comments, answers to related questions, and the like, which are not limited in this specification. The input module 210 may also input the voice information sent by the server 120 through the network and receive the voice information from the user terminal, and convert the voice information into text information such as corresponding user questions.
The mapping module 220 is configured to construct an iso-graph according to the user question inputted by the input module 210 and N information sources related to the user question according to a preset rule. The heterogeneous graph comprises user problem nodes and N information source nodes. The heterograms characterize the relationship between the user problem and the N information sources.
Illustratively, n=4, and the 4 information sources input in the input module 210 are user articles, article comments, related questions, and answers to the related questions, respectively. The article comments are comments corresponding to the user articles, namely the article comments are associated with the user articles; the answer of the related question is the answer corresponding to the related question, that is, the answer of the related question is related to the related question. As shown in fig. 3, the mapping module 220 may connect the user question node 310 with the user article node 320 according to a preset edge type α1 by using the 4 information sources and the user questions input by the input module 210 as nodes, where the preset edge type α1 is a user article-to-user question; connecting the user question node 310 with the related question node 330 according to a preset edge type alpha 2, wherein the preset edge type alpha 2 is related questions to user questions; connecting an article comment node 340 with a user article node 320 according to a preset edge type alpha 3, wherein the preset edge type alpha 3 is the comment of the article to the user article; the answer node 350 of the related question is connected to the related question node 330 according to a preset edge type α4, and the preset edge type α4 is the answer of the related question to the related question. After the mapping module 220 connects the 5 nodes according to the 4 preset edge types, an iso-graph as shown in fig. 3 may be obtained. The user problem and the relationship between the 4 information sources can be intuitively seen from the iso-graph shown in fig. 3.
The encoding module 230 may include at least one node encoder, configured to encode text information corresponding to each node in the iso-graph constructed by the mapping module 220, so as to obtain a vector corresponding to the text information of each node.
The problem-aware graph transformer 240 may aggregate information related to user problems from different types of information source nodes by passing information between heterogeneous nodes. A Question-aware map Transformer (QGT) 240, configured to update a vector corresponding to the text information of each node in the heterogram according to correlation among nodes in the heterogram, and the like, so as to obtain an updated vector corresponding to each node. Each of the updated nodes aggregates all of the information related to the user problem that can be transferred to the node to which the information is transferred.
The answer generator 250 includes a mask Output embedding layer (Output-Embedding), a Self-Attention layer (Self-Attention), a Question Attention layer (Question-Attention), a Graph-Attention layer (Graph-Attention), and a Feed-Forward layer (Feed-Forward) for decoding a vector corresponding to text information of a user Question node based on the updated vector corresponding to each node, thereby generating a target answer.
And an output module 260 for outputting the target answer generated by the answer generator 250.
Next, a data processing method provided in the embodiments of the present specification will be described with reference to fig. 1 to 3. Referring to fig. 4, a flow chart of a data processing method according to an exemplary embodiment of the present disclosure is shown. As shown in fig. 4, the data processing method includes the following steps:
step 402, user questions entered by a user are received.
Specifically, after a user inputs a user problem in user version software installed on a terminal, the terminal transmits the user problem to a server through a network, so that the server can receive the user problem input in the user version software through the network.
Illustratively, as shown in fig. 5A, the server can receive, via the network, a user question 510 "how to repay the financial borrowing of the jindong? ".
Optionally, when the user inputs the voice in the user version software installed on the terminal, the terminal may convert the voice into text information such as a user problem through the voice conversion device, and then send the text information such as the user problem to the server through the network, so that the server can receive the text information of the user problem corresponding to the voice input by the user in the user version software through the network.
Optionally, the terminal may also send the above voice directly to the server through the network, and after the server receives the above voice, the server converts the above voice into text information such as a user question through the input module 210 in the question-answering model.
Step 404, inquiring N kinds of information sources from the preset question-answer data set based on the user questions and N kinds of preset information source types.
Specifically, N types of information sources related to the user question may be queried from the pre-set question-and-answer data set (database) which has been constructed according to N types of pre-set information sources using the user question. And searching N information sources with the relevance to the user problem larger than a preset threshold value from the preset question-answer data set according to N preset information source types by adopting a preset search algorithm. And N is a positive integer greater than or equal to 2. The preset information source types may include user articles, article comments, answers to related questions, and the like, which are not limited in this specification. At least two information sources in the N information sources are associated. The related two information sources can transmit messages, and may include user articles, article comments corresponding to the user articles, related questions, answers corresponding to the related questions, and the like, which are not limited in this specification. The preset question-answer data set includes a question-answer data set MS-MARCO, a question-answer data set AntQA, etc., which is not limited in this specification. The preset search algorithm includes a BM25 algorithm, an edit distance (EDIT DISTANCE) algorithm, and the like, which is not limited in this specification. Each of the N information sources includes at least one piece of information. The relevance of each piece of information in each information source to the user problem is greater than a preset threshold. The preset threshold may be 0.99, 0.90, 0.80, 0.60, etc., which is not limited in this specification.
Illustratively, given a user question 510 as shown in FIG. 5A and four preset information source types of user articles, article reviews, related questions and answers to related questions, a BM25 algorithm may be used to retrieve from the already constructed question-answer datasets MS-MARCO and question-answer datasets AntQA, a message that a user article 521 "Beijing Dong financial credit card pays more than the 2000-Yuan portion from day 3 months 26 that would charge 0.1% of the commission since yesterday, as shown in FIG. 5B, that the relevance to the user question is greater than a preset threshold" 0.5", and the boiling boils in each large platform since yesterday. It is believed that psychological preparation is already in fact available. "the article evaluation 522 corresponding to the user article 521," i have paid for by the jingdong finance since the charging of other applications ", the related question 523" how does the jingdong finance white bar? "answer 524 to related questions" enter the Beijing Dong financial white strip page to see the borrowing and returning money, click the repayment to see the repayment number, deposit enough funds in advance in the balance of the Beijing Dong financial white strip ", related questions 525" do money borrowing in advance to collect the commission? The answer 526 to the related questions is not needed, and the blank is paid in advance, so that no fee is required for paying in advance, and the user can directly operate on the blank page. Logging in Beijing Dong finance, clicking the right lower corner, selecting the first page of the bar, and operating the six pieces of information according to page prompts.
Step 406, inputting the user questions and N information sources into the question-answer model, and outputting target answers.
Specifically, after receiving a user question input by a user and querying N information sources related to the user question from a preset question-answer data set according to the user question, the user question and the N information sources may be input into a question-answer model, so as to output and obtain a target answer corresponding to the user question. The question-answering model is trained based on a plurality of user questions, N information sources corresponding to the user questions, and a plurality of standard answers corresponding to the user questions. After the question model outputs the target answer corresponding to the user question, the server can send the target answer to the terminal, so as to answer the user input the user question in the user version software installed on the terminal.
Illustratively, the server may input the user questions 510 illustrated in fig. 5A and the 4 information sources illustrated in fig. 5B into a question-and-answer model, and may output the user questions 510 "how is the money paid in the jindong finance? The corresponding target answer can be repayment in the Beijing dong finance. At this time, as shown in fig. 5C, the server also feeds back the target answer output by the question-answer model to the terminal, that is, the user version software QA in fig. 5C outputs the target answer 530.
Optionally, after the question-answering model outputs the target answer corresponding to the user question, the cross entropy loss in the question-answering process is also output, and each parameter in the question-answering model is further optimized by taking the cross entropy loss as a training target, so that the accuracy of answering the user question of the question-answering model is further improved.
Specifically, in the data processing method provided in the embodiment of the present disclosure, a specific implementation process of outputting the target answer by the question-answering model in step 406 according to the input user question and the N information sources is described next with reference to fig. 2 to 3. Referring specifically to fig. 6, a schematic flow chart of an implementation of a question-answering model according to an exemplary embodiment of the present disclosure is shown. As shown in fig. 6, the implementation flow of the question-answering model includes the following steps:
Step 602, inputting user questions and N information sources.
Specifically, text information such as user questions received by the server through the network and N information sources queried from the preset question-answer data set according to the user questions and N preset information source types may be used as inputs of the question-answer model. And N is a positive integer greater than or equal to 2. At least two information sources in the N information sources are associated. The above preset information source types include user articles, article comments, answers to related questions, and the like, which are not limited in this specification.
Step 604, constructing an heterogram according to a preset rule based on the user problem and N information sources.
Specifically, each piece of information in the input user problem and the N information sources related to the user problem may be used as a node, and the nodes may be connected according to a preset edge type, so as to complete the construction of the different composition. The heterogeneous graph comprises user problem nodes and N information source nodes. The heterograms characterize the relationship between the user problem and the N information sources. The above-mentioned preset edge type is used to represent the connection relationship between two associated nodes in the heterogram, and may include user article to user problem, article comment to user article, related problem to user problem, answer to related problem, and so on, which is not limited in this specification.
Illustratively, n=4, and the 4 information sources input are user articles, article comments, related questions, and answers to the related questions, respectively. The input user articles include a user article a and a user article B, the article comment a1 and the article comment a2 are comments corresponding to the user article a, the input related questions include a related question C and a related question D, the answer C1 of the related question and the answer C2 of the related question are answers corresponding to the related question C, as shown in fig. 7, the graph building module 220 in the question-answer model may use the 4 information sources and the input user questions as nodes, and connect the user question node 710 with the user article a node 720 and the user article B node 730 according to a preset edge type α1, where the preset edge type α1 is the user article to the user question; the user question node 710 is respectively connected with the related question C node 740 and the related question D node 750 according to a preset edge type α2, wherein the preset edge type α2 is a related question to user question; the article comment a1 node 760 and the article comment a2 node 770 are connected with the user article A node 720 according to a preset edge type alpha 3, wherein the preset edge type alpha 3 is that the article comments are the user articles; the answer C1 node 780 of the related problem and the answer C2 node 790 of the related problem are connected with the related problem C node 740 according to a preset edge type α4, wherein the preset edge type α4 is the answer of the related problem to the related problem. After the mapping module 220 connects the 5 nodes according to the 4 preset edge types, an iso-graph as shown in fig. 7 may be obtained. The user problem and the relationship between the 4 information sources can be intuitively seen from the iso-graph shown in fig. 7.
And step 606, coding the text information corresponding to each node in the heterogram to obtain a vector corresponding to the text information of each node.
Specifically, the text information corresponding to each node in the aforementioned heterograms includes at least one word. Each word in the text information corresponding to each node may be encoded by the encoding module 230 in the question-answer model, so as to obtain a vector corresponding to each word. And then carrying out average pooling on vectors corresponding to each word in each node of the iso-graph, thereby obtaining the vectors corresponding to the text information of each node.
Illustratively, when a node in the heterogramWhereinFor indicating the length of the word included in the node u, i.e. the node u includesThe word (word),A j-th node for indicating the i-th typeIndividual words, each word in the user question described above may be encoded using a pre-trained node (BART) encoderEncoding to obtain each word of the node uCorresponding vector. Then each word in the user question node is processedThe corresponding vector is subjected to average pooling operation, so that the vector corresponding to the text information of the node u is obtainedThe vector corresponding to the text information of the node u can be obtainedAs an initial node representation of the node u.
And 608, updating the vector corresponding to the text information of each node based on the different composition, and obtaining the updated vector corresponding to each node.
Specifically, the vector corresponding to the text information of each node in the aforementioned iso-graph, that is, the initialized node representation, may be updated by the problem-aware graph transformer 240 in the question-answering model, so as to obtain the vector corresponding to each node after the update. Each of the updated nodes aggregates all of the information related to the user problem that can be transferred to the node to which the information is transferred.
And step 610, decoding the vector corresponding to the text information of the user question node based on the updated vector corresponding to each node to obtain a target answer.
Specifically, the answer generator 250 shown in fig. 2 may be employed to decode a vector corresponding to text information of a user question node by the updated vector corresponding to each node, thereby generating a target answer corresponding to the user question.
Specifically, the answer generator 250 includes a mask Output embedding layer (Output-Embedding), a Self-Attention layer (Self-Attention), a Question Attention layer (Question-Attention), a Graph-Attention layer (Graph-Attention), and a Feed-Forward layer (Feed-Forward), as shown in fig. 8, the specific implementation process of decoding the vector corresponding to the text information of the user Question node includes the following steps:
and step 802, respectively decoding the vector corresponding to the text information of the problem node and the updated vector corresponding to each node to obtain a problem decoding vector and an updated decoding vector corresponding to each node.
Specifically, self-attention is first applied to the mask output for embedding, resulting in the output of each decoding stepThen by mapping the text information of the question node q to a vectorInput into the Question-Attention layer (Question-Attention) to obtain a Question-decoded vectorAnd aggregating useful knowledge from the nodes included in the updated heterogeneous graph according to the state of each decoding step, i.e. the vector corresponding to each updated nodeInput into Graph-Attention layer (Graph-Attention) to obtain updated decoding vector corresponding to each node. Above-mentionedFirst to characterize T typeUpdated vectors corresponding to the individual nodes.
Step 804, fusing the problem decoding vector with the decoding vector corresponding to each updated node to obtain a target vector.
Specifically, the information included in the user problem and the updated information included in each node can be dynamically fused through a Feed-Forward layer (Feed-Forward), that is, the problem decoding vector and the decoding vector corresponding to each updated node are fused, so as to obtain the target vector. Equivalent to the probability that the decoding vector corresponding to the updated node in the heterogram appears in the target answer by fusing the user question and the information source in N through the Sigmoid functionThen, weighting and summing are carried out according to the probability, the decoding vector corresponding to the updated node and the problem decoding vector, thereby obtaining a target vector
Step 806, obtaining a target answer from the query in the preset vocabulary according to the target vector.
Specifically, the answer generator 250 may query a plurality of words corresponding to the target vector from a preset vocabulary according to the target vector, and rank the plurality of words according to the corresponding word positions in the target vector, so as to obtain the target answer.
As shown in fig. 9, the specific implementation process of the problem-aware map transformer 240 in step 608 for updating the vector corresponding to the text information of each node includes the following steps:
step 902, calculating a first attention score between two adjacent nodes based on the iso-graph and a vector corresponding to the text information of each node.
Specifically, the two adjacent nodes comprise a source node and a target node, and the source node and the target node are connected through a preset edge type, so that the source node can transmit a message to the target node. The problem-aware map transformer 240 may calculate a correlation between each adjacent two nodes, i.e., a first attention score, according to a connection relationship between the adjacent two nodes in the iso-map and a vector corresponding to text information of each node.
Alternatively, the problem-aware map transformer 240 may first project the vector projection corresponding to the text information of each node in the iso-graph into the first space and the second space, so as to obtain the first vector and the second vector corresponding to each node, and then calculate the first attention score between every two adjacent nodes according to the iso-graph and the first vector and the second vector corresponding to each node. The method comprises the steps of combining a second vector of a target node and a first vector of a source node through a preset edge type specific parameter matrix, calculating correlation between the target node and each source node, mapping attention points between the target node and each source node to zero to positive infinity through a normalized index (softmax) function, and dividing each mapped attention point by the sum of all mapped attention points to obtain a first attention point between the target node and each source node. The first attention score is a number greater than or equal to 0 and less than or equal to 1. The first space corresponds to a second space, the first space is used for representing a value space, and the second space can be used for representing a key space, which is not limited in this specification. The first vector of each node corresponds to the second vector one by one. The first vector is used for representing the index of the element, and the second vector is used for representing the specific content corresponding to the first vector. The first vector may be a key (key) vector, and the second vector may be a value (value) vector, which is not limited in this specification.
Optionally, before calculating the first attention score between two adjacent nodes, the source node and the target node need to be determined according to the heterograph. The target node is adjacent to the source node. The node corresponding to the start of the arrow (the type of edge connected between the target node and the source node) in the heterogram can be determined as the source node, and the end of the arrow, i.e., the node pointed by the arrow, can be determined as the target node. And then projecting the vector corresponding to the text information of the source node to a first space to obtain a first vector corresponding to the source node, and projecting the vector corresponding to the text information of the target node to a second space to obtain a second vector corresponding to the target node. And finally, determining a first attention score between the source node and the target node according to the first vector corresponding to the source node, the second vector corresponding to the target node and the edge type between the source node and the target node. The first attention score is a number greater than or equal to 0 and less than or equal to 1. The first space corresponds to a second space, the first space is used for representing a value space, and the second space can be used for representing a key space, which is not limited in this specification. There is a mapping relationship between the first vector and the second vector. The first vector of each node corresponds to the second vector one by one. The first vector is used for representing the index of the element, and the second vector is used for representing the specific content corresponding to the first vector. The first vector may be a key (key) vector, and the second vector may be a value (value) vector, which is not limited in this specification. I.e. if the source node s corresponds to the first vectorSecond vector corresponding to target node tA first attention score between the source node s and the target node t. In the above formula: representing the source node and the source node, The representation of the target node is made,For an initialized node representation of the source node s, i.e. a vector corresponding to the text information of the source node s,An initializing node representation for the target node t, i.e. a vector corresponding to the text information of the target node t, e is used to characterize the edge type between the source node s and the target node t,For a parameter matrix specific to the edge type e,The source nodes are neighbor nodes of the target node t, namely all source nodes corresponding to the target node t. The calculation method of the first attention score between two adjacent nodes in the above embodiment of the present specification is not limited, and other calculation methods are also possible in specific implementations, which is not limited in this embodiment of the present specification.
Step 904, readjusting the first attention score based on the vector corresponding to the text information of the question node and the vector corresponding to the text information of the source node to obtain a second attention score.
In particular, since the purpose of transferring messages between nodes is to extract useful knowledge from heterogeneous information sources to answer user questions, only knowledge about the questions should be preserved when transferring messages. Thus, it is desirable to first re-weigh the first attention score between the source node and the target node using the relationship score, i.e., the relevance, between the source node and the user problem node. The semantic relevance between the source node and the user question node can be determined according to the vector corresponding to the text information of the user question node, the vector corresponding to the text information of the source node and the edge type between the user question node and the source node. The semantic relativity between the source node s and the user problem node q can be calculated through a double-line layerWhereinIn order for the parameters to be trainable,And initializing node representation of the user problem node q, namely a vector corresponding to text information of the user problem node q. After the problem-aware graph transformer 240 calculates the semantic correlation between the source node and the user problem node, it then calculates the semantic correlation between the source node and the user problem nodeThe first attention scoreDetermining a second attention score between the source node and the target node related to the user problem
Step 906, determining a vector corresponding to each updated node based on the second attention score, the vector corresponding to the text information of the source node, the vector corresponding to the text information of the target node, and the edge type between the source node and the target node.
In particular, since the messaging behavior on different edge types should be different. For example, the message transferred between the relevant question node and the answer node of the relevant question should be a message related to the above-mentioned relevant question and the user question extracted from the answer of the relevant question according to the user question, and thus, the message related to both the target node and the user question node transferred from the source node needs to be calculated through different edge types. The target node corresponds to M source nodes. M is a positive integer.
Specifically, as shown in fig. 10, the specific implementation process of determining the updated vector corresponding to each node according to the second attention score, the vector corresponding to the text information of the source node, the vector corresponding to the text information of the target node, and the edge type between the source node and the target node includes the following steps:
step 1002, determining a first message transmitted from a source node to a target node based on a vector corresponding to text information of the source node and an edge type between the source node and the target node.
In particular, messages delivered from different source nodes to their target nodes should be different. Thus, there is a need to extract a useful message from the source node, i.e. the first message the source node passes to the target node, by edge type. Namely, the first message transmitted from the source node s to the target node t can be obtained through the output of the multi-layer perceptronWherein, the method comprises the steps of, wherein,Is a parameter matrix specific to the edge type e,A first message related to the target node t is delivered to the target node t by the source node s through the edge e.
Step 1004 determines a second message delivered by the source node to the target node based on the first message and the second attention score.
Specifically, the first message related to the target node t and the second attention score related to the user problem between the source node and the target node, which are transmitted to the target node t by the source node s through the edge e, may be calculated according to a preset formula, so as to obtain that the source node transmits the second message related to the target node t and the user problem node to the target node. The predetermined formula may be the second message. The calculation method of the second message in the preset formula is not limited, and other calculation methods may be used in specific implementation, which is not limited in the embodiment of the present disclosure.
In step 1006, weighted summation is performed on M second messages transmitted from M source nodes corresponding to the target node, so as to obtain third messages transmitted from all source nodes to the target node.
Specifically, since one target node corresponds to M source nodes, when updating the target node, weighted summation is performed on all (M) second messages transmitted from all (M) source nodes to the target node, so as to obtain third messages transmitted from all (M) source nodes corresponding to the target node. I.e. the third message
And step 1008, performing nonlinear activation and linear projection of residual connection based on the third message and the vector corresponding to the text information of the target node, and obtaining the updated vector corresponding to each node.
Specifically, after obtaining the third message transmitted from all source nodes to the target node, it is necessary to apply a linear projection with nonlinear activation to the third message, which is the sum of the messages transmitted from all source nodes, and connect the vector corresponding to the text information of the target node, which is the message included in the target node, with the vector obtained after the projection through residual connection, so that the target node fuses the information related to the user problem from the adjacent source nodes, and obtains the vector corresponding to the updated target node. The method of calculating the vector corresponding to each updated node is not limited to the above, and other calculation methods are also possible in the specific implementation, which is not limited in the embodiment of the present disclosure. When some nodes in the heterogram cannot be used as source nodes, for example, the user article B node 730 in fig. 7, the answer c2 node 790 of the related problem, and other target nodes do not correspond to the source nodes, the vector corresponding to the text information of the node, namely, the initialization representation, can be directly used as the vector corresponding to the updated node, which is equivalent to that the nodes cannot be updated.
According to the embodiment of the specification, the received user questions and N information sources related to the user questions are obtained through inquiring from the preset question-answer data set and input into the question-answer model provided by the embodiment of the specification, so that target answers of the user questions are obtained. The N is a positive integer greater than or equal to 2; at least two information sources in the N information sources are associated; the question-answering model is trained based on a plurality of user questions, N information sources corresponding to the user questions, and a plurality of standard answers corresponding to the user questions. In other words, the embodiment of the specification understands and merges information contained in various information sources by combining the association among the various heterogeneous information sources, and merges the information into the process of answering the user questions, so that reasoning in the various heterogeneous information sources is realized, the consistency of the generated target answers and the user questions is improved, and the accuracy of answering the user questions is improved.
Referring to fig. 11, fig. 11 is a schematic diagram of a data processing apparatus according to an exemplary embodiment of the present disclosure. The data processing apparatus 1100 includes:
A receiving module 1110, configured to receive a user question input by a user;
The query module 1120 is configured to query from a preset question-answer dataset based on the user question and N preset information source types to obtain N information sources; the N is a positive integer greater than or equal to 2; at least two information sources in the N information sources are associated;
The question and answer module 1130 is configured to input the user question and the N information sources into a question and answer model, and output a target answer; the question-answering model is trained based on a plurality of user questions, N information sources corresponding to the user questions, and a plurality of standard answers corresponding to the user questions.
In one possible implementation, the question and answer module 1130 includes:
An input unit for inputting the user questions and the N information sources;
the construction unit is used for constructing an abnormal composition according to a preset rule based on the user problem and the N information sources; the heterogeneous graph comprises user problem nodes and N information source nodes; the heterograms characterize the relationship between the user questions and the N information sources;
The coding unit is used for coding the text information corresponding to each node in the heterogram to obtain a vector corresponding to the text information of each node;
the updating unit is used for updating the vector corresponding to the text information of each node based on the different composition to obtain the updated vector corresponding to each node;
and the decoding unit is used for decoding the vector corresponding to the text information of the user question node based on the updated vector corresponding to each node to obtain a target answer.
In one possible implementation manner, the text information corresponding to each node in the heterogram includes at least one word;
The above-mentioned coding unit includes:
The coding subunit is used for coding each word in the text information corresponding to each node to obtain a vector corresponding to each word;
and the average pooling subunit is used for carrying out average pooling on the vector corresponding to each word in each node to obtain the vector corresponding to the text information of each node.
In one possible implementation manner, the updating unit includes:
A calculating subunit, configured to calculate a first attention score between two adjacent nodes based on the iso-graph and a vector corresponding to the text information of each node; the two adjacent nodes comprise a source node and a target node;
An adjustment subunit, configured to readjust the first attention score based on a vector corresponding to the text information of the problem node and a vector corresponding to the text information of the source node, to obtain a second attention score;
And a first determining subunit, configured to determine a vector corresponding to each updated node based on the second attention score, a vector corresponding to the text information of the source node, a vector corresponding to the text information of the target node, and an edge type between the source node and the target node.
In one possible implementation manner, the computing subunit is specifically configured to:
projecting the vector corresponding to the text information of each node to obtain a first vector and a second vector corresponding to each node; the first vector corresponding to each node corresponds to the second vector one by one;
and calculating a first attention score between every two adjacent nodes based on the different patterns and the first vector and the second vector corresponding to each node.
In one possible implementation manner, the updating unit further includes:
a second determining subunit, configured to determine a source node and a target node based on the heterogeneous graph; the target node is adjacent to the source node;
The above-mentioned calculation subunit is specifically configured to:
projecting a vector corresponding to the text information of the source node to a first space to obtain a first vector corresponding to the source node, and projecting a vector corresponding to the text information of the target node to a second space to obtain a second vector corresponding to the target node; a mapping relation exists between the first vector and the second vector;
and determining a first attention score between the source node and the target node based on the first vector corresponding to the source node, the second vector corresponding to the target node and the edge type between the source node and the target node.
In one possible implementation, the above-mentioned adjustment subunit is specifically configured to:
determining a correlation between the source node and the problem node based on a vector corresponding to the text information of the problem node, a vector corresponding to the text information of the source node, and a type of an edge between the problem node and the source node;
a second attention score is determined based on the correlation between the source node and the problem node.
In one possible implementation manner, the target node corresponds to M source nodes; m is a positive integer;
the first determining subunit is specifically configured to:
Determining a first message transmitted from the source node to the target node based on a vector corresponding to the text information of the source node and an edge type between the source node and the target node;
Determining a second message communicated by the source node to the target node based on the first message and the second attention score;
carrying out weighted summation on M pieces of second information transmitted to the target node by the M source nodes corresponding to the target node to obtain third information transmitted to the target node by all source nodes;
And carrying out nonlinear activation and linear projection of residual connection on the basis of the third message and the vector corresponding to the text information of the target node to obtain the updated vector corresponding to each node.
In one possible implementation manner, the decoding unit includes:
a decoding subunit, configured to decode a vector corresponding to the text information of the problem node and a vector corresponding to each updated node, respectively, to obtain a problem decoding vector and a decoding vector corresponding to each updated node;
a fusion subunit, configured to fuse the problem decoding vector with the decoding vector corresponding to each updated node, to obtain a target vector;
and the query subunit is used for querying and obtaining target answers from preset vocabulary according to the target vectors.
The above-described division of the modules in the data processing apparatus is merely for illustration, and in other embodiments, the data processing apparatus may be divided into different modules as needed to perform all or part of the functions of the data processing apparatus. The implementation of each module in the data processing apparatus provided in the embodiments of the present specification may be in the form of a computer program. The computer program may run on a terminal or a server. Program modules of the computer program may be stored in the memory of the terminal or server. Which when executed by a processor, performs all or part of the steps of the data processing method described in the embodiments of the present specification.
Referring to fig. 12, fig. 12 is a schematic structural diagram of an electronic device according to an exemplary embodiment of the present disclosure. As shown in fig. 12, the electronic device 1200 may include: at least one processor 1210, at least one communication bus 1220, a user interface 1230, at least one network interface 1240, a memory 1250. Wherein the communication bus 1220 may be used to facilitate the connection communication of the various components described above.
The user interface 1230 may include a Display screen (Display) and a Camera (Camera), and the optional user interface may further include a standard wired interface, a wireless interface, among others.
The network interface 1240 may optionally include a bluetooth module, a Near Field Communication (NFC) module, a wireless fidelity (WIRELESS FIDELITY, wi-Fi) module, and the like, among others.
Processor 1210 may include, among other things, one or more processing cores. The processor 1210 uses various interfaces and lines to connect various portions of the overall electronic device 1200, perform various functions of the routing electronic device 1200, and process data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 1250, and invoking data stored in the memory 1250. Alternatively, the processor 1210 may be implemented in at least one hardware form of digital signal Processing (DIGITAL SIGNAL Processing, DSP), field-Programmable gate array (Field-Programmable GATE ARRAY, FPGA), programmable logic array (Programmable Logic Array, PLA). The processor 1210 may integrate one or a combination of several of a processor (Central Processing Unit, CPU), an image processor (Graphics Processing Unit, GPU), and a modem, etc. The CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing the content required to be displayed by the display screen; the modem is used to handle wireless communications. It will be appreciated that the modem may not be integrated into the processor 1210 and may be implemented by a single chip.
The Memory 1250 may include a random access Memory (Random Access Memory, RAM) or a Read-Only Memory (ROM). Optionally, the memory 1250 includes a non-transitory computer readable medium. Memory 1250 may be used to store instructions, programs, code, sets of codes, or sets of instructions. The memory 1250 may include a storage program area and a storage data area, wherein the storage program area may store instructions for implementing an operating system, instructions for at least one function (such as a receiving function, a query function, a question-answer function, etc.), instructions for implementing the various method embodiments described above, and the like; the storage data area may store data or the like referred to in the above respective method embodiments. Memory 1250 may also optionally be at least one storage device located remotely from the aforementioned processor 1210. As shown in fig. 12, an operating system, network communication modules, user interface modules, and program instructions may be included in memory 1250, which is a type of computer storage medium.
In particular, processor 1210 may be configured to invoke the program instructions stored in memory 1250 and to perform the following operations in particular:
user questions entered by a user are received.
Inquiring from a preset question-answer data set based on the user questions and N preset information source types to obtain N information sources; the N is a positive integer greater than or equal to 2; at least two information sources in the N information sources are associated.
Inputting the user questions and the N information sources into a question-answer model, and outputting target answers; the question-answering model is trained based on a plurality of user questions, N information sources corresponding to the user questions, and a plurality of standard answers corresponding to the user questions.
In some possible embodiments, the processor 1210 is configured to input the user questions and the N information sources into a question-answer model, and output a target answer, and specifically configured to perform:
the user questions and the N information sources are input.
Constructing an abnormal composition according to a preset rule based on the user problem and the N information sources; the heterogeneous graph comprises user problem nodes and N information source nodes; the heterograms characterize the relationship between the user problem and the N information sources.
And encoding the text information corresponding to each node in the heterogram to obtain a vector corresponding to the text information of each node.
Updating the vector corresponding to the text information of each node based on the different composition, and obtaining the updated vector corresponding to each node.
And decoding the vector corresponding to the text information of the user question node based on the updated vector corresponding to each node to obtain a target answer.
In some possible embodiments, the text information corresponding to each node in the heterogram includes at least one word;
the processor 1210 is specifically configured to perform:
And encoding each word in the text information corresponding to each node to obtain a vector corresponding to each word.
And carrying out average pooling on the vectors corresponding to each word in each node to obtain the vectors corresponding to the text information of each node.
In some possible embodiments, the processor 1210 updates a vector corresponding to the text information of each node based on the different patterns, and when obtaining an updated vector corresponding to each node, the method is specifically configured to perform:
Calculating a first attention score between two adjacent nodes based on the different composition and the vector corresponding to the text information of each node; the two adjacent nodes comprise a source node and a target node.
And readjusting the first attention score based on the vector corresponding to the text information of the problem node and the vector corresponding to the text information of the source node to obtain a second attention score.
And determining a vector corresponding to each updated node based on the second attention score, the vector corresponding to the text information of the source node, the vector corresponding to the text information of the target node and the edge type between the source node and the target node.
In some possible embodiments, the processor 1210 is specifically configured to perform:
projecting the vector corresponding to the text information of each node to obtain a first vector and a second vector corresponding to each node; the first vector corresponding to each node corresponds to the second vector one by one.
And calculating a first attention score between every two adjacent nodes based on the different patterns and the first vector and the second vector corresponding to each node.
In some possible embodiments, before the processor 1210 calculates the first attention score between two adjacent nodes based on the iso-graph and the vector corresponding to the text information of each node, the method further includes:
determining a source node and a target node based on the heterogeneous graph; the target node is adjacent to the source node;
The processor 1210 is specifically configured to, when calculating a first attention score between two adjacent nodes based on the different patterns and the vector corresponding to the text information of each node, perform:
Projecting a vector corresponding to the text information of the source node to a first space to obtain a first vector corresponding to the source node, and projecting a vector corresponding to the text information of the target node to a second space to obtain a second vector corresponding to the target node; there is a mapping relationship between the first vector and the second vector.
And determining a first attention score between the source node and the target node based on the first vector corresponding to the source node, the second vector corresponding to the target node and the edge type between the source node and the target node.
In some possible embodiments, the processor 1210 is configured to readjust the first attention score based on a vector corresponding to the text information of the problem node and a vector corresponding to the text information of the source node, and obtain a second attention score, where the second attention score is specifically configured to perform:
and determining the correlation between the source node and the problem node based on the vector corresponding to the text information of the problem node, the vector corresponding to the text information of the source node and the edge type between the problem node and the source node.
A second attention score is determined based on the correlation between the source node and the problem node.
In some possible embodiments, the target node corresponds to M source nodes; m is a positive integer;
The processor 1210 is specifically configured to perform:
and determining a first message transmitted by the source node to the target node based on the vector corresponding to the text information of the source node and the edge type between the source node and the target node.
And determining a second message transmitted by the source node to the target node based on the first message and the second attention score.
And carrying out weighted summation on M pieces of second messages transmitted to the target node by the M source nodes corresponding to the target node to obtain third messages transmitted to the target node by all source nodes.
And carrying out nonlinear activation and linear projection of residual connection on the basis of the third message and the vector corresponding to the text information of the target node to obtain the updated vector corresponding to each node.
In some possible embodiments, the processor 1210 decodes a vector corresponding to the text information of the user question node based on the updated vector corresponding to each node, and when obtaining the target answer, the method is specifically configured to perform:
and respectively decoding the vector corresponding to the text information of the problem node and the updated vector corresponding to each node to obtain a problem decoding vector and a decoding vector corresponding to each updated node.
And fusing the problem decoding vector with the decoding vector corresponding to each updated node to obtain a target vector.
And inquiring from a preset vocabulary according to the target vector to obtain a target answer.
The present description also provides a computer-readable storage medium having instructions stored therein, which when executed on a computer or processor, cause the computer or processor to perform one or more steps of the above embodiments. The respective constituent modules of the above-described data processing apparatus may be stored in the above-described computer-readable storage medium if implemented in the form of software functional units and sold or used as independent products.
In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product described above includes one or more computer instructions. When the computer program instructions described above are loaded and executed on a computer, the processes or functions described in accordance with the embodiments of the present specification are all or partially produced. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in or transmitted across a computer-readable storage medium. The computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by a wired (e.g., coaxial cable, fiber optic, digital subscriber line (Digital Subscriber Line, DSL)), or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage media may be any available media that can be accessed by a computer or a data storage device such as a server, data center, or the like that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., a floppy disk, a hard disk, a magnetic tape), an optical medium (e.g., a digital versatile disk (DIGITAL VERSATILE DISC, DVD)), or a semiconductor medium (e.g., a Solid state disk (Solid STATE DISK, SSD)), or the like.
Those skilled in the art will appreciate that implementing all or part of the above-described embodiment methods may be accomplished by way of a computer program, which may be stored in a computer-readable storage medium, instructing relevant hardware, and which, when executed, may comprise the embodiment methods as described above. And the aforementioned storage medium includes: various media capable of storing program code, such as ROM, RAM, magnetic or optical disks. The technical features in the present examples and embodiments may be arbitrarily combined without conflict.
The above-described embodiments are merely preferred embodiments of the present disclosure, and do not limit the scope of the disclosure, and various modifications and improvements made by those skilled in the art to the technical solution of the disclosure should fall within the scope of protection defined by the claims.
The foregoing describes specific embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.

Claims (12)

1. A method of data processing, the method comprising:
receiving user questions input by a user;
Inquiring from a preset question-answer data set based on the user questions and N preset information source types to obtain N information sources; the N is a positive integer greater than or equal to 2; at least two information sources in the N information sources are associated; the N information sources are information sets corresponding to the N preset information source types respectively, and each information set comprises at least one piece of information;
Inputting the user questions and the N information sources into a question-answer model, and outputting target answers; the question-answering model is trained and obtained based on a plurality of user questions, N information sources corresponding to the user questions and a plurality of standard answers corresponding to the user questions.
2. The method of claim 1, wherein inputting the user questions and the N information sources into a question-answer model, outputting target answers, comprises:
Inputting the user questions and the N information sources;
constructing an abnormal composition according to a preset rule based on the user problem and the N information sources; the heterogeneous graph comprises user problem nodes and N information source nodes; the heterograms characterize the relationship between the user questions and the N information sources;
encoding the text information corresponding to each node in the heterogram to obtain a vector corresponding to the text information of each node;
Updating the vector corresponding to the text information of each node based on the different composition, and obtaining the updated vector corresponding to each node; each updated node aggregates all information related to the user problem which can be transmitted to the node transmitting the information;
And decoding the vector corresponding to the text information of the user question node based on the updated vector corresponding to each node to obtain a target answer.
3. The method of claim 2, the text information corresponding to each node in the iso-graph comprising at least one word;
the encoding of the text information corresponding to each node in the heterogram to obtain a vector corresponding to the text information of each node includes:
encoding each word in the text information corresponding to each node to obtain a vector corresponding to each word;
And carrying out average pooling on the vector corresponding to each word in each node to obtain the vector corresponding to the text information of each node.
4. The method of claim 2, wherein updating the vector corresponding to the text information of each node based on the heterogram to obtain the updated vector corresponding to each node comprises:
Calculating a first attention score between two adjacent nodes based on the different composition and the vector corresponding to the text information of each node; the two adjacent nodes comprise a source node and a target node;
readjusting the first attention score based on the vector corresponding to the text information of the question node and the vector corresponding to the text information of the source node to obtain a second attention score;
and determining a vector corresponding to each updated node based on the second attention score, the vector corresponding to the text information of the source node, the vector corresponding to the text information of the target node and the edge type between the source node and the target node.
5. The method of claim 4, wherein calculating the first attention score between two adjacent nodes based on the iso-graph and the vector corresponding to the text information of each node comprises:
Projecting vectors corresponding to the text information of each node to obtain a first vector and a second vector corresponding to each node; the first vector corresponding to each node corresponds to the second vector one by one;
And calculating a first attention score between every two adjacent nodes based on the different composition, the first vector corresponding to each node and the second vector.
6. The method of claim 4, further comprising, prior to said calculating a first attention score between two adjacent nodes based on the iso-graph and the vector corresponding to the text information for each node:
determining a source node and a target node based on the heterogeneous graph; the target node is adjacent to the source node;
the calculating a first attention score between two adjacent nodes based on the different composition and the vector corresponding to the text information of each node includes:
projecting a vector corresponding to the text information of the source node to a first space to obtain a first vector corresponding to the source node, and projecting a vector corresponding to the text information of the target node to a second space to obtain a second vector corresponding to the target node; a mapping relation exists between the first vector and the second vector;
A first attention score between the source node and the target node is determined based on a first vector corresponding to the source node, a second vector corresponding to the target node, and an edge type between the source node and the target node.
7. The method of claim 4, wherein readjusting the first attention score based on the vector corresponding to the text information of the question node and the vector corresponding to the text information of the source node to obtain a second attention score, comprising:
Determining the correlation between the source node and the problem node based on the vector corresponding to the text information of the problem node, the vector corresponding to the text information of the source node and the edge type between the problem node and the source node;
A second attention score is determined based on a correlation between the source node and the problem node and the first attention score.
8. The method of claim 4, the target node corresponding to M source nodes; m is a positive integer;
The determining the updated vector corresponding to each node based on the second attention score, the vector corresponding to the text information of the source node, the vector corresponding to the text information of the target node, and the edge type between the source node and the target node includes:
Determining a first message transmitted by the source node to the target node based on a vector corresponding to the text information of the source node and an edge type between the source node and the target node;
Determining a second message communicated by the source node to the target node based on the first message and the second attention score;
Carrying out weighted summation on M pieces of second information transmitted to the target node by the M source nodes corresponding to the target node to obtain third information transmitted to the target node by all source nodes;
and carrying out nonlinear activation and linear projection of residual connection based on the third message and the vector corresponding to the text information of the target node to obtain the updated vector corresponding to each node.
9. The method of claim 2, wherein the decoding the vector corresponding to the text information of the user question node based on the updated vector corresponding to each node to obtain the target answer includes:
Respectively decoding the vector corresponding to the text information of the problem node and the updated vector corresponding to each node to obtain a problem decoding vector and a decoding vector corresponding to each updated node;
Fusing the problem decoding vector with the decoding vector corresponding to each updated node to obtain a target vector;
and inquiring from a preset vocabulary according to the target vector to obtain a target answer.
10. A data processing apparatus, the apparatus comprising:
the receiving module is used for receiving user problems input by a user;
The query module is used for querying from a preset question-answer data set based on the user questions and N preset information source types to obtain N information sources; the N is a positive integer greater than or equal to 2; at least two information sources in the N information sources are associated; the N information sources are information sets corresponding to the N preset information source types respectively, and each information set comprises at least one piece of information;
The question-answering module is used for inputting the user questions and the N information sources into a question-answering model and outputting target answers; the question-answering model is trained and obtained based on a plurality of user questions, N information sources corresponding to the user questions and a plurality of standard answers corresponding to the user questions.
11. An electronic device, comprising: a processor and a memory;
The processor is connected with the memory;
the memory is used for storing executable program codes;
The processor runs a program corresponding to executable program code stored in the memory by reading the executable program code for performing the method according to any one of claims 1-9.
12. A computer storage medium storing a plurality of instructions adapted to be loaded by a processor and to perform the method steps of any of claims 1-9.
CN202210078998.1A 2022-01-24 2022-01-24 Data processing method, device, electronic equipment and computer storage medium Active CN114443824B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210078998.1A CN114443824B (en) 2022-01-24 2022-01-24 Data processing method, device, electronic equipment and computer storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210078998.1A CN114443824B (en) 2022-01-24 2022-01-24 Data processing method, device, electronic equipment and computer storage medium

Publications (2)

Publication Number Publication Date
CN114443824A CN114443824A (en) 2022-05-06
CN114443824B true CN114443824B (en) 2024-08-02

Family

ID=81370097

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210078998.1A Active CN114443824B (en) 2022-01-24 2022-01-24 Data processing method, device, electronic equipment and computer storage medium

Country Status (1)

Country Link
CN (1) CN114443824B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114817512B (en) * 2022-06-28 2023-03-14 清华大学 Question-answer reasoning method and device
CN119761336B (en) * 2024-12-11 2025-07-04 北京中科闻歌科技股份有限公司 Comprehensive evaluation method, equipment and medium for chart question-answer model

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104216913A (en) * 2013-06-04 2014-12-17 Sap欧洲公司 Problem answering frame
CN112749265A (en) * 2021-01-08 2021-05-04 哈尔滨工业大学 Intelligent question-answering system based on multiple information sources

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160217209A1 (en) * 2015-01-22 2016-07-28 International Business Machines Corporation Measuring Corpus Authority for the Answer to a Question
US10678822B2 (en) * 2018-06-29 2020-06-09 International Business Machines Corporation Query expansion using a graph of question and answer vocabulary
US11037049B2 (en) * 2018-10-29 2021-06-15 International Business Machines Corporation Determining rationale of cognitive system output
US11507069B2 (en) * 2019-05-03 2022-11-22 Chevron U.S.A. Inc. Automated model building and updating environment
US20200403945A1 (en) * 2019-06-19 2020-12-24 International Business Machines Corporation Methods and systems for managing chatbots with tiered social domain adaptation
CN112115381B (en) * 2020-09-28 2024-08-02 北京百度网讯科技有限公司 Construction method, device, electronic equipment and medium of fusion relation network
CN112948546B (en) * 2021-01-15 2021-11-23 中国科学院空天信息创新研究院 Intelligent question and answer method and device for multi-source heterogeneous data source
CN113449038B (en) * 2021-06-29 2024-04-26 东北大学 A mine intelligent question-answering system and method based on autoencoder

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104216913A (en) * 2013-06-04 2014-12-17 Sap欧洲公司 Problem answering frame
CN112749265A (en) * 2021-01-08 2021-05-04 哈尔滨工业大学 Intelligent question-answering system based on multiple information sources

Also Published As

Publication number Publication date
CN114443824A (en) 2022-05-06

Similar Documents

Publication Publication Date Title
CN111368219B (en) Information recommendation method, device, computer equipment and storage medium
CN112905868B (en) Event extraction method, device, equipment and storage medium
CN113704460B (en) Text classification method and device, electronic equipment and storage medium
WO2022188534A1 (en) Information pushing method and apparatus
CN114443824B (en) Data processing method, device, electronic equipment and computer storage medium
CN113779225B (en) Training method of entity link model, entity link method and device
CN109886699A (en) Behavior recognition method and device, electronic device, storage medium
CN113688232A (en) Method and device for classifying bidding texts, storage medium and terminal
CN112598039A (en) Method for acquiring positive sample in NLP classification field and related equipment
CN113569017A (en) Model processing method and device, electronic equipment and storage medium
CN116975267A (en) Information processing method and device, computer equipment, medium and product
CN112906361A (en) Text data labeling method and device, electronic equipment and storage medium
CN114358023A (en) Intelligent question-answer recall method and device, computer equipment and storage medium
WO2024152686A1 (en) Method and apparatus for determining recommendation index of resource information, device, storage medium and computer program product
CN117573842A (en) Document retrieval methods and automatic question and answer methods
CN115098700B (en) Knowledge graph embedding representation method and device
EP4318375A1 (en) Graph data processing method and apparatus, computer device, storage medium and computer program product
CN115905518A (en) Sentiment classification method, device, equipment and storage medium based on knowledge graph
CN114880991A (en) Knowledge map question-answer entity linking method, device, equipment and medium
CN114429384A (en) Intelligent product recommendation method and system based on e-commerce platform
CN112765481B (en) Data processing method, device, computer and readable storage medium
CN113254788A (en) Big data based recommendation method and system and readable storage medium
CN111460113A (en) Data interaction method and related equipment
CN115510203B (en) Method, device, equipment, storage medium and program product for determining answers to questions
CN117291722A (en) Object management method, related device and computer readable medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant