[go: up one dir, main page]

CN117408255A - A dual-graph neural network medical named entity recognition method based on multi-feature fusion - Google Patents

A dual-graph neural network medical named entity recognition method based on multi-feature fusion Download PDF

Info

Publication number
CN117408255A
CN117408255A CN202311317519.8A CN202311317519A CN117408255A CN 117408255 A CN117408255 A CN 117408255A CN 202311317519 A CN202311317519 A CN 202311317519A CN 117408255 A CN117408255 A CN 117408255A
Authority
CN
China
Prior art keywords
representing
word
graph
vector
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311317519.8A
Other languages
Chinese (zh)
Inventor
马甲林
古汉钊
韩庆宾
李澳繁
谢乾
汪涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huaiyin Institute of Technology
Original Assignee
Huaiyin Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huaiyin Institute of Technology filed Critical Huaiyin Institute of Technology
Priority to CN202311317519.8A priority Critical patent/CN117408255A/en
Publication of CN117408255A publication Critical patent/CN117408255A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/14Tree-structured documents
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/042Knowledge-based neural networks; Logical representations of neural networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • G06N3/0442Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Computation (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Public Health (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Primary Health Care (AREA)
  • Epidemiology (AREA)
  • Pathology (AREA)
  • Machine Translation (AREA)

Abstract

The invention provides a double-graph neural network medical named entity recognition method based on multi-feature fusion, which comprises the steps of preprocessing text sequences; on one hand, the context characteristics are obtained by inputting the context characteristics into a two-way long-short-term memory network; on the other hand, respectively constructing a word co-occurrence diagram and a dependency syntax diagram; the graph convolution neural network learns the dependence and interaction between nodes according to the node relation of the graph structure to obtain global characteristic information; carrying out feature fusion on the context feature information and the global feature information, and calculating the dependency relationship inside the features by adopting a multi-head self-attention mechanism to obtain comprehensive feature information; and calculating comprehensive characteristic information by using a CRF model, and performing sequence labeling to obtain an optimal labeling sequence. Compared with the prior art, the invention extracts text features from the perspective of double graphs by constructing the word co-occurrence graph and the dependency syntax graph, and introduces a multi-head self-attention mechanism to calculate the dependency relationship inside the features, thereby effectively improving the accuracy of identifying the medical named entity.

Description

Double-graph neural network medical named entity recognition method based on multi-feature fusion
Technical Field
The invention relates to the technical field of natural language processing, in particular to a double-graph neural network medical named entity identification method based on multi-feature fusion.
Background
Named entity recognition (Named Entity Recognition, NER) in the medical field is an important task in natural language processing in the medical field, aiming at identifying specific entities from text, such as disease names, drug names, surgical procedures, diseases and diagnostics, etc. With the continuous development of medical research and medical practice, a great deal of Chinese electronic medical record text data is generated in the clinical treatment process, and the data contains rich medical information. The key information of related entities is automatically extracted from the massive texts through the medical naming entity recognition technology, and the information can be used for personalized medical service system construction and clinical auxiliary decision support, and has important significance for professional research in the medical field.
The entities in the medical field have diversity and complexity compared to the task of named entity identification in the general field, the same drug name may have multiple aliases and variant names, the same disease may have different naming schemes, and the disease name is often more complex. Meanwhile, complex relationships exist between entities in the electronic medical record, such as relationships between doctors and patients, relationships between diagnosis and treatment, and the like, which make the task of identifying named entities in the medical field more difficult.
Disclosure of Invention
The invention aims to: aiming at the technical problems in the background technology, the invention provides a dual-graph neural network medical named entity recognition method based on multi-feature fusion, which is characterized in that context feature information in a sequence is extracted through a two-way long-short-term memory network, global feature information of the sequence is obtained through a graph convolution neural network, the context information and the global information are subjected to feature fusion, and the dependency relationship inside the features is calculated through a self-attention mechanism, so that the accuracy of medical named entity recognition is improved.
The technical scheme is as follows: the invention provides a double-graph neural network medical entity identification method based on multi-feature fusion, which comprises the following steps:
step 1: preprocessing data of the text sequence, and dividing a training set and a testing set;
step 2: inputting text sequences into a two-way long and short term memory network module, enhancing the association of context Wen Yuyi and obtaining hidden layer feature vector H containing contextual information t
Step 3: respectively constructing a word co-occurrence diagram and a dependency syntax diagram according to the text sequence;
step 4: learning the dependency relationship and interaction between nodes according to the node relationship of the graph structure by using the graph convolution neural network to obtain global feature information H of fusion word co-occurrence node feature information and dependency syntax feature information C
Step 5: carrying out feature fusion on the context feature information obtained in the step 2 and the global feature information obtained in the step 4, and calculating the dependency relationship inside the feature by adopting a multi-head self-attention mechanism to obtain comprehensive feature information;
step 6: and (5) calculating the comprehensive characteristic information obtained in the step (5) by using a CRF model, and performing sequence labeling to obtain an optimal labeling sequence.
Further, the specific method of the step 1 is as follows:
step 1.1: performing text word segmentation, text normalization, text cleaning and removal of stop words and low-frequency words on the text sequence;
and step 1.2, performing a shuffle operation on the data set, and dividing the data set into a training set and a testing set according to the proportion of 7:3.
Further, the specific method of the step 2 is as follows:
step 2.1, enhancing the association of the upper and lower Wen Yuyi by utilizing a two-way long-short-term memory network, and acquiring a hidden layer feature vector containing context information;
step 2.1, using l= (L 1 ,l 2 ,…,l i …,l n ) Representing the text sequence after the text preprocessing of step 1, and using a pre-trained word vector model GloVe for the text sequence l i Representing to obtain Wherein l i ∈R d N represents the length of the sentence and d represents the word vector dimension;
step 2.2, the two-way long-short-term memory network obtains the text sequence l through the forward and reverse LSTM networks respectively i Context dependent information in front and back directions is calculated to obtain a forward vector and a backward vector, and the two vectors are spliced and used as output of a hidden layer and expressed as:
f t =σ(W f ·[h t-1 ,x t ]+b f )
i t =σ(W i ·[h t-1 ,x t ]+b i )
o t =σ(W o ·[h t-1 ,x t ]+b o
h t =o t *tanh(C t )
wherein f t 、i t And o t Respectively representing a forgetting gate, an input gate and an output gate at the time t,representing t-moment candidate memory cell vector, C t A memory cell vector h representing time t t Representing hidden layer output vector, representing dot product, W representing weight matrix, b representing bias vector, σ (·) and tanh (·) representing sigmoid activation function and hyperbolic tangent activation function, respectively;
by outputting the forward LSTM of a bidirectional long and short term memory networkAnd reverse LSTM output->Splicing to obtain final hidden layer characteristics +.>
Further, the specific process of the step 3 is as follows:
step 3.1, constructing a word co-occurrence diagram;
step 3.1.1, the text sequence of length n is represented by w= { W 1 ,w 2 ,…,w i ,…,x n Represented by }, w i Representing an ith word in the text sequence, and setting a designated sliding window size;
step 3.1.2 sliding window along text sequence from left to right, window center word is w i If w j And w i Within a window, w is constructed i And w j Connecting edges between word nodes;
step 3.1.3, constructing a word co-occurrence graph adjacency matrix A m Expressed as:wherein the method comprises the steps of ci j represents the number of times two words co-occur within each window;
step 3.2, constructing a dependency syntax diagram;
step 3.2.1, analyzing the text sequence by using a dependency syntax analyzer Stanford NLP to obtain a syntax dependency tree of the sentence;
step 3.2.2, constructing a graph structure G= (V, E) by taking words in the syntactic dependency tree as nodes and taking dependency relationship as directed edges, wherein each edge E (i, j) E represents a head node x i To a slave node x j Is dependent on (a); wherein V represents the nodes of the graph, E represents the nodes and the edges of the nodes;
step 3.2.3 constructing an adjacency matrix A from the syntactic dependency graph s Expressed as:
further, the specific process of the step 4 is as follows:
step 4.1, aggregating and updating the nodes in the word co-occurrence graph and the dependency syntax graph according to the nodes, meanwhile, the graph convolution operation considers the neighbor node characteristics of the nodes, updates the characteristic representation of the nodes by utilizing the information of the neighbor nodes, and aims at the first-layer word nodeAnd syntax node->Layer 1 word node->And syntax node->The update mode is defined as follows:
wherein W is l Representing model training weight parameters, RELU (·) representing activation functions,representing a normalized symmetric adjacency matrix expressed as:Wherein D represents a degree matrix;
step 4.2, splicing the updated word node vector representation and the dependency syntax node vector representation to obtain global feature information H C Expressed as:
further, the specific process of the step 5 is as follows:
step 5.1, performing feature fusion on the context feature information obtained in the step 2 and the global feature information obtained in the step 4 to obtain a fusion vector, and marking the fusion vector as R= [ H ] t ,H C ];
Step 5.2, performing linear transformation on the fusion vector R to obtain 3 different vector matrixes Q (query), K (Key) and V (Value), wherein q=k=v=r; and (3) carrying out dot product operation on each keyword matrix K by using Q, scaling an operation result, and normalizing by using a softmax function to obtain attention weight, wherein a calculation formula is expressed as follows:
wherein,representing the dimension of the key, softmax (·) represents probability mapping of the input element;
and 5.3, performing multiple parallel attention calculations on the high-dimensional feature vectors through a multi-head self-attention mechanism, wherein the number of heads is e, each attention head has different feature parameters, after attention information of one head is obtained through calculation, performing linear transformation on the e heads after splicing to finally obtain comprehensive feature information of fused attention information, and the comprehensive feature information is expressed as:
M i =SelfAttention(QW Q ,KW K ,VW V )
MulAtt(Q,K,V)=Concat(M 1 ,M 2 ,…,M e )W
wherein W is Q ,W K ,W V The weight matrix learned by the model in training is represented, concat (·) represents the feature vector extracted by the e attention heads is spliced, and W represents the weight matrix generated in the implementation process.
Further, the specific process of the step 6 is as follows:
for a given sequence x= { x 1 ,x 2 ,…,x n Sum prediction sequence y= { 1 ,y 2 ,…,y n -its calculated score is expressed as:
wherein,representing y i Transfer to y i+1 Probability of->Indicating that the i-th word is marked y i S (x, y) represents the probability that the input sentence sequence x is labeled with the label sequence y.
The beneficial effects are that:
the invention acquires topological structure information and syntax dependency information among words by constructing the word co-occurrence graph and the dependency syntax graph, and solves the problem that the word ambiguity of a named entity and the dependency information among words cannot be fully utilized. The context characteristic information is captured by utilizing the two-way long-short-term memory network, the double-graph updating learning is performed through the graph convolution neural network, and a multi-head self-attention mechanism is introduced to perform characteristic fusion, so that the identification accuracy of the medical named entity is effectively improved.
Drawings
FIG. 1 is a flowchart of a dual-map neural network medical named entity recognition method based on multi-feature fusion in an embodiment of the invention;
FIG. 2 is a schematic diagram of an overall architecture in an embodiment of the present invention;
FIG. 3 is a schematic diagram of a word co-occurrence diagram structure according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a dependency algorithm in an embodiment of the present invention.
Detailed Description
The present invention is further illustrated below in conjunction with specific embodiments, it being understood that these embodiments are meant to be illustrative of the invention only and not limiting the scope of the invention, and that modifications of the invention, which are equivalent to those skilled in the art to which the invention pertains, will fall within the scope of the invention as defined in the claims appended hereto.
The invention discloses a double-graph neural network medical entity identification method based on multi-feature fusion, which comprises the following steps:
step 1, preprocessing data of a text sequence; the data preprocessing comprises the steps of performing text word segmentation, capitalization and lowercase, removing non-text content, removing stop words and low-frequency words on a text sequence; and then, carrying out a shuffle operation on the data set, and dividing the data set into a training set and a testing set according to the proportion of 7:3.
And step 2, inputting the text sequence into a two-way long-short-term memory network module, associating the enhanced context Wen Yuyi and acquiring a hidden layer feature vector containing context information.
Step 2.1, using l= (L 1 ,l 2 ,…,l i …,l n ) Representing the text sequence after the text preprocessing of step 1, and using a pre-trained word vector model GloVe for the text sequence l i Representing to obtain Wherein l i ∈R d N represents the length of the sentence and d represents the word vector dimension.
Step 2.2, the two-way long-short-term memory network obtains the text sequence l through the forward and reverse LSTM networks respectively i Context dependency information in front and back directions ensures that each word obtains rich semantic information, and the forward vector and the backward vector obtained through calculation are spliced and serve as output of a hidden layer and can be expressed as:
f t =σ(W f ·[h t-1 ,x t ]+b f )
i t =σ(W i ·[h t-1 ,x t ]+b i )
o t =σ(W o ·[h t-1 ,x t ]+b o )
h t =o t *tanh(C t )
wherein f t 、i t And o t Respectively representing a forgetting gate, an input gate and an output gate at the time t,representing t-moment candidate memory cell vector, C t A memory cell vector h representing time t t Represents hidden layer output vector, represents dot product, W represents weight matrix, b represents bias vector, σ (·) and tanh (·) represent ssigmoid activation function and hyperbolic tangent activation function, respectively.
By outputting the forward LSTM of a bidirectional long and short term memory networkAnd reverse LSTM output->Splicing to obtain final hidden layer characteristics +.>
And 3, respectively constructing a word co-occurrence diagram and a dependency syntax diagram according to the text sequence.
Step 3.1, constructing a word co-occurrence diagram;
step 3.1.1, the text sequence of length n is represented by w= { W 1 ,w 2 ,…,w i ,…,x n Represented by }, w i Representing an ith word in the text sequence, and setting a designated sliding window size;
step 3.1.2 sliding window along text sequence from left to right, window center word is w i If w j And w i Within a window, w is constructed i And w j The edges between word nodes, as shown in FIG. 3;
step 3.1.3, constructing a word co-occurrence graph adjacency matrix A m Can be expressed as:
wherein c ij Representing the number of times two words co-occur within each window.
And 3.2, constructing a dependency syntax graph.
And 3.2.1, analyzing the text sequence by using a dependency syntax analyzer Stanford NLP to obtain a syntax dependency tree of the sentence.
Step 3.2.2, constructing a graph structure G= (V, E) by taking words in the syntactic dependency tree as nodes and taking dependency relationship as directed edges, wherein each edge E (i, j) E represents a head node x i To a slave node x j As shown in fig. 4.
Where V represents the nodes of the graph and E represents the nodes and their edges.
Step 3.2.3 constructing an adjacency matrix from the syntactic dependency graphA s Can be expressed as:
and 4, learning the dependency relationship and interaction between nodes according to the node relationship of the graph structure by the graph convolution neural network to obtain the global feature information of the fusion word co-occurrence node feature information and the dependency syntax feature information.
Step 4.1, aggregating and updating the nodes in the word co-occurrence graph and the dependency syntax graph according to the nodes, meanwhile, the graph convolution operation considers the neighbor node characteristics of the nodes, updates the characteristic representation of the nodes by utilizing the information of the neighbor nodes, and aims at the first-layer word nodeAnd syntax node->Layer 1 word node->And syntax node->The update mode is defined as follows:
wherein W is l Representing model training weight parameters, RELU (·) representing activation functions,representing a normalized symmetric adjacency matrix, which can be expressed as:
where D represents the degree matrix.
Step 4.2, splicing the updated word node vector representation and the dependency syntax node vector representation to obtain global feature information H C Can be expressed as:
and 5, carrying out feature fusion on the context feature information obtained in the step 2 and the global feature information obtained in the step 4, and calculating the dependency relationship inside the feature by adopting a multi-head self-attention mechanism to obtain comprehensive feature information.
Step 5.1, in order to improve the association degree and the dependency relationship between words in sentences, performing feature fusion on the context feature information obtained in the step 2 and the global feature information obtained in the step 4 to obtain a fusion vector, and marking the fusion vector as R= [ H ] t ,H C ]。
Step 5.2, the fusion vector R is linearly transformed to obtain 3 different vector matrices Q (query), K (Key) and V (Calue), and q=k=v=r. And (3) carrying out dot product operation on each keyword matrix K by using Q, scaling an operation result, and normalizing by a softmax function to obtain an attention weight, wherein a calculation formula can be expressed as follows:
wherein,representing the dimensions of the key, softmax (·) represents the probability mapping of the input element.
And 5.3, performing multiple parallel attention calculations on the high-dimensional feature vectors through a multi-head self-attention mechanism, wherein the number of heads is e, each attention head has different feature parameters, after attention information of one head is obtained through calculation, the e heads are spliced and then are subjected to linear transformation to finally obtain comprehensive feature information of fused attention information, and the comprehensive feature information can be expressed as:
M i =SelfAttention(QW Q ,KW K ,VW V )
MulAtt(Q,K,V)=Concat(M 1 ,M 2 ,…,M e )W
wherein W is Q ,W K ,W V The weight matrix learned by the model in training is represented, concat (·) represents the feature vector extracted by the e attention heads is spliced, and W represents the weight matrix generated in the implementation process.
And 6, calculating the comprehensive characteristic information obtained in the step 5 by using a CRF model, and performing sequence labeling to obtain an optimal labeling sequence.
Step 6.1, CRF is a discriminant probability model, and can simultaneously utilize the correlation between input features and labels to predict the global optimal solution of a label sequence, thereby greatly improving the performance of a named entity recognition task. For a given sequence x= { x 1 ,x 2 ,…,x n Sum prediction sequence y= { y 1 ,y 2 ,…,y n -its calculated score can be expressed as:
wherein,representing y i Transfer to y i+1 Probability of->Indicating that the i-th word is marked y i S (x, y) represents the probability that the input sentence sequence x is labeled with the label sequence y.
The foregoing embodiments are merely illustrative of the technical concept and features of the present invention, and are intended to enable those skilled in the art to understand the present invention and to implement the same, not to limit the scope of the present invention. All equivalent changes or modifications made according to the spirit of the present invention should be included in the scope of the present invention.

Claims (7)

1. A dual-image neural network medical entity identification method based on multi-feature fusion is characterized by comprising the following steps:
step 1: preprocessing data of the text sequence, and dividing a training set and a testing set;
step 2: inputting text sequences into a two-way long and short term memory network module, enhancing the association of context Wen Yuyi and obtaining hidden layer feature vector H containing contextual information t
Step 3: respectively constructing a word co-occurrence diagram and a dependency syntax diagram according to the text sequence;
step 4: learning the dependency relationship and interaction between nodes according to the node relationship of the graph structure by using the graph convolution neural network to obtain global feature information H of fusion word co-occurrence node feature information and dependency syntax feature information C
Step 5: carrying out feature fusion on the context feature information obtained in the step 2 and the global feature information obtained in the step 4, and calculating the dependency relationship inside the feature by adopting a multi-head self-attention mechanism to obtain comprehensive feature information;
step 6: and (5) calculating the comprehensive characteristic information obtained in the step (5) by using a CRF model, and performing sequence labeling to obtain an optimal labeling sequence.
2. The method for identifying the medical named entity of the double-graph neural network based on the multi-feature fusion according to claim 1, wherein the specific method of the step 1 is as follows:
step 1.1: performing text word segmentation, text normalization, text cleaning and removal of stop words and low-frequency words on the text sequence;
and 1.2, performing a shuffle operation on the data set, and dividing the data set into a training set and a testing set according to the ratio of 7:3.
3. The dual-map neural network medical named entity recognition method based on multi-feature fusion according to claim 1, wherein the specific method of the step 2 is as follows:
step 2.1, enhancing the association of the upper and lower Wen Yuyi by utilizing a two-way long-short-term memory network, and acquiring a hidden layer feature vector containing context information;
step 2.1, using l= (L 1 ,l 2 ,...,l i ...,l n ) Representing the text sequence after the text preprocessing of step 1, and using a pre-trained word vector model GloVe for the text sequence l i Representing to obtain Wherein l i ∈R d N represents the length of the sentence and d represents the word vector dimension;
step 2.2, the two-way long-short-term memory network obtains the text sequence l through the forward and reverse LSTM networks respectively i Context dependent information in front and back directions is calculated to obtain a forward vector and a backward vector, and the two vectors are spliced and used as output of a hidden layer and expressed as:
f t =σ(W f ·[h t-1 ,x t ]+b f )
i t =σ(W i ·[h t-1 ,x t ]+b i )
o t =σ(W o ·[h t-1 ,x t ]+b o )
h t =o t *tanh(C t )
wherein f t 、i t And o t Respectively representing a forgetting gate, an input gate and an output gate at the time t,representing t-moment candidate memory cell vector, C t A memory cell vector h representing time t t Representing hidden layer output vector, representing dot product, W representing weight matrix, b representing bias vector, σ (·) and tanh (·) representing sigmoid activation function and hyperbolic tangent activation function, respectively;
by outputting the forward LSTM of a bidirectional long and short term memory networkAnd reverse LSTM output->Splicing to obtain final hidden layer characteristics +.>
4. The method for identifying the medical named entity of the double-graph neural network based on the multi-feature fusion according to claim 1, wherein the specific process of the step 3 is as follows:
step 3.1, constructing a word co-occurrence diagram;
step 3.1.1, the text sequence of length n is represented by w= { W 1 ,w 2 ,...,w i ,...,x n Represented by }, w i Representing an ith word in the text sequence, and setting a designated sliding window size;
step 3.1.2 sliding window along text sequence from left to right, window center word is w i If w j And w i Within a window, w is constructed i And w j Connecting edges between word nodes;
step 3.1.3, constructing a word co-occurrence graph adjacency matrix A m Expressed as:wherein c ij Representing the number of times two words co-occur within each window;
step 3.2, constructing a dependency syntax diagram;
step 3.2.1, analyzing the text sequence by using a dependency syntax analyzer Stanford NLP to obtain a syntax dependency tree of the sentence;
step 3.2.2, constructing a graph structure G= (V, E) by taking words in the syntactic dependency tree as nodes and taking dependency relationship as directed edges, wherein each edge E (i, j) E represents a head node x i To a slave node x j Is dependent on (a); wherein V represents the nodes of the graph, E represents the nodes and the edges of the nodes;
step 3.2.3 constructing an adjacency matrix A from the syntactic dependency graph s Expressed as:
5. the method for identifying the medical named entity of the double-graph neural network based on the multi-feature fusion according to claim 1, wherein the specific process of the step 4 is as follows:
step 4.1, aggregating and updating the nodes in the word co-occurrence graph and the dependency syntax graph according to the nodes, meanwhile, the graph convolution operation considers the neighbor node characteristics of the nodes, updates the characteristic representation of the nodes by utilizing the information of the neighbor nodes, and aims at the first-layer word nodeAnd syntax node->Layer 1 word node->And syntax node->The update mode is defined as follows:
wherein W is l Representing model training weight parameters, RELU (·) representing activation functions,representing a normalized symmetric adjacency matrix expressed as:Wherein D represents a degree matrix;
step 4.2, splicing the updated word node vector representation and the dependency syntax node vector representation to obtain global feature information H C Expressed as:
6. the method for identifying the medical named entity of the double-graph neural network based on the multi-feature fusion according to claim 1, wherein the specific process of the step 5 is as follows:
step 5.1, performing feature fusion on the context feature information obtained in the step 2 and the global feature information obtained in the step 4 to obtain a fusion vector, and marking the fusion vector as R= [ H ] t ,H C ];
Step 5.2, performing linear transformation on the fusion vector R to obtain 3 different vector matrixes Q (query), K (Key) and V (Value), wherein q=k=v=r; and (3) carrying out dot product operation on each keyword matrix K by using Q, scaling an operation result, and normalizing by using a softmax function to obtain attention weight, wherein a calculation formula is expressed as follows:
wherein,representing the dimension of the key, softmax (·) represents probability mapping of the input element;
and 5.3, performing multiple parallel attention calculations on the high-dimensional feature vectors through a multi-head self-attention mechanism, wherein the number of heads is e, each attention head has different feature parameters, after attention information of one head is obtained through calculation, performing linear transformation on the e heads after splicing to finally obtain comprehensive feature information of fused attention information, and the comprehensive feature information is expressed as:
M i =SelfAttention(QW Q ,KW K ,VW V )
MulAtt(Q,K,V)=Concat(M 1 ,M 2 ,...,M e )W
wherein W is Q ,W K ,W V The weight matrix learned by the model in training is represented, concat (·) represents the feature vector extracted by the e attention heads is spliced, and W represents the weight matrix generated in the implementation process.
7. The method for identifying the medical named entity of the double-graph neural network based on the multi-feature fusion according to claim 6, wherein the specific process of the step 6 is as follows:
for a given sequence x= { x 1 ,x 2 ,...,x n The i and predicted sequence y= { y 1 ,y 2 ,...,y n -its calculated score is expressed as:
wherein,representing y i Transfer to y i+1 Probability of->Represent the firsti words are labeled y i S (x, y) represents the probability that the input sentence sequence x is labeled with the label sequence y.
CN202311317519.8A 2023-10-11 2023-10-11 A dual-graph neural network medical named entity recognition method based on multi-feature fusion Pending CN117408255A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311317519.8A CN117408255A (en) 2023-10-11 2023-10-11 A dual-graph neural network medical named entity recognition method based on multi-feature fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311317519.8A CN117408255A (en) 2023-10-11 2023-10-11 A dual-graph neural network medical named entity recognition method based on multi-feature fusion

Publications (1)

Publication Number Publication Date
CN117408255A true CN117408255A (en) 2024-01-16

Family

ID=89488146

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311317519.8A Pending CN117408255A (en) 2023-10-11 2023-10-11 A dual-graph neural network medical named entity recognition method based on multi-feature fusion

Country Status (1)

Country Link
CN (1) CN117408255A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119673429A (en) * 2024-12-05 2025-03-21 济南凯信医疗科技有限公司 An Internet-based online gynecological disease intelligent consultation system

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119673429A (en) * 2024-12-05 2025-03-21 济南凯信医疗科技有限公司 An Internet-based online gynecological disease intelligent consultation system

Similar Documents

Publication Publication Date Title
CN112818676B (en) A joint extraction method of medical entity relationships
CN108984724B (en) Method for improving emotion classification accuracy of specific attributes by using high-dimensional representation
CN110765775B (en) A Domain Adaptation Method for Named Entity Recognition Fusing Semantics and Label Differences
CN112364174A (en) Patient medical record similarity evaluation method and system based on knowledge graph
CN111914556B (en) Emotion guiding method and system based on emotion semantic transfer pattern
CN106980608A (en) A Chinese electronic medical record word segmentation and named entity recognition method and system
CN111554360A (en) Drug relocation prediction method based on biomedical literature and domain knowledge data
CN109871538A (en) A kind of Chinese electronic health record name entity recognition method
CN111950283B (en) Chinese word segmentation and named entity recognition system for large-scale medical text mining
CN114021584B (en) Knowledge representation learning method based on graph convolution network and translation model
CN114781382B (en) Medical named entity recognition system and method based on RWLSTM model fusion
Ke et al. Medical entity recognition and knowledge map relationship analysis of Chinese EMRs based on improved BiLSTM-CRF
CN108959566B (en) A kind of medical text based on Stacking integrated study goes privacy methods and system
CN111078875A (en) Method for extracting question-answer pairs from semi-structured document based on machine learning
CN111274790A (en) Text-level event embedding method and device based on syntactic dependency graph
CN111222318A (en) Trigger word recognition method based on two-channel bidirectional LSTM-CRF network
CN114077673A (en) Knowledge graph construction method based on BTBC model
Ren et al. Detecting the scope of negation and speculation in biomedical texts by using recursive neural network
Zhang et al. Using a pre-trained language model for medical named entity extraction in Chinese clinic text
CN116108840A (en) A text fine-grained sentiment analysis method, system, medium and computing device
CN117891958B (en) A standard data processing method based on knowledge graph
CN111125378A (en) Closed-loop entity extraction method based on automatic sample labeling
CN116680407B (en) A method and apparatus for constructing a knowledge graph
CN114373512B (en) Protein interaction relationship extraction method based on Gaussian enhancement and auxiliary tasks
CN117408255A (en) A dual-graph neural network medical named entity recognition method based on multi-feature fusion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination