CN113255366B

CN113255366B - An Aspect-level Text Sentiment Analysis Method Based on Heterogeneous Graph Neural Network

Info

Publication number: CN113255366B
Application number: CN202110593991.9A
Authority: CN
Inventors: 田锋; 安文斌; 陈妍; 徐墨; 高瞻; 郭倩; 文华; 郑庆华
Original assignee: Xian Jiaotong University
Current assignee: Xian Jiaotong University
Priority date: 2021-05-28
Filing date: 2021-05-28
Publication date: 2022-12-09
Anticipated expiration: 2041-05-28
Also published as: CN113255366A

Abstract

The invention discloses an aspect-level text emotion analysis method based on a heterogeneous graph neural network, and belongs to the field of language processing. According to the co-occurrence relation of words and sentences in the text and the evaluation aspect contained in the sentences, a three-level graph structure network in the word-sentence-evaluation aspect is constructed; then obtaining initial embedded vector representation of each node; and then using the parameters of the graph attention network training model, continuously updating the embedded vector representation of the nodes in the graph network according to the connection relation of each node in the graph network through a multi-head attention mechanism, and finally predicting the aspect-level emotional tendency of the text. And calculating the correlation between the sentence nodes and the evaluation nodes by using a self-attention mechanism according to the finally obtained embedded vector representation of the sentence nodes and the evaluation nodes, thereby obtaining the predicted text aspect level emotional tendency. The invention effectively improves the expression capability and generalization capability of the model.

Description

An Aspect-level Text Sentiment Analysis Method Based on Heterogeneous Graph Neural Network

技术领域technical field

本发明属于语言处理领域，尤其是一种基于异构图神经网络的方面级文本情感分析方法。The invention belongs to the field of language processing, in particular to an aspect-level text sentiment analysis method based on a heterogeneous graph neural network.

背景技术Background technique

方面级情感分析(Aspect-Based Sentiment Analysis,ABSA)是一种细粒度的文本情感分析方法，相较于传统的情感分析，它能提供更加细致丰富的情感信息。方面级情感分析主要用于分析文本在不同方面对应的情感倾向(积极、中立、消极)，可以分为两类子任务：基于方面的情感分类(ATSA)和基于类别的情感分类(ACSA)。例如对于文本“食物很好吃，就是服务员态度太差了”，ACSA需要针对评价方面“食物”给出积极情感，针对评价方面“服务”给出消极情感。Aspect-Based Sentiment Analysis (ABSA) is a fine-grained text sentiment analysis method, which can provide more detailed and rich emotional information than traditional sentiment analysis. Aspect-level sentiment analysis is mainly used to analyze the sentiment orientation (positive, neutral, negative) corresponding to different aspects of text, and can be divided into two types of subtasks: aspect-based sentiment classification (ATSA) and category-based sentiment classification (ACSA). For example, for the text "the food is delicious, but the attitude of the waiter is too bad", ACSA needs to give positive emotions to the evaluation aspect "food", and give negative emotions to the evaluation aspect "service".

现阶段大部分模型都是基于注意力机制和神经网络。利用神经网络捕捉文本的语义信息，同时利用注意力机制捕获评价方面的信息，同时强化文本语义与评价方面之间的关注度。从网络结构来分类，现有方法大致可以分为以下四类：1、基于循环神经网络(RNN)的方法，如Wang等提出了基于注意力的长短期记忆网络(LSTM)来生成评价方面的嵌入表示。2、基于卷积神经网络(CNN)的方法，如Xue等提出了卷积门控单元来提取文本特征以及评价方面特征。3、基于图神经网络(GNN)的方法，如Li等提出了使用图神经网络来建模文本的语法结构来辅助分类。4、基于预训练模型的方法，如Sun等利用预训练模型BERT建模文本与评价方面之间的语义关系。上述四类技术方案存在如下的缺点：第一，现有模型均假设输入文本之间是独立同分布的，然而方面级情感分析的主要研究对象—评论性质的文本之间往往存在较强相关性，忽略这种相关性会导致损失大量信息，使得模型性能下降。第二，现有模型均忽略了针对相同评价方面具有相同情感的文本之间具有结构相似性特征，这导致了文本之间信息无法共享，使得模型的表达能力变差。第三，现有模型均忽略了针对相同评价方面具有相同情感的文本之间具有语义表达多样性特征，这种多样性信息的损失会导致模型的泛化能力变差。Most of the models at this stage are based on attention mechanism and neural network. Use the neural network to capture the semantic information of the text, and use the attention mechanism to capture the evaluation information, and at the same time strengthen the attention between the text semantics and the evaluation. From the classification of the network structure, the existing methods can be roughly divided into the following four categories: 1. The method based on the recurrent neural network (RNN). For example, Wang et al. proposed an attention-based long short-term memory network (LSTM) to generate evaluation results Embedded representation. 2. Based on the method of convolutional neural network (CNN), such as Xue et al. proposed a convolutional gating unit to extract text features and evaluation features. 3. A method based on graph neural network (GNN). For example, Li et al. proposed to use graph neural network to model the grammatical structure of text to assist classification. 4. The method based on the pre-training model, such as Sun et al., uses the pre-training model BERT to model the semantic relationship between the text and the evaluation aspect. The above four types of technical solutions have the following shortcomings: First, the existing models all assume that the input texts are independent and identically distributed, but the main research object of aspect-level sentiment analysis-commentary texts often have strong correlations , ignoring this correlation will lead to the loss of a large amount of information, which will degrade the performance of the model. Second, the existing models ignore the structural similarity between texts with the same emotion for the same evaluation, which leads to the inability to share information between texts, which makes the model's expressive ability worse. Third, the existing models ignore the diversity of semantic expressions between texts with the same emotion for the same evaluation aspect. The loss of this diversity information will lead to poor generalization ability of the model.

发明内容Contents of the invention

本发明的目的在于克服上述现有技术的缺点，提供一种基于异构图神经网络的方面级文本情感分析方法。The purpose of the present invention is to overcome the above-mentioned shortcomings of the prior art, and provide a method for aspect-level text sentiment analysis based on a heterogeneous graph neural network.

为达到上述目的，本发明采用以下技术方案予以实现：In order to achieve the above object, the present invention adopts the following technical solutions to achieve:

一种基于异构图神经网络的方面级文本情感分析方法，包括以下步骤：A method for aspect-level text sentiment analysis based on a heterogeneous graph neural network, comprising the following steps:

(1)、根据单词与文本的共现关系以及文本中涉及的评价方面，构建单词-句子-评价方面三级的图网络结构；(1), according to the co-occurrence relationship between words and text and the evaluation aspects involved in the text, construct a word-sentence-evaluation aspect three-level graph network structure;

(2)、使用预训练的模型初始化图网络结构中各结点的嵌入向量表示，分别得到单词结点的初始嵌入表示矩阵

文本结点的初始嵌入表示矩阵

以及评价方面的初始嵌入表示矩阵

(2), use the pre-trained model to initialize the embedding vector representation of each node in the graph network structure, and obtain the initial embedding representation matrix of the word node respectively

Initial embedding representation matrix for text nodes

and the initial embedding representation matrix for the evaluation

(3)、采用图注意力网络GAT训练模型，根据图网络结构以及文本间的语义关系，通过多头自注意力机制不断更新图网络结构中各结点的嵌入表示，使得各结点间不断进行信息交换，从而得到在(t+1)步时各结点的嵌入表示矩阵

最终得到文本嵌入表示矩阵

以及评价方面嵌入表示矩阵

(3) Using the graph attention network GAT training model, according to the graph network structure and the semantic relationship between texts, the embedded representation of each node in the graph network structure is continuously updated through the multi-head self-attention mechanism, so that the nodes are continuously updated Information exchange, so as to obtain the embedded representation matrix of each node at (t+1) step

Finally, the text embedding representation matrix is obtained

and the evaluation aspect embedding representation matrix

(4)、利用所述文本嵌入表示矩阵

和评价方面嵌入表示矩阵

通过自注意力机制，计算文本与评价方面各情感倾向之间的相关性，取相关性最大的情感作为文本在所述评价方面上的预测情感，通过损失函数计算预测的情感去倾向与文本真实情感倾向之间的差异，最后通过反向传播优化模型参数，直至预测的情感倾向与文本的真实情感倾向的接近度在预设范围，得到训练好的模型；(4), utilize described text embedding representation matrix

and evaluate aspect embedding representation matrices

Through the self-attention mechanism, the correlation between the text and the emotional tendencies in the evaluation is calculated, and the emotion with the greatest correlation is taken as the predicted emotion of the text in the evaluation aspect, and the predicted emotion is calculated by the loss function. The difference between the emotional tendencies, and finally optimize the model parameters through backpropagation until the predicted emotional tendencies and the true emotional tendencies of the text are within the preset range, and the trained model is obtained;

(5)、将待分类文本输入训练好的模型进行特征提取，将提取到的文本特征向量与训练好的评价方面向量采用自注意力机制计算相关性，最后使用softmax分类器进行分类。(5) Input the text to be classified into the trained model for feature extraction, use the self-attention mechanism to calculate the correlation between the extracted text feature vector and the trained evaluation aspect vector, and finally use the softmax classifier for classification.

进一步的，所述图网络结构G表示为：G＝{V_w,V_s,V_a,E_ws,E_sa}；Further, the graph network structure G is expressed as: G={V _w , V _s , V _a , E _ws , E _sa };

其中，V_w表示文本中包含的单词结点；V_s表示文本结点；V_a表示评价方面结点；E_ws表示单词结点与文本结点之间的边，其权重表示单词在文本中出现的位置；E_sa表示文本结点与评价方面结点之间的边。Among them, V _w represents the word node contained in the text; V _s represents the text node; V _a represents the evaluation aspect node; E _ws represents the edge between the word node and the text node, and its weight represents the word in the text The position where it appears; E _sa represents the edge between the text node and the evaluation aspect node.

进一步的，步骤(2)中的单词结点嵌入向量初始化的具体操作为：Further, the specific operation of word node embedding vector initialization in step (2) is:

针对图网络结构中的单词结点，使用预训练的GloVe词向量库对所述单词结点进行初始化，得到单词嵌入向量，将所有单词嵌入向量拼接，得到单词初始嵌入矩阵。For the word nodes in the graph network structure, use the pre-trained GloVe word vector library to initialize the word nodes to obtain word embedding vectors, and splicing all the word embedding vectors to obtain the initial word embedding matrix.

进一步的，步骤(2)中的文本结点嵌入向量初始化的具体操作为：Further, the specific operation of text node embedding vector initialization in step (2) is:

针对图网络结构中的文本结点，使用预训练语言模型BERT对所述文本结点进行初始化，得到初始嵌入向量，将所有文本初始嵌入向量拼接，得到文本初始嵌入矩阵。For the text nodes in the graph network structure, use the pre-trained language model BERT to initialize the text nodes to obtain the initial embedding vector, and splicing all the text initial embedding vectors to obtain the text initial embedding matrix.

进一步的，步骤(2)中的评价方面结点嵌入向量初始化的具体操作为：Further, the specific operation of the evaluation aspect node embedding vector initialization in step (2) is:

针对图网络结构中的评价方面结点，使用独热编码对所述评价方面结点进行编码，并利用一层参数可学习的全连接网络FCN将编码向量映射到特征空间，得到评价方面结点的初始嵌入向量，将所有评价方面初始嵌入向量进行拼接，得到评价方面初始嵌入矩阵。For the evaluation aspect node in the graph network structure, use one-hot encoding to encode the evaluation aspect node, and use a layer of fully connected network FCN with learnable parameters to map the encoding vector to the feature space to obtain the evaluation aspect node The initial embedding vector of all evaluation aspects is concatenated to obtain the initial embedding matrix of evaluation aspects.

进一步的，步骤(3)中更新图网络结构中各结点嵌入表示的具体操作为：Further, the specific operation of updating the embedding representation of each node in the graph network structure in step (3) is:

对于图网络结构中的给定结点的嵌入向量h_i及与所述给定结点相连的邻居N_i，使用多头注意力机制获得给定结点新的嵌入向量表示

For the embedding vector h _i of a given node in the graph network structure and the neighbors N _i connected to the given node, use the multi-head attention mechanism to obtain a new embedding vector representation of the given node

给定结点n在t步时的嵌入表示记为

给定结点的邻居在t步时的嵌入表示记为

记给定结点n在(t+1)时的嵌入表示为

基于给定结点新的嵌入向量表示

构建

三者之间的关系：The embedding representation of a given node n at step t is denoted as

The embedded representation of the neighbors of a given node at step t is denoted as

Note that the embedding of a given node n at (t+1) is expressed as

A new embedding vector representation based on a given node

Construct

The relationship between the three:

基于图网络结构中的单词结点、文本结点和评价方面结点对应的初始嵌入矩阵，通过公式(5)反复迭代，得到(t+1)步时各结点的嵌入表示矩阵

Based on the initial embedding matrix corresponding to the word node, text node and evaluation aspect node in the graph network structure, the embedding representation matrix of each node at step (t+1) is obtained through repeated iterations of the formula (5)

进一步的，步骤(4)中，计算文本与评价方面各情感倾向之间的相关性，计算公式为：Further, in step (4), the correlation between the text and the emotional tendencies in the evaluation is calculated, and the calculation formula is:

其中，

为

中第i个文本结点对应的嵌入向量；

为

中第j个评价方面对应的嵌入向量；β_ij为文本结点向量与评价方面结点向量之间的注意力权重；

为注意力权重加权后的文本结点嵌入表示；

表示预测文本在当前评价方面上情感倾向的概率分布；W_a,b_a为可学习参数；softmax()为指数归一化函数，计算公式为：in,

for

The embedding vector corresponding to the i-th text node in ;

for

The embedding vector corresponding to the jth evaluation aspect in ; β _ij is the attention weight between the text node vector and the evaluation aspect node vector;

Embedding representations for text nodes weighted by attention weights;

Indicates the probability distribution of the emotional tendency of the predicted text in terms of current evaluation; W _a , b _a are learnable parameters; softmax() is an exponential normalization function, and the calculation formula is:

进一步的，步骤(4)中，将预测的文本情感倾向分布

与文本真实情感标签

进行比对，通过交叉熵损失函数

计算二者之间的差异，在全部样本上的损失为对所有文本结点i及所有评价方面结点j进行求和：Further, in step (4), the predicted text sentiment distribution

Label with text real emotion

For comparison, pass the cross entropy loss function

To calculate the difference between the two, the loss over all samples is summed over all text nodes i and all evaluation aspect nodes j:

最后通过反向传播算法，不断更新模型参数，直至预测的情感倾向与文本的真实情感倾向的接近度在预设范围。Finally, through the backpropagation algorithm, the model parameters are continuously updated until the predicted emotional orientation and the true emotional orientation of the text are within the preset range.

与现有技术相比，本发明具有以下有益效果：Compared with the prior art, the present invention has the following beneficial effects:

本发明的基于异构图神经网络的方面级文本情感分析方法，根据文本中单词与句子的共现关系以及句子中包含的评价方面，构建单词-句子-评价方面的三级图结构网络；之后生成图网络中各结点的嵌入表示，使用预训练的语言模型分别初始化图网络中的单词结点，句子结点以及评价方面结点，从而获得各结点的初始嵌入向量表示；再使用图注意力网络训练模型参数，通过多头注意力机制，根据图网络中各结点的连接关系，不断更新图网络中结点的嵌入向量表示，最后预测文本的方面级情感倾向。根据最终得到的句子结点及评价方面结点的嵌入向量表示，利用自注意力机制计算二者之间的相关性，从而得到预测的文本方面级情感倾向。本发明借助图神经网络捕获具有相同评价方面以及情感倾向的文本间的结构相似性信息以及语义表达多样性信息，通过模型训练得到文本及评价方面结点的嵌入向量表示，有效的提升了模型的表达能力以及泛化能力。The aspect-level text sentiment analysis method based on the heterogeneous graph neural network of the present invention, according to the co-occurrence relationship between words and sentences in the text and the evaluation aspects contained in the sentence, constructs a word-sentence-evaluation aspect three-level graph structure network; after that Generate the embedding representation of each node in the graph network, use the pre-trained language model to initialize the word nodes, sentence nodes and evaluation nodes in the graph network respectively, so as to obtain the initial embedding vector representation of each node; then use the graph Attention network training model parameters, through the multi-head attention mechanism, according to the connection relationship of each node in the graph network, constantly update the embedding vector representation of the nodes in the graph network, and finally predict the aspect-level emotional tendency of the text. According to the embedding vector representation of the final sentence node and evaluation aspect node, the self-attention mechanism is used to calculate the correlation between the two, so as to obtain the predicted textual aspect-level emotional tendency. The present invention captures structural similarity information and semantic expression diversity information between texts with the same evaluation aspects and emotional tendencies by means of a graph neural network, and obtains embedding vector representations of text and evaluation nodes through model training, effectively improving the model expressiveness and generalization ability.

附图说明Description of drawings

图1为本发明的网络结构的示意图。FIG. 1 is a schematic diagram of the network structure of the present invention.

具体实施方式detailed description

为了使本技术领域的人员更好地理解本发明方案，下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本发明一部分的实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都应当属于本发明保护的范围。In order to enable those skilled in the art to better understand the solutions of the present invention, the following will clearly and completely describe the technical solutions in the embodiments of the present invention in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments are only It is an embodiment of a part of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts shall fall within the protection scope of the present invention.

需要说明的是，本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象，而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换，以便这里描述的本发明的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外，术语“包括”和“具有”以及他们的任何变形，意图在于覆盖不排他的包含，例如，包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元，而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。It should be noted that the terms "first" and "second" in the description and claims of the present invention and the above drawings are used to distinguish similar objects, but not necessarily used to describe a specific sequence or sequence. It is to be understood that the data so used are interchangeable under appropriate circumstances such that the embodiments of the invention described herein can be practiced in sequences other than those illustrated or described herein. Furthermore, the terms "comprising" and "having", as well as any variations thereof, are intended to cover a non-exclusive inclusion, for example, a process, method, system, product or device comprising a sequence of steps or elements is not necessarily limited to the expressly listed instead, may include other steps or elements not explicitly listed or inherent to the process, method, product or apparatus.

针对现有的模型存在的问题，本发明提出结合图神经网络，对具有相同评价方面和情感倾向的文本之间的关系以及文本与评价方面之间的关系进行建模，使得模型可以学到相似文本之间的结构相似性特征，从而提高模型的表达能力；同时模型也可以学到文本之间的语义多样性特征，从而提高模型的泛化能力。Aiming at the problems existing in the existing models, the present invention proposes to combine the graph neural network to model the relationship between texts with the same evaluation aspects and emotional tendencies and the relationship between texts and evaluation aspects, so that the model can learn similar The structural similarity features between texts can improve the expressive ability of the model; at the same time, the model can also learn the semantic diversity features between texts, thereby improving the generalization ability of the model.

下面结合附图对本发明做进一步详细描述：The present invention is described in further detail below in conjunction with accompanying drawing:

步骤1：根据单词与文本的共现关系以及文本中涉及的评价方面，构建单词-句子-评价方面的三级图网络结构；Step 1: According to the co-occurrence relationship between words and text and the evaluation aspects involved in the text, construct a three-level graph network structure of word-sentence-evaluation;

图结构G表示为：G＝{V_w,V_s,V_a,E_ws,E_sa}，其中，V_w表示文本中包含的单词结点，V_s表示文本结点，V_a表示评价方面结点，E_ws表示单词结点与文本结点之间的边，其权重表示单词在文本中出现的位置，E_sa表示文本结点与评价方面结点之间的边；The graph structure G is expressed as: G={V _w , V _s , V _a , E _ws , E _sa }, where V _w represents the word node contained in the text, V _s represents the text node, and V _a represents the evaluation aspect Node, E _ws represents the edge between the word node and the text node, its weight represents the position where the word appears in the text, E _sa represents the edge between the text node and the evaluation node;

步骤2：针对图网络结构中的单词结点，使用预训练的GloVe词向量对单词结点进行初始化得到单词嵌入向量，将所有单词嵌入向量拼接，得到单词初始嵌入矩阵；Step 2: For the word nodes in the graph network structure, use the pre-trained GloVe word vector to initialize the word node to obtain the word embedding vector, and splicing all the word embedding vectors to obtain the initial word embedding matrix;

针对单词w，通过其在词典中的序号查阅GloVe词向量库，得到单词w初始嵌入向量为

其中，d_w为单词嵌入向量的维度，将所有单词嵌入向量拼接，得到单词初始嵌入矩阵

其中，n为单词数量。For the word w, refer to the GloVe word vector library through its serial number in the dictionary, and get the initial embedding vector of the word w as

Among them, d _w is the dimension of the word embedding vector, and all the word embedding vectors are concatenated to obtain the initial word embedding matrix

Among them, n is the number of words.

步骤3：针对图网络结构中的文本结点，使用预训练语言模型BERT对其进行初始化，得到文本的初始嵌入向量，将所有文本初始嵌入向量拼接，得到文本初始嵌入矩阵；Step 3: For the text nodes in the graph network structure, use the pre-trained language model BERT to initialize it to obtain the initial embedding vector of the text, and stitch all the initial embedding vectors of the text to obtain the initial embedding matrix of the text;

对于文本s＝w₁,w₂,…,w_l,其中，w_i(i∈1…l)为构成文本的单词，l为句子长度，文本s的初始嵌入向量为：For text s=w ₁ ,w ₂ ,...,w _l , where, w _i (i∈1...l) is the word that constitutes the text, l is the sentence length, and the initial embedding vector of text s is:

X_s＝MeanPooling(BERT(s))# (1)X _s = MeanPooling(BERT(s))# (1)

其中，MeanPooling表示对BERT模型最终的输出做平均池化，

d_s为文本嵌入向量的维度，将所有文本初始嵌入向量拼接，得到文本初始嵌入矩阵

其中m为文本数量。Among them, MeanPooling represents the average pooling of the final output of the BERT model,

d _s is the dimension of the text embedding vector, and all the initial text embedding vectors are concatenated to obtain the initial text embedding matrix

where m is the number of texts.

步骤4：针对图网络结构中的评价方面结点，使用独热编码进行编码，利用一层参数可学习的全连接网络(FCN)将编码向量映射到特征空间，得到编码向量的初始嵌入向量，将所有评价方面初始嵌入向量进行拼接，得到评价方面初始嵌入矩阵；Step 4: For the evaluation node in the graph network structure, use one-hot encoding to encode, use a layer of fully connected network (FCN) with learnable parameters to map the encoding vector to the feature space, and obtain the initial embedding vector of the encoding vector, Concatenate all the evaluation aspect initial embedding vectors to obtain the evaluation aspect initial embedding matrix;

对于评价方面节点a，其初始嵌入向量为：For the evaluation aspect node a, its initial embedding vector is:

X_a＝FCN(OneHot(a))# (2)X _a ＝FCN(OneHot(a))# (2)

其中，OneHot表示独热编码；FCN表示全连接网络，

d_a为评价方面嵌入向量的维度；将所有评价方面初始嵌入向量进行拼接，得到评价方面初始嵌入矩阵

k为评价侧面数量。Among them, OneHot means one-hot encoding; FCN means fully connected network,

d _a is the dimension of the embedding vector of the evaluation aspect; concatenate all the initial embedding vectors of the evaluation aspect to obtain the initial embedding matrix of the evaluation aspect

k is the number of evaluation sides.

步骤5：对于图网络结构中的给定节点的嵌入向量h_i及与其相连的邻居节点N_i，使用多头注意力机制获得给定结点新的嵌入向量，给定节点新的嵌入向量

为：Step 5: For the embedding vector h _i of a given node in the graph network structure and the neighbor node N _i connected to it, use the multi-head attention mechanism to obtain the new embedding vector of the given node, and the new embedding vector of the given node

for:

其中，||表示向量的拼接操作，σ表示ReLU激活函数，W_n为可学习参数，

为结点i和结点j在第n个头的注意力分数，

计算公式为：Among them, || represents the splicing operation of the vector, σ represents the ReLU activation function, W _n is a learnable parameter,

is the attention score of node i and node j at the nth head,

The calculation formula is:

其中，

为可学习参数，e_ij表示结点i和结点j之间边的权重，其取决于单词结点在文本中出现的位置。in,

As a learnable parameter, e _ij represents the weight of the edge between node i and node j, which depends on the position of the word node in the text.

给定结点n在t步时的嵌入表示

其邻居在t步时的嵌入表示记为

记结点n在(t+1)时的嵌入表示为

计算公式如公式(3)所示，记为：The embedding representation of a given node n at step t

The embedding representation of its neighbors at step t is denoted as

Note that the embedding of node n at (t+1) is expressed as

The calculation formula is shown in formula (3), which is recorded as:

步骤6：针对图网络结构中的单词结点，文本结点和评价方面结点，对应的初始嵌入矩阵已经给定，通过公式(5)反复迭代，在得到其t步的嵌入

后，在(t+1)步的嵌入表示计算公式为：Step 6: For the word nodes, text nodes and evaluation nodes in the graph network structure, the corresponding initial embedding matrix has been given, iterated repeatedly through the formula (5), and the embedding of step t is obtained

After that, the embedding representation calculation formula at (t+1) step is:

其中，

为计算的中间值；FFN()为前馈网络，其输入为计算中间值与t时刻的嵌入表示的残差连接，这样做的好处在于可以大幅提升模型的表达能力并有利于模型快速收敛，其计算公式为：in,

is the calculated intermediate value; FFN() is a feed-forward network, and its input is the residual connection between the calculated intermediate value and the embedded representation at time t. The advantage of this is that it can greatly improve the expressive ability of the model and facilitate the rapid convergence of the model. Its calculation formula is:

FFN(x)＝max(0,xW₁+b₁)W₂+b₂ FFN(x)=max(0,xW ₁ +b ₁ )W ₂ +b ₂

其中，max()表示取两个元素中的较大值，W₁,W₂,b₁,b₂为可学习参数。Among them, max() means to take the larger value of the two elements, and W ₁ , W ₂ , b ₁ , b ₂ are learnable parameters.

步骤7：给定最终的文本嵌入表示矩阵

以及评价方面嵌入表示矩阵

采用自注意力机制计算二者之间的相关性，计算公式为：Step 7: Given the final text embedding representation matrix

and the evaluation aspect embedding representation matrix

The self-attention mechanism is used to calculate the correlation between the two, and the calculation formula is:

其中，

为

中第i个文本结点对应的嵌入向量；

为

中第j个评价方面对应的嵌入向量；β_ij为文本结点向量与评价方面结点向量之间的注意力权重，其值表示二者之间的相关性权重分布；

为注意力权重加权后的文本结点嵌入表示；

表示预测文本在当前评价方面上情感倾向的概率分布；W_a,b_a为可学习参数；softmax()为指数归一化函数，其计算公式为：in,

for

The embedding vector corresponding to the i-th text node in ;

for

The embedding vector corresponding to the jth evaluation aspect in ; β _ij is the attention weight between the text node vector and the evaluation aspect node vector, and its value represents the correlation weight distribution between the two;

Embedding representations for text nodes weighted by attention weights;

Indicates the probability distribution of the emotional tendency of the predicted text in terms of current evaluation; W _a , b _a are learnable parameters; softmax() is an exponential normalization function, and its calculation formula is:

步骤8：将预测的文本情感倾向分布

与文本真实情感标签

进行比对，通过交叉熵损失函数

计算二者之间的差异，在全部样本上的损失为对所有文本结点i及所有评价方面结点j进行求和：Step 8: Distribution of predicted text sentiment

Label with text real emotion

For comparison, pass the cross entropy loss function

最后通过反向传播算法，不断更新模型参数，使得文本的预测情感倾向不断接近其真实情感倾向。Finally, through the backpropagation algorithm, the model parameters are constantly updated, so that the predicted emotional tendency of the text is constantly approaching its true emotional tendency.

步骤9：模型训练Step 9: Model Training

使用Adam优化器更新梯度，学习率设置为0.001，Adam的一阶动量参数为0.1，二阶动量参数为0.999，数据集训练迭代次数(Epoch)设置为200次，预训练BERT模型的参数固定，预训练GloVe词向量为300维。Use the Adam optimizer to update the gradient, the learning rate is set to 0.001, the first-order momentum parameter of Adam is 0.1, the second-order momentum parameter is 0.999, the number of training iterations (Epoch) of the dataset is set to 200, and the parameters of the pre-trained BERT model are fixed. The pre-trained GloVe word vector is 300 dimensions.

模型使用：The model uses:

将待分类文本输入模型进行特征提取，将提取到的文本特征向量与训练好的评价方面向量采用自注意力机制计算相关性，最后使用softmax分类器进行分类。The text to be classified is input into the model for feature extraction, and the extracted text feature vector and the trained evaluation aspect vector are used to calculate the correlation using the self-attention mechanism, and finally the softmax classifier is used for classification.

参见图1，图1为本发明的网络模型的示意图，本发明的网络模型主要包括了结点嵌入表示初始化模块，图注意力模块以及预测模块。结点嵌入表示初始化模块用于初始化单词，文本，评价方面结点的嵌入表示；图注意力模块用于迭代更新网络结点的嵌入表示；预测模块使用最终的结点嵌入表示预测文本的情感倾向。Referring to FIG. 1, FIG. 1 is a schematic diagram of the network model of the present invention. The network model of the present invention mainly includes a node embedding representation initialization module, a graph attention module and a prediction module. The node embedding representation initialization module is used to initialize the embedding representation of words, texts, and evaluation nodes; the graph attention module is used to iteratively update the embedding representation of network nodes; the prediction module uses the final node embedding representation to predict the emotional tendency of the text .

为了衡量模型性能，在五个广泛使用的公开数据集上进行了对比试验，数据集的训练集，测试集划分以及包含不同情感的文本数量如表1所示。表2为对比实验的结果，与十三个常用模型在指标准确率(Acc.)和F1值上进行了对比，从表中可以看出，本发明的模型HAGNN-GloVe和HAGNN-BERT在大部分指标上都取得了最好的结果，相较于传统方法在模型性能上有较大提升。In order to measure the performance of the model, comparative experiments were carried out on five widely used public datasets. The training set of the dataset, the test set division and the number of texts containing different emotions are shown in Table 1. Table 2 is the result of the comparative experiment, compared with thirteen commonly used models on the index accuracy rate (Acc.) and F1 value, as can be seen from the table, the model HAGNN-GloVe of the present invention and HAGNN-BERT in large Some indicators have achieved the best results, and compared with traditional methods, the model performance has been greatly improved.

表1用于衡量模型性能的数据集的统计信息Table 1 Statistics of datasets used to measure model performance

表2为对比模型在不同数据集上的准确率(Acc.)和F1值，其中HAGNN-GloVe和HAGNN-BERT为本发明的两种方法，其采用了不同的初始化数据。Table 2 shows the accuracy rate (Acc.) and F1 value of the comparison model on different data sets, wherein HAGNN-GloVe and HAGNN-BERT are two methods of the present invention, which use different initialization data.

表2对比模型在不同数据集上的准确率(Acc.)和F1值Table 2 compares the accuracy (Acc.) and F1 value of the model on different data sets

以上内容仅为说明本发明的技术思想，不能以此限定本发明的保护范围，凡是按照本发明提出的技术思想，在技术方案基础上所做的任何改动，均落入本发明权利要求书的保护范围之内。The above content is only to illustrate the technical ideas of the present invention, and cannot limit the protection scope of the present invention. Any changes made on the basis of the technical solutions according to the technical ideas proposed in the present invention shall fall within the scope of the claims of the present invention. within the scope of protection.

Claims

1. A method for aspect-level text sentiment analysis based on a heterogeneous graph neural network, characterized in that, comprising the following steps:

(1), according to the co-occurrence relationship between words and text and the evaluation aspects involved in the text, construct a word-sentence-evaluation aspect three-level graph network structure;

Initial embedding representation matrix for text nodes

and the initial embedding representation matrix for the evaluation

Finally, the text embedding representation matrix is obtained

and the evaluation aspect embedding representation matrix

(4), utilize described text embedding representation matrix

and evaluate aspect embedding representation matrices

(5) Input the text to be classified into the trained model for feature extraction, use the self-attention mechanism to calculate the correlation between the extracted text feature vector and the trained evaluation aspect vector, and finally use the softmax classifier for classification.

2. The aspect-level text sentiment analysis method based on heterogeneous graph neural network according to claim 1, characterized in that, the graph network structure G is expressed as: G={V _w , V _s , V _a , E _ws , E _sa };

Among them, V _w represents the word node contained in the text; V _s represents the text node; V _a represents the evaluation aspect node; E _ws represents the edge between the word node and the text node, and its weight represents the word in the text The position where it appears; E _sa represents the edge between the text node and the evaluation aspect node.

3. the aspect-level text sentiment analysis method based on heterogeneous graph neural network according to claim 1, is characterized in that, the specific operation of the word node embedding vector initialization in step (2) is:

For the word nodes in the graph network structure, use the pre-trained GloVe word vector library to initialize the word nodes to obtain word embedding vectors, and splicing all the word embedding vectors to obtain the initial word embedding matrix.

4. the aspect-level text sentiment analysis method based on heterogeneous graph neural network according to claim 1, is characterized in that, the concrete operation of the text node embedding vector initialization in step (2) is:

For the text nodes in the graph network structure, use the pre-trained language model BERT to initialize the text nodes to obtain the initial embedding vector, and splicing all the text initial embedding vectors to obtain the text initial embedding matrix.

5. the aspect-level text sentiment analysis method based on heterogeneous graph neural network according to claim 1, is characterized in that, the specific operation of the evaluation aspect node embedding vector initialization in step (2) is:

For the evaluation aspect node in the graph network structure, use one-hot encoding to encode the evaluation aspect node, and use a layer of fully connected network FCN with learnable parameters to map the encoding vector to the feature space to obtain the evaluation aspect node The initial embedding vector of all evaluation aspects is concatenated to obtain the initial embedding matrix of evaluation aspects.

6. the aspect-level text sentiment analysis method based on heterogeneous graph neural network according to claim 1, is characterized in that, in the step (3), the concrete operation that each node embedding represents in updating graph network structure is:

The embedding representation of a given node n at step t is denoted as

Note that the embedding of a given node n at (t+1) is expressed as

A new embedding vector representation based on a given node

Construct

The relationship between the three:

7. The aspect-level text sentiment analysis method based on the heterogeneous graph neural network according to claim 1, characterized in that, in step (4), the correlation between each emotional tendency of the calculation text and the evaluation aspect, the calculation formula is :

in,

for

The embedding vector corresponding to the i-th text node in ;

for

Embedding representations for text nodes weighted by attention weights;

8. the aspect-level text sentiment analysis method based on heterogeneous graph neural network according to claim 7, is characterized in that, in step (4), the text sentiment tendency distribution of prediction

Label with text real emotion

For comparison, pass the cross entropy loss function

Finally, through the backpropagation algorithm, the model parameters are continuously updated until the predicted emotional orientation and the true emotional orientation of the text are within the preset range.