[go: up one dir, main page]

CN111353040A - GRU-based attribute level emotion analysis method - Google Patents

GRU-based attribute level emotion analysis method Download PDF

Info

Publication number
CN111353040A
CN111353040A CN201910459539.6A CN201910459539A CN111353040A CN 111353040 A CN111353040 A CN 111353040A CN 201910459539 A CN201910459539 A CN 201910459539A CN 111353040 A CN111353040 A CN 111353040A
Authority
CN
China
Prior art keywords
sentence
layer
word
vector
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910459539.6A
Other languages
Chinese (zh)
Inventor
邢永平
禹晶
肖创柏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
Original Assignee
Beijing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology filed Critical Beijing University of Technology
Priority to CN201910459539.6A priority Critical patent/CN111353040A/en
Publication of CN111353040A publication Critical patent/CN111353040A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)

Abstract

本发明公开了一种属性级情感分析方法。情感分析是自然语言处理中一个基本的任务,而属性级的情感分析是情感分析的一个重要课题。在一句话中不同的词语对句子中属性(aspect)的情感极性有不同的影响,如何对属性和其句子中的词的关系以及整个句子含义建模是解决该问题的关键。在本文我们将通过两个循环网络在对句子信息建模的同时还引入注意力机制以融合属性的信息从而期望达到较好的效果。本文通过在公开数据集的实验显示在不需要做繁杂特征工程的情况下,本文提出的算法取得了更好的结果。

Figure 201910459539

The invention discloses an attribute-level sentiment analysis method. Sentiment analysis is a basic task in natural language processing, and attribute-level sentiment analysis is an important topic of sentiment analysis. Different words in a sentence have different effects on the sentiment polarity of an aspect in the sentence. How to model the relationship between the attribute and the words in the sentence and the meaning of the whole sentence is the key to solving this problem. In this paper, we will use two recurrent networks to model sentence information and introduce an attention mechanism to fuse attribute information to achieve better results. In this paper, experiments on public datasets show that the algorithm proposed in this paper achieves better results without the need for complicated feature engineering.

Figure 201910459539

Description

基于GRU的属性级别情感分析方法Attribute-level sentiment analysis method based on GRU

技术领域technical field

本发明涉及互联网领域,更具体地说,涉及基于GRU的属性级别情感分析方法。The present invention relates to the field of Internet, and more particularly, to an attribute-level sentiment analysis method based on GRU.

背景技术Background technique

随着互联网的快速发展,文本信息越来越多,如何从海量文本信息中获取有用信息变得越来越重要,这些海量文本信息也客观促进自然语言处理的发展,而深度学习为自然语言处理带来了新的方向。情感分析(即意见挖掘)是自然语言处理中一个基础却很重要的任务。企业可以利用客户对产品的评论信息来及时的获取反馈从而为决策提供参考。因此近年来如何在海量的文本数据中提取情感信息显得成为自然语言处理的一个重要研究课题。With the rapid development of the Internet and more and more text information, how to obtain useful information from massive text information has become more and more important. These massive text information also objectively promote the development of natural language processing, and deep learning is a natural language processing technology. brought a new direction. Sentiment analysis (i.e. opinion mining) is a fundamental but important task in natural language processing. Enterprises can use customer comments on products to obtain timely feedback to provide reference for decision-making. Therefore, how to extract emotional information from massive text data has become an important research topic in natural language processing in recent years.

目前主要的文本情感分析研究主要是基于情感词典和基于机器学习。基于情感词典的方法依赖于情感词典,情感词典对于情感分析具有大的影响,杨鼎等基于情感词典对文本进行处理和表示从而构建出基于朴素贝叶斯理论的分类器。另一种方法是基于机器学习的方法。机器学习的方法通过对人工标定的数据进行训练而得到一个情感分析分类器,通过实验证明了支持向量机的优良分类性能。这两种方法都需要人工标记数据从而完成情感词典构建和特征工程,这些任务繁琐且复杂而深度学习算法能够很好地解决这一问题。近年来深度学习在自然语言处理方面取得巨大的成功,比如机器翻译,问答系统。在情感分析领域也有应用, Socher等提出了基于半监督递归自动编码机RAE)的深度学习方法来实现文本情感分类;Jurgovsky等利用卷积神经网络(CNN)实现了文本情感分类。文本情感分析可以分为篇章级、句子级以及单词级。本文主要研究的是基于属性(aspect) 的情感分析。这是由于在同一句子中对不同的aspect其情感极性有可能是不相同的,比如在“Thevoice quality of this phone is not good,but the battery life is long.”这个句子中对于quality来说这句话的评价是负面的,然而对battery life这个来说是正面的。Wang等提出AE-LSTM,AT-LSTM,和AEAT-LSTM循环网络算法用于aspect粒度的情感分析,其将aspect信息融合到长短期记忆网络LSTM中以提高分类精度。SVM-dep算法将特征分为和属性aspect相关的特征和aspect 无关的特征,分别提取出来完成了基于属性级别的情感分析,其精度由于不包含属性特征的支持向量机分类器。At present, the main text sentiment analysis research is mainly based on sentiment dictionary and machine learning. The method based on sentiment dictionary relies on sentiment dictionary, which has a great influence on sentiment analysis. Yang Ding et al. processed and represented text based on sentiment dictionary to construct a classifier based on Naive Bayes theory. Another approach is a machine learning based approach. The machine learning method obtains a sentiment analysis classifier by training the manually calibrated data, and the excellent classification performance of the support vector machine is proved by experiments. Both methods require manual labeling of data for sentiment dictionary construction and feature engineering, which are tedious and complex tasks that deep learning algorithms can handle well. In recent years, deep learning has achieved great success in natural language processing, such as machine translation, question answering systems. It also has applications in the field of sentiment analysis. Socher et al. proposed a deep learning method based on semi-supervised recurrent auto-encoder (RAE) to achieve text sentiment classification; Jurgovsky et al. used convolutional neural network (CNN) to achieve text sentiment classification. Text sentiment analysis can be divided into chapter level, sentence level and word level. This paper mainly studies the sentiment analysis based on aspect. This is because the emotional polarity of different aspects may be different in the same sentence. For example, in the sentence "The voice quality of this phone is not good, but the battery life is long." The evaluation of the sentence is negative, but it is positive for battery life. Wang et al. proposed AE-LSTM, AT-LSTM, and AEAT-LSTM recurrent network algorithms for aspect-granular sentiment analysis, which integrated aspect information into long short-term memory network LSTM to improve classification accuracy. The SVM-dep algorithm divides the features into the features related to the attribute aspect and the features not related to the aspect, and extracts them respectively to complete the sentiment analysis based on the attribute level.

注意力机制是在信息处理时选择性地集中于某些重要的信息的一种机制,而忽略和关注目标意义相关性较弱的一种信息处理机制,它强调在信息处理时更关注信息的本质方面的信息,它将有限的资源集中于重要的信息的处理,从而取得了巨大的成功。注意力(Attention)机制在图像识别、自动翻译等领域已经取得了巨大的成功。结合本文的主题,在处理基于属性情感分析的时候,可以更加关注和属性有关的信息从而提高情感分类的准确度。The attention mechanism is a mechanism that selectively focuses on some important information during information processing, while ignoring and paying attention to an information processing mechanism that is less relevant to the meaning of the target. Essentially information, it has achieved great success by concentrating limited resources on the processing of important information. Attention mechanism has achieved great success in image recognition, automatic translation and other fields. Combined with the theme of this paper, when dealing with attribute-based sentiment analysis, we can pay more attention to attribute-related information to improve the accuracy of sentiment classification.

循环网络(RNN)因为其网络记忆性能够处理上下文信息而被广泛应用于自然语言处理中,典型的循环网络有长短期记忆网络(LSTM)、门控循环单元(GRU) 和MUT网络等。本文将提出基于GRU网络在属性粒度情感分析的算法,然后通过注意力机制将属性信息融合到模型中,使得算法模型更能够关注到属性对情感分类的影响,从而提高情感分类的精度。Recurrent network (RNN) is widely used in natural language processing because of its network memory ability to process contextual information. Typical recurrent networks include long short-term memory network (LSTM), gated recurrent unit (GRU) and MUT network. This paper proposes an algorithm for attribute granular sentiment analysis based on GRU network, and then integrates attribute information into the model through the attention mechanism, so that the algorithm model can pay more attention to the influence of attributes on sentiment classification, thereby improving the accuracy of sentiment classification.

发明内容SUMMARY OF THE INVENTION

有鉴于此,本发明的目的在于提供了一种基于GRU网络的属性级情感分析模型和方法,本发明基于Att-CGRU的属性级别情感分类算法,以实现情感分类精度的提高。In view of this, the purpose of the present invention is to provide an attribute-level sentiment analysis model and method based on GRU network. The present invention is based on the attribute-level sentiment classification algorithm of Att-CGRU to improve sentiment classification accuracy.

为了实现上述目的,本发明设计的基于GRU网络的属性级情感分析模型包括如下:In order to achieve the above purpose, the attribute-level sentiment analysis model based on the GRU network designed by the present invention includes the following:

在Att-CGRU模型中,将通过注意力机制的引入来体现属性在对于整个句子的情感极性具有很重要的影响。在序列问题的处理中,编码解码(encoder-decoder) 是一种很常用的模型,通过在编解码模型根据不同的算法以及任务目标来对编码输出的隐状态向量分配不同的权重,抽取出能够尽可能表征输入数据的向量表示以改善算法模型性能,这一过程实质上是将有限的资源集中于和目标任务相关度更高的信息上来提高算法性能。Att-CGRU模型具体结构见说明书附图1。模型包括五个部分即输入层、嵌入层、GRU层、attention层和输出层。输入层将短文本即句子输入到模型中;嵌入层将句子中的每一个词映射成一个向量;GRU层利用从词嵌入中来获取特征信息;attention层实现注意力机制,它将会通过权重计算把词一级的特征信息融合成句子级的特征信息而产生一个句子特征向量;最终将句子特征向量进行分类。In the Att-CGRU model, the introduction of the attention mechanism to reflect the attribute has a very important impact on the emotional polarity of the entire sentence. In the processing of sequence problems, the encoder-decoder is a very commonly used model. By assigning different weights to the hidden state vector of the encoding output according to different algorithms and task goals in the encoder-decoder model, extracting To improve the performance of the algorithm model by characterizing the vector representation of the input data as much as possible, this process essentially concentrates limited resources on information that is more relevant to the target task to improve the performance of the algorithm. The specific structure of the Att-CGRU model is shown in Figure 1 of the description. The model consists of five parts namely input layer, embedding layer, GRU layer, attention layer and output layer. The input layer inputs short texts or sentences into the model; the embedding layer maps each word in the sentence into a vector; the GRU layer uses the word embedding to obtain feature information; the attention layer implements the attention mechanism, which will pass weights Calculate and fuse word-level feature information into sentence-level feature information to generate a sentence feature vector; finally classify the sentence feature vector.

1.1输入层1.1 Input layer

在输入层输入每一个需要进行情感极性分类的句子,假设句子长度为T,则句子可以表示为s={x1,x2,...,xT},xi表示句子中的第i个单词。In the input layer, input each sentence that needs to be classified by sentiment polarity. Assuming that the sentence length is T, the sentence can be expressed as s={x 1 ,x 2 ,...,x T }, and x i represents the first sentence in the sentence. i words.

1.2嵌入层1.2 Embedding layer

在从输入层获得的一个包含T个词的句子s={x1,x2,...,xT},后每一个词在嵌入层得到其对应的词向量ei In a sentence s={x 1 ,x 2 ,...,x T } obtained from the input layer containing T words, each word gets its corresponding word vector e i in the embedding layer

首先从词嵌入矩阵

Figure RE-GDA0002499126760000031
中获得每一个词的词向量,这里V是词表的长度,dw是可以指定的词向量维数,则有First from the word embedding matrix
Figure RE-GDA0002499126760000031
The word vector of each word is obtained in , where V is the length of the vocabulary, and d w is the dimension of the word vector that can be specified, then there are

embi=Wwrdvi (1)emb i =W wrd v i (1)

其中vi是一个长度为|V|的向量,其中在i处为1,其他处为0。可以同样得出aspect的词向量embasp,当句子中aspect为多个单词的时候,将每个单词的词向量相同维度的值加起来得到aspect的词向量。然后将embi和embasp拼接起来得到最终的词向量eiwhere v i is a vector of length |V|, which is 1 at i and 0 elsewhere. The word vector emb asp of aspect can also be obtained. When there are multiple words in the sentence, the word vector of aspect is obtained by adding up the values of the same dimension of the word vector of each word. Then concatenate emb i and emb asp to get the final word vector e i :

ei=[embi:embasp] (2)e i = [emb i : emb asp ] (2)

最后将e={e1,e2,...,eT}输入到下一层。Finally, e={e 1 ,e 2 ,...,e T } is input to the next layer.

1.3 GRU层1.3 GRU layer

在GRU层中,将会以属性为分界点,将句子分为左右部分去对属性上下文建模,其结构如图1,其中{xl+1,xl+2,...,xr-1}表示aspect,{x1,x2,...,xl}表示句子中属性以前的单词,{xr-1,xr-2,...,xT}表示属性以后的单词。将左右两个序列输入到左右两个网络后隐藏层分别得到{h1,h2,...,hr-1}和{hl+1,hl+2,...,hT}。In the GRU layer, the attribute is used as the dividing point, and the sentence is divided into left and right parts to model the attribute context. Its structure is shown in Figure 1, where {x l+1 ,x l+2 ,...,x r -1 } represents aspect, {x 1 ,x 2 ,...,x l } represents the word before the attribute in the sentence, {x r-1 ,x r-2 ,...,x T } represents the word after the attribute word. After inputting the left and right sequences into the left and right networks, the hidden layers get {h 1 ,h 2 ,...,h r-1 } and {h l+1 ,h l+2 ,...,h T respectively }.

1.4 attention层1.4 attention layer

在这个模型中引入注意力机制来获得更好的分类效果,这是由于句子中前后两部分的不同词和属性有不同的联系,将更多的来关注和属性联系紧密的信息。注意力机制的实现如下:The attention mechanism is introduced into this model to obtain better classification results, because different words and attributes in the front and back parts of the sentence have different connections, and more attention will be paid to the information that is closely related to the attributes. The attention mechanism is implemented as follows:

Figure RE-GDA0002499126760000041
Figure RE-GDA0002499126760000041

at=softmax(wTM) (4)a t =softmax(w T M) (4)

r=Hat (5)r=Ha t (5)

这里at表示的是注意力权重系数,

Figure RE-GDA0002499126760000042
表示重复easp多次至和H的维度保持一致,H 是模型中隐藏层输出组成的矩阵,r表示的是加权后的表示句子含义的向量,Wh、 Wv、w是参数矩阵,然后得到能够最终表针句子信息的向量oHere a t represents the attention weight coefficient,
Figure RE-GDA0002499126760000042
Represents repeating e asp multiple times to be consistent with the dimension of H, H is the matrix composed of the output of the hidden layer in the model, r represents the weighted vector representing the meaning of the sentence, W h , W v , w are the parameter matrix, and then Get the vector o that can finally pin the sentence information

o=tanh(Wpr+Wxh) (6)o = tanh(Wpr+ Wxh ) (6)

h表示hr-1和hl+1向量的和。h represents the sum of h r-1 and h l+1 vectors.

1.5输出层1.5 Output layer

最后将注意力层的输出o输入到分类器Finally, the output o of the attention layer is input to the classifier

Figure RE-GDA0002499126760000043
Figure RE-GDA0002499126760000043

实现情感的极性分类,其中Wo和bo是要训练得到的参数矩阵。Implements sentiment polarity classification, where Wo and bo are the parameter matrices to be trained.

本方法的具体实验步骤如下:The specific experimental steps of this method are as follows:

步骤S1、首先将本发明中所用的收集于推特的数据集输入到Att-CGRU模型的输入层,Step S1, first input the data set collected on Twitter used in the present invention into the input layer of the Att-CGRU model,

步骤S2、将S1得到的数据输入到嵌入层,得到输入句子中每个词的词向量,Step S2, input the data obtained in S1 into the embedding layer to obtain the word vector of each word in the input sentence,

步骤S3、在GRU层中通过S2的方式得到句子中每个词的词向量后,以属性词{xl+1,xl+2,...,xr-1}为分界点将左边{x1,x2,...,xl}单词的词向量和右边 {xr-1,xr-2,...,xT}单词的词向量输入到两个左右两个GRU网络分别对属性词的上下文建模,从隐藏层分别得到输出{h1,h2,...,hr-1}和{hl+1,hl+2,...,hT}。Step S3: After obtaining the word vector of each word in the sentence in the GRU layer by means of S2, take the attribute word {x l+1 ,x l+2 ,...,x r-1 } as the dividing point to divide the left The word vector of the word {x 1 ,x 2 ,...,x l } and the word vector of the word on the right {x r-1 ,x r-2 ,...,x T } are input to two left and right GRUs The network models the context of the attribute word separately, and obtains the outputs {h 1 ,h 2 ,...,h r-1 } and {h l+1 ,h l+2 ,...,h T from the hidden layer, respectively }.

步骤S4、根据S4的输出,按照以下公式来计算能代表句子信息的向量o,具体公式如下:Step S4, according to the output of S4, according to the following formula to calculate the vector o that can represent the sentence information, the specific formula is as follows:

Figure RE-GDA0002499126760000044
Figure RE-GDA0002499126760000044

at=softmax(wTM)a t =softmax(w T M)

r=Hat r =Hat

这里r表示的是加权后的能表征句子含义的向量,at表示的是注意力权重系数,其由将wTM输入到softmax函数后得出,M表示一个由模型GRU层中隐藏层的输出组成的矩阵H得来的向量,

Figure RE-GDA0002499126760000051
表示重复属性词词向量easp多次至和H的维度保持一致,H是模型中隐藏层输出组成的矩阵,tanh代表tanh函数,Wh、Wv、 w是参数矩阵。最终得到能够最终表针句子信息的向量oHere r represents the weighted vector that can represent the meaning of the sentence, and a t represents the attention weight coefficient, which is obtained by inputting w T M into the softmax function, and M represents a hidden layer in the GRU layer of the model. The vector obtained from the matrix H composed of the output,
Figure RE-GDA0002499126760000051
Indicates that the word vector e asp is repeated several times until the dimension of H is consistent. H is the matrix composed of the output of the hidden layer in the model, tanh represents the tanh function, and W h , W v , and w are the parameter matrices. Finally, the vector o that can finally pin the sentence information is obtained

o=tanh(Wpr+Wxh)o = tanh(Wpr+ Wxh )

h表示hr-1和hl+1向量的和,hr-1表示左GRU网络中的第r-1个词对应的隐藏层输出,hl+1表示右GRU网络中的第l+1个词对应的隐藏层输出,Wp和Wx表示参数矩阵。h represents the sum of h r-1 and h l+1 vectors, h r-1 represents the output of the hidden layer corresponding to the r-1th word in the left GRU network, and h l+1 represents the l+th word in the right GRU network The output of the hidden layer corresponding to 1 word, W p and W x represent the parameter matrix.

步骤S5、输出层是将能够表针句子信息的向量o输入到softmax函数得到预测的情感极性

Figure RE-GDA0002499126760000052
具体由
Figure RE-GDA0002499126760000053
得出,Wo和bo都是参数矩阵。Step S5, the output layer is to input the vector o that can indicate sentence information into the softmax function to obtain the predicted sentiment polarity.
Figure RE-GDA0002499126760000052
Specifically by
Figure RE-GDA0002499126760000053
It is concluded that both W o and b o are parameter matrices.

步骤6、根据S5的输出和每个句子对应的实际分类y计算损失函数值lossStep 6. Calculate the loss function value loss according to the output of S5 and the actual classification y corresponding to each sentence

Figure RE-GDA0002499126760000054
Figure RE-GDA0002499126760000054

其中λ是正则化系数,并通过误差反向传播算法训练迭代至Accuracy取得最大值,误差反向传播算法中的优化算法是以初始化系数为0.01的AdaGrad算法。Among them, λ is the regularization coefficient, and the error back-propagation algorithm is trained to iterate until the Accuracy gets the maximum value. The optimization algorithm in the error back-propagation algorithm is the AdaGrad algorithm with an initialization coefficient of 0.01.

与现有技术相比较,本发明具有如下技术效果。Compared with the prior art, the present invention has the following technical effects.

本方法中分别和传统的机器学习方法(包括支持向量机算法、SVM-dep算法) 以及深度学习的方法(AdaRNN-w/E、AdaRNNcomb、TC-LSTM)分别进行了对比实验,评价模型时以准确率(Accuracy)来评价各种模型,结果如下表所示:This method is compared with traditional machine learning methods (including support vector machine algorithm, SVM-dep algorithm) and deep learning methods (AdaRNN-w/E, AdaRNNcomb, TC-LSTM). Accuracy rate (Accuracy) to evaluate various models, the results are shown in the following table:

表1实验结果Table 1 Experimental results

Figure RE-GDA0002499126760000055
Figure RE-GDA0002499126760000055

Figure RE-GDA0002499126760000061
Figure RE-GDA0002499126760000061

附图说明Description of drawings

图1是Att-CGRU模型具体结构图。Figure 1 is a specific structural diagram of the Att-CGRU model.

为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,在图1Att-CGRU模型结构图中包括五个部分即输入层、嵌入层、GRU层、attention层和输出层。输入层将短文本即句子输入到模型中;嵌入层将句子中的每一个词映射成一个向量;GRU 层利用从词嵌入中来获取特征信息;attention层实现注意力机制,它将会通过权重计算把词一级的特征信息融合成句子级的特征信息而产生一个句子特征向量;最终将句子特征向量进行分类。In order to illustrate the embodiments of the present invention or the technical solutions in the prior art more clearly, the following briefly introduces the accompanying drawings that need to be used in the description of the embodiments or the prior art. In the structural diagram of the Att-CGRU model in FIG. The five parts are input layer, embedding layer, GRU layer, attention layer and output layer. The input layer inputs short texts or sentences into the model; the embedding layer maps each word in the sentence into a vector; the GRU layer uses the word embedding to obtain feature information; the attention layer implements the attention mechanism, which will pass weights Calculate and fuse word-level feature information into sentence-level feature information to generate a sentence feature vector; finally classify the sentence feature vector.

具体实施方式Detailed ways

下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, but not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

实施本发明时,首先要收集数据集,本发明所用的数据集是收集于推特的一个基本数据集。When implementing the present invention, a data set must be collected first, and the data set used in the present invention is a basic data set collected on Twitter.

本算法的具体实验步骤如下:The specific experimental steps of this algorithm are as follows:

步骤S1、本文所用的数据集是收集于推特的一个基本数据集。每一个训练和测试数据已经人工标定。训练数据集用来训练模型,测试数据集用来测试模型性能。训练数据集有6248个句子,测试数据集有692个句子。测试和训练数据集中正面、负面和中性数据各自占25%、25%和50%Step S1, the data set used in this paper is a basic data set collected on Twitter. Each training and testing data has been manually calibrated. The training dataset is used to train the model, and the test dataset is used to test the model performance. The training dataset has 6248 sentences and the test dataset has 692 sentences. 25%, 25%, and 50% of positive, negative, and neutral data in test and training datasets, respectively

步骤S2、本文中的模型中包括了五个部分即输入层、嵌入层、GRU层、 attention层和输出层。输入层将短文本即句子输入到模型中,句子可以表示为 s={x1,x2,...,xT}其中xi表示组成句子的第i个单词,T表示句子的长度。嵌入层将句子中的每一个单词xi根据词向量词典映射成词向量ei=[embi:embasp],其中embi表示第i个单词在词典中对应的词向量,embasp表示的是属性词的词向量,当属性词由多个词组成时,取这几个词的词向量的均值。GRU层将利用从嵌入层中来获取语义特征信息的基础上将以属性为分界点,把句子分为左右部分对属性上下文建模,其结构如图1,其中{xl+1,xl+2,...,xr-1}表示属性aspect,{x1,x2,...,xl}表示句子中属性以前的单词,{xr-1,xr-2,...,xT}表示属性以后的单词。将左右两个序列输入到左右两个GRU网络后隐藏层分别得到{h1,h2,...,hr-1}和{hl+1,hl+2,...,hT}。attention层实现注意力机制,它通过权重计算把词一级的特征信息融合成句子级的特征信息而产生一个句子特征向量最终将句子特征向量进行分类,其具体实现公示如下Step S2, the model in this paper includes five parts, namely the input layer, the embedding layer, the GRU layer, the attention layer and the output layer. The input layer inputs short texts, namely sentences, into the model. The sentences can be represented as s={x 1 ,x 2 ,...,x T } where x i represents the ith word that composes the sentence, and T represents the length of the sentence. The embedding layer maps each word x i in the sentence into a word vector e i =[emb i : emb asp ] according to the word vector dictionary, where emb i represents the word vector corresponding to the i-th word in the dictionary, and emb asp represents the word vector is the word vector of the attribute word. When the attribute word consists of multiple words, take the mean of the word vectors of these words. The GRU layer will use the attribute as the demarcation point to obtain the semantic feature information from the embedding layer, and divide the sentence into left and right parts to model the attribute context. Its structure is shown in Figure 1, where {x l+1 , x l +2 ,...,x r-1 } represents the attribute aspect, {x 1 ,x 2 ,...,x l } represents the word before the attribute in the sentence, {x r-1 ,x r-2 ,. ..,x T } represents the word after the attribute. After inputting the left and right sequences into the left and right GRU networks, the hidden layers get {h 1 ,h 2 ,...,h r-1 } and {h l+1 ,h l+2 ,...,h respectively T }. The attention layer implements the attention mechanism. It fuses the word-level feature information into sentence-level feature information through weight calculation to generate a sentence feature vector and finally classifies the sentence feature vector. The specific implementation is as follows

Figure RE-GDA0002499126760000071
Figure RE-GDA0002499126760000071

at=softmax(wTM)a t =softmax(w T M)

r=Hat r =Hat

这里r表示的是加权后的能表征句子含义的向量,at表示的是注意力权重系数,其由将wTM输入到softmax函数后得出,M表示一个由模型GRU层中隐藏层的输出组成的矩阵H得来的向量,

Figure RE-GDA0002499126760000072
表示重复属性词词向量easp多次至和H的维度保持一致,H是模型中隐藏层输出组成的矩阵,tanh代表tanh函数,Wh、Wv、w是参数矩阵。最终得到能够最终表针句子信息的向量o;Here r represents the weighted vector that can represent the meaning of the sentence, and a t represents the attention weight coefficient, which is obtained by inputting w T M into the softmax function, and M represents a hidden layer in the GRU layer of the model. The vector obtained from the matrix H composed of the output,
Figure RE-GDA0002499126760000072
Indicates that the word vector e asp is repeated several times until the dimension of H is consistent. H is the matrix composed of the output of the hidden layer in the model, tanh represents the tanh function, and W h , W v , and w are the parameter matrices. Finally, the vector o that can finally indicate the sentence information is obtained;

o=tanh(Wpr+Wxh)o = tanh(Wpr+ Wxh )

h表示hr-1和hl+1向量的和,hr-1表示左GRU网络中的第r-1个词对应的隐藏层输出,hl+1表示右GRU网络中的第l+1个词对应的隐藏层输出,Wp和Wx表示参数矩阵。输出层是将能够表针句子信息的向量o输入到softmax函数得到预测的情感极性

Figure RE-GDA0002499126760000073
具体由
Figure RE-GDA0002499126760000074
得出,Wo和bo都是参数矩阵。h represents the sum of h r-1 and h l+1 vectors, h r-1 represents the output of the hidden layer corresponding to the r-1th word in the left GRU network, and h l+1 represents the l+th word in the right GRU network The output of the hidden layer corresponding to 1 word, W p and W x represent the parameter matrix. The output layer is to input the vector o that can indicate sentence information to the softmax function to obtain the predicted sentiment polarity
Figure RE-GDA0002499126760000073
Specifically by
Figure RE-GDA0002499126760000074
It is concluded that both W o and b o are parameter matrices.

步骤S3、在训练模型时采用交叉熵作为损失函数,用

Figure RE-GDA0002499126760000075
表示预测结果。训练的过程是最小化所有句子真实极性y和预测
Figure RE-GDA0002499126760000076
间的交叉熵损失值:Step S3, use cross entropy as the loss function when training the model, use
Figure RE-GDA0002499126760000075
represents the prediction result. The training process is to minimize the true polarity y of all sentences and predict
Figure RE-GDA0002499126760000076
The cross-entropy loss value between:

Figure RE-GDA0002499126760000081
Figure RE-GDA0002499126760000081

这里j表示其情感极性种类,在本文里有正面、负面和中性;i表示句子的索引号,λ是二阶范数正则化系数,θ是待解参数;同时设置dropout概率为0.5以防止过拟合。在本方法中采用200维的词向量来初始化句子中每个词,隐藏层维度同样也是100,其他参数矩阵初始化为均匀分布的抽样。训练模型时采用批量训练方式,每一批量包含20个句子。L2正则化系数λ为0.001,优化算法采用 AdaGrad,其初始化系数为0.01。Here j represents the type of sentiment polarity, which is positive, negative and neutral in this paper; i represents the index number of the sentence, λ is the second-order norm regularization coefficient, and θ is the parameter to be solved; at the same time, set the dropout probability to 0.5 to Prevent overfitting. In this method, a 200-dimensional word vector is used to initialize each word in the sentence, the dimension of the hidden layer is also 100, and the other parameter matrices are initialized as uniformly distributed sampling. The batch training method is adopted when training the model, and each batch contains 20 sentences. The L 2 regularization coefficient λ is 0.001, the optimization algorithm adopts AdaGrad, and its initialization coefficient is 0.01.

步骤S4、在实验中分别和传统的机器学习方法(包括支持向量机算法、 SVM-dep算法)以及深度学习的方法(AdaRNN-w/E、AdaRNNcomb、TC-LSTM) 分别进行了对比实验,评价模型时以准确率(Accuracy)来评价各种模型,结果如下表所示:Step S4, in the experiment, the experiments were compared with traditional machine learning methods (including support vector machine algorithm, SVM-dep algorithm) and deep learning methods (AdaRNN-w/E, AdaRNNcomb, TC-LSTM) respectively, and the evaluation Models are evaluated by accuracy (Accuracy), and the results are shown in the following table:

表1实验结果Table 1 Experimental results

Figure RE-GDA0002499126760000082
Figure RE-GDA0002499126760000082

通过实验数据可以看出通过用左右两个网络对句子建模的同时引入基于属性词的注意力机制的方法,在准确率上和其他模型相比有一定的优势。From the experimental data, it can be seen that the method of introducing an attention mechanism based on attribute words while modeling sentences with the left and right networks has certain advantages compared with other models in terms of accuracy.

Claims (2)

1. An attribute level emotion analysis model based on a GRU network is characterized in that: the model comprises five parts, namely an input layer, an embedded layer, a GRU layer, an attention layer and an output layer; the input layer inputs short texts, namely sentences, into the model; the embedding layer maps each word in the sentence into a vector; the GRU layer acquires characteristic information by embedding words; the attention layer realizes an attention mechanism, and the attention mechanism fuses the word-level characteristic information into sentence-level characteristic information through weight calculation to generate a sentence characteristic vector; finally, classifying the sentence characteristic vectors;
1.1 input layer
Inputting each sentence needing emotion polarity classification at an input layer, and assuming that the sentence length is T, the sentence is expressed as s ═ x1,x2,...,xT},xiRepresenting the ith word in the sentence;
1.2 embedding layer
Obtaining a sentence s containing T words from an input layer1,x2,...,xTAnd fourthly, obtaining a corresponding word vector e of each word in the embedding layeri
First embedding a matrix from words
Figure RE-FDA0002499126750000011
A word vector for each word is obtained, where V is the length of the vocabulary, dwIs a word vector dimension that can be specified, then there is
embi=Wwrdvi(1)
Wherein v isiIs a vector of length | V | where i is 1 and others are 0; likewise, the word vector emb of aspect is obtainedaspWhen the aspect in the sentence is a plurality of words, adding the values of the same dimensionality of the word vector of each word to obtain the word vector of the aspect; then will embiAnd embaspSpliced to obtain a final word vector ei
ei=[embi:embasp](2)
Finally, e is ═ e1,e2,...,eTThe input is to the next layer;
1.3 GRU layer
In the GRU layer, the sentence is divided into left and right parts by taking the attribute as a demarcation point to model the context of the attribute, wherein { xl+1,xl+2,...,xr-1Denotes aspect, { x1,x2,...,xlDenotes the word before the attribute in the sentence, { xr-1,xr-2,...,xTRepresents the words after the attribute; the left and right sequences are input into the left and right networks, and then the hidden layer respectively obtains { h1,h2,...,hr-1H andl+1,hl+2,...,hT};
1.4 attention layer
An attention mechanism is introduced into the model to obtain a better classification effect, because different words and attributes of the front part and the rear part in the sentence are in different relations, more information which is closely related to the attributes is concerned; the attention mechanism is implemented as follows:
Figure RE-FDA0002499126750000021
at=softmax(wTM) (4)
r=Hat(5)
where a istIt is indicated that the attention weight coefficient,
Figure RE-FDA0002499126750000023
denotes repetition easpMultiple times until the dimension of the model is consistent with that of H, H is a matrix formed by hidden layer outputs in the model, r represents a weighted vector representing the meaning of a sentence, Wh、WvW is a parameter matrix, and then a vector o capable of finally representing sentence information is obtained
o=tanh(Wpr+Wxh) (6)
h represents hr-1And hl+1The sum of the vectors;
1.5 output layer
Finally, the output o of the attention layer is input into the classifier
Figure RE-FDA0002499126750000022
Implementing a polarity classification of the emotion, where WoAnd boIs a ginseng to be trainedA matrix of numbers.
2. The attribute level emotion analysis method based on GRU is characterized by comprising the following steps: the method comprises the following specific steps:
step S1, firstly, inputting the collected twitter data set to an input layer of the Att-CGRU model;
step S2, inputting the data obtained in step S1 into an embedding layer to obtain a word vector of each word in the input sentence,
step S3, obtaining the word vector of each word in the sentence through the mode of S2 in the GRU layer, and then using the attribute words { xl+1,xl+2,...,xr-1As the demarcation point, the left side { x }1,x2,...,xlWord vector and the right side of the word { x }r-1,xr-2,...,xTInputting the word vector of the word into two left and right GRU networks to model the context of the attribute word respectively, and obtaining output { h } from the hidden layer respectively1,h2,...,hr-1H andl+1,hl+2,...,hT};
step S4, according to the output of S4, calculating a vector o capable of representing sentence information according to the following formula:
Figure RE-FDA0002499126750000031
at=softmax(wTM)
r=Hat
where r denotes a weighted vector, a, which characterizes the meaning of the sentencetDenoted is an attention weight coefficient, which is formed by dividing wTM is obtained after inputting the softmax function, M represents a vector obtained by a matrix H formed by the output of the hidden layer in the model GRU layer,
Figure RE-FDA0002499126750000035
word vector e representing repetitive attributesaspMultiple times until the dimension of the model is consistent with that of H, H is a matrix formed by hidden layer outputs in the model, tanh represents a tanh function, Wh、WvW is a parameter matrix; finally, the vector o capable of finally pointing the sentence information is obtained
o=tanh(Wpr+Wxh)
h represents hr-1And hl+1Sum of vectors, hr-1Denotes the hidden layer output, h, corresponding to the r-1 th word in the left GRU networkl+1Denotes the hidden layer output, W, corresponding to the l +1 th word in the right GRU networkpAnd WxRepresenting a parameter matrix;
in step S5, the output layer inputs the vector o capable of representing sentence information into the softmax function to obtain the predicted emotion polarity
Figure RE-FDA0002499126750000032
In particular by
Figure RE-FDA0002499126750000033
To obtain WoAnd boAre all parameter matrices; step 6, calculating the loss function value loss according to the output of S5 and the actual classification y corresponding to each sentence
Figure RE-FDA0002499126750000034
Wherein lambda is a regularization coefficient, and training iteration is carried out to Accuracy through an error back propagation algorithm to obtain a maximum value, and an optimization algorithm in the error back propagation algorithm is an AdaGrad algorithm with an initialization coefficient of 0.01.
CN201910459539.6A 2019-05-29 2019-05-29 GRU-based attribute level emotion analysis method Pending CN111353040A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910459539.6A CN111353040A (en) 2019-05-29 2019-05-29 GRU-based attribute level emotion analysis method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910459539.6A CN111353040A (en) 2019-05-29 2019-05-29 GRU-based attribute level emotion analysis method

Publications (1)

Publication Number Publication Date
CN111353040A true CN111353040A (en) 2020-06-30

Family

ID=71196950

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910459539.6A Pending CN111353040A (en) 2019-05-29 2019-05-29 GRU-based attribute level emotion analysis method

Country Status (1)

Country Link
CN (1) CN111353040A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111813895A (en) * 2020-08-07 2020-10-23 深圳职业技术学院 An attribute-level sentiment analysis method based on hierarchical attention mechanism and gate mechanism
CN112131886A (en) * 2020-08-05 2020-12-25 浙江工业大学 Method for analyzing aspect level emotion of text
CN113849646A (en) * 2021-09-28 2021-12-28 西安邮电大学 Text emotion analysis method
CN114492521A (en) * 2022-01-21 2022-05-13 成都理工大学 Intelligent lithology while drilling identification method and system based on acoustic vibration signals
CN115098631A (en) * 2022-06-23 2022-09-23 浙江工商大学 Sentence-level emotion analysis method based on text capsule neural network

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108595601A (en) * 2018-04-20 2018-09-28 福州大学 A kind of long text sentiment analysis method incorporating Attention mechanism
CN108984724A (en) * 2018-07-10 2018-12-11 凯尔博特信息科技(昆山)有限公司 It indicates to improve particular community emotional semantic classification accuracy rate method using higher-dimension
US20190005027A1 (en) * 2017-06-29 2019-01-03 Robert Bosch Gmbh System and Method For Domain-Independent Aspect Level Sentiment Detection
CN109145304A (en) * 2018-09-07 2019-01-04 中山大学 A kind of Chinese Opinion element sentiment analysis method based on word

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190005027A1 (en) * 2017-06-29 2019-01-03 Robert Bosch Gmbh System and Method For Domain-Independent Aspect Level Sentiment Detection
CN108595601A (en) * 2018-04-20 2018-09-28 福州大学 A kind of long text sentiment analysis method incorporating Attention mechanism
CN108984724A (en) * 2018-07-10 2018-12-11 凯尔博特信息科技(昆山)有限公司 It indicates to improve particular community emotional semantic classification accuracy rate method using higher-dimension
CN109145304A (en) * 2018-09-07 2019-01-04 中山大学 A kind of Chinese Opinion element sentiment analysis method based on word

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
MEISHAN ZHANG: "Gated Neural Networks for Targeted Sentiment Analysis", 《PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE》 *
YEQUANWANG: "Attention-based LSTM for Aspect-level Sentiment Classification", 《PROCEEDINGS OF THE 2016 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING》 *
ZHAI PENGHUA: "Bidirectional-GRU Based on Attention Mechanism for Aspect-level Sentiment Analysis", 《PROCEEDINGS OF THE 2019 11TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112131886A (en) * 2020-08-05 2020-12-25 浙江工业大学 Method for analyzing aspect level emotion of text
CN111813895A (en) * 2020-08-07 2020-10-23 深圳职业技术学院 An attribute-level sentiment analysis method based on hierarchical attention mechanism and gate mechanism
CN111813895B (en) * 2020-08-07 2022-06-03 深圳职业技术学院 Attribute level emotion analysis method based on level attention mechanism and door mechanism
CN113849646A (en) * 2021-09-28 2021-12-28 西安邮电大学 Text emotion analysis method
CN114492521A (en) * 2022-01-21 2022-05-13 成都理工大学 Intelligent lithology while drilling identification method and system based on acoustic vibration signals
CN115098631A (en) * 2022-06-23 2022-09-23 浙江工商大学 Sentence-level emotion analysis method based on text capsule neural network
CN115098631B (en) * 2022-06-23 2024-08-02 浙江工商大学 Sentence-level emotion analysis method based on text capsule neural network

Similar Documents

Publication Publication Date Title
CN109284506B (en) User comment emotion analysis system and method based on attention convolution neural network
CN110929030B (en) A joint training method for text summarization and sentiment classification
CN109657239B (en) Chinese Named Entity Recognition Method Based on Attention Mechanism and Language Model Learning
CN109635109B (en) Sentence classification method based on LSTM combined with part of speech and multi-attention mechanism
CN108763326B (en) Emotion analysis model construction method of convolutional neural network based on feature diversification
CN109992780B (en) Specific target emotion classification method based on deep neural network
CN111353040A (en) GRU-based attribute level emotion analysis method
CN110096711B (en) Natural language semantic matching method for sequence global attention and local dynamic attention
CN113435211B (en) A Text Implicit Sentiment Analysis Method Combining External Knowledge
CN112001187A (en) Emotion classification system based on Chinese syntax and graph convolution neural network
CN111310474A (en) Online course comment sentiment analysis method based on activation-pooling enhanced BERT model
CN116579347A (en) Comment text emotion analysis method, system, equipment and medium based on dynamic semantic feature fusion
CN112784041B (en) Chinese short text sentiment orientation analysis method
CN108170848B (en) A Dialogue Scene Classification Method for China Mobile Intelligent Customer Service
CN109145304B (en) Chinese viewpoint element sentiment analysis method based on characters
CN111400494B (en) A sentiment analysis method based on GCN-Attention
CN112749274A (en) Chinese text classification method based on attention mechanism and interference word deletion
CN113806543B (en) A Text Classification Method Based on Gated Recurrent Units with Residual Skip Connections
CN114722835A (en) Text emotion recognition method based on LDA and BERT fusion improved model
CN114547299A (en) Short text sentiment classification method and device based on composite network model
CN111309909A (en) Text emotion classification method based on hybrid model
CN112199503B (en) Feature-enhanced unbalanced Bi-LSTM-based Chinese text classification method
CN113204640A (en) Text classification method based on attention mechanism
CN116467443A (en) Topic identification-based online public opinion text classification method
Rauf et al. Using bert for checking the polarity of movie reviews

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200630

RJ01 Rejection of invention patent application after publication