CN111353040A - GRU-based attribute level emotion analysis method - Google Patents
GRU-based attribute level emotion analysis method Download PDFInfo
- Publication number
- CN111353040A CN111353040A CN201910459539.6A CN201910459539A CN111353040A CN 111353040 A CN111353040 A CN 111353040A CN 201910459539 A CN201910459539 A CN 201910459539A CN 111353040 A CN111353040 A CN 111353040A
- Authority
- CN
- China
- Prior art keywords
- sentence
- layer
- word
- vector
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000004458 analytical method Methods 0.000 title claims abstract description 21
- 230000008451 emotion Effects 0.000 title claims 6
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 23
- 230000000694 effects Effects 0.000 claims abstract description 3
- 239000013598 vector Substances 0.000 claims description 70
- 239000011159 matrix material Substances 0.000 claims description 17
- 238000000034 method Methods 0.000 claims description 16
- 238000012549 training Methods 0.000 claims description 10
- 238000005457 optimization Methods 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 claims description 2
- 241000208340 Araliaceae Species 0.000 claims 1
- 235000005035 Panax pseudoginseng ssp. pseudoginseng Nutrition 0.000 claims 1
- 235000003140 Panax quinquefolius Nutrition 0.000 claims 1
- 235000008434 ginseng Nutrition 0.000 claims 1
- 230000003252 repetitive effect Effects 0.000 claims 1
- 238000003058 natural language processing Methods 0.000 abstract description 7
- 230000000306 recurrent effect Effects 0.000 abstract description 6
- 238000002474 experimental method Methods 0.000 abstract description 4
- 230000006870 function Effects 0.000 description 8
- 238000013135 deep learning Methods 0.000 description 6
- 238000010801 machine learning Methods 0.000 description 5
- 238000012360 testing method Methods 0.000 description 5
- 230000002996 emotional effect Effects 0.000 description 3
- 238000012706 support-vector machine Methods 0.000 description 3
- 238000013459 approach Methods 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000010365 information processing Effects 0.000 description 2
- 230000007935 neutral effect Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000006403 short-term memory Effects 0.000 description 2
- 238000013519 translation Methods 0.000 description 2
- 241000288105 Grus Species 0.000 description 1
- 238000007635 classification algorithm Methods 0.000 description 1
- 239000012141 concentrate Substances 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000013210 evaluation model Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 230000007087 memory ability Effects 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Databases & Information Systems (AREA)
- Machine Translation (AREA)
Abstract
本发明公开了一种属性级情感分析方法。情感分析是自然语言处理中一个基本的任务,而属性级的情感分析是情感分析的一个重要课题。在一句话中不同的词语对句子中属性(aspect)的情感极性有不同的影响,如何对属性和其句子中的词的关系以及整个句子含义建模是解决该问题的关键。在本文我们将通过两个循环网络在对句子信息建模的同时还引入注意力机制以融合属性的信息从而期望达到较好的效果。本文通过在公开数据集的实验显示在不需要做繁杂特征工程的情况下,本文提出的算法取得了更好的结果。
The invention discloses an attribute-level sentiment analysis method. Sentiment analysis is a basic task in natural language processing, and attribute-level sentiment analysis is an important topic of sentiment analysis. Different words in a sentence have different effects on the sentiment polarity of an aspect in the sentence. How to model the relationship between the attribute and the words in the sentence and the meaning of the whole sentence is the key to solving this problem. In this paper, we will use two recurrent networks to model sentence information and introduce an attention mechanism to fuse attribute information to achieve better results. In this paper, experiments on public datasets show that the algorithm proposed in this paper achieves better results without the need for complicated feature engineering.
Description
技术领域technical field
本发明涉及互联网领域,更具体地说,涉及基于GRU的属性级别情感分析方法。The present invention relates to the field of Internet, and more particularly, to an attribute-level sentiment analysis method based on GRU.
背景技术Background technique
随着互联网的快速发展,文本信息越来越多,如何从海量文本信息中获取有用信息变得越来越重要,这些海量文本信息也客观促进自然语言处理的发展,而深度学习为自然语言处理带来了新的方向。情感分析(即意见挖掘)是自然语言处理中一个基础却很重要的任务。企业可以利用客户对产品的评论信息来及时的获取反馈从而为决策提供参考。因此近年来如何在海量的文本数据中提取情感信息显得成为自然语言处理的一个重要研究课题。With the rapid development of the Internet and more and more text information, how to obtain useful information from massive text information has become more and more important. These massive text information also objectively promote the development of natural language processing, and deep learning is a natural language processing technology. brought a new direction. Sentiment analysis (i.e. opinion mining) is a fundamental but important task in natural language processing. Enterprises can use customer comments on products to obtain timely feedback to provide reference for decision-making. Therefore, how to extract emotional information from massive text data has become an important research topic in natural language processing in recent years.
目前主要的文本情感分析研究主要是基于情感词典和基于机器学习。基于情感词典的方法依赖于情感词典,情感词典对于情感分析具有大的影响,杨鼎等基于情感词典对文本进行处理和表示从而构建出基于朴素贝叶斯理论的分类器。另一种方法是基于机器学习的方法。机器学习的方法通过对人工标定的数据进行训练而得到一个情感分析分类器,通过实验证明了支持向量机的优良分类性能。这两种方法都需要人工标记数据从而完成情感词典构建和特征工程,这些任务繁琐且复杂而深度学习算法能够很好地解决这一问题。近年来深度学习在自然语言处理方面取得巨大的成功,比如机器翻译,问答系统。在情感分析领域也有应用, Socher等提出了基于半监督递归自动编码机RAE)的深度学习方法来实现文本情感分类;Jurgovsky等利用卷积神经网络(CNN)实现了文本情感分类。文本情感分析可以分为篇章级、句子级以及单词级。本文主要研究的是基于属性(aspect) 的情感分析。这是由于在同一句子中对不同的aspect其情感极性有可能是不相同的,比如在“Thevoice quality of this phone is not good,but the battery life is long.”这个句子中对于quality来说这句话的评价是负面的,然而对battery life这个来说是正面的。Wang等提出AE-LSTM,AT-LSTM,和AEAT-LSTM循环网络算法用于aspect粒度的情感分析,其将aspect信息融合到长短期记忆网络LSTM中以提高分类精度。SVM-dep算法将特征分为和属性aspect相关的特征和aspect 无关的特征,分别提取出来完成了基于属性级别的情感分析,其精度由于不包含属性特征的支持向量机分类器。At present, the main text sentiment analysis research is mainly based on sentiment dictionary and machine learning. The method based on sentiment dictionary relies on sentiment dictionary, which has a great influence on sentiment analysis. Yang Ding et al. processed and represented text based on sentiment dictionary to construct a classifier based on Naive Bayes theory. Another approach is a machine learning based approach. The machine learning method obtains a sentiment analysis classifier by training the manually calibrated data, and the excellent classification performance of the support vector machine is proved by experiments. Both methods require manual labeling of data for sentiment dictionary construction and feature engineering, which are tedious and complex tasks that deep learning algorithms can handle well. In recent years, deep learning has achieved great success in natural language processing, such as machine translation, question answering systems. It also has applications in the field of sentiment analysis. Socher et al. proposed a deep learning method based on semi-supervised recurrent auto-encoder (RAE) to achieve text sentiment classification; Jurgovsky et al. used convolutional neural network (CNN) to achieve text sentiment classification. Text sentiment analysis can be divided into chapter level, sentence level and word level. This paper mainly studies the sentiment analysis based on aspect. This is because the emotional polarity of different aspects may be different in the same sentence. For example, in the sentence "The voice quality of this phone is not good, but the battery life is long." The evaluation of the sentence is negative, but it is positive for battery life. Wang et al. proposed AE-LSTM, AT-LSTM, and AEAT-LSTM recurrent network algorithms for aspect-granular sentiment analysis, which integrated aspect information into long short-term memory network LSTM to improve classification accuracy. The SVM-dep algorithm divides the features into the features related to the attribute aspect and the features not related to the aspect, and extracts them respectively to complete the sentiment analysis based on the attribute level.
注意力机制是在信息处理时选择性地集中于某些重要的信息的一种机制,而忽略和关注目标意义相关性较弱的一种信息处理机制,它强调在信息处理时更关注信息的本质方面的信息,它将有限的资源集中于重要的信息的处理,从而取得了巨大的成功。注意力(Attention)机制在图像识别、自动翻译等领域已经取得了巨大的成功。结合本文的主题,在处理基于属性情感分析的时候,可以更加关注和属性有关的信息从而提高情感分类的准确度。The attention mechanism is a mechanism that selectively focuses on some important information during information processing, while ignoring and paying attention to an information processing mechanism that is less relevant to the meaning of the target. Essentially information, it has achieved great success by concentrating limited resources on the processing of important information. Attention mechanism has achieved great success in image recognition, automatic translation and other fields. Combined with the theme of this paper, when dealing with attribute-based sentiment analysis, we can pay more attention to attribute-related information to improve the accuracy of sentiment classification.
循环网络(RNN)因为其网络记忆性能够处理上下文信息而被广泛应用于自然语言处理中,典型的循环网络有长短期记忆网络(LSTM)、门控循环单元(GRU) 和MUT网络等。本文将提出基于GRU网络在属性粒度情感分析的算法,然后通过注意力机制将属性信息融合到模型中,使得算法模型更能够关注到属性对情感分类的影响,从而提高情感分类的精度。Recurrent network (RNN) is widely used in natural language processing because of its network memory ability to process contextual information. Typical recurrent networks include long short-term memory network (LSTM), gated recurrent unit (GRU) and MUT network. This paper proposes an algorithm for attribute granular sentiment analysis based on GRU network, and then integrates attribute information into the model through the attention mechanism, so that the algorithm model can pay more attention to the influence of attributes on sentiment classification, thereby improving the accuracy of sentiment classification.
发明内容SUMMARY OF THE INVENTION
有鉴于此,本发明的目的在于提供了一种基于GRU网络的属性级情感分析模型和方法,本发明基于Att-CGRU的属性级别情感分类算法,以实现情感分类精度的提高。In view of this, the purpose of the present invention is to provide an attribute-level sentiment analysis model and method based on GRU network. The present invention is based on the attribute-level sentiment classification algorithm of Att-CGRU to improve sentiment classification accuracy.
为了实现上述目的,本发明设计的基于GRU网络的属性级情感分析模型包括如下:In order to achieve the above purpose, the attribute-level sentiment analysis model based on the GRU network designed by the present invention includes the following:
在Att-CGRU模型中,将通过注意力机制的引入来体现属性在对于整个句子的情感极性具有很重要的影响。在序列问题的处理中,编码解码(encoder-decoder) 是一种很常用的模型,通过在编解码模型根据不同的算法以及任务目标来对编码输出的隐状态向量分配不同的权重,抽取出能够尽可能表征输入数据的向量表示以改善算法模型性能,这一过程实质上是将有限的资源集中于和目标任务相关度更高的信息上来提高算法性能。Att-CGRU模型具体结构见说明书附图1。模型包括五个部分即输入层、嵌入层、GRU层、attention层和输出层。输入层将短文本即句子输入到模型中;嵌入层将句子中的每一个词映射成一个向量;GRU层利用从词嵌入中来获取特征信息;attention层实现注意力机制,它将会通过权重计算把词一级的特征信息融合成句子级的特征信息而产生一个句子特征向量;最终将句子特征向量进行分类。In the Att-CGRU model, the introduction of the attention mechanism to reflect the attribute has a very important impact on the emotional polarity of the entire sentence. In the processing of sequence problems, the encoder-decoder is a very commonly used model. By assigning different weights to the hidden state vector of the encoding output according to different algorithms and task goals in the encoder-decoder model, extracting To improve the performance of the algorithm model by characterizing the vector representation of the input data as much as possible, this process essentially concentrates limited resources on information that is more relevant to the target task to improve the performance of the algorithm. The specific structure of the Att-CGRU model is shown in Figure 1 of the description. The model consists of five parts namely input layer, embedding layer, GRU layer, attention layer and output layer. The input layer inputs short texts or sentences into the model; the embedding layer maps each word in the sentence into a vector; the GRU layer uses the word embedding to obtain feature information; the attention layer implements the attention mechanism, which will pass weights Calculate and fuse word-level feature information into sentence-level feature information to generate a sentence feature vector; finally classify the sentence feature vector.
1.1输入层1.1 Input layer
在输入层输入每一个需要进行情感极性分类的句子,假设句子长度为T,则句子可以表示为s={x1,x2,...,xT},xi表示句子中的第i个单词。In the input layer, input each sentence that needs to be classified by sentiment polarity. Assuming that the sentence length is T, the sentence can be expressed as s={x 1 ,x 2 ,...,x T }, and x i represents the first sentence in the sentence. i words.
1.2嵌入层1.2 Embedding layer
在从输入层获得的一个包含T个词的句子s={x1,x2,...,xT},后每一个词在嵌入层得到其对应的词向量ei In a sentence s={x 1 ,x 2 ,...,x T } obtained from the input layer containing T words, each word gets its corresponding word vector e i in the embedding layer
首先从词嵌入矩阵中获得每一个词的词向量,这里V是词表的长度,dw是可以指定的词向量维数,则有First from the word embedding matrix The word vector of each word is obtained in , where V is the length of the vocabulary, and d w is the dimension of the word vector that can be specified, then there are
embi=Wwrdvi (1)emb i =W wrd v i (1)
其中vi是一个长度为|V|的向量,其中在i处为1,其他处为0。可以同样得出aspect的词向量embasp,当句子中aspect为多个单词的时候,将每个单词的词向量相同维度的值加起来得到aspect的词向量。然后将embi和embasp拼接起来得到最终的词向量ei:where v i is a vector of length |V|, which is 1 at i and 0 elsewhere. The word vector emb asp of aspect can also be obtained. When there are multiple words in the sentence, the word vector of aspect is obtained by adding up the values of the same dimension of the word vector of each word. Then concatenate emb i and emb asp to get the final word vector e i :
ei=[embi:embasp] (2)e i = [emb i : emb asp ] (2)
最后将e={e1,e2,...,eT}输入到下一层。Finally, e={e 1 ,e 2 ,...,e T } is input to the next layer.
1.3 GRU层1.3 GRU layer
在GRU层中,将会以属性为分界点,将句子分为左右部分去对属性上下文建模,其结构如图1,其中{xl+1,xl+2,...,xr-1}表示aspect,{x1,x2,...,xl}表示句子中属性以前的单词,{xr-1,xr-2,...,xT}表示属性以后的单词。将左右两个序列输入到左右两个网络后隐藏层分别得到{h1,h2,...,hr-1}和{hl+1,hl+2,...,hT}。In the GRU layer, the attribute is used as the dividing point, and the sentence is divided into left and right parts to model the attribute context. Its structure is shown in Figure 1, where {x l+1 ,x l+2 ,...,x r -1 } represents aspect, {x 1 ,x 2 ,...,x l } represents the word before the attribute in the sentence, {x r-1 ,x r-2 ,...,x T } represents the word after the attribute word. After inputting the left and right sequences into the left and right networks, the hidden layers get {h 1 ,h 2 ,...,h r-1 } and {h l+1 ,h l+2 ,...,h T respectively }.
1.4 attention层1.4 attention layer
在这个模型中引入注意力机制来获得更好的分类效果,这是由于句子中前后两部分的不同词和属性有不同的联系,将更多的来关注和属性联系紧密的信息。注意力机制的实现如下:The attention mechanism is introduced into this model to obtain better classification results, because different words and attributes in the front and back parts of the sentence have different connections, and more attention will be paid to the information that is closely related to the attributes. The attention mechanism is implemented as follows:
at=softmax(wTM) (4)a t =softmax(w T M) (4)
r=Hat (5)r=Ha t (5)
这里at表示的是注意力权重系数,表示重复easp多次至和H的维度保持一致,H 是模型中隐藏层输出组成的矩阵,r表示的是加权后的表示句子含义的向量,Wh、 Wv、w是参数矩阵,然后得到能够最终表针句子信息的向量oHere a t represents the attention weight coefficient, Represents repeating e asp multiple times to be consistent with the dimension of H, H is the matrix composed of the output of the hidden layer in the model, r represents the weighted vector representing the meaning of the sentence, W h , W v , w are the parameter matrix, and then Get the vector o that can finally pin the sentence information
o=tanh(Wpr+Wxh) (6)o = tanh(Wpr+ Wxh ) (6)
h表示hr-1和hl+1向量的和。h represents the sum of h r-1 and h l+1 vectors.
1.5输出层1.5 Output layer
最后将注意力层的输出o输入到分类器Finally, the output o of the attention layer is input to the classifier
实现情感的极性分类,其中Wo和bo是要训练得到的参数矩阵。Implements sentiment polarity classification, where Wo and bo are the parameter matrices to be trained.
本方法的具体实验步骤如下:The specific experimental steps of this method are as follows:
步骤S1、首先将本发明中所用的收集于推特的数据集输入到Att-CGRU模型的输入层,Step S1, first input the data set collected on Twitter used in the present invention into the input layer of the Att-CGRU model,
步骤S2、将S1得到的数据输入到嵌入层,得到输入句子中每个词的词向量,Step S2, input the data obtained in S1 into the embedding layer to obtain the word vector of each word in the input sentence,
步骤S3、在GRU层中通过S2的方式得到句子中每个词的词向量后,以属性词{xl+1,xl+2,...,xr-1}为分界点将左边{x1,x2,...,xl}单词的词向量和右边 {xr-1,xr-2,...,xT}单词的词向量输入到两个左右两个GRU网络分别对属性词的上下文建模,从隐藏层分别得到输出{h1,h2,...,hr-1}和{hl+1,hl+2,...,hT}。Step S3: After obtaining the word vector of each word in the sentence in the GRU layer by means of S2, take the attribute word {x l+1 ,x l+2 ,...,x r-1 } as the dividing point to divide the left The word vector of the word {x 1 ,x 2 ,...,x l } and the word vector of the word on the right {x r-1 ,x r-2 ,...,x T } are input to two left and right GRUs The network models the context of the attribute word separately, and obtains the outputs {h 1 ,h 2 ,...,h r-1 } and {h l+1 ,h l+2 ,...,h T from the hidden layer, respectively }.
步骤S4、根据S4的输出,按照以下公式来计算能代表句子信息的向量o,具体公式如下:Step S4, according to the output of S4, according to the following formula to calculate the vector o that can represent the sentence information, the specific formula is as follows:
at=softmax(wTM)a t =softmax(w T M)
r=Hat r =Hat
这里r表示的是加权后的能表征句子含义的向量,at表示的是注意力权重系数,其由将wTM输入到softmax函数后得出,M表示一个由模型GRU层中隐藏层的输出组成的矩阵H得来的向量,表示重复属性词词向量easp多次至和H的维度保持一致,H是模型中隐藏层输出组成的矩阵,tanh代表tanh函数,Wh、Wv、 w是参数矩阵。最终得到能够最终表针句子信息的向量oHere r represents the weighted vector that can represent the meaning of the sentence, and a t represents the attention weight coefficient, which is obtained by inputting w T M into the softmax function, and M represents a hidden layer in the GRU layer of the model. The vector obtained from the matrix H composed of the output, Indicates that the word vector e asp is repeated several times until the dimension of H is consistent. H is the matrix composed of the output of the hidden layer in the model, tanh represents the tanh function, and W h , W v , and w are the parameter matrices. Finally, the vector o that can finally pin the sentence information is obtained
o=tanh(Wpr+Wxh)o = tanh(Wpr+ Wxh )
h表示hr-1和hl+1向量的和,hr-1表示左GRU网络中的第r-1个词对应的隐藏层输出,hl+1表示右GRU网络中的第l+1个词对应的隐藏层输出,Wp和Wx表示参数矩阵。h represents the sum of h r-1 and h l+1 vectors, h r-1 represents the output of the hidden layer corresponding to the r-1th word in the left GRU network, and h l+1 represents the l+th word in the right GRU network The output of the hidden layer corresponding to 1 word, W p and W x represent the parameter matrix.
步骤S5、输出层是将能够表针句子信息的向量o输入到softmax函数得到预测的情感极性具体由得出,Wo和bo都是参数矩阵。Step S5, the output layer is to input the vector o that can indicate sentence information into the softmax function to obtain the predicted sentiment polarity. Specifically by It is concluded that both W o and b o are parameter matrices.
步骤6、根据S5的输出和每个句子对应的实际分类y计算损失函数值lossStep 6. Calculate the loss function value loss according to the output of S5 and the actual classification y corresponding to each sentence
其中λ是正则化系数,并通过误差反向传播算法训练迭代至Accuracy取得最大值,误差反向传播算法中的优化算法是以初始化系数为0.01的AdaGrad算法。Among them, λ is the regularization coefficient, and the error back-propagation algorithm is trained to iterate until the Accuracy gets the maximum value. The optimization algorithm in the error back-propagation algorithm is the AdaGrad algorithm with an initialization coefficient of 0.01.
与现有技术相比较,本发明具有如下技术效果。Compared with the prior art, the present invention has the following technical effects.
本方法中分别和传统的机器学习方法(包括支持向量机算法、SVM-dep算法) 以及深度学习的方法(AdaRNN-w/E、AdaRNNcomb、TC-LSTM)分别进行了对比实验,评价模型时以准确率(Accuracy)来评价各种模型,结果如下表所示:This method is compared with traditional machine learning methods (including support vector machine algorithm, SVM-dep algorithm) and deep learning methods (AdaRNN-w/E, AdaRNNcomb, TC-LSTM). Accuracy rate (Accuracy) to evaluate various models, the results are shown in the following table:
表1实验结果Table 1 Experimental results
附图说明Description of drawings
图1是Att-CGRU模型具体结构图。Figure 1 is a specific structural diagram of the Att-CGRU model.
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,在图1Att-CGRU模型结构图中包括五个部分即输入层、嵌入层、GRU层、attention层和输出层。输入层将短文本即句子输入到模型中;嵌入层将句子中的每一个词映射成一个向量;GRU 层利用从词嵌入中来获取特征信息;attention层实现注意力机制,它将会通过权重计算把词一级的特征信息融合成句子级的特征信息而产生一个句子特征向量;最终将句子特征向量进行分类。In order to illustrate the embodiments of the present invention or the technical solutions in the prior art more clearly, the following briefly introduces the accompanying drawings that need to be used in the description of the embodiments or the prior art. In the structural diagram of the Att-CGRU model in FIG. The five parts are input layer, embedding layer, GRU layer, attention layer and output layer. The input layer inputs short texts or sentences into the model; the embedding layer maps each word in the sentence into a vector; the GRU layer uses the word embedding to obtain feature information; the attention layer implements the attention mechanism, which will pass weights Calculate and fuse word-level feature information into sentence-level feature information to generate a sentence feature vector; finally classify the sentence feature vector.
具体实施方式Detailed ways
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, but not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.
实施本发明时,首先要收集数据集,本发明所用的数据集是收集于推特的一个基本数据集。When implementing the present invention, a data set must be collected first, and the data set used in the present invention is a basic data set collected on Twitter.
本算法的具体实验步骤如下:The specific experimental steps of this algorithm are as follows:
步骤S1、本文所用的数据集是收集于推特的一个基本数据集。每一个训练和测试数据已经人工标定。训练数据集用来训练模型,测试数据集用来测试模型性能。训练数据集有6248个句子,测试数据集有692个句子。测试和训练数据集中正面、负面和中性数据各自占25%、25%和50%Step S1, the data set used in this paper is a basic data set collected on Twitter. Each training and testing data has been manually calibrated. The training dataset is used to train the model, and the test dataset is used to test the model performance. The training dataset has 6248 sentences and the test dataset has 692 sentences. 25%, 25%, and 50% of positive, negative, and neutral data in test and training datasets, respectively
步骤S2、本文中的模型中包括了五个部分即输入层、嵌入层、GRU层、 attention层和输出层。输入层将短文本即句子输入到模型中,句子可以表示为 s={x1,x2,...,xT}其中xi表示组成句子的第i个单词,T表示句子的长度。嵌入层将句子中的每一个单词xi根据词向量词典映射成词向量ei=[embi:embasp],其中embi表示第i个单词在词典中对应的词向量,embasp表示的是属性词的词向量,当属性词由多个词组成时,取这几个词的词向量的均值。GRU层将利用从嵌入层中来获取语义特征信息的基础上将以属性为分界点,把句子分为左右部分对属性上下文建模,其结构如图1,其中{xl+1,xl+2,...,xr-1}表示属性aspect,{x1,x2,...,xl}表示句子中属性以前的单词,{xr-1,xr-2,...,xT}表示属性以后的单词。将左右两个序列输入到左右两个GRU网络后隐藏层分别得到{h1,h2,...,hr-1}和{hl+1,hl+2,...,hT}。attention层实现注意力机制,它通过权重计算把词一级的特征信息融合成句子级的特征信息而产生一个句子特征向量最终将句子特征向量进行分类,其具体实现公示如下Step S2, the model in this paper includes five parts, namely the input layer, the embedding layer, the GRU layer, the attention layer and the output layer. The input layer inputs short texts, namely sentences, into the model. The sentences can be represented as s={x 1 ,x 2 ,...,x T } where x i represents the ith word that composes the sentence, and T represents the length of the sentence. The embedding layer maps each word x i in the sentence into a word vector e i =[emb i : emb asp ] according to the word vector dictionary, where emb i represents the word vector corresponding to the i-th word in the dictionary, and emb asp represents the word vector is the word vector of the attribute word. When the attribute word consists of multiple words, take the mean of the word vectors of these words. The GRU layer will use the attribute as the demarcation point to obtain the semantic feature information from the embedding layer, and divide the sentence into left and right parts to model the attribute context. Its structure is shown in Figure 1, where {x l+1 , x l +2 ,...,x r-1 } represents the attribute aspect, {x 1 ,x 2 ,...,x l } represents the word before the attribute in the sentence, {x r-1 ,x r-2 ,. ..,x T } represents the word after the attribute. After inputting the left and right sequences into the left and right GRU networks, the hidden layers get {h 1 ,h 2 ,...,h r-1 } and {h l+1 ,h l+2 ,...,h respectively T }. The attention layer implements the attention mechanism. It fuses the word-level feature information into sentence-level feature information through weight calculation to generate a sentence feature vector and finally classifies the sentence feature vector. The specific implementation is as follows
at=softmax(wTM)a t =softmax(w T M)
r=Hat r =Hat
这里r表示的是加权后的能表征句子含义的向量,at表示的是注意力权重系数,其由将wTM输入到softmax函数后得出,M表示一个由模型GRU层中隐藏层的输出组成的矩阵H得来的向量,表示重复属性词词向量easp多次至和H的维度保持一致,H是模型中隐藏层输出组成的矩阵,tanh代表tanh函数,Wh、Wv、w是参数矩阵。最终得到能够最终表针句子信息的向量o;Here r represents the weighted vector that can represent the meaning of the sentence, and a t represents the attention weight coefficient, which is obtained by inputting w T M into the softmax function, and M represents a hidden layer in the GRU layer of the model. The vector obtained from the matrix H composed of the output, Indicates that the word vector e asp is repeated several times until the dimension of H is consistent. H is the matrix composed of the output of the hidden layer in the model, tanh represents the tanh function, and W h , W v , and w are the parameter matrices. Finally, the vector o that can finally indicate the sentence information is obtained;
o=tanh(Wpr+Wxh)o = tanh(Wpr+ Wxh )
h表示hr-1和hl+1向量的和,hr-1表示左GRU网络中的第r-1个词对应的隐藏层输出,hl+1表示右GRU网络中的第l+1个词对应的隐藏层输出,Wp和Wx表示参数矩阵。输出层是将能够表针句子信息的向量o输入到softmax函数得到预测的情感极性具体由得出,Wo和bo都是参数矩阵。h represents the sum of h r-1 and h l+1 vectors, h r-1 represents the output of the hidden layer corresponding to the r-1th word in the left GRU network, and h l+1 represents the l+th word in the right GRU network The output of the hidden layer corresponding to 1 word, W p and W x represent the parameter matrix. The output layer is to input the vector o that can indicate sentence information to the softmax function to obtain the predicted sentiment polarity Specifically by It is concluded that both W o and b o are parameter matrices.
步骤S3、在训练模型时采用交叉熵作为损失函数,用表示预测结果。训练的过程是最小化所有句子真实极性y和预测间的交叉熵损失值:Step S3, use cross entropy as the loss function when training the model, use represents the prediction result. The training process is to minimize the true polarity y of all sentences and predict The cross-entropy loss value between:
这里j表示其情感极性种类,在本文里有正面、负面和中性;i表示句子的索引号,λ是二阶范数正则化系数,θ是待解参数;同时设置dropout概率为0.5以防止过拟合。在本方法中采用200维的词向量来初始化句子中每个词,隐藏层维度同样也是100,其他参数矩阵初始化为均匀分布的抽样。训练模型时采用批量训练方式,每一批量包含20个句子。L2正则化系数λ为0.001,优化算法采用 AdaGrad,其初始化系数为0.01。Here j represents the type of sentiment polarity, which is positive, negative and neutral in this paper; i represents the index number of the sentence, λ is the second-order norm regularization coefficient, and θ is the parameter to be solved; at the same time, set the dropout probability to 0.5 to Prevent overfitting. In this method, a 200-dimensional word vector is used to initialize each word in the sentence, the dimension of the hidden layer is also 100, and the other parameter matrices are initialized as uniformly distributed sampling. The batch training method is adopted when training the model, and each batch contains 20 sentences. The L 2 regularization coefficient λ is 0.001, the optimization algorithm adopts AdaGrad, and its initialization coefficient is 0.01.
步骤S4、在实验中分别和传统的机器学习方法(包括支持向量机算法、 SVM-dep算法)以及深度学习的方法(AdaRNN-w/E、AdaRNNcomb、TC-LSTM) 分别进行了对比实验,评价模型时以准确率(Accuracy)来评价各种模型,结果如下表所示:Step S4, in the experiment, the experiments were compared with traditional machine learning methods (including support vector machine algorithm, SVM-dep algorithm) and deep learning methods (AdaRNN-w/E, AdaRNNcomb, TC-LSTM) respectively, and the evaluation Models are evaluated by accuracy (Accuracy), and the results are shown in the following table:
表1实验结果Table 1 Experimental results
通过实验数据可以看出通过用左右两个网络对句子建模的同时引入基于属性词的注意力机制的方法,在准确率上和其他模型相比有一定的优势。From the experimental data, it can be seen that the method of introducing an attention mechanism based on attribute words while modeling sentences with the left and right networks has certain advantages compared with other models in terms of accuracy.
Claims (2)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910459539.6A CN111353040A (en) | 2019-05-29 | 2019-05-29 | GRU-based attribute level emotion analysis method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910459539.6A CN111353040A (en) | 2019-05-29 | 2019-05-29 | GRU-based attribute level emotion analysis method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111353040A true CN111353040A (en) | 2020-06-30 |
Family
ID=71196950
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910459539.6A Pending CN111353040A (en) | 2019-05-29 | 2019-05-29 | GRU-based attribute level emotion analysis method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111353040A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111813895A (en) * | 2020-08-07 | 2020-10-23 | 深圳职业技术学院 | An attribute-level sentiment analysis method based on hierarchical attention mechanism and gate mechanism |
CN112131886A (en) * | 2020-08-05 | 2020-12-25 | 浙江工业大学 | Method for analyzing aspect level emotion of text |
CN113849646A (en) * | 2021-09-28 | 2021-12-28 | 西安邮电大学 | Text emotion analysis method |
CN114492521A (en) * | 2022-01-21 | 2022-05-13 | 成都理工大学 | Intelligent lithology while drilling identification method and system based on acoustic vibration signals |
CN115098631A (en) * | 2022-06-23 | 2022-09-23 | 浙江工商大学 | Sentence-level emotion analysis method based on text capsule neural network |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108595601A (en) * | 2018-04-20 | 2018-09-28 | 福州大学 | A kind of long text sentiment analysis method incorporating Attention mechanism |
CN108984724A (en) * | 2018-07-10 | 2018-12-11 | 凯尔博特信息科技(昆山)有限公司 | It indicates to improve particular community emotional semantic classification accuracy rate method using higher-dimension |
US20190005027A1 (en) * | 2017-06-29 | 2019-01-03 | Robert Bosch Gmbh | System and Method For Domain-Independent Aspect Level Sentiment Detection |
CN109145304A (en) * | 2018-09-07 | 2019-01-04 | 中山大学 | A kind of Chinese Opinion element sentiment analysis method based on word |
-
2019
- 2019-05-29 CN CN201910459539.6A patent/CN111353040A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190005027A1 (en) * | 2017-06-29 | 2019-01-03 | Robert Bosch Gmbh | System and Method For Domain-Independent Aspect Level Sentiment Detection |
CN108595601A (en) * | 2018-04-20 | 2018-09-28 | 福州大学 | A kind of long text sentiment analysis method incorporating Attention mechanism |
CN108984724A (en) * | 2018-07-10 | 2018-12-11 | 凯尔博特信息科技(昆山)有限公司 | It indicates to improve particular community emotional semantic classification accuracy rate method using higher-dimension |
CN109145304A (en) * | 2018-09-07 | 2019-01-04 | 中山大学 | A kind of Chinese Opinion element sentiment analysis method based on word |
Non-Patent Citations (3)
Title |
---|
MEISHAN ZHANG: "Gated Neural Networks for Targeted Sentiment Analysis", 《PROCEEDINGS OF THE THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE》 * |
YEQUANWANG: "Attention-based LSTM for Aspect-level Sentiment Classification", 《PROCEEDINGS OF THE 2016 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING》 * |
ZHAI PENGHUA: "Bidirectional-GRU Based on Attention Mechanism for Aspect-level Sentiment Analysis", 《PROCEEDINGS OF THE 2019 11TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING》 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112131886A (en) * | 2020-08-05 | 2020-12-25 | 浙江工业大学 | Method for analyzing aspect level emotion of text |
CN111813895A (en) * | 2020-08-07 | 2020-10-23 | 深圳职业技术学院 | An attribute-level sentiment analysis method based on hierarchical attention mechanism and gate mechanism |
CN111813895B (en) * | 2020-08-07 | 2022-06-03 | 深圳职业技术学院 | Attribute level emotion analysis method based on level attention mechanism and door mechanism |
CN113849646A (en) * | 2021-09-28 | 2021-12-28 | 西安邮电大学 | Text emotion analysis method |
CN114492521A (en) * | 2022-01-21 | 2022-05-13 | 成都理工大学 | Intelligent lithology while drilling identification method and system based on acoustic vibration signals |
CN115098631A (en) * | 2022-06-23 | 2022-09-23 | 浙江工商大学 | Sentence-level emotion analysis method based on text capsule neural network |
CN115098631B (en) * | 2022-06-23 | 2024-08-02 | 浙江工商大学 | Sentence-level emotion analysis method based on text capsule neural network |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109284506B (en) | User comment emotion analysis system and method based on attention convolution neural network | |
CN110929030B (en) | A joint training method for text summarization and sentiment classification | |
CN109657239B (en) | Chinese Named Entity Recognition Method Based on Attention Mechanism and Language Model Learning | |
CN109635109B (en) | Sentence classification method based on LSTM combined with part of speech and multi-attention mechanism | |
CN108763326B (en) | Emotion analysis model construction method of convolutional neural network based on feature diversification | |
CN109992780B (en) | Specific target emotion classification method based on deep neural network | |
CN111353040A (en) | GRU-based attribute level emotion analysis method | |
CN110096711B (en) | Natural language semantic matching method for sequence global attention and local dynamic attention | |
CN113435211B (en) | A Text Implicit Sentiment Analysis Method Combining External Knowledge | |
CN112001187A (en) | Emotion classification system based on Chinese syntax and graph convolution neural network | |
CN111310474A (en) | Online course comment sentiment analysis method based on activation-pooling enhanced BERT model | |
CN116579347A (en) | Comment text emotion analysis method, system, equipment and medium based on dynamic semantic feature fusion | |
CN112784041B (en) | Chinese short text sentiment orientation analysis method | |
CN108170848B (en) | A Dialogue Scene Classification Method for China Mobile Intelligent Customer Service | |
CN109145304B (en) | Chinese viewpoint element sentiment analysis method based on characters | |
CN111400494B (en) | A sentiment analysis method based on GCN-Attention | |
CN112749274A (en) | Chinese text classification method based on attention mechanism and interference word deletion | |
CN113806543B (en) | A Text Classification Method Based on Gated Recurrent Units with Residual Skip Connections | |
CN114722835A (en) | Text emotion recognition method based on LDA and BERT fusion improved model | |
CN114547299A (en) | Short text sentiment classification method and device based on composite network model | |
CN111309909A (en) | Text emotion classification method based on hybrid model | |
CN112199503B (en) | Feature-enhanced unbalanced Bi-LSTM-based Chinese text classification method | |
CN113204640A (en) | Text classification method based on attention mechanism | |
CN116467443A (en) | Topic identification-based online public opinion text classification method | |
Rauf et al. | Using bert for checking the polarity of movie reviews |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200630 |
|
RJ01 | Rejection of invention patent application after publication |