CN116244429A - Microblog Sentiment Analysis Method Based on Multi-level Feature Interaction Fusion Guided by Social Relations - Google Patents
Microblog Sentiment Analysis Method Based on Multi-level Feature Interaction Fusion Guided by Social Relations Download PDFInfo
- Publication number
- CN116244429A CN116244429A CN202211602922.0A CN202211602922A CN116244429A CN 116244429 A CN116244429 A CN 116244429A CN 202211602922 A CN202211602922 A CN 202211602922A CN 116244429 A CN116244429 A CN 116244429A
- Authority
- CN
- China
- Prior art keywords
- features
- level
- microblog
- feature
- word
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000004927 fusion Effects 0.000 title claims abstract description 64
- 230000003993 interaction Effects 0.000 title claims abstract description 62
- 238000004458 analytical method Methods 0.000 title claims abstract description 28
- 230000002452 interceptive effect Effects 0.000 claims abstract description 27
- 238000000034 method Methods 0.000 claims abstract description 25
- 230000011273 social behavior Effects 0.000 claims abstract description 3
- 239000011159 matrix material Substances 0.000 claims description 42
- 239000013598 vector Substances 0.000 claims description 39
- 230000006870 function Effects 0.000 claims description 26
- 230000008569 process Effects 0.000 claims description 16
- 238000000605 extraction Methods 0.000 claims description 13
- 238000010276 construction Methods 0.000 claims description 10
- 238000012549 training Methods 0.000 claims description 10
- 241000282326 Felis catus Species 0.000 claims description 8
- 238000011176 pooling Methods 0.000 claims description 8
- 230000009977 dual effect Effects 0.000 claims description 4
- 230000007246 mechanism Effects 0.000 claims description 4
- 230000004913 activation Effects 0.000 claims description 3
- 238000012937 correction Methods 0.000 claims description 3
- 238000007781 pre-processing Methods 0.000 claims description 3
- 230000003997 social interaction Effects 0.000 claims description 3
- 238000013527 convolutional neural network Methods 0.000 abstract description 12
- 238000003058 natural language processing Methods 0.000 abstract description 2
- 230000008451 emotion Effects 0.000 abstract 3
- 238000010586 diagram Methods 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 101100012902 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) FIG2 gene Proteins 0.000 description 2
- 230000006399 behavior Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 101001121408 Homo sapiens L-amino-acid oxidase Proteins 0.000 description 1
- 102100026388 L-amino-acid oxidase Human genes 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000002996 emotional effect Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
- G06F16/353—Clustering; Classification into predefined classes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/31—Indexing; Data structures therefor; Storage structures
- G06F16/316—Indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/3332—Query translation
- G06F16/3338—Query expansion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9536—Search customisation based on social or collaborative filtering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/211—Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/237—Lexical tools
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Machine Translation (AREA)
Abstract
Description
技术领域Technical Field
本发明属于自然语言处理领域,涉及社交关系引导的多级特征交互融合的微博情感分析方法。The present invention belongs to the field of natural language processing and relates to a microblog sentiment analysis method based on multi-level feature interaction fusion guided by social relationships.
背景技术Background Art
近年来,Twitter、新浪微博成为用户喜爱发布微博信息和传播的热门平台。分析大型社交平台上微博的情绪可以帮助政府或电商等感知公众对各种话题(如:政治事件、名人、日常生活等)的看法,其在学术和工业领域都有着广泛的应用。微博情感分析旨在快速地对海量的数据做出情感的判别,减少人为投入的同时给予决策者及时的反馈信息,以便快速决策。In recent years, Twitter and Sina Weibo have become popular platforms for users to post and spread information on Weibo. Analyzing the sentiment of Weibo on large social platforms can help governments or e-commerce companies perceive the public's views on various topics (such as political events, celebrities, daily life, etc.), and it has a wide range of applications in academic and industrial fields. Weibo sentiment analysis aims to quickly judge the sentiment of massive data, reduce human input, and provide decision makers with timely feedback information for quick decision-making.
微博情感分析方法的基本思想是在已经由人为标记情感的数据集上训练得到情感分类器,然后用该分类器作为情感分析模型识别未标注微博的情感。具体分两步,微博文本数据的预处理和情感分析模型的建立。在预处理方面,在移除特殊符号和停止词后通过分词工具将微博文本的每句话转换为相应的词语格式。在情感分析模型建立方面,构建神经网络模型求解词语格式的微博文本至情感特征的映射函数。然而由于微博中网络新词、特殊符号的使用导致微博中的词语匮乏,从而导致词汇稀疏问题。为了解决这个问题,研究者将注意力转移到了微博文本间的交互活动(即社交平台上的关注、点赞、转发等交互行为)。其通过交互行为找出关系微博从而对目标微博进行扩充,以解决词汇稀疏问题。The basic idea of the microblog sentiment analysis method is to train a sentiment classifier on a dataset that has been manually labeled with sentiment, and then use the classifier as a sentiment analysis model to identify the sentiment of unlabeled microblogs. It is divided into two steps: preprocessing of microblog text data and establishment of a sentiment analysis model. In terms of preprocessing, after removing special symbols and stop words, each sentence of the microblog text is converted into a corresponding word format through a word segmentation tool. In terms of establishing a sentiment analysis model, a neural network model is constructed to solve the mapping function from microblog text in word format to sentiment features. However, due to the use of new network words and special symbols in microblogs, there is a lack of words in microblogs, which leads to a sparse vocabulary problem. In order to solve this problem, researchers have shifted their attention to the interactive activities between microblog texts (i.e., interactive behaviors such as following, liking, and forwarding on social platforms). They find related microblogs through interactive behaviors and expand the target microblogs to solve the problem of sparse vocabulary.
无疑交互活动信息的引入在一定程度上可以解决微博由于数据噪声带来的情感识别率低的问题。早期引入交互活动到微博情感分析的方法是利用用户关注关系构建微博间关系网络,从而让关系微博对目标微博进行扩充。但是,随着用户发布微博数量的增加,带给目标微博过多的数据噪声。针对这个问题,目前的研究方法将主题特征、用户相似性、微博相似性等单个因素被考虑到微博关系网络的构建中以过滤数据噪声。然而,目前现有的研究方法有以下缺陷:(1)并没有考虑多个因素过滤微博关系网络提高关系网络情感一致性的概率;(2)仅将目标微博作为扩充特征来解决数据稀疏问题,并没有让微博关系网络参与到引导微博间的交互中。因此,现有技术对于处理结合社交关系的微博情感分析的研究显得不够充分。Undoubtedly, the introduction of interactive activity information can solve the problem of low sentiment recognition rate caused by data noise in Weibo to a certain extent. The early method of introducing interactive activities into Weibo sentiment analysis was to use user attention relationships to build a relationship network between Weibo, so that the relationship Weibo can expand the target Weibo. However, as the number of Weibo posts by users increases, too much data noise is brought to the target Weibo. To address this problem, current research methods take into account individual factors such as topic features, user similarity, and Weibo similarity in the construction of Weibo relationship networks to filter data noise. However, the current existing research methods have the following defects: (1) They do not consider multiple factors to filter the Weibo relationship network to increase the probability of emotional consistency of the relationship network; (2) They only use the target Weibo as an expansion feature to solve the data sparsity problem, and do not allow the Weibo relationship network to participate in guiding the interaction between Weibo. Therefore, the existing technology is not sufficient for the research on Weibo sentiment analysis combined with social relationships.
发明内容Summary of the invention
有鉴于此,本发明的目的在于提供社交关系引导的多级特征交互融合的微博情感分析方法。In view of this, the purpose of the present invention is to provide a microblog sentiment analysis method with multi-level feature interaction fusion guided by social relations.
为达到上述目的,本发明提供如下技术方案:In order to achieve the above object, the present invention provides the following technical solutions:
社交关系引导的多级特征交互融合的微博情感分析方法,该方法包括以下步骤:A social relationship-guided multi-level feature interactive fusion microblog sentiment analysis method, the method comprising the following steps:
步骤1、微博文本和用户交互特征的提取。对微博文本及用户的社交行为数据进行预处理,使用不同维度的BERT预训练模型提取微博的文本的词级、句级特征,使用LINE提取社交信息的关系特征,以将它们表示为计算机可以识别的张量格式。Step 1: Extraction of Weibo text and user interaction features. Preprocess Weibo text and user social behavior data, use BERT pre-trained models of different dimensions to extract word-level and sentence-level features of Weibo text, and use LINE to extract relational features of social information to represent them in a tensor format that can be recognized by computers.
进一步,所述步骤1具体过程包括:Furthermore, the specific process of step 1 includes:
首先,对于微博文本si,将BERT模型中最后一个隐藏层的状态的特征(l为句子的长度)作为词级的表示特征。将BERT模型输出的最后一个隐藏层的[CLS]特征作为句级的表示特征。则经过BERT编码器,输出的文本特征可以表示为:First, for the microblog text s i , the features of the state of the last hidden layer in the BERT model are (l is the length of the sentence) as the word-level representation feature. The [CLS] feature of the last hidden layer output by the BERT model As a sentence-level representation feature, after the BERT encoder, the output text feature can be expressed as:
其中,d0、d1为两个BERT预训练模型的维度,表示第1维度的BERT模型输出的词级特征,表示第1种维度BERT模型输出的句级特征,Linear(X,y)的意义在于给特征X乘上过度的可训练矩阵,使输出的特征维度映射为y维。最终可以得到所有微博文本的特征表示:Among them, d 0 and d 1 are the dimensions of the two BERT pre-training models. Represents the word-level features output by the BERT model in the first dimension, It represents the sentence-level features output by the BERT model in the first dimension. The meaning of Linear(X,y) is to multiply feature X by an excessive trainable matrix so that the output feature dimension is mapped to the y dimension. Finally, the feature representation of all Weibo texts can be obtained:
然后,用户交互特征的提取。用户交互特征提取分两步:微博关系权重网络的构建、微博关系权重网络的嵌入。微博关系权重网络的构建。将从用户间的关注关系用户发布微博的提及标签用户发布微博的主题标签三个方面构建微博间的关系权重网络。对于依靠主题标签(#)建立的微博关系的规则为:两个微博处于同一主题下,关系权重表示为两个微博共享相同主题的个数。对于依靠提及标签(@)建立的微微博关系的规则为:(1)微博A提及用户B后,用户B发布微博B,则微博A和微博B间有一条权重为1的边;(2)两个微博有相同提及的用户,权重为提及相同用户的个数。对于依靠用户关注建立的微博关系的规则为:(1)用户之间有关注关系且用户发布微博,则这些微博之间有一条权重为1的边;(2)同一用户发布的微博之间有一条权重为1的边。最终可以构建一个微博关系权重网络E(i,j,w)。Then, the user interaction features are extracted. The user interaction feature extraction is divided into two steps: the construction of the Weibo relationship weight network and the embedding of the Weibo relationship weight network. The Weibo relationship weight network is constructed. The mention tags of the user's microblog Theme tags for users to post on Weibo The relationship weight network between microblogs is constructed from three aspects. The rule for the microblog relationship established by the topic tag (#) is: if two microblogs are under the same topic, the relationship weight is expressed as the number of topics shared by the two microblogs. The rule for the microblog relationship established by the mention tag (@) is: (1) after microblog A mentions user B, user B posts microblog B, there is an edge with a weight of 1 between microblog A and microblog B; (2) if two microblogs have the same mentioned user, the weight is the number of times the same user is mentioned. The rule for the microblog relationship established by user following is: (1) if there is a following relationship between users and users post microblogs, there is an edge with a weight of 1 between these microblogs; (2) there is an edge with a weight of 1 between microblogs posted by the same user. Finally, a microblog relationship weight network E(i,j,w) can be constructed.
最后,微博关系权重网络的嵌入。采用LINE将一个权重网络中的节点嵌入表示为对应的低维特征向量。对于第一步建立的微博关系权重网络E(i,j,w),微博节点vi和vj,其可通过最小化嵌入损失函数O来获得权重网络中每个微博节点的嵌入表示。Finally, embedding of the Weibo relation weight network. LINE is used to embed the nodes in a weight network as corresponding low-dimensional feature vectors. For the Weibo relation weight network E(i,j,w) established in the first step, Weibo nodes v i and v j , the embedding representation of each Weibo node in the weight network can be obtained by minimizing the embedding loss function O.
其中,(i,j,wi,j)表示微博i和微博j之间有条权重为wi,j的边,表示微博i的低维嵌入向量,表示当视为上下文的向量。生成微博关系嵌入表示为微博数量为m,微博节点嵌入维度是dr。其中,没有参与到社交互动的微博节点填充为相同维度的噪声向量得到m个微博节点的嵌入矩阵表示为 Among them, (i,j,wi ,j ) indicates that there is an edge with weight wi ,j between Weibo i and Weibo j. represents the low-dimensional embedding vector of Weibo i, Indicates when The generated microblog relationship embedding is represented as The number of microblogs is m, and the embedding dimension of microblog nodes is d r . Among them, microblog nodes that do not participate in social interactions are filled with noise vectors of the same dimension The embedding matrix of m microblog nodes is expressed as
步骤2、词级特征的融合。拟使用双通道、多个不同大小卷积核的CNN对词级特征进行融合。Step 2: Fusion of word-level features: It is planned to use a dual-channel CNN with multiple convolution kernels of different sizes to fuse word-level features.
进一步,所述步骤2具体过程包括:Further, the specific process of step 2 includes:
首先,对于微博文本si提取到的词级特征向量采用CNN提取到的词级特征输出表示为:First, the word-level feature vector extracted from the microblog text si Output of word-level features extracted by CNN It is expressed as:
y={y1,y2,...,yL}y={y 1 ,y 2 ,...,y L }
其中yi表示第i个卷积核的输出,为两个BERT输出到卷积双通道的堆叠,Wl和bl分别表示第l个卷积核的权重矩阵和偏置,表示卷积操作;表示在一个包含h个词的窗口中,第l个卷积核的输出。接着,通过最大池化层找出文本中的重要部分p并在最后一维度连接所有特征。得到最终的词级特征表示为:Where yi represents the output of the i-th convolution kernel, is a stack of two BERT outputs to the convolution dual channel, W l and b l represent the weight matrix and bias of the lth convolution kernel respectively, Represents the convolution operation; represents the output of the lth convolution kernel in a window containing h words. Next, the important part p in the text is found through the maximum pooling layer and all features are connected in the last dimension. The final word-level feature is obtained It is expressed as:
p=cat([maxpool(y1),maxpool(y2),...,maxpool(yL)])p=cat([maxpool(y 1 ),maxpool(y 2 ),...,maxpool(y L )])
然后,为了维度的统一,将词级特征p映射为dr维,即表示为Then, in order to unify the dimensions, the word-level feature p is mapped to d r dimensions, that is, Expressed as
ps=Linear(p,dr) ps =Linear(p, dr )
最后,句子级特征的提取。将第一个BERT的句子级特征用作整体句子特征。即将句子特征也映射为dr维,即表示为:Finally, sentence-level features are extracted. The sentence-level features of the first BERT are used as overall sentence features. It is also mapped to d r dimension, that is, It is expressed as:
步骤3、词级、句级、关系特征的交互。拟使用步骤1中的关系特征引导步骤2中的词级特征、步骤1中的句级特征进行交互,得交互后词级特征、交互后句级特征。Step 3: Interaction of word-level, sentence-level, and relational features. It is planned to use the relational features in step 1 to guide the word-level features in step 2 and the sentence-level features in step 1 to interact, and obtain the word-level features and sentence-level features after interaction.
进一步,所述步骤3具体过程包括:Further, the specific process of step 3 includes:
首先是关系特征的引导交互。包含微博相似度矩阵的构建、词级和句级的引导交互。The first is the guided interaction of relational features, including the construction of microblog similarity matrix, and guided interaction at word level and sentence level.
然后,微博相似度矩阵的构建。使用关系特征构建立微博相似性矩阵,定义为两个节点向量相似度的归一化值。m个微博节点的嵌入矩阵表示为对于第i个微博节点嵌入得到微博相似性矩阵M为:Then, the microblog similarity matrix is constructed. The microblog similarity matrix is constructed using the relationship features and is defined as the normalized value of the similarity between two node vectors. The embedding matrix of m microblog nodes is expressed as For the i-th microblog node embedding The microblog similarity matrix M is obtained as:
h=Tanh(sumr(Mr-I(1)*vnoise)*sumr(Mr-I*vnoise))h=Tanh(sum r (M r -I (1) *v noise )*sum r (M r -I*v noise ))
Mcorr=ReLU(Tanh((h·hT-I)*10))M corr =ReLU(Tanh((h·h T -I)*10))
其中||vi||F为向量vi的Frobenius范数,·为矩阵乘法,*为元素乘法,sumr()表示按行对矩阵进行求和。I(1)为全1矩阵,I为单位矩阵。Mcorr为修正矩阵,主要是为了修正随机填充的微博节点的填充向量vnoise带来的微博文本相似性噪声。Tanh和ReLU为激活函数,如下:Where || vi || F is the Frobenius norm of vector vi , · is matrix multiplication, * is element multiplication, and sum r () represents summing the matrix by row. I (1) is a matrix of all 1s, and I is the identity matrix. M corr is a correction matrix, which is mainly used to correct the microblog text similarity noise caused by the filling vector v noise of the randomly filled microblog nodes. Tanh and ReLU are activation functions, as follows:
ReLU(x)=max(0,x)ReLU(x)=max(0,x)
最后,词级和句级的引导交互。对于m个微博,交互后词级特征交互后句级特征输出分别为:Finally, the guided interaction between word level and sentence level. For m microblogs, the word level features after interaction Post-interaction sentence-level features The outputs are:
步骤4、特征的第一次融合。以步骤1的关系特征为Q(查询向量)、步骤3的句级特征为K(键值向量)、步骤3的词级特征为V(值向量)为特征,构建注意力融合网络对3个特征进行第一次融合,得一次融合特征。具体地,以微博关系特征Mr为引导查询(Q),以交互后的句子级特征hcls为键值(K),以词级特征hs为值(V),输入到自注意机制中获取引导后的微博文本的一次融合特征:Step 4: First fusion of features. Take the relational features of step 1 as Q (query vector), the sentence-level features of step 3 as K (key-value vector), and the word-level features of step 3 as V (value vector) as features, and build an attention fusion network to fuse the three features for the first time to obtain a fusion feature. Specifically, take the microblog relational feature Mr as the guided query (Q), the sentence-level feature hcls after interaction as the key value (K), and the word-level feature hs as the value (V), and input them into the self-attention mechanism to obtain the guided microblog text’s first fusion feature:
hatt=SelfAtt(Q=Mr,K=hs,V=hcls)h att = SelfAtt (Q = M r , K = h s , V = h cls )
其中,dk是键值向量K的维度。Q、K、V为对应的查询、键值、值向量。Where d k is the dimension of the key-value vector K. Q, K, V are the corresponding query, key-value, and value vectors.
步骤5、特征的第二次融合。拟使用动态加权系数完成对步骤4的一次融合特征、步骤3的交互后词级特征、交互后句级特征进行权重分配。构建交互融合网络对加权后特征进行第二次融合。Step 5: Second fusion of features. Dynamic weighting coefficients are used to assign weights to the first fusion features of step 4, the word-level features after interaction of step 3, and the sentence-level features after interaction. An interactive fusion network is constructed to perform a second fusion of the weighted features.
进一步,所述步骤5具体过程包括:Further, the specific process of step 5 includes:
首先,动态加权。将一次融合特征hatt、交互后词级特征hcls、交互后句级特征hs进行堆叠动态加权。得加权后特征为:First, dynamic weighting is performed. The first fusion feature h att , the word-level feature after interaction h cls , and the sentence-level feature after interaction h s are stacked and dynamically weighted. The weighted feature is:
hstack=Stack([hs,hatt,hcls])h stack =Stack([h s ,h att ,h cls ])
hc=(hstack)T*[a1,a2,a3]h c =(h stack ) T *[a 1 ,a 2 ,a 3 ]
其中,为堆叠后的特征,[a1,a2,a3]为3个特征的动态加权系数。in, is the stacked feature, and [a 1 ,a 2 ,a 3 ] is the dynamic weighting coefficient of the three features.
接着,交互融合网络。类似于词级特征融合网络,使用CNN进行对加权特征进行第二次融合,得到融合后的特征h为:Next, the interactive fusion network. Similar to the word-level feature fusion network, CNN is used to perform a second fusion of the weighted features, and the fused feature h is:
h={h1,h2,...,hL}h={h 1 ,h 2 ,...,h L }
最后,通过平均池化获得交互后的特征并在最后一维度拼接。得到融合特征为Finally, the interactive features are obtained through average pooling and concatenated in the last dimension. The fused features are:
hf=cat([avgpool(h1),avgpool(h2),...,avgpool(hL)])h f =cat([avgpool(h 1 ),avgpool(h 2 ),...,avgpool(h L )])
步骤6、微博情感分类。构建Softmax情感分类器对微博进行情感分类。以交叉熵损失函数作为训练损失函数,使用反向传播算法训练模型,得到微博情感分析模型。Step 6: Weibo sentiment classification. Construct a Softmax sentiment classifier to classify Weibo sentiment. Use the cross entropy loss function as the training loss function and use the back propagation algorithm to train the model to obtain a Weibo sentiment analysis model.
进一步,所述步骤6具体过程包括:Further, the specific process of step 6 includes:
首先,构建Softmax分类器完成文本情感的分类:First, build a Softmax classifier to complete the classification of text sentiment:
其中,num_class为微博文本对应的情感类别。Among them, num_class is the sentiment category corresponding to the microblog text.
最后,采用反向传播算法训练模型,以交叉熵损失函数作为训练过程中的损失函数,通过优化损失函数来优化模型,表达式为:Finally, the back propagation algorithm is used to train the model, and the cross entropy loss function is used as the loss function in the training process. The model is optimized by optimizing the loss function. The expression is:
其中,J(w,b)为样本整体的损失,m为样本数量,y(i)、分别为样本的真实样本概率分布和预测样本概率分布。λ为L2正则化的系数。Among them, J(w,b) is the loss of the whole sample, m is the number of samples, y (i) , are the true sample probability distribution and predicted sample probability distribution of the sample respectively. λ is the coefficient of L2 regularization.
本发明的有益效果在于:The beneficial effects of the present invention are:
1)针对现有的结合社交关系的微博情感分析方法构建的微博关系网络情感一致性概率低下问题,考虑多个因素去过滤微博关系网络,提高关系网络情感一致性的概率。1) Aiming at the low probability of sentiment consistency in the Weibo relationship network constructed by the existing Weibo sentiment analysis method combined with social relationships, multiple factors are considered to filter the Weibo relationship network to improve the probability of sentiment consistency in the relationship network.
2)针对现有的方法仅是将目标微博作为扩充特征来解决数据稀疏问题,并没有让微博关系网络参与到引导微博间的交互中,因此提取一种使用社交关系引导微博文本交互的情感分析网络。2) The existing methods only use the target Weibo as an extended feature to solve the data sparsity problem, and do not allow the Weibo relationship network to participate in guiding the interaction between Weibo. Therefore, a sentiment analysis network is extracted that uses social relationships to guide Weibo text interaction.
3)针对社交平台上没有参与社交活动的用户在引入社交关系时候具有噪声,对其进行噪声消除。3) For users who do not participate in social activities on the social platform, noise is generated when introducing social relationships, and noise is eliminated.
4)使用双通道的BERT预训练模型来共同提取微博文本的词级和句级特征,丰富了文本特征。同时,设计加权融合网络对文本特征进行融合,避免了单通道的误判。4) Use the dual-channel BERT pre-trained model to jointly extract word-level and sentence-level features of microblog texts, enriching the text features. At the same time, design a weighted fusion network to fuse text features to avoid single-channel misjudgment.
本发明的其他优点、目标和特征在某种程度上将在随后的说明书中进行阐述,并且在某种程度上,基于对下文的考察研究对本领域技术人员而言将是显而易见的,或者可以从本发明的实践中得到教导。本发明的目标和其他优点可以通过下面的说明书来实现和获得。Other advantages, objectives and features of the present invention will be described in the following description to some extent, and to some extent, will be obvious to those skilled in the art based on the following examination and study, or can be taught from the practice of the present invention. The objectives and other advantages of the present invention can be realized and obtained through the following description.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
为了使本发明的目的、技术方案和优点更加清楚,下面将结合附图对本发明作优选的详细描述,其中:In order to make the purpose, technical solutions and advantages of the present invention more clear, the present invention will be described in detail below in conjunction with the accompanying drawings, wherein:
图1为本发明社交关系引导下基于BERT的微博情感分析网络方法流程图;FIG1 is a flow chart of a BERT-based microblog sentiment analysis network method guided by social relationships of the present invention;
图2为本发明社交关系引导下基于BERT的微博情感分析网络系统模型图。FIG2 is a model diagram of a microblog sentiment analysis network system based on BERT under the guidance of social relationships of the present invention.
具体实施方式DETAILED DESCRIPTION
以下通过特定的具体实例说明本发明的实施方式,本领域技术人员可由本说明书所揭露的内容轻易地了解本发明的其他优点与功效。本发明还可以通过另外不同的具体实施方式加以实施或应用,本说明书中的各项细节也可以基于不同观点与应用,在没有背离本发明的精神下进行各种修饰或改变。需要说明的是,以下实施例中所提供的图示仅以示意方式说明本发明的基本构想,在不冲突的情况下,以下实施例及实施例中的特征可以相互组合。The following describes the embodiments of the present invention by specific examples, and those skilled in the art can easily understand other advantages and effects of the present invention from the contents disclosed in this specification. The present invention can also be implemented or applied through other different specific embodiments, and the details in this specification can also be modified or changed in various ways based on different viewpoints and applications without departing from the spirit of the present invention. It should be noted that the illustrations provided in the following embodiments only illustrate the basic concept of the present invention in a schematic manner, and the following embodiments and the features in the embodiments can be combined with each other without conflict.
其中,附图仅用于示例性说明,表示的仅是示意图,而非实物图,不能理解为对本发明的限制;为了更好地说明本发明的实施例,附图某些部件会有省略、放大或缩小,并不代表实际产品的尺寸;对本领域技术人员来说,附图中某些公知结构及其说明可能省略是可以理解的。Among them, the drawings are only used for illustrative explanations, and they only represent schematic diagrams rather than actual pictures, and should not be understood as limitations on the present invention. In order to better illustrate the embodiments of the present invention, some parts of the drawings may be omitted, enlarged or reduced, and do not represent the size of actual products. For those skilled in the art, it is understandable that some well-known structures and their descriptions in the drawings may be omitted.
本发明实施例的附图中相同或相似的标号对应相同或相似的部件;在本发明的描述中,需要理解的是,若有术语“上”、“下”、“左”、“右”、“前”、“后”等指示的方位或位置关系为基于附图所示的方位或位置关系,仅是为了便于描述本发明和简化描述,而不是指示或暗示所指的装置或元件必须具有特定的方位、以特定的方位构造和操作,因此附图中描述位置关系的用语仅用于示例性说明,不能理解为对本发明的限制,对于本领域的普通技术人员而言,可以根据具体情况理解上述术语的具体含义。The same or similar numbers in the drawings of the embodiments of the present invention correspond to the same or similar parts; in the description of the present invention, it should be understood that if the terms "upper", "lower", "left", "right", "front", "back" and the like indicate directions or positional relationships, they are based on the directions or positional relationships shown in the drawings, which are only for the convenience of describing the present invention and simplifying the description, rather than indicating or implying that the device or element referred to must have a specific direction, be constructed and operated in a specific direction. Therefore, the terms describing the positional relationship in the drawings are only used for illustrative purposes and cannot be understood as limiting the present invention. For ordinary technicians in this field, the specific meanings of the above terms can be understood according to specific circumstances.
如图1所示,本发明提供一种社交关系引导的多级特征交互融合的微博情感分析方法。实施的场景为,对于一个包含微博文本及对应情感和社交信息的数据集,使用提到的方法训练一个用于情感分析的网络模型,使得模型完成对微博文本的情感分类。具体的实施步骤如下:As shown in Figure 1, the present invention provides a social relationship-guided multi-level feature interaction fusion microblog sentiment analysis method. The implementation scenario is that for a data set containing microblog texts and corresponding sentiment and social information, a network model for sentiment analysis is trained using the method mentioned above, so that the model completes the sentiment classification of microblog texts. The specific implementation steps are as follows:
步骤1、通过BERT预训练模型、LINE分别完成对微博文本、社交信息的预处理,即使用向量表示微博文本和社交信息。Step 1: Use the BERT pre-trained model and LINE to pre-process Weibo text and social information, that is, use vectors to represent Weibo text and social information.
首先,BERT预训练模型是谷歌在大量无标记数据集中训练得到的深度模型,可以将输入的微博文本表示为对应的向量格式,即微博文本的词级、句级特征。具体地,对于微博文本si,在经过BERT之后,可以获得si的词级特征(l为句子的长度)和句级特征而本文中使用d0、d1两种不同维度的BERT来共同提取特征,如下:First, the BERT pre-trained model is a deep model trained by Google on a large number of unlabeled datasets. It can represent the input microblog text in the corresponding vector format, that is, the word-level and sentence-level features of the microblog text. Specifically, for the microblog text s i , after BERT, the word-level features of s i can be obtained. (l is the length of the sentence) and sentence-level features In this paper, BERT with two different dimensions, d 0 and d 1, is used to extract features together, as follows:
接着,通过用户间的关注关系文本中的提及关系文本中的主题关系来构建微博关系网络。具体地,对于用户间的关注关系如果用户之间有关注关系且用户发布微博,则这些微博之间有一条权重为1的边;同一用户发布的微博之间有一条权重为1的边。对文本中的提及关系微博A提及用户B后,用户B发布微博B,则微博A和微博B间有一条权重为1的边;两个微博有相同提及的用户,权重为提及相同用户的个数。对于文本中的主题关系两个微博处于同一主题下,则两个微博之间有相同主题的个数权重的边。最终得到一个微博关系权重网络E(i,j,w)。Next, through the following relationship between users Mentions in text Thematic relationships in text To build a microblog relationship network. Specifically, for the attention relationship between users If there is a follow relationship between users and users post microblogs, there is an edge with a weight of 1 between these microblogs; there is an edge with a weight of 1 between microblogs posted by the same user. After Weibo A mentions user B, user B posts Weibo B, then there is an edge with a weight of 1 between Weibo A and Weibo B; if two Weibo posts have the same mentioned user, the weight is the number of times the same user is mentioned. If two microblogs are under the same topic, then there is an edge with the same number of topics between the two microblogs. Finally, a microblog relationship weight network E(i,j,w) is obtained.
最后,通过将微博关系权重网络E(i,j,w)输入到LINE网络中,获得每个微博的向量表示。具体地,通过优化以下函数来获得每个微博节点的向量表示,即关系特征:Finally, the microblog relationship weight network E(i,j,w) is input into the LINE network to obtain the vector representation of each microblog. Specifically, the vector representation of each microblog node, i.e., the relationship feature, is obtained by optimizing the following function:
步骤2、通过CNN融合词级特征、使用关系特征指引词级、句级特征并进行第一次融合。Step 2: Use CNN to fuse word-level features, use relational features to guide word-level and sentence-level features, and perform the first fusion.
首先是通过CNN融合词级特征。对于微博文本si提取到的词级特征向量采用CNN提取到的词级特征输出再通过最大池化层找出文本中的重要部分p并在最后一维度连接所有特征。得到最终的词级特征表示为:First, word-level features are fused through CNN. For the word-level feature vector extracted from the microblog text si Output of word-level features extracted by CNN Then use the maximum pooling layer to find the important part p in the text and connect all the features in the last dimension to get the final word-level feature It is expressed as:
y={y1,y2,...,yL}y={y 1 ,y 2 ,...,y L }
p=cat([maxpool(y1),maxpool(y2),...,maxpool(yL)])p=cat([maxpool(y 1 ),maxpool(y 2 ),...,maxpool(y L )])
将词级特征p映射为dr维,即表示为:Map the word-level feature p to d r dimensions, that is It is expressed as:
ps=Linear(p,dr) ps =Linear(p, dr )
句子特征也映射为dr维,即表示为:Sentence Features It is also mapped to d r dimension, that is, It is expressed as:
接着是使用关系特征指引词级、句级特征。对于第i个微博节点嵌入得到微博相似性矩阵M为:Next, we use relational features to guide word-level and sentence-level features. For the i-th microblog node embedding The microblog similarity matrix M is obtained as:
h=Tanh(sumr(Mr-I(1)*vnoise)*sumr(Mr-I*vnoise))h=Tanh(sum r (M r -I (1) *v noise )*sum r (M r -I*v noise ))
Mcorr=ReLU(Tanh((h·hT-I)*10))M corr =ReLU(Tanh((h·h T -I)*10))
对于m个微博,交互后词级特征交互后句级特征输出分别为:For m microblogs, word-level features after interaction Post-interaction sentence-level features The outputs are:
最后是第一次融合。以微博关系特征Mr为引导查询(Q),以交互后的句子级特征hcls为键值(K),以词级特征hs为值(V),输入到自注意机制中获取引导后的微博文本的一次融合特征:Finally, the first fusion is performed. The microblog relationship feature Mr is used as the guided query (Q), the sentence-level feature hcls after interaction is used as the key value (K), and the word-level feature hs is used as the value (V). These are input into the self-attention mechanism to obtain the guided microblog text’s first fusion feature:
hatt=SelfAtt(Q=Mr,K=hs,V=hcls)h att = SelfAtt (Q = M r , K = h s , V = h cls )
步骤3、构建特征融合网络对特征进行二次融合、对微博进行情感分类。Step 3: Build a feature fusion network to perform secondary fusion of features and perform sentiment classification on microblogs.
首先,将一次融合特征hatt、交互后词级特征hcls、交互后句级特征hs进行堆叠动态加权。得加权后特征为:First, the first fusion feature h att , the word-level feature after interaction h cls , and the sentence-level feature after interaction h s are stacked and dynamically weighted. The weighted feature is:
hstack=Stack([hs,hatt,hcls])h stack =Stack([h s ,h att ,h cls ])
hc=(hstack)T*[a1,a2,a3]h c =(h stack ) T *[a 1 ,a 2 ,a 3 ]
接着,类似于词级特征特征融合网络,使用CNN进行对加权特征进行第二次融合,得到融合后的特征h为:Next, similar to the word-level feature fusion network, CNN is used to fuse the weighted features for the second time, and the fused feature h is:
h={h1,h2,...,hL}h={h 1 ,h 2 ,...,h L }
通过平均池化获得交互后的特征并在最后一维度拼接。得到融合特征为:The interactive features are obtained through average pooling and concatenated in the last dimension. The fused features are:
hf=cat([avgpool(h1),avgpool(h2),...,avgpool(hL)])h f =cat([avgpool(h 1 ),avgpool(h 2 ),...,avgpool(h L )])
然后,构建Softmax分类器完成文本情感的分类:Then, build a Softmax classifier to complete the classification of text sentiment:
最后,采用反向传播算法训练模型,以交叉熵损失函数作为训练过程中的损失函数,通过优化损失函数来优化模型,表达式为:Finally, the back propagation algorithm is used to train the model, and the cross entropy loss function is used as the loss function in the training process. The model is optimized by optimizing the loss function. The expression is:
在以上步骤完成之后,最终会得到一个可用于微博情感分类的模型,将模型保存后即可使用。给定一个微博文本,输入到该模型后即可以得到该模型的情感倾向。After completing the above steps, you will eventually get a model that can be used for Weibo sentiment classification. After saving the model, you can use it. Given a Weibo text, you can get the sentiment tendency of the model by inputting it into the model.
图2为本发明的系统模型图,下面结合附图进行说明,包括以下几个模块:FIG2 is a system model diagram of the present invention, which is described below in conjunction with the accompanying drawings and includes the following modules:
模块一:微博特征提取模块。使用两种不同维度的预训练BERT编码器将微博文本转换为张量格式的词级、句级特征;使用社交信息构建微博关系网络,利用LINE将微博关系网络中的微博节点嵌入为关系特征。Module 1: Weibo feature extraction module. Use two different dimensions of pre-trained BERT encoders to convert Weibo text into word-level and sentence-level features in tensor format; use social information to build a Weibo relationship network, and use LINE to embed Weibo nodes in the Weibo relationship network as relationship features.
模块二:微博特征交互模块。构建双通道、多卷积核的CNN词级特征融合网络对词级特征进行交互融合;使用关系特征构建微博相似度矩阵,使用微博相似度矩阵指引微博文本词级、句级特征的交互;构建Attention网络对交互后的词级特征、交互后的句级特征、关系特征进行第一次融合,得一次融合特征。Module 2: Weibo feature interaction module. Construct a dual-channel, multi-convolution kernel CNN word-level feature fusion network to interactively fuse word-level features; use relationship features to construct a Weibo similarity matrix, and use the Weibo similarity matrix to guide the interaction of word-level and sentence-level features of Weibo texts; construct an Attention network to perform the first fusion of the interactive word-level features, sentence-level features, and relationship features to obtain a fusion feature.
模块三:微博特征融合与情感分类模块。对一次融合特征、交互后的词级特征、交互后的句级特征进行堆叠后进行权重分配;构建交互融合网络对进行权重分配后的特征进行二次融合;以交叉熵损失函数作为训练损失函数,使用反向传播算法训练模型,得到微博情感分析模型,对微博文本进行情感分类。Module 3: Weibo feature fusion and sentiment classification module. The first fusion feature, the word-level feature after interaction, and the sentence-level feature after interaction are stacked and weighted; an interactive fusion network is constructed to perform secondary fusion of the features after weight assignment; the cross entropy loss function is used as the training loss function, and the back propagation algorithm is used to train the model to obtain the Weibo sentiment analysis model and perform sentiment classification on the Weibo text.
可选的,模块一具体包括:Optionally, module one specifically includes:
(1)微博文本和关系特征的提取。包括微博文本特征的提取和微博关系特征的提取。(1) Extraction of microblog text and relation features. This includes the extraction of microblog text features and the extraction of microblog relation features.
设所有微博表示为C={c1,c2,c3,...,cm}(m为数据集中微博的数量)。对于任意微博ci,其包含微博文本si和微博交互特征xi(包括用户关注关系用户提及主题标签)。Suppose all microblogs are represented by C = {c 1 ,c 2 ,c 3 ,...,c m } (m is the number of microblogs in the dataset). For any microblog c i , it contains the microblog text s i and the microblog interaction features x i (including the user follow relationship User Mentions Hashtags ).
(1.1)微博文本特征的提取。包含词级、句级特征的提取。为了获取文本中的局部语义特征和全局句子语境特征,拟同时从文本中提取词级特征和句子级的特征以获得词汇和语境情感特征。采用2种不同维度的预训练BERT模型来完成这项工作。具体来讲,对于微博文本si,将BERT模型中最后一个隐藏层的状态的特征(l为句子的长度)作为词级的表示特征。将BERT模型输出的最后一个隐藏层的[CLS]特征作为句级的表示特征。则经过BERT编码器,输出的文本特征可以表示为:(1.1) Extraction of microblog text features. This includes the extraction of word-level and sentence-level features. In order to obtain local semantic features and global sentence context features in the text, it is planned to extract word-level features and sentence-level features from the text at the same time to obtain vocabulary and context sentiment features. Two pre-trained BERT models with different dimensions are used to complete this task. Specifically, for the microblog text s i , the features of the state of the last hidden layer in the BERT model are extracted. (l is the length of the sentence) as the word-level representation feature. The [CLS] feature of the last hidden layer output by the BERT model As a sentence-level representation feature, after the BERT encoder, the output text feature can be expressed as:
其中,d0、d1为两个BERT预训练模型的维度,表示第1维度的BERT模型输出的词级特征,表示第1种维度BERT模型输出的句级特征,Linear(X,y)的意义在于给特征X乘上过度的可训练矩阵,使输出的特征维度映射为y维。最终可以得到所有微博文本的特征表示:Among them, d 0 and d 1 are the dimensions of the two BERT pre-training models. Represents the word-level features output by the BERT model in the first dimension, It represents the sentence-level features output by the BERT model in the first dimension. The meaning of Linear(X,y) is to multiply feature X by an excessive trainable matrix so that the output feature dimension is mapped to the y dimension. Finally, the feature representation of all Weibo texts can be obtained:
(1.2)用户交互特征的提取。用户交互特征提取分两步:微博关系权重网络的构建、微博关系权重网络的嵌入。(1.2) Extraction of user interaction features. User interaction feature extraction is divided into two steps: construction of microblog relationship weight network and embedding of microblog relationship weight network.
微博关系权重网络的构建。将从用户间的关注关系用户发布微博的提及标签用户发布微博的主题标签三个方面构建微博间的关系权重网络。对于依靠主题标签(#)建立的微博关系的规则为:两个微博处于同一主题下,关系权重表示为两个微博共享相同主题的个数。对于依靠提及标签(@)建立的微微博关系的规则为:(1)微博A提及用户B后,用户B发布微博B,则微博A和微博B间有一条权重为1的边;(2)两个微博有相同提及的用户,权重为提及相同用户的个数。对于依靠用户关注建立的微博关系的规则为:(1)用户之间有关注关系且用户发布微博,则这些微博之间有一条权重为1的边;(2)同一用户发布的微博之间有一条权重为1的边。最终可以构建一个微博关系权重网络E(i,j,w)。The construction of Weibo relationship weight network. The mention tags of the user's microblog Theme tags for users to post on Weibo The relationship weight network between microblogs is constructed from three aspects. The rule for the microblog relationship established by the topic tag (#) is: if two microblogs are under the same topic, the relationship weight is expressed as the number of topics shared by the two microblogs. The rule for the microblog relationship established by the mention tag (@) is: (1) after microblog A mentions user B, user B posts microblog B, then there is an edge with a weight of 1 between microblog A and microblog B; (2) if two microblogs have the same mentioned user, the weight is the number of times the same user is mentioned. The rule for the microblog relationship established by user following is: (1) if there is a following relationship between users and users post microblogs, there is an edge with a weight of 1 between these microblogs; (2) there is an edge with a weight of 1 between microblogs posted by the same user. Finally, a microblog relationship weight network E(i,j,w) can be constructed.
微博关系权重网络的嵌入。采用LINE将一个权重网络中的节点嵌入表示为对应的低维特征向量。对于第一步建立的微博关系权重网络E(i,j,w),微博节点vi和vj,其可通过最小化嵌入损失函数O来获得权重网络中每个微博节点的嵌入表示。Embedding of Weibo relation weighted network. LINE is used to embed nodes in a weighted network as corresponding low-dimensional feature vectors. For the Weibo relation weighted network E(i,j,w) established in the first step, Weibo nodes v i and v j , the embedding representation of each Weibo node in the weighted network can be obtained by minimizing the embedding loss function O.
其中,(i,j,wi,j)表示微博i和微博j之间有条权重为wi,j的边,表示微博i的低维嵌入向量,表示当视为上下文的向量。生成微博关系嵌入表示为微博数量为m,微博节点嵌入维度是dr。其中,没有参与到社交互动的微博节点填充为相同维度的噪声向量得到m个微博节点的嵌入矩阵表示为 Among them, (i,j,wi ,j ) indicates that there is an edge with weight wi ,j between Weibo i and Weibo j. represents the low-dimensional embedding vector of Weibo i, Indicates when The generated microblog relationship embedding is represented as The number of microblogs is m, and the embedding dimension of microblog nodes is d r . Among them, microblog nodes that do not participate in social interactions are filled with noise vectors of the same dimension The embedding matrix of m microblog nodes is expressed as
可选的,模块二具体包括:Optional, Module 2 specifically includes:
(1)词级特征融合网络。使用多个不同大小的卷积核的双通道的CNN提取词级高阶特征。具体地,对于微博文本si提取到的词级特征向量采用CNN提取到的词级特征输出表示为:(1) Word-level feature fusion network. A dual-channel CNN with multiple convolution kernels of different sizes is used to extract word-level high-order features. Specifically, for the word-level feature vector extracted from the microblog text si Output of word-level features extracted by CNN It is expressed as:
y={y1,y2,...,yL}y={y 1 ,y 2 ,...,y L }
其中yi表示第i个卷积核的输出,为两个BERT输出到卷积双通道的堆叠,Wl和bl分别表示第l个卷积核的权重矩阵和偏置,表示卷积操作;表示在一个包含h个词的窗口中,第l个卷积核的输出。接着,通过最大池化层找出文本中的重要部分p并在最后一维度连接所有特征。得到最终的词级特征表示为:Where yi represents the output of the i-th convolution kernel, is a stack of two BERT outputs to the convolution dual channel, W l and b l represent the weight matrix and bias of the lth convolution kernel respectively, Represents the convolution operation; represents the output of the lth convolution kernel in a window containing h words. Next, the important part p in the text is found through the maximum pooling layer and all features are connected in the last dimension. The final word-level feature is obtained It is expressed as:
p=cat([maxpool(y1),maxpool(y2),...,maxpool(yL)])p=cat([maxpool(y 1 ),maxpool(y 2 ),...,maxpool(y L )])
为了维度的统一,将词级特征p映射为dr维,即表示为In order to unify the dimensions, the word-level feature p is mapped to d r dimensions, that is, Expressed as
ps=Linear(p,dr) ps =Linear(p, dr )
句子级特征的提取。将第一个BERT的句子级特征用作整体句子特征。即将句子特征也映射为dr维,即表示为:Extraction of sentence-level features. The sentence-level features of the first BERT are used as the overall sentence features. It is also mapped to d r dimension, that is, It is expressed as:
(2)关系特征的引导交互。包含微博相似度矩阵的构建、词级和句级的引导交互。(2) Guided interaction of relational features. This includes the construction of microblog similarity matrix, and guided interaction at word level and sentence level.
(2.1)微博相似度矩阵的构建。使用关系特征构建立微博相似性矩阵,定义为两个节点向量相似度的归一化值。m个微博节点的嵌入矩阵表示为对于第i个微博节点嵌入得到微博相似性矩阵M为:(2.1) Construction of microblog similarity matrix. The microblog similarity matrix is constructed using relationship features and is defined as the normalized value of the similarity between two node vectors. The embedding matrix of m microblog nodes is expressed as For the i-th microblog node embedding The microblog similarity matrix M is obtained as:
h=Tanh(sumr(Mr-I(1)*vnoise)*sumr(Mr-I*vnoise))h=Tanh(sum r (M r -I (1) *v noise )*sum r (M r -I*v noise ))
Mcorr=ReLU(Tanh((h·hT-I)*10))M corr =ReLU(Tanh((h·h T -I)*10))
其中||vi||F为向量vi的Frobenius范数,·为矩阵乘法,*为元素乘法,sumr()表示按行对矩阵进行求和。I(1)为全1矩阵,I为单位矩阵。Mcorr为修正矩阵,主要是为了修正随机填充的微博节点的填充向量vnoise带来的微博文本相似性噪声。Tanh和ReLU为激活函数,如下:Where || vi || F is the Frobenius norm of vector vi , · is matrix multiplication, * is element multiplication, and sum r () represents summing the matrix by row. I (1) is a matrix of all 1s, and I is the identity matrix. M corr is a correction matrix, which is mainly used to correct the microblog text similarity noise caused by the filling vector v noise of the randomly filled microblog nodes. Tanh and ReLU are activation functions, as follows:
ReLU(x)=max(0,x)ReLU(x)=max(0,x)
(2.2)词级和句级的引导交互。对于m个微博,交互后词级特征交互后句级特征输出分别为:(2.2) Guided interaction at word level and sentence level. For m microblogs, the word-level features after interaction Post-interaction sentence-level features The outputs are:
(3)特征的第一次融合。构建Attention网络的对提取到的交互特征、交互后词级特征、交互后句级特征进行第一次融合。具体地,以微博关系特征Mr为引导查询(Q),以交互后的句子级特征hcls为键值(K),以词级特征hs为值(V),输入到自注意机制中获取引导后的微博文本的一次融合特征:(3) The first fusion of features. The Attention network is constructed to perform the first fusion of the extracted interaction features, word-level features after interaction, and sentence-level features after interaction. Specifically, the microblog relationship feature Mr is used as the guided query (Q), the sentence-level feature hcls after interaction is used as the key value (K), and the word-level feature hs is used as the value (V). These are input into the self-attention mechanism to obtain the first fusion feature of the guided microblog text:
hatt=SelfAtt(Q=Mr,K=hs,V=hcls)h att = SelfAtt (Q = M r , K = h s , V = h cls )
其中,dk是键值向量K的维度。Q、K、V为对应的查询、键值、值向量。Where d k is the dimension of the key-value vector K. Q, K, V are the corresponding query, key-value, and value vectors.
可选的,模块三具体包括:Optional, Module 3 specifically includes:
(1)动态加权。将一次融合特征hatt、交互后词级特征hcls、交互后句级特征hs进行堆叠动态加权。得加权后特征为:(1) Dynamic weighting. The first fusion feature h att , the word-level feature after interaction h cls , and the sentence-level feature after interaction h s are stacked and dynamically weighted. The weighted feature is:
hstack=Stack([hs,hatt,hcls])h stack =Stack([h s ,h att ,h cls ])
hc=(hstack)T*[a1,a2,a3]h c =(h stack ) T *[a 1 ,a 2 ,a 3 ]
其中,为堆叠后的特征,[a1,a2,a3]为3个特征的动态加权系数。in, is the stacked feature, and [a 1 ,a 2 ,a 3 ] is the dynamic weighting coefficient of the three features.
(2)交互融合网络。类似于词级特征特征融合网络,使用CNN进行对加权特征进行第二次融合,得到融合后的特征h为:(2) Interactive fusion network. Similar to the word-level feature fusion network, CNN is used to perform a second fusion of weighted features, and the fused feature h is:
h={h1,h2,...,hL}h={h 1 ,h 2 ,...,h L }
接着,通过平均池化获得交互后的特征并在最后一维度拼接。得到融合特征为Next, the interactive features are obtained through average pooling and concatenated in the last dimension. The fused features are obtained as
hf=cat([avgpool(h1),avgpool(h2),...,avgpool(hL)])h f =cat([avgpool(h 1 ),avgpool(h 2 ),...,avgpool(h L )])
(3)微博情感分类。构建Softmax分类器完成文本情感的分类:(3) Microblog sentiment classification. Construct a Softmax classifier to complete the text sentiment classification:
其中,num_class为微博文本对应的情感类别。Among them, num_class is the sentiment category corresponding to the microblog text.
采用反向传播算法训练模型,以交叉熵损失函数作为训练过程中的损失函数,通过优化损失函数来优化模型,表达式为:The back propagation algorithm is used to train the model, and the cross entropy loss function is used as the loss function in the training process. The model is optimized by optimizing the loss function. The expression is:
其中,J(w,b)为样本整体的损失,m为样本数量,y(i)、分别为样本的真实样本概率分布和预测样本概率分布。λ为L2正则化的系数。Among them, J(w,b) is the loss of the whole sample, m is the number of samples, y (i) , are the true sample probability distribution and predicted sample probability distribution of the sample respectively. λ is the coefficient of L2 regularization.
最后说明的是,以上实施例仅用以说明本发明的技术方案而非限制,尽管参照较佳实施例对本发明进行了详细说明,本领域的普通技术人员应当理解,可以对本发明的技术方案进行修改或者等同替换,而不脱离本技术方案的宗旨和范围,其均应涵盖在本发明的权利要求范围当中。Finally, it should be noted that the above embodiments are only used to illustrate the technical solution of the present invention rather than to limit it. Although the present invention has been described in detail with reference to the preferred embodiments, those skilled in the art should understand that the technical solution of the present invention can be modified or replaced by equivalents without departing from the purpose and scope of the technical solution, which should be included in the scope of the claims of the present invention.
Claims (7)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211602922.0A CN116244429A (en) | 2022-12-13 | 2022-12-13 | Microblog Sentiment Analysis Method Based on Multi-level Feature Interaction Fusion Guided by Social Relations |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211602922.0A CN116244429A (en) | 2022-12-13 | 2022-12-13 | Microblog Sentiment Analysis Method Based on Multi-level Feature Interaction Fusion Guided by Social Relations |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116244429A true CN116244429A (en) | 2023-06-09 |
Family
ID=86632081
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211602922.0A Pending CN116244429A (en) | 2022-12-13 | 2022-12-13 | Microblog Sentiment Analysis Method Based on Multi-level Feature Interaction Fusion Guided by Social Relations |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116244429A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118644869A (en) * | 2024-08-15 | 2024-09-13 | 贵州白山云科技股份有限公司 | Detection method and device for interactive recognition of intention based on dual-channel attention network |
-
2022
- 2022-12-13 CN CN202211602922.0A patent/CN116244429A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118644869A (en) * | 2024-08-15 | 2024-09-13 | 贵州白山云科技股份有限公司 | Detection method and device for interactive recognition of intention based on dual-channel attention network |
CN118644869B (en) * | 2024-08-15 | 2024-11-08 | 贵州白山云科技股份有限公司 | Method and device for detecting interaction recognition intention based on dual-channel attention network |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11631007B2 (en) | Method and device for text-enhanced knowledge graph joint representation learning | |
Liu et al. | Bidirectional LSTM with attention mechanism and convolutional layer for text classification | |
CN111291181B (en) | Representation learning for input classification via topic sparse self-encoder and entity embedding | |
Vadicamo et al. | Cross-media learning for image sentiment analysis in the wild | |
CN109446331B (en) | Text emotion classification model establishing method and text emotion classification method | |
CN111382565B (en) | Method and system for extracting emotion-cause pairs based on multi-label | |
Zobeidi et al. | Opinion mining in Persian language using a hybrid feature extraction approach based on convolutional neural network | |
Bedi et al. | CitEnergy: A BERT based model to analyse Citizens’ Energy-Tweets | |
CN111325029A (en) | A text similarity calculation method based on deep learning ensemble model | |
CN114936277B (en) | Similar question matching method and user similar question matching system | |
CN112800225B (en) | Microblog comment emotion classification method and system | |
CN111814453A (en) | Fine-grained sentiment analysis method based on BiLSTM-TextCNN | |
CN114579741B (en) | GCN-RN aspect emotion analysis method and system for fusing syntax information | |
CN116883723A (en) | A compositional zero-shot image classification method based on parallel semantic embedding | |
CN116467443A (en) | Topic identification-based online public opinion text classification method | |
CN108876643A (en) | It is a kind of social activity plan exhibition network on acquire(Pin)Multimodal presentation method | |
CN116049393A (en) | A GCN-based aspect-level text sentiment classification method | |
CN116244429A (en) | Microblog Sentiment Analysis Method Based on Multi-level Feature Interaction Fusion Guided by Social Relations | |
Kusal et al. | Multimodal text-emoji fusion using deep neural networks for text-based emotion detection in online communication | |
Meng et al. | Regional bullying text recognition based on two-branch parallel neural networks | |
Hossain et al. | TransNet: deep attentional hybrid transformer for Arabic posts classification | |
CN114281999B (en) | Personalized implicit emotion analysis method and system based on user knowledge | |
Huang | Research on sentiment classification of tourist destinations based on convolutional neural network | |
Ibrahiem et al. | Convolutional neural network multi-emotion classifiers | |
Zhi | Mental health analysis for college students based on pattern recognition and reinforcement learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |