计算机科学 ›› 2022, Vol. 49 ›› Issue (3): 246-254.doi: 10.11896/jsjkx.201200073

李浩, 张兰, 杨兵, 杨海潇, 寇勇奇, 王飞, 康雁   

  1. 云南大学软件学院 昆明650504
  • 收稿日期:2020-12-07 修回日期:2021-06-08 出版日期:2022-03-15 发布日期:2022-03-15
  • 通讯作者: 康雁(562530855@qq.com)
  • 作者简介:(lihao707@ynu.edu.cn)
  • 基金资助:

Fine-grained Sentiment Classification of Chinese Microblogs Combining Dual Weight Mechanismand Graph Convolutional Neural Network

LI Hao, ZHANG Lan, YANG Bing, YANG Hai-xiao, KOU Yong-qi, WANG Fei, KANG Yan   

  1. School of Software,Yunnan University,Kunming 650504,China
  • Received:2020-12-07 Revised:2021-06-08 Online:2022-03-15 Published:2022-03-15
  • About author:LI Hao,born in 1970,professor,Ph.D,His main research interests include distributed computing,grid and cloud computing research.
    KANG Yan,born in 1972,Ph.D,asso-ciate professor.Her main research inte-rests include software engineering,system optimization,big data processing and mining.
  • Supported by:
    National Natural Science Foundation of China(61762092),Open Fund Project of Key Laboratory of Software Engineering in Yunnan Province(2020SE303),Major Science and Technology Projects in Yunnan Province(202002AB080001),Material Genetic Engineering-Calculation Software Development of Integrated Calculation Function Module Based on Metcloud(2019CLJY06) and Gene Enginee-ring of Rare and Precious Metal Materialsin Yunnan Province-R & D and Demonstration Application of High-throughput Integrated Computing and Data Analysis Technology for Rare and Precious Metal Materials(2019ZE001-1,202002AB080001).

摘要: 利用深度学习模型和注意力机制对微博文本进行细粒度情感分类,已成为研究的热点,但是现有注意力机制只考虑单词对单词的影响,对单词本身的多种维度特性(如词义、词性、语义等特征信息)缺乏有效的融合。为了解决这个问题,文中提出了一种双重权重机制WDWM(Word and Dimension Weight Mechanism),并将其与基于解析依赖树的GCN模型相结合,通过选择每条微博中含有关键信息的单词,来抽取单词的重要维度特性,对单词的多种维度特性进行有效融合,从而捕获更加丰富的特征信息。在针对微博细粒度情感分类的实验中,融合双重权重机制和图卷积神经网络的微博细粒度情感分类模型(WDWM-GCN)的F测度达到了84.02%,比2020年提出的最新的算法高出1.7%,这进一步证明,WDWM-GCN能对单词的多维度特性进行有效的融合,能够捕获丰富的特征信息。在对搜狗新闻数据集进行分类的实验中,BERT模型在加入WDWM后,其分类效果得到了进一步提升,这充分证明 WDWM对所提分类模型有明显的改进效果。

关键词: 双重权重机制, 图卷积神经网络, 细粒度情感分类, 注意力机制

Abstract: Using deep learning models and attention mechanisms to classify fine-grained emotions of Chinese microblogs has become a research hotspot.However,the existing attention mechanisms consider the impact of words on words,and lack effective integration of the various dimensional characteristics of the words themselves (such as word meaning,part of speech,semantics and other characteristic information).In order to solve this problem,the paper proposes a dual weight mechanism WDWM (word and dimension weight mechanism),and combines it with the GCN model based on the analytical dependency tree,so that it can not only select the words that contain key information in each microblog,but also extract the important dimensional characteristics of the word and effectively integrate multiple dimensional characteristics of words,so as to capture more rich feature information.The F measure of fine-grained sentiment classification of Chinese microblogs combining dual weight mechanism and graph convolutional neural network(WDWM-GCN) reaches 84.02%,which is 1.7% higher than the latest algorithm proposed by WWW in 2020,which further proves that WDWM-GCN can effectively integrate the multi-dimensional characteristics of words and capture rich feature information.In the experiment on the classification of Sogou news data set,after the BERT model is addedto the WDWM mechanism,the classification effect is further improved,which fully provs that the WDWM has a significant improvement on the text classification model.

Key words: Attention mechanism, Dual weight mechanism, Fine-grained emotion classification, Graph convolutional neural network


