融合表情符号与短文本的微博多维情感分类
CSTR:
作者:
作者单位:

(天津大学 电气自动化与信息工程学院,天津 300072)

作者简介:

赵晓芳(1990—),女,博士研究生; 金志刚(1972—),男,教授,博士生导师

通讯作者:

金志刚,zgjin@tju.edu.cn

中图分类号:

TP391

基金项目:

国家自然科学基金(71502125)


Multi-dimensional sentiment classification of microblog based on Emoticons and short texts
Author:
Affiliation:

(School of Electrical and Information Engineering, Tianjin University, Tianjin 300072, China)

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    表情符号已成为网络语言重要组成部分,是分析社交媒体情感的主要特征之一.目前分析社交媒体情感符号的方法多针对Emoji,对颜文字的情感倾向没有相应分析.为获取中文媒体的多维度情感并分析热点话题的群体情感走向,本文以微博为例提出一种新的融合表情符号与短文本的多维情感分类方法.在该框架中,采用深度学习模型分析文本与Emoji组合部分、颜文字部分,分别计算两部分的7种情感强度,挖掘各部分与情感标签的深层次关联,并设计计算模型来反映语句包含的多维情感属性,实现对语句多维情感强度的检测.实验选择NLPCC2014数据集和爬取的带有颜文字的微博数据集进行验证,实验证明当文本与Emoji组合、颜文字占比分别为0.6和0.4时情感分类效果最好,且含颜文字的语句情感分类性能指标始终高于不含颜文字的语句,这表明融合表情符号和短文本的形式有效提高了情感检测精度.该方法为研究群体情感趋势提供了更细粒度的分析,为中文社交媒体的情感分析提供了新思路.

    Abstract:

    Emoticons have become an important component of network language and is one of the main characteristics of the analysis of social media sentiment. The current social media sentiment analysis methods most focus on Emoji, while there is no study on the sentiment trend of kinesics. In order to obtain the multi-dimensional sentiment polarity of Chinese social media and analyze the group sentiment trend on hot topics, this paper proposes a new multi-dimensional sentiment classification method based on deep learning, which combines Emoticons with short texts. In this framework, the text and Emoji combination and the kinesics in microblog sentences were analyzed using deep learning model, and seven sentiment intensities of the two parts were obtained to explore the correlation between each part and sentiment labels. Then, a computational model was designed to reflect the multi-dimensional sentiment polarity contained in microblog sentences, which can realize the detection of the multi-dimensional sentiment intensity of sentences. The experiment utilized the NLPCC2014 dataset and the crawled microblog dataset containing kinesics for verification. Results show that when the proportion of the text and Emoji combination and the kinesics were 0.6 and 0.4, the effect of sentiment classification was the best. The sentiment classification performance indicator of the sentences containing kinesics was always higher than that without kinesics, which indicates that the combination of Emoticon and short texts can effectively improve the accuracy of microblog sentiment detection. The experiment provides a more fine-grained analysis for group sentiment trend and a new idea for Chinese social media sentiment analysis.

    参考文献
    相似文献
    引证文献
引用本文

赵晓芳,金志刚.融合表情符号与短文本的微博多维情感分类[J].哈尔滨工业大学学报,2020,52(5):113. DOI:10.11918/201907004

复制
相关视频

分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2019-07-01
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2020-02-08
  • 出版日期:
文章二维码