摘要: |
随着信息技术与社交媒体的不断发展,用户情感分析在舆情监控、信息预测、产品评价上发挥着越来越重要的作用.然而,监督学习手工标签获取困难,无监督学习缺少标签的引导,因此本文基于社会学理论建立了半监督的情感分析模型,该模型主要分为标签添加和情感分析两部分.标签添加部分首先基于情感一致性和情感传染性两种被认可的社会学理论建立UR-S模型,然后通过用户关联度和文本相似度进行改进,建立TRS-SAT模型,增加标签数量.情感分析部分将TRS-SAT模型与卷积神经网络结合,通过卷积神经网络挖掘特征集合与情感分析标签之间的深层次联系,构建半监督学习模型改善情感分析性能.实验表明,本文提出的基于用户关联度和深度学习的半监督情感分析模型,与半监督的支持向量机模型相比,准确率、召回率、F值分别提升11.40%、5.90%、8.65%;与卷积神经网络模型相比,分别提升4.12%、4.17%、4.14%,均有较好的表现.由此证明,该模型能够为舆情分析与用户决策提供良好的理论基础,具有创新性和实用性. |
关键词: 用户关联度 半监督学习 深度学习 卷积神经网络 情感分析 文本相似度 |
DOI:10.11918/j.issn.0367-6234.201809214 |
分类号:TP391 |
文献标识码:A |
基金项目:国家自然科学基金项目(71502125) |
|
A semi-supervised short text sentiment analysis model based on social relationship strength |
JIN Zhigang,YANG Yang
|
(School of Electrical and Information Engineering, Tianjin University, Tianjin 300072, China)
|
Abstract: |
With the development of information technology and social media, user sentiment analysis tends to play an increasingly important role in public opinion monitoring, information prediction and product evaluation. However, collecting sufficient manual sentiment labels in supervised learning is still difficult and costly, and unsupervised learning is lack of label guidance. Therefore, a semi-supervised sentiment analysis model based on sociological theory is established in this paper, which is mainly divided into two parts: label addition and emotion analysis. First, a UR-S (User Relationship using Social relations) model was built, which was inspired by sentiment consistency and emotional contagion. Then a TRS-SAT (Text Relationship Strength using Social relations, user Attribute and Text similarities) model based on UR-S model and add labels was established. Finally, the TRS-SAT model and CNN (convolutional neural network) were combined to construct SA-SRS-CNN (Sentiment Analysis using Social Relationship Strength and Convolutional Neural Network) model. The model uses CNN to mine the deep connection between the feature set and the emotional labels to improve the emotional performance. Experiments show that the accuracy, recall, and the F value of the proposed model increased by 11.40%, 5.90% and 8.65%, respectively compared with SVM, and increased 4.12%, 4.17%, and 4.14%, respectively compared with CNN, which suggests that the model is innovative and practical and can provide a good theoretical basis for public opinion analysis. |
Key words: social relationship strength semi-supervised learning deep learning convolutional neural network sentiment analysis text similarity. 〖FQ(+20mm。22,ZX-W〗收稿日期: 2018-09-30 基金项目: 国家自然科学基金项目(71502125)作者简介: 金志刚(1972—),男,教授,博士生导师通信作者: 金志刚,zgjin@tju.edu.cn |