您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(理学版)》

山东大学学报(理学版) ›› 2014, Vol. 49 ›› Issue (11): 68-73.doi: 10.6040/j.issn.1671-9352.3.2014.025

• 论文 • 上一篇    下一篇

基于SVM与RNN的文本情感关键句判定与抽取

刘铭, 昝红英, 原慧斌   

  1. 郑州大学信息工程学院, 河南 郑州 450001
  • 收稿日期:2014-08-28 修回日期:2014-10-21 出版日期:2014-11-20 发布日期:2014-11-25
  • 作者简介:刘铭(1990- ),男,硕士研究生,主要研究方向为自然语言处理、深度学习.E-mail:liuming@gs.zzu.edu.cn
  • 基金资助:
    国家自然科学基金资助项目(61402419,60970083,61272221);国家社会科学基金资助项目(14BYY096);国家高技术研究发展计划(“八六三”计划)项目(2012AA011101);河南省科技厅科技攻关计划项目(132102210407);河南省科技厅基础研究项目(142300410231,142300410308);河南省教育厅科学技术研究重点项目(12B520055,13B520381)

Key sentiment sentence prediction using SVM and RNN

LIU Ming, ZAN Hong-ying, YUAN Hui-bin   

  1. School of Information Engineering, Zhengzhou University, Zhengzhou 450001, Henan, China
  • Received:2014-08-28 Revised:2014-10-21 Online:2014-11-20 Published:2014-11-25

摘要: 文本的情感倾向在很大程度上依赖于其中情感倾向性较高的关键句,对这些情感关键句正确判定有利于提高整个篇章情感分类的效果.传统的基于规则的情感倾向性分析的优点是情感词表和规则表达准确,缺点是完备性差,而统计的方法则相反.结合使用支持向量机 (support vector machine, SVM)与递归神经网络(recursive neural network, RNN)分别构造分类器,然后对整个篇章和单个句子进行情感二元分类,将分类结果进行比较投票后判定出篇章中的情感关键句.句子级情感特征不仅包含情感词、否定词等传统的文法信息,同时加入深度学习领域中词向量的统计信息,而在篇章特征中也抽取出句型、位置等宏观信息.通过参与COAE 2014评测任务1的结果显示,该方法的微平均F1值达到0.388,在同类评测系统中处于最高水平.

关键词: 递归神经网络, 机器学习, 情感倾向性, RNN, 深度学习

Abstract: Key sentiment sentences play an important role in predicting the sentiment distribution in texts, and therefore it improves the performance after correctly judging these key sentences. After analyzing the advantages and disadvantages of the state-of-the-art approaches which are mainly based on rules and statistics, it is found that rule-based methods achieve high accuracy but with low coverage, the statistic method is quite the opposite. In this paper, a novel deep learning framework to predict sentiment distributions based on Recursive Neural Network as well as Support Vector Machine was introduced. There are sentiment features including not only grammar information such as sentiment and negative words, but also statistical information like word vector in deep learning. Meanwhile, text features like sentence pattern and position were also involved. This method combines SVM and RNN in deep learning to predict sentiment distributions in texts, which outperforms other traditional approaches. The result from COAE2014 Task 1 shows that our method achieves a MicroF1 value of 0.388, higher than the average level.

Key words: recursive neural network, machine learning, RNN, deep learning, sentiment analysis

中图分类号: 

  • TP391
[1] PANG Bo, LEE L, VAITHYANATHAN S. Thumbs up? sentiment classification using machine learning techniques[C]//Proceedings of the 2002 Conference on Empirical Methods In Natural Language Processing. Somerset: ACL, 2002:79-86.
[2] PANG Bo, LEE L. Opinion mining and sentiment analysis[M]. Boston, Delft: Now Publishers Inc, 2008, 2(1-2):1-135.
[3] MICHAEL G. Sentiment classification on customer feedback data: noisy data, large feature vectors, and the role of linguistic analysis [C]//Proceedings of the International Conference on Computational Linguistics. New York: ACM,2004:491-503.
[4] LI Shoushan, HUANG Churen, ZHOU Guodong, et al. Employing personal/impersonal views in supervised and semi-supervised sentiment classification[C]//Proceedings of the International Conference on Computational Linguistics. New York: ACM, 2010:414-423.
[5] TANG Duyu, QIN Bing, LIU Ting, et al. Learning sentence representation for emotion classification on microblogs[C]//Natural Language Processing and Chinese Computing.[S.l.]: Springer-Verlag, 2013:212-223.
[6] TURNEY P D. Thumbs up or down? semantic orientation applied to unsupervised of reviews[C]//Proceedongs of 40th Annual Meeting of the Association for Computation Linguistics. Somerset: ACL, 2002:417-424.
[7] SOCHER R, PENNINGTON J, HUANG E H, et al. Semi-supervised recursive auto-encoders for predicting sentiment distributions[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. PA, USA:Association for Computational Linguistics, 2011:151-161.
[8] DASGUPTA Sajib, NG Vincent. Mine the easy, classify the hard: a semi-supervised approach to automatic sentiment classification[C]//Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. Singapore: ACM, 2009:701-709.
[9] SOCHER R, PERELYGIN A, WU J Y, et al. Recursive deep models for semantic compositionality over a sentiment treebank[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Edinburgh, UK: Elsevier BV, 2011:1631-1642.
[10] LI Tao, ZHANG Yi, SINDHWANI Vikas. A non-negative matrix tri-factorization approach to sentiment classification with lexical prior knowledge[C]//Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. Somerset: ACL, 2009:244-252.
[11] YESSENALINA A, CARDIE C. Compositional matrix-space models for sentiment analysis[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Edinburgh, UK: Elsevier B V, 2011:172-182.
[1] 庞博,刘远超. 融合pointwise及深度学习方法的篇章排序[J]. 山东大学学报(理学版), 2018, 53(3): 30-35.
[2] 刘明明,张敏情,刘佳,高培贤. 一种基于浅层卷积神经网络的隐写分析方法[J]. 山东大学学报(理学版), 2018, 53(3): 63-70.
[3] 潘清清,周枫,余正涛,郭剑毅,线岩团. 基于条件随机场的越南语命名实体识别方法[J]. 山东大学学报(理学版), 2014, 49(1): 76-79.
[4] 杜瑞颖, 杨勇, 陈晶, 王持恒. 一种基于相似度的高效网络流量识别方案[J]. 山东大学学报(理学版), 2014, 49(09): 109-114.
[5] 董源1,徐雅斌1,2*,李卓1,2,李艳平1. 基于社会计算和机器学习的垃圾邮件识别方法的研究[J]. J4, 2013, 48(7): 72-78.
[6] 黄林晟1,邓志鸿1,2,唐世渭1,2,王文清3,陈凌3. 基于编辑距离的中文组织机构名简称-全称匹配算法[J]. J4, 2012, 47(5): 43-48.
[7] 唐都钰1,王大亮2,赵凯2,秦兵1,刘挺1. 面向汽车领域的软文识别研究[J]. J4, 2012, 47(3): 43-46.
[8] 黄贤立,罗冬梅. 倾向性文本迁移学习中的特征重要性研究[J]. J4, 2010, 45(7): 13-17.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!