您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(理学版)》

山东大学学报(理学版) ›› 2014, Vol. 49 ›› Issue (11): 14-21.doi: 10.6040/j.issn.1671-9352.3.2014.069

• 论文 • 上一篇    下一篇

基于语义分析的中文微博情感分类方法

杨佳能1,3, 阳爱民2, 周咏梅2   

  1. 1. 广东外语外贸大学国际工商管理学院, 广东 广州 510006;
    2. 广东外语外贸大学思科信息学院, 广东 广州 510006;
    3. 广东省电信规划设计院有限公司, 广东 广州 510630
  • 收稿日期:2014-08-28 修回日期:2014-10-24 出版日期:2014-11-20 发布日期:2014-11-25
  • 通讯作者: 阳爱民(1970- ),男,教授,博士,研究方向为计算语言学、文本情感分析.E-mail: amyang18@163.com E-mail:amyang18@163.com
  • 作者简介:杨佳能(1987- ),男,硕士研究生,研究方向为机器学习、文本情感分析.E-mail: tizzi@163.com
  • 基金资助:
    国家社科基金资助项目(12BYY045);教育部新世纪优秀人才支持计划资助项目(NCET-12-0939); 广东省教育厅科技创新项目(2013KJCX0067)

Sentiment classification method of Chinese Micro-blog based on semantic analysis

YANG Jia-neng1,3, YANG Ai-min2, ZHOU Yong-mei2   

  1. 1. School of Management, Guangdong University of Foreign Studies, Guangzhou 510006, Guangdong, China;
    2. Cisco School of Informatics, Guangdong University of Foreign Studies, Guangzhou 510006, Guangdong, China;
    3. Guangdong Planning and Designing Institute of Telecommunications Co., Ltd., Guangzhou 510630, Guangdong, China
  • Received:2014-08-28 Revised:2014-10-24 Online:2014-11-20 Published:2014-11-25

摘要: 通过分析微博的结构特点,提出了一种基于语义分析的中文微博情感分类方法.首先构建了表情符号情感词典和网络用语情感词典;然后结合词典资源对微博文本进行依存句法分析并且构建情感表达式树;最后根据制定的规则计算微博文本的情感强度,依据强度值判断微博的情感倾向类别.实验结果验证了该方法的有效性,也表明所构建的表情符号情感词典和网络用语情感词典能够有效增强情感分类器的性能.

关键词: 网络用语, 情感分析, 表情符号, 依存句法分析, 中文微博

Abstract: By analyzing the structural features of Chinese Micro-blog, a sentiment classification method based on semantic analysis was proposed. For the proposed method, firstly, an emoticons sentiment lexicon and a network language sentiment lexicon were built. Then by using these two lexicons and the dependency parsing results, the sentiment expression Binary Tree was constructed. Finally, the sentiment strength, which was calculated by the established rules, was applied into the sentiment classification. Experimental results show that this method is effective and two built sentiment lexicons can better enhance the performance of the sentiment analysis system.

Key words: sentiment analysis, network language, Chinese Micro-blog, dependency parsing, emoticons

中图分类号: 

  • TP391
[1] 赵妍妍, 秦兵, 刘挺. 文本情感分析[J]. 软件学报, 2010, 21(8):1834-1848. ZHAO Yanyan, QIN Bing, LIU Ting.Sentiment analysis [J]. Journal of Software, 2010,21(8):1834-1848.
[2] YANG Aimin, LIN Jianghao, ZHOU Yongmei, et al. Research on building a Chinese sentiment lexicon based on SO-PMI [J]. Applied Mechanics and Materials, 2013, 263:1688-1693.
[3] XU Ge, MENG Xinfan, WANG Houfeng. Build Chinese emotion lexicons using a graph-based algorithm and multiple resources[C]//Proceedings of the 23rd International Conference on Computational Linguistics. [S.l.]: Association for Computational Linguistics, 2010:1209-1217.
[4] CLASTER W B, HUNG D, COOPER M. Nave Bayes and unsupervised artificial neural nets for Cancun tourism social media data analysis[C]//Proceedings of the 2nd World Congress on Nature and Biologically Inspired Computing (NaBIC). Piscataway: IEEE, 2010:158-163.
[5] TSENG C, PATEL N, PARANJAPE H, et al. Classifying Twitter data with Nave Bayes classifier[C]//Proceedings of 2012 IEEE International Conference on Granular Computing (GrC). Piscataway: IEEE, 2012:294-299.
[6] REN Yong, KAJI N, YOSHINAGA N, et al. Sentiment classification in resource-scarce languages by using label propagation[C]//Proceedings of the 25th Pacific Asia Conference on Language, Information and Computation.[S.l.]:[s.n.] ,2011: 420-429.
[7] ESCALANTE H J, MONTES-Y-GMEZ M, SOLORIO T. A weighted profile intersection measure for profile-based authorship attribution[C]//Proceedings of the 10th Mexican International Conference on Advances in Artificial Intelligence. Heidelberg: Springer-Verlag Berlin, 2011:232-243.
[8] JUNG J J. Maximum entropy-based named entity recognition method for multiple social networking services [J]. Journal of Internet Technology, 2012, 13(6):931-937.
[9] 赵妍妍, 秦兵, 车万翔, 等. 基于句法路径的情感评价单元识别[J]. 软件学报,2011, 22(5):887-898. ZHAO Yanyan, QIN Bing, CHE Wanxiang, et al. Appraisal expression recognition based on syntactic path[J]. Journal of Software, 2011, 22(5):887-898.
[10] POPESCU A M, ETZIONI O. Extracting product features and opinions from reviews[M]//Natural Language Processing and Text Mining. London: Springer, 2007: 9-28.
[11] 姚天昉, 聂青阳, 李建超, 等. 一个用于汉语汽车评论的意见挖掘系统 [C]//中文信息处理前沿进展:中国中文信息学会二十五周年学术会议论文集. 北京: 清华大学出版社, 2006:260-281. YAO Tianfang, NIE Qingyang, LI Jianchao, et al. An opinion mining system for chinese automobile reviews[C]//Frontier Progress of Chinese Information Processing:the 25th Colleted Papers of Chinese Information Processing. Beijing:Tsinghua University Press, 2006:260-281.
[12] TURNEY P D, LITTMAN M L. Measuring praise and criticism: inference of semantic orientation from association[J]. ACM Trans Information Systems, 2003, 21(4):315-346.
[13] READ J. Recognising affect in text using pointwise-mutual information[D]. Brighton:University of Sussex, 2004.
[14] 陈晓东. 基于情感词典的中文微博情感倾向分析研究[D]. 武汉:华中科技大学, 2012. CHEN Xiaodong. Research on sentiment dictionary based emotional tendency analysis of Chinese Micro-blog[D]. Wuhan:Huazhong University of Science & Technology, 2012.
[15] 王鸿飞. 基于条件随机场的中文微博情感分析研究[D]. 广州:广东工业大学, 2013. WANG Hongfei. Research of sentiment analysis for Chinese Micro-blog based on conditional random field[D].Guangzhou:Guangdong University of Technology, 2013.
[16] 张珊, 于留宝, 胡长军. 基于表情图片与情感词的中文微博情感分析[J]. 计算机科学, 2012, 39(11A):146-148,176. ZHANG Shan, YU Liubao, HU Changjun. Sentiment analysis of Chinese Micro-blogs based on emotions and emotional words [J].Computer Science, 2012, 39(11A):146-148,176.
[17] 谢丽星, 周明, 孙茂松. 基于层次结构的多策略中文微博情感分析和特征抽取[J]. 中文信息学报, 2012, 26(1): 73-83. XIE Lixing, ZHOU Ming, SUN Maosong. Hierarchical structure based hybrid approach to sentiment analysis of Chinese Micro-blog and its feature extraction [J]. Journal of Chinese Information Processing, 2012, 26(1):73-83.
[18] WANG Guangwei, ARAKI K. Modifying so-pmi for Japanese weblog opinion mining by using a balancing factor and detecting neutral expressions[C]//Proceedings of NAACL. PA, USA:Association for Computational Linguistics, 2007:189-192.
[19] 王文远, 王大玲, 冯时, 等. 一种面向情感分析的微博表情情感词典构建及应用[J]. 计算机与数字工程, 2012, 40(11): 6-9. WANG Wenyuan, WANG Daling, FENG Shi, et al. An approach of building Micro-blog smiley emotion lexicon and its application for sentiment analysis[J].Computer & Digital Engineering, 2012, 40(11):6-9.
[20] 李国林. 基于语义分析的 Web 金融文本信息情感计算[D]. 南昌:江西财经大学, 2012. LI Guolin. Sentiment computation of web financial text based on semantic analysis[D].Nanchang:Jiangxi University of Finance and Economics, 2012.
[21] 周咏梅, 杨佳能, 阳爱民. 面向文本情感分析的中文情感词典构建方法[J]. 山东大学学报:工学版, 2013(6):27-33. ZHOU Yongmei, YANG Jianeng, YANG Aimin. A method on building Chinese sentiment lexicon for text sentiment analysis[J]. Journal of Shandong University: Engineering Science, 2013(6):27-33.
[1] 余传明,冯博琳,田鑫,安璐. 基于深度表示学习的多语言文本情感分析[J]. 山东大学学报(理学版), 2018, 53(3): 13-23.
[2] 陈鑫,薛云,卢昕,李万理,赵洪雅,胡晓晖. 基于保序子矩阵和频繁序列模式挖掘的文本情感特征提取方法[J]. 山东大学学报(理学版), 2018, 53(3): 36-45.
[3] 胡默之,姚天昉. 中文微博观点句识别及评价对象抽取方法[J]. 山东大学学报(理学版), 2016, 51(7): 81-89.
[4] 何炎祥, 刘健博, 孙松涛, 文卫东. 基于层叠条件随机场的微博商品评论情感分类[J]. 山东大学学报(理学版), 2015, 50(11): 67-73.
[5] 昝红英, 吴泳钢, 贾玉祥, 牛桂玲. 基于多源知识的中文微博命名实体链接[J]. 山东大学学报(理学版), 2015, 50(07): 9-16.
[6] 朱珠, 李寿山, 戴敏, 周国栋. 结合主动学习和自动标注的评价对象抽取方法[J]. 山东大学学报(理学版), 2015, 50(07): 38-44.
[7] 周文, 张书卿, 欧阳纯萍, 刘志明, 阳小华. 基于情感依存元组的新闻文本主题情感分析[J]. 山东大学学报(理学版), 2014, 49(12): 1-6.
[8] 朱玺, 董喜双, 关毅, 刘志广. 基于半监督学习的微博情感倾向性分析[J]. 山东大学学报(理学版), 2014, 49(11): 37-42.
[9] 孙松涛, 何炎祥, 蔡瑞, 李飞, 贺飞艳. 面向微博情感评测任务的多方法对比研究[J]. 山东大学学报(理学版), 2014, 49(11): 43-50.
[10] 夏梦南, 杜永萍, 左本欣. 基于依存分析与特征组合的微博情感分析[J]. 山东大学学报(理学版), 2014, 49(11): 22-30.
[11] 刘培玉, 张艳辉, 朱振方, 荀静. 融合表情符号的微博文本倾向性分析[J]. 山东大学学报(理学版), 2014, 49(11): 8-13.
[12] 田海龙, 朱艳辉, 梁韬, 马进, 刘璟. 基于三支决策的中文微博观点句识别研究[J]. 山东大学学报(理学版), 2014, 49(08): 58-65.
[13] 张成功1,2,刘培玉1,2*,朱振方1,2,方明1,2. 一种基于极性词典的情感分析方法[J]. J4, 2012, 47(3): 47-50.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!