您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(理学版)》

山东大学学报(理学版) ›› 2015, Vol. 50 ›› Issue (11): 67-73.doi: 10.6040/j.issn.1671-9352.0.2015.082

• 论文 • 上一篇    下一篇

基于层叠条件随机场的微博商品评论情感分类

何炎祥1,2, 刘健博1,2, 孙松涛1,2, 文卫东1,2   

  1. 1. 武汉大学计算机学院, 湖北 武汉 430072;
    2. 武汉大学软件工程国家重点实验室, 湖北 武汉 430072
  • 收稿日期:2015-02-12 修回日期:2015-11-11 出版日期:2015-11-20 发布日期:2015-12-09
  • 通讯作者: 刘健博(1986-),男,博士研究生,研究方向为社会计算、自然语言处理、机器学习.E-mail:ljb@whu.edu.cn E-mail:ljb@whu.edu.cn
  • 作者简介:何炎祥(1952-),男,博士,教授,研究方向为可信软件、自然语言处理、分布并行处理和软件工程.E-mail:yxhe@whu.edu.cn
  • 基金资助:
    国家自然科学基金资助项目(61303115,61472290,61472291);武汉市科技攻关项目(201210421135)

Product reviews sentiment classification in Micro-blog based on cascaded conditional random field

HE Yan-xiang1,2, LIU Jian-bo1,2, SUN Song-tao1,2, WEN Wei-dong1,2   

  1. 1. Computer School of Wuhan University, Wuhan 430072, Hubei, China;
    2. State Key Laboratory of Software Engineering, Wuhan 430072, Hubei, China
  • Received:2015-02-12 Revised:2015-11-11 Online:2015-11-20 Published:2015-12-09

摘要: 商品评论是消费者针对某一个商品的主观议论。针对微博中商品的评论文本短小、结构多样等特征,在仅使用现有的微博级情感标注的条件下,提出了一种基于层叠条件随机场模型。以中文小句中枢说为理论基础,将商品评论的句子划分为若干小句,使用微博内小句序列的各种特征训练粗粒度的随机条件场情感分类模型,同时使用小句内汉字序列的各种特征来训练细粒度的随机条件场情感分类模型。实验结果表明,本文提出的方法优于传统的情感分类方法。

关键词: 情感分析, 小句中枢说, 条件随机场, 微博

Abstract: Product reviews are subjective comments submitted by customers. Nowadays, product reviews are in the form of Micro-blog text which is typically very short but with varied structures. We proposed a novel sentiment classification method for product reviews from Micro-blog based on cascaded Conditional Random Field(CRF). First, review sentences were divided into a number of clauses based on the theory of clausal pivot. Then, features of the Chinese clause sequences were exploited to train a coarse-grained CRF sentiment classification model. Meanwhile, features of the Chinese character sequences within clauses were exploited to train a fine-grained CRF sentiment classification model. The experimental evaluation shows that the proposed method is better than the state-of-the-art ones.

Key words: Micro-blog, sentiment analysis, theory of clausal pivot, CRF

中图分类号: 

  • TP391
[1] NASUKAWA T, YI J. Sentiment analysis:capturing favorability using natural language processing[C]//Proceedings of the 2nd international conference on Knowledge capture. New York:ACM, 2003:70-77.
[2] DAVE K, LAWRENCE S, PENNOCK D M. Mining the peanut gallery:opinion extraction and semantic classification of product reviews[C]//Proceedings of the 12th International Conference on World Wide Web, 2003:519-528.
[3] 邢福义. 小句中枢说[J]. 中国语文, 1995, 6:420-428.
[4] PANG B, LEE L. Opinion mining and sentiment analysis[J]. Foundations and Trends in Information Retrieval, 2008, 2(1-2):1-135.
[5] 赵妍妍,秦兵,刘挺. 文本情感分析[J]. 软件学报, 2010, 21(8):1834-1848. ZHAO Yanyan, QIN Bin, LIU Ting. Sentiment analysis[J]. Journal of Software, 2010, 21(8):1834-1848.
[6] LIU B. Sentiment analysis and opinion mining[J]. Synthesis Lectures on Human Language Technologies, 2012, 5(1):1-167.
[7] HU Minqing, LIU Bing. Mining and summarizing customer reviews[C]//Proceedings of KDD '04.New York:ACM, 2004:168-177.
[8] HU Minqing, LIU Bing. Opinion extraction and summarization on the web[C]//Proceedings of the 21st National Conference on Artificial Intelligence (AAAI-06).Menlo Park:AAAI Press, 2006, 7:1621-1624.
[9] YU H, HATZIVASSILOGLOU V. Towards answering opinion questions:separating facts from opinions and identifying the polarity of opinion sentences[C]//Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing. Philadelphia:Association for Computational Linguistics, 2003:129-136.
[10] SHEN Y, LI S, ZHENG L, et al. Emotion mining research on Micro-blog[C]//Proceedings of the 1st IEEE Symposium on Web Society. Piscataway:IEEE, 2009:71-75.
[11] 谢丽星,周明,孙茂松. 基于层次结构的多策略中文微博情感分析和特征抽取[J]. 中文信息学报, 2012, 26(1):73-83. XIE Lixing, ZHOU Ming, SUN Maosong.Hierarchical structure based hybrid approach to sentiment analysis of Chinese Micro-blog and its feature extraction[J]. Journal of Chinese Information Processing, 2012, 26(1):73-83.
[12] CHOI Y, CARDIE C, RILOFF E, et al. Identifying sources of opinions with conditional random fields and extraction patterns[C]//Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing. Philadelphia:Association for Computational Linguistics, 2005:355-362.
[13] MAO Y, LEBANON G. Isotonic conditional random fields and local sentiment flow[C]//Proceedings of the Neural Information Processing Systems. 2007:961-968.
[14] PANG B, LEE L, Vaithyanathan S. Thumbs up:sentiment classification using machine learning techniques[C]//Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing.Somerset:Association for Computational Linguistics, 2002:79-86.
[15] MCDONALD R, HANNAN K, NEYLON T, et al. Structured models for fine-to-coarse sentiment analysis[C]//Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics. Philadelphia:Association for Computational Linguistics, 2007, 45(1):432-439.
[16] SADAMITSU K, SEKINE S, YAMAMOTO M. Sentiment analysis based on probabilistic models using inter-sentence information[C]//Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC'08). Luxembourg:European Language Resources Association,2008:2892-2896.
[17] TCKSTRM O, MCDONALD R. Discovering fine-grained sentiment with latent variable structured prediction models[C]//Proceedings of the 33rd European Conference on Information Retrieval.Heidelberg:Springer-Verlag Berlin, 2011:368-374.
[18] TCKSTRM O, MCDONALD R. Semi-supervised latent variable models for sentence-level sentiment analysis[C]//Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics:Human Language Technologies. Philadelphia:Association for Computational Linguistics, 2011:569-574.
[19] FANG L, HUANG M, ZHU X. Exploring weakly supervised latent sentiment explanations for aspect-level review analysis[C]//Proceedings of the 22nd ACM International Conference on Conference on Information & Knowledge Management. New York:ACM, 2013:1057-1066.
[20] CHOI Y, BRECK E, CARDIE C. Joint extraction of entities and relations for opinion recognition[C]//Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing. Philadelphia:Association for Computational Linguistics, 2006:431-439.
[21] 王根, 赵军. 基于多重冗余标记 CRF 的句子情感分析研究[J]. 中文信息学报, 2007, 21(5):51-55. WANG Gen, ZHAO Jun. Sentence sentiment analysis based on multi-redundant-labeled CRFs[J]. Journal of Chinese Information Processing, 2007, 21(5):51-55.
[22] 刘康,赵军. 基于层叠 CRF 模型的句子褒贬度分析研究[J]. 中文信息学报, 2008, 22(1):123-128. LIU Kang, ZHAO Jun. Sentence sentiment analysis based on cascaded CRFs model[J]. Journal of Chinese Information Processing, 2008, 22(1):123-128.
[23] 王荣洋,鞠久鹏,李寿山. 基于 CRF 的评价对象抽取特征研究[J]. 中文信息学报, 2012, 26(2):56-61. WANG Rongyang, JU Jiupeng, LI Shoushan. Extraction of opinion targets based on shallow parsing features[J]. Journal of Chinese Information Processing, 2012, 26(2):56-61.
[24] 郑敏洁,雷志城,廖祥文. 基于层叠 CRF 的中文句子评价对象抽取[J]. 中文信息学报, 2013, 27(3):69-76. ZHENG Minjie, LEI Zhicheng, LIAO Xiangwen. Indentify sentiment-objects from Chinese sententence based on cascaded conditional random fields[J].Journal of Chinese Information Processing, 2013, 27(3):69-76.
[25] 何炎祥,罗楚威,胡彬尧. 基于CRF和规则相结合的地理命名实体识别方法[J]. 计算机应用与软件,2015,32(1):179-185. HE Yanxiang, LUO Chuwei, HU Binyao. Geographic entity recognition method based on CRF model and rules combination[J]. Computer Applications and Software, 2015, 32(1):179-185.
[26] 乌达巴拉,汪增福. 一种扩展式CRFs的短语情感倾向性分析方法研究[J]. 中文信息学报,2015,29(1):155-162. ODBAL, WANG Zengfu. Phrase-level sentiment analysis approach based on yet another CRFs[J]. Journal of Chinese Information Processing, 2015, 29(1):155-162.
[27] 栗伟,赵大哲,李博. CRF与规则相结合的医学病历实体识别[J]. 计算机应用研究,2015,4:1082-1086. LI Wei, ZHAO Dazhe, LI Bo, et al. Combining CRF and rule based medical named entity recognition[J]. Application Research of Computers, 2015, 4:1082-1086.
[28] 黄忠廉. 小句中枢全译说[J]. 汉语学报, 2005(2):62-69.
[1] 余传明,冯博琳,田鑫,安璐. 基于深度表示学习的多语言文本情感分析[J]. 山东大学学报(理学版), 2018, 53(3): 13-23.
[2] 陈鑫,薛云,卢昕,李万理,赵洪雅,胡晓晖. 基于保序子矩阵和频繁序列模式挖掘的文本情感特征提取方法[J]. 山东大学学报(理学版), 2018, 53(3): 36-45.
[3] 张聪,裴家欢,黄锴宇,黄德根,殷章志. 基于语义图优化算法的中文微博观点摘要研究[J]. 山东大学学报(理学版), 2017, 52(7): 59-65.
[4] 张中军,张文娟,于来行,李润川. 基于网络距离和内容相似度的微博社交网络社区划分方法[J]. 山东大学学报(理学版), 2017, 52(7): 97-103.
[5] 胡默之,姚天昉. 中文微博观点句识别及评价对象抽取方法[J]. 山东大学学报(理学版), 2016, 51(7): 81-89.
[6] 孙赫,李淑琴,吕学强,刘克会. 微博城市投诉文本中的地理位置实体识别[J]. 山东大学学报(理学版), 2016, 51(3): 77-85.
[7] 朱梦珺,蒋洪迅,许伟. 基于金融微博情感与传播效果的股票价格预测[J]. 山东大学学报(理学版), 2016, 51(11): 13-25.
[8] 王立人, 余正涛, 王炎冰, 高盛祥, 李贤慧. 基于有指导LDA用户兴趣模型的微博主题挖掘[J]. 山东大学学报(理学版), 2015, 50(09): 36-41.
[9] 昝红英, 吴泳钢, 贾玉祥, 牛桂玲. 基于多源知识的中文微博命名实体链接[J]. 山东大学学报(理学版), 2015, 50(07): 9-16.
[10] 朱珠, 李寿山, 戴敏, 周国栋. 结合主动学习和自动标注的评价对象抽取方法[J]. 山东大学学报(理学版), 2015, 50(07): 38-44.
[11] 周超, 严馨, 余正涛, 洪旭东, 线岩团. 融合词频特性及邻接变化数的微博新词识别[J]. 山东大学学报(理学版), 2015, 50(03): 6-10.
[12] 唐波, 陈光, 王星雅, 王非, 陈小慧. 微博新词发现及情感倾向判断分析[J]. 山东大学学报(理学版), 2015, 50(01): 20-25.
[13] 周文, 张书卿, 欧阳纯萍, 刘志明, 阳小华. 基于情感依存元组的新闻文本主题情感分析[J]. 山东大学学报(理学版), 2014, 49(12): 1-6.
[14] 杨佳能, 阳爱民, 周咏梅. 基于语义分析的中文微博情感分类方法[J]. 山东大学学报(理学版), 2014, 49(11): 14-21.
[15] 朱玺, 董喜双, 关毅, 刘志广. 基于半监督学习的微博情感倾向性分析[J]. 山东大学学报(理学版), 2014, 49(11): 37-42.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!