山东大学学报(理学版) ›› 2015, Vol. 50 ›› Issue (07): 71-75.doi: 10.6040/j.issn.1671-9352.3.2014.108
桑乐园, 徐新峰, 张婧, 黄德根
SANG Le-yuan, XU Xin-feng, ZHANG Jing, HUANG De-gen
摘要: 微博情感新词的极性判定是情感分析研究中的一项基本任务,旨在对新词进行情感分类。针对极性判定的问题,提出一种新的计算特征向量相似度的算法。该方法首先使用特征向量表示情感新词和已有情感词,利用点互信息计算特征权值:然后采用广义Jaccard系数分别计算情感新词与已有的三种极性的情感词集内情感词的相似度,词集内相似度之和即为情感新词与该情感词集的相关度:最后,通过情感新词与三个极性情感词集的相关度的距离差判定其极性。实验结果表明,基于广义Jaccard系数的情感新词极性判定算法得出的F值比COAE 2014参赛队伍的最好成绩高两个百分点。
中图分类号:
[1] 杨亮,林原,林鸿飞. 基于情感分布的微博热点事件发现[J]. 中文信息学报,2012,26(1):84-90. YANG Liang, LIN Yuan, LIN Hongfei. Micro-blog hot events detection based on emotion distribution[J]. Journal of Chinese Information Processing, 2012, 26(1):84-90. [2] GODBOLE N, SRINIVASAIAH M, SKIENA S. Large-scale sentiment analysis for news and blogs[J]. ICWSM, 2007, 7(21):219-222. [3] QIU Guang, LIU Bing, BU Jiajun, et al. Expanding domain sentiment lexicon through double propagation[C]//IJCAI-International Joint Conference on Artificial Intelligence. San Francisco:Morgan Kaufmann Publishers Inc, 2009:1199-1204. [4] WANG S, MANNING C D. Baselines and bigrams: simple, good sentiment and topic classification[C]//Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2012:90-94. [5] TURNEY P D. Thumbs up or thumbs down: semantic orientation applied to unsupervised classification of reviews[C]//Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. Stroudsburg: ACL, 2002:417-424. [6] LI Fangtao, PAN S J, JIN Ou, et al. Cross-domain co-extraction of sentiment and topic lexicons[C]//Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2012:410-419. [7] 金宇,朱洪波,王亚强,等. 基于直推式学习的中文情感词极性判别[J]. 计算机工程与应用,2011,47(34):164-167. JIN Yu, ZHU Hongbo, WANG Yaqiang, et al. Determining of polarity of Chinese opinion words based on transductive learning[J]. Computer Engineering and Applications, 2011, 47(34):164-167. [8] 杨立公,樊孝忠,朱俭. 利用语义词典的情感词快速识别[J]. 计算机工程与设计,2013,34(8):2978-2982. YANG Ligong, FAN Xiaozhong, ZHU Jian. Quick sentiment word discrimination by using semantics lexicon[J]. Computer Engineering and Design, 2013, 34(8):2978-2982. [9] BOLLEGALA D, WEIR D, CARROLL J. Cross-domain sentiment classification using a sentiment sensitive thesaurus[J]. Knowledge and Data Engineering, 2013, 25(8):1719-1731. [10] 石静,吴云芳,邱立坤,等. 基于大规模语料库的汉语词义相似度计算方法[J]. 中文信息学报,2013, 27(1):1-6. SHI Jing, WU Yunfang, QIU Likun, et al. Chinese lexical sematic similarity computing based on large-scale corpus[J]. Journal of Chinese Information Processing, 2013, 27(1):1-6. [11] 张宇,刘雨东,计钊. 向量相似度测度方法[J]. 声学技术, 2009, 28(4): 532-536. ZHANG Yu, LIU Yudong, JI Zhao. Vector similarity measurement method[J]. Technical Acoustics, 2009, 28(4):532-536. [12] HUANG Degen, TONG Deqin. Context information and fragments based cross-domain word segmentation[J]. China Communications, 2012, 9(3):49-57. [13] 徐琳宏,林鸿飞,潘宇,等. 情感词汇本体的构造[J]. 情报学报,2008,27(2): 180-185. XU Linhong, LIN Hongfei, PAN Yu, et al. Constructing the affective lexicon ontology[J]. Journal of the China Society for Scientific and Technical Information, 2008, 27(2):180-185. |
[1] | 钱小燕. 解大型对称矩阵特征值问题的一个子空间加速截断牛顿法[J]. J4, 2011, 46(8): 8-12. |
[2] | 冯新磊,赵建立, . 极大加广义正定矩阵[J]. J4, 2007, 42(8): 70-73 . |
[3] | 胡 钢,冯向前,魏翠萍,李宗植 . 区间数判断矩阵满意一致性递推排序方法研究[J]. J4, 2007, 42(11): 89-93 . |
|