JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE) ›› 2015, Vol. 50 ›› Issue (07): 71-75.doi: 10.6040/j.issn.1671-9352.3.2014.108

Previous Articles     Next Articles

New microblog sentiment lexicon judgment based on generalized Jaccard coefficient

SANG Le-yuan, XU Xin-feng, ZHANG Jing, HUANG De-gen   

  1. School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, Liaoning, China
  • Received:2015-03-03 Online:2015-07-20 Published:2015-07-31

Abstract: New microblog sentiment lexicon polarity judgment is a basic task aiming at classifying its emotion categories in sentiment analysis. This paper proposed a new approach that can judge the polarity of new microblog sentiment lexicon. The feature vectors are employed to represent new sentiment lexicon and the existing sentiment lexicon while the weight values are calculated by PMI. The similarity between the new sentiment lexicon and the candidates which is from three sentiment lexicon sets of different polarities through the generalized Jaccard coefficient, and the relativity between the new sentiment lexicon and the existing sentiment lexicon sets is defined as the sum of the above similarities. Finally, relativity distance differences of the three sentiment lexicon sets are applied to judge the polarity. The result of experiment showed that the F-score calculated through polarity judgment algorithm base on the generalized Jaccard coefficient was two points higher than the best team in COAE 2014.

Key words: feature vector, PMI, distance difference, unsupervised

CLC Number: 

  • TP391
[1] 杨亮,林原,林鸿飞. 基于情感分布的微博热点事件发现[J]. 中文信息学报,2012,26(1):84-90. YANG Liang, LIN Yuan, LIN Hongfei. Micro-blog hot events detection based on emotion distribution[J]. Journal of Chinese Information Processing, 2012, 26(1):84-90.
[2] GODBOLE N, SRINIVASAIAH M, SKIENA S. Large-scale sentiment analysis for news and blogs[J]. ICWSM, 2007, 7(21):219-222.
[3] QIU Guang, LIU Bing, BU Jiajun, et al. Expanding domain sentiment lexicon through double propagation[C]//IJCAI-International Joint Conference on Artificial Intelligence. San Francisco:Morgan Kaufmann Publishers Inc, 2009:1199-1204.
[4] WANG S, MANNING C D. Baselines and bigrams: simple, good sentiment and topic classification[C]//Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2012:90-94.
[5] TURNEY P D. Thumbs up or thumbs down: semantic orientation applied to unsupervised classification of reviews[C]//Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. Stroudsburg: ACL, 2002:417-424.
[6] LI Fangtao, PAN S J, JIN Ou, et al. Cross-domain co-extraction of sentiment and topic lexicons[C]//Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2012:410-419.
[7] 金宇,朱洪波,王亚强,等. 基于直推式学习的中文情感词极性判别[J]. 计算机工程与应用,2011,47(34):164-167. JIN Yu, ZHU Hongbo, WANG Yaqiang, et al. Determining of polarity of Chinese opinion words based on transductive learning[J]. Computer Engineering and Applications, 2011, 47(34):164-167.
[8] 杨立公,樊孝忠,朱俭. 利用语义词典的情感词快速识别[J]. 计算机工程与设计,2013,34(8):2978-2982. YANG Ligong, FAN Xiaozhong, ZHU Jian. Quick sentiment word discrimination by using semantics lexicon[J]. Computer Engineering and Design, 2013, 34(8):2978-2982.
[9] BOLLEGALA D, WEIR D, CARROLL J. Cross-domain sentiment classification using a sentiment sensitive thesaurus[J]. Knowledge and Data Engineering, 2013, 25(8):1719-1731.
[10] 石静,吴云芳,邱立坤,等. 基于大规模语料库的汉语词义相似度计算方法[J]. 中文信息学报,2013, 27(1):1-6. SHI Jing, WU Yunfang, QIU Likun, et al. Chinese lexical sematic similarity computing based on large-scale corpus[J]. Journal of Chinese Information Processing, 2013, 27(1):1-6.
[11] 张宇,刘雨东,计钊. 向量相似度测度方法[J]. 声学技术, 2009, 28(4): 532-536. ZHANG Yu, LIU Yudong, JI Zhao. Vector similarity measurement method[J]. Technical Acoustics, 2009, 28(4):532-536.
[12] HUANG Degen, TONG Deqin. Context information and fragments based cross-domain word segmentation[J]. China Communications, 2012, 9(3):49-57.
[13] 徐琳宏,林鸿飞,潘宇,等. 情感词汇本体的构造[J]. 情报学报,2008,27(2): 180-185. XU Linhong, LIN Hongfei, PAN Yu, et al. Constructing the affective lexicon ontology[J]. Journal of the China Society for Scientific and Technical Information, 2008, 27(2):180-185.
[1] GONG Shuang-shuang, CHEN Yu-feng, XU Jin-an, ZHANG Yu-jie. Extraction of Chinese multiword expressions based on Web text [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2018, 53(9): 40-48.
[2] YU Chuan-ming, ZUO Yu-heng, GUO Ya-jing, AN Lu. Dynamic discovery of authors research interest based on the combined topic evolutional model [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2018, 53(9): 23-34.
[3] . Reader emotion classification with news and comments [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2018, 53(9): 35-39.
[4] . Design and implementation of topic detection in Russian news based on ontology [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2018, 53(9): 49-54.
[5] LIAO Xiang-wen, ZHANG Ling-ying, WEI Jing-jing, GUI Lin, CHENG Xue-qi, CHEN Guo-long. User influence analysis of social media with temporal characteristics [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2018, 53(3): 1-12.
[6] YU Chuan-ming, FENG Bo-lin, TIAN Xin, AN Lu. Deep representative learning based sentiment analysis in the cross-lingual environment [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2018, 53(3): 13-23.
[7] ZHANG Jun, LI Jing-fei, ZHANG Rui, RUAN Xing-mao, ZHANG Shuo. Community detection algorithm based on effective resistance of network [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2018, 53(3): 24-29.
[8] PANG Bo, LIU Yuan-chao. Fusion of pointwise and deep learning methods for passage ranking [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2018, 53(3): 30-35.
[9] CHEN Xin, XUE Yun, LU Xin, LI Wan-li, ZHAO Hong-ya, HU Xiao-hui. Text feature extraction method for sentiment analysis based on order-preserving submatrix and frequent sequential pattern mining [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2018, 53(3): 36-45.
[10] WANG Tong, MA Yan-zhou, YI Mian-zhu. Speech recognition of Russian short instructions based on DTW [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2017, 52(11): 29-36.
[11] ZHANG Xiao-dong, DONG Wei-guang, TANG Min-an, GUO Jun-feng, LIANG Jin-ping. gOMP reconstruction algorithm based on generalized Jaccard coefficient for compressed sensing [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2017, 52(11): 23-28.
[12] SUN Jian-dong, GU Xiu-sen, LI Yan, XU Wei-ran. Chinese entity relation extraction algorithms based on COAE2016 datasets [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2017, 52(9): 7-12.
[13] WANG Kai, HONG Yu, QIU Ying-ying, WANG Jian, YAO Jian-min, ZHOU Guo-dong. Study on boundary detection of users query intents [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2017, 52(9): 13-18.
[14] ZHANG Fan, LUO Cheng, LIU Yi-qun, ZHANG Min, MA Shao-ping. User preference prediction in heterogeneous search environment [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2017, 52(9): 26-34.
[15] YANG Yan, XU Bing, YANG Mu-yun, ZHAO Jing-jing. An emotional classification method based on joint deep learning model [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2017, 52(9): 19-25.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!