JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE) ›› 2016, Vol. 51 ›› Issue (1): 52-57.doi: 10.6040/j.issn.1671-9352.1.2015.062

Previous Articles     Next Articles

Research on classification for Chinese short film reviews

MA Li-fei, MO Qian*, DU Hui   

  1. School of Computer and Information Engineering, Beijing Technology and Business University, Beijing 100048, China
  • Received:2015-09-18 Online:2016-01-16 Published:2016-11-29

Abstract: Aiming at the problems of film reviews that the sentences are short and characteristics matrix is sparse, a method using ontology to expand the matrix was proposed. Through comparison and analysis of traditional and developmental text classification methods, a suitable way for Chinese short film reviews classification was found. The experiment results proved that the decision tree is better than the SVM, Bayes and KNN in this essay, and the decision tree classifier was further used to classify the feature vectors of the ontology expanding. The results of experiment showed that the effect of Chinese short film reviews classification based on the ontology expanding was 3% higher than the traditional methods, and the classification accuracy reached 90.1%.

Key words: Chinese short film reviews, classification, decision tree, ontology

CLC Number: 

  • TP391
[1] 姜英杰.基于本体的短文本分类研究[D].长春:东北师范大学,2010. JIANG Yingjie. Research on ontology-based short-text classification[D]. Changchun: Northeast Normal University, 2010.
[2] 王盛, 樊兴华.利用上下位关系的中文短文本分类[J]. 计算机应用, 2010, 30(3):603-606. WANG Sheng, FAN Xinghua. Chinese short text classification based on hyponymy relation[J]. Journal of Computer Applications, 2010, 30(3):603-606.
[3] 范云杰. 基于维基百科的中文短文本分类研究[D]. 西安:西安电子科技大学, 2013. FAN Yunjie. Research on Chinese short text classification based on wikipedia[D]. Xian: Xidian University, 2013.
[4] 林小俊,张猛, 暴筱,等. 基于概念网络的短文本分类方法[J]. 计算机工程, 2010, 36(21):4-6. LIN Xiaojun, ZHANG Meng, BAO Xiao, et al. Short-text classification method based on concept network[J]. Computer Engineering, 2010, 36(21):4-6.
[5] 冶忠林, 杨燕, 贾真,等. 基于语义扩展的短问题分类[J]. 计算机应用, 2015, 35(3):792-796. YE Zhonglin, YANG Yan, JIA Zhen, et al. Short question classification based on semantic extensions[J]. Journal of Computer Applications, 2015, 35(3):792-796.
[6] PHAN X H, NGUYEN L M, HORIGUCHI S. Learning to classify short and sparse text & web with hidden topics from large-scale data collections[C] // Proceedings of the 17th International Conference on World Wide Web. Beijing: ACM, 2008: 91-100.
[7] 闫瑞, 曹先彬, 李凯. 面向短文本的动态组合分类算法[J]. 电子学报, 2009, 37(5):1019-1024. YAN Rui, CAO Xianbin, LI Kai. Dynamic assembly classification algorithm for short text[J]. Acta Electronica Sinica, 2009, 37(5):1019-1024.
[8] 杨天平, 朱征宇. 使用概念描述的中文短文本分类算法[J]. 计算机应用, 2012, 32(12):3335-3338. YANG Tianping, ZHU Zhengyu. Algorithm for Chinese short-text classification using concept description[J]. Journal of Computer Applications, 2012, 32(12):3335-3338.
[9] 吴薇. 大规模短文本的分类过滤方法研究[D]. 北京:北京邮电大学, 2007. WU Wei. Research on filter action and classification methods of large-scale short text[D]. Beijing: Beijing University of Posts and Telecommunications, 2007.
[10] 宁亚辉, 樊兴华, 吴渝. 基于领域词语本体的短文本分类[J]. 计算机科学, 2009, 36(3):142-145. NING Yahui, FAN Xinghua, WU Yu. Short text classification based on domain word ontology[J]. Computer Science, 2009, 36(3):142-145.
[11] 黄永文.中文产品评论挖掘关键技术研究[D]. 重庆:重庆大学, 2009. HUANG Yongwen. Research on key mining techniques of product reviews in Chinese[D]. Chongqing: Chongqing University, 2009.
[12] 黄永光, 刘挺, 车万翔,等. 面向变异短文本的快速聚类算法[J]. 中文信息学报, 2007, 21(2):63-68. HUANG Yongguang, LIU Ting, CHE Wanxiang, et al. A fast clustering algorithm for abnormal and short texts[J]. Journal of Chinese Information Processing, 2007, 21(2):63-68.
[13] 刘婧姣. 基于语义的短文本分类算法研究[D]. 郑州:郑州轻工业学院, 2013. LIU Jingjiao. The study of short text classification algorithm based on semantic[D]. Zhengzhou: Zhengzhou University of Light Industry, 2013.
[14] 崔争艳. 中文短文本分类的相关技术研究[D].开封:河南大学, 2011. CUI Zhengyan. Research of Chinese short-text classification[D]. Kaifeng: Henan University, 2011.
[15] 赵辉, 刘怀亮. 一种基于维基百科的中文短文本分类算法[J]. 图书情报工作, 2013, 57(11):120-124.
[16] 胡勇军, 江嘉欣, 常会友. 基于LDA高频词扩展的中文短文本分类[J]. 现代图书情报技术, 2013, 234(6):42-48.
[17] 吕超镇, 姬东鸿, 吴飞飞. 基于LDA特征扩展的短文本分类[J]. 计算机工程与应用, 2015, 51(4):123-127. LYU Chaozhen, JI Donghong, WU Feifei. Short text classification based on expanding feature of LDA[J]. Computer Engineering and Applications, 2015, 51(4):123-127.
[18] BAUMGARTEN M, GULDENRING D, POLAND M, et al. Embedding self-awareness into objects of daily life-the smart kettle[C] // Proceedings of 2010 Sixth International Conference on Intelligent Environments. Malaysia: IEEE Computer Society, 2010: 34-39.
[19] 薛亮. 基于SVM的中文文本分类系统的设计与实现[D]. 重庆:重庆大学, 2012. XUE Liang. Design and Implementation of Chinese text categorization system based on support vector machine[D]. Chongqing: Chongqing University, 2012.
[20] 叶志刚. SVM在文本分类中的应用[D]. 哈尔滨:哈尔滨工程大学, 2006. YE Zhigang. Application on text classifying With SVM[D]. Harbin: Harbin Engineering University, 2006.
[21] 施聪莺, 徐朝军, 杨晓江. TFIDF算法研究综述[J]. 计算机应用, 2009, 29(6):167-170. SHI Congying, XU Chaojun, YANG Xiaojiang. Study of TFIDF algorithm[J]. Journal of Computer Applications, 2009, 29(6):167-170.
[22] CASTELLS P, FERNANDEZ M, VALLET D. An sdaptation of the vector-space model for ontology-based information retrieval[J]. IEEE Transactions on Knowledge & Data Engineering, 2007, 19(2):261-272.
[23] QUINLAN J R. C4.5: programs for machine learning[M]. San Francisco: Morgan Kaufmann Publishers Inc, 1993.
[1] . Reader emotion classification with news and comments [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2018, 53(9): 35-39.
[2] . Design and implementation of topic detection in Russian news based on ontology [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2018, 53(9): 49-54.
[3] ZUO Zhi-cui, ZHANG Xian-yong, MO Zhi-wen, FENG Lin. Block discernibility matrix based on decision classification and its algorithm finding the core [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2018, 53(8): 25-33.
[4] LI Hui-hui, LIU Xi-qiang, XIN Xiang-peng. Differential invariants and exact solutions of variable coefficients Benjamin-Bona-Mahony-Burgers equation [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2018, 53(10): 51-60.
[5] YANG Yan, XU Bing, YANG Mu-yun, ZHAO Jing-jing. An emotional classification method based on joint deep learning model [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2017, 52(9): 19-25.
[6] DU Man, XU Xue-ke, DU Hui, WU Da-yong, LIU Yue, CHENG Xue-qi. Emotion-specific word embedding learning for emotion classification [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2017, 52(7): 52-58.
[7] ZHANG Peng, WANG Su-ge, LI De-yu, WANG Jie. A semi-supervised spam review classification method based on heuristic rules [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2017, 52(7): 44-51.
[8] QIAO Hu-sheng, BAI Yong-fa. Characterization of monoids by inverse S-acts [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2017, 52(2): 1-4.
[9] LUO Yong-gui. Maximal(regular)subsemigroups of the semigroup W(n,r) [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2017, 52(10): 7-11.
[10] WAN Zhong-ying, WANG Ming-wen, ZUO Jia-li, WAN Jian-yi. Feature selection combined with the global and local information(GLFS) [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2016, 51(5): 87-93.
[11] XU Ye, XU Wei-ran. Algorithm of knowledge base cumulative citation recommendation based on semantic features expansion [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2016, 51(11): 26-32.
[12] CHEN Song-liang. On the structures of groups of order p2q3 with non-Abelian Sylow subgroups [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2015, 50(12): 93-97.
[13] QIAO Hu-sheng, WEN Hai-cun. On a generalizations of principally weakly po-flat posets [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2015, 50(12): 109-113.
[14] LIU Jian, XU Hong-bo, YI Mian-zhu, CHENG Xue-qi. Multi-dimensional semantic ontology construction oriented to knowledge-level application [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2015, 50(09): 13-20.
[15] MA Cheng-long, JIANG Ya-song, LI Yan-ling, ZHANG Yan, YAN Yong-hong. Short text classification based on word embedding similarity [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2014, 49(12): 18-22.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!