您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(理学版)》

山东大学学报(理学版) ›› 2016, Vol. 51 ›› Issue (11): 13-25.doi: 10.6040/j.issn.1671-9352.1.2015.E26

• • 上一篇    下一篇

基于金融微博情感与传播效果的股票价格预测

朱梦珺,蒋洪迅*,许伟   

  1. 中国人民大学信息学院, 北京 100872
  • 收稿日期:2015-11-14 出版日期:2016-11-20 发布日期:2016-11-22
  • 通讯作者: 蒋洪迅(1974— ),男,博士,副教授,研究方向为信息系统工程、网络金融系统、社交网络数据挖据、服务运筹优化、计算智能.E-mail:jianghx@ruc.edu.cn E-mail:vivianchu1015@gmail.com
  • 作者简介:朱梦珺(1988— ),女,硕士研究生,研究方向为社交网络数据挖据、服务运筹优化.E-mail:vivianchu1015@gmail.com
  • 基金资助:
    国家自然科学基金资助项目(71571183);教育部人文社会科学基金资助项目(12YJA630046)

Weibo moods and propagation factors based stock prices prediction

ZHU Meng-jun, JIANG Hong-xun*, XU Wei   

  1. School of Information, Renmin University of China, Beijing 100872, China
  • Received:2015-11-14 Online:2016-11-20 Published:2016-11-22

摘要: 目前对微博情绪与金融预测之间关系的研究多数停留于诸如模式识别、语义分析、情感挖掘等文本挖掘技术,而较少研究微博情感传递过程。以金融微博文本情感挖掘和语义分析为基础,对相关的股票价格曲线进行拟合预测分析,包括对微博信息转播模型的研究和对微博情绪预测模型的研究。首先通过分析微博转播过程中的多个因素,包括转发情绪吸收、微博内容影响力、微博作者影响力、微博发布时间等,对模型自身进行拟合效果优化。其次,针对认证和非认证用户分类分析,并加入了转发次数的对其的再度影响,发现不同类型不同转发的用户对于股市曲线的影响滞后期不同。最后,在针对股市曲线变化的不同时期,对模型的拟合效果进行分析。给定金融市场某一特定关键词,采集了500,000多条金融微博及其相关用户信息。实验结果表明,新集成模型表现强于简单神经网络模型,而且是否为认证用户以及微博转发次数对微博滞后期的影响有所不同。此外,新模型的拟合效果,在股市上升期模型的拟合效果最好,下降期次之,平稳振荡期效果最差。

关键词: 微博, 股票, 情感挖掘, 预测, 传播效果

Abstract: At present, there are many studies on the relationship between Weibo sentiment and financial forecast. Most forecasting studies concern excessively on text mining techniques, such as pattern recognition, semantic or sentiment analysis, but neglect the procedure of moods dissemination. We provide an integrated framework, including the semantic mining, information transmission and propagating factors analysis, to predict stock prices more accurately. First, we select several factors in the dissemination process, such as emotional absorption of forwarding, influence of content and poster, release time, etc. to optimize the fitting effect of original model. Second, we classify users into two categories, verified or unverified users. And we also take the count of forwarding into account, checking its effect on stock prices fluctuation. Third, we compare the fitting effect of prediction models for different periods of the stock curve. Given a certain keyword related to financial market, we collected over 500,000 Micro-blogs and their user information from Weibo. Experiments demonstrate that our proposed integrated framework outperformed the simple neural network method. We observe that user category and the count of forwarding differ on the lag phase of influence. And more, we found that the model fitting effect were the best in the rising periods of stock prices curves, the second place in the declining and the worst in the fluctuating.

Key words: sentiment mining, stock, Weibo, prediction, propagation effect

中图分类号: 

  • TP391
[1] JIANG Long, YU Mo, ZHOU Ming, et al. Target-dependent twitter sentiment classification[C] // Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: ACL, 2011, 1:151-160.
[2] GO A, BHAYANI R, HUANG Lei. Twitter sentiment classification using distant supervision[R]. Stanford: Stanford University, 2009.
[3] KONTOPOULOS E, BERBERIDIS C, DERGIADES T, et al. Ontology-based sentiment analysis of twitter posts[J].Expert Systems with Applications, 2013, 40(10):4065-4074.
[4] THELWALL M, BUCKLEY K, PALTOGLOU G. Sentiment in twitter events[J]. Journal of the American Society for Information Science and Technology, 2011, 62(2):406-418.
[5] READ J. Using emoticons to reduce dependency in machine learning techniques for sentiment classification[C/OL] //Proceedings of the ACL Student Research Workshop, Association for Computational Linguistics(2005), 2005: 43-48.[2015-03-08]. http://portal.acm.org/citation.cfm?id=1628969.
[6] DAVIDIV D, TSURO, RAPPOPORT A. Enhanced sentiment learning using twitter hashtags and smileys[C/OL].Coling 2010. [2015-02-16]. http://dl.acm.org/ft-gateway.cfm?id=1944594& type=pdf.
[7] SAIF H, HE Y, ALANI H. Alleviating data sparsity for twitter sentiment analysis[C/OL].Proceedings of the 2nd Workshop on Making Sense of Microposts(#MSM2012).[2015-02-06]. http://ceur-ws.org/Vol-838/paper-01.pdf.
[8] 刘志明, 刘鲁. 基于机器学习的中文微博情感分类实证研究[J]. 计算机工程与应用, 2012, 48(1):1-4. LIU Zhiming, LIU Lu. Empirical study of sentiment classification for Chinese Microblog based on machine learning[J]. Computer Engineering and Applications, 2012, 48(1):1-4.
[9] 谢丽星,周明,孙茂松. 基于层次结构的多策略中文微博情感分析和特征抽取[J]. 中文信息学报,2012, 26(1):73-83. XIE Lixing, ZHOU Ming, SUN Maosong. Hierarchical structure based hybrid approach to sentiment analysis of Chinese Microblog and its feature extraction[J]. Journal of Chinese Information Processing, 2012, 26(1):73-83.
[10] 林江豪,阳爱民,周咏梅,等. 一种基于朴素贝叶斯的微博情感分类研究[J]. 计算机工程与科学,2012,34(9):160-165. LIN Jianghao, YANG Aimin, ZHOU Yongmei, et al. Classification of Microblog sentiment based on naive Bayesian[J]. Computer Engineering & Science, 2012, 34(9):160-165.
[11] JANSEN B J, ZHANG M, SOBEL K, et al. Micro-blogging as online word of mouth branding[C] //Proceedings of the 27th International Conference Extended Abstracts on Human Factors in Computing System. New York: ACM, 2009:3859-3864.
[12] BOYD D, GOLDER S, LOTAN G. Tweet, tweet, retweet: conversational aspects of retweeting on twitter[C] //Proceedings of the 43rd Hawaii International Conference on System Sciences. Los Alamitos: IEEE Computer Society, New York: ACM, 2010:1567-1666.
[13] SUH B, HONG L, PIROLLI P, et al. Want to be retweeted? large scale analytics on factors impacting retweet in twitter network[C] //Proceedings of IEEE 2nd International Conference on Social Computing(Social Com). Washington: IEEE Computer Society, 2010:177-184.
[14] 夏雨禾. 微博互动的结构与机制—基于对新浪微博的实证研究[J]. 新闻传播与研究, 2010(4):60-69. XIA Yuhe. The structure and mechanism of Micro-blog interaction: an empirical study on Sina Microblog[J]. Journalism & Communication, 2010(4):60-69.
[15] 李英乐. 微博传播效果预测技术研究[D].郑州:解放军信息工程大学, 2013. LI Yingle. Research on the prediction technology of micro-blog communication effect [D]. Zhengzhou:The PLA Information Engineering University, 2013.
[16] 张旸,路荣,杨青. 微博客中转发行为的预测研究[J]. 中文信息学报,2012, 26(4):109-114, 121. ZHANG Yang, LU Rong, YANG Qing. Predicting retweeting in Microblogs[J]. Journal of Chinese Information Processing, 2012, 26(4):109-114, 121.
[17] ZHANG Xue, FUEHRES H, GLOOR P A, et al. Predictiong stock market indicators through twitter “I hope it is not as bad as I fear”[J]. Procedia-Social and Behavioral Sciences, 2011, 26(26):55-62.
[18] BOLLEN J, MAO Huina, ZENG Xiaojun. Twitter mood predicts the stock market [J]. Journal of Computational Science, 2011, 2(1):1-8.
[19] 金桃,岳敏,穆进超,等. 基于SVM的多变量股市时间序列预测研究[J]. 计算机应用与软件,2010,27(6):191-194,209. JIN Tao, YUE Min, MU Jinchao, et al. On SVM-based multi-variable stock market time series prediction[J]. Computer Applications & Software, 2010, 27(6):191-194,209.
[20] 王超,李楠,李欣丽,等. 倾向性分析用于金融市场波动率的研究[J]. 中文信息学报,2009,23(1):95-99. WANG Chao, LI Nan, LI Xinli, et al. The research on financial volatility with sentiment analysis[J]. Journal of Chinese Information Processing, 2009, 23(1):95-99.
[21] 余佩琨,钟瑞军. 个人投资者情绪能预测市场收益率吗?[J]. 南开管理评论,2009,12(1):96-101. YU Peikun, ZHONG Ruijun. Can individual investor sentiment predict market returns?[J]. Nankai Business Review, 2009, 12(1):96-101.
[22] 饶育蕾,刘达锋. 行为金融学[M].上海:上海财经大学出版社, 2003. RAO Yulei, LIU Dafeng. Behavioral finance [M].Shanghai: Shanghai University of Finance and Economics Press, 2003.
[1] 张帆,罗成,刘奕群,张敏,马少平. 异质搜索环境下的用户偏好性预测方法研究[J]. 山东大学学报(理学版), 2017, 52(9): 26-34.
[2] 张聪,裴家欢,黄锴宇,黄德根,殷章志. 基于语义图优化算法的中文微博观点摘要研究[J]. 山东大学学报(理学版), 2017, 52(7): 59-65.
[3] 张中军,张文娟,于来行,李润川. 基于网络距离和内容相似度的微博社交网络社区划分方法[J]. 山东大学学报(理学版), 2017, 52(7): 97-103.
[4] 许忠好,李天奇. 基于复杂网络的中国股票市场统计特征分析[J]. 山东大学学报(理学版), 2017, 52(5): 41-48.
[5] 胡默之,姚天昉. 中文微博观点句识别及评价对象抽取方法[J]. 山东大学学报(理学版), 2016, 51(7): 81-89.
[6] 孙赫,李淑琴,吕学强,刘克会. 微博城市投诉文本中的地理位置实体识别[J]. 山东大学学报(理学版), 2016, 51(3): 77-85.
[7] 李希鹏,郭岩,赵岭,张儒清,刘悦,俞晓明,程学旗. 基于事件的新闻客户端热门评论预测框架[J]. 山东大学学报(理学版), 2016, 51(3): 91-97.
[8] 刘连新,何伟平,刘郁,金勇. 白藜芦醇类似物热力学性质的构效关系[J]. 山东大学学报(理学版), 2016, 51(11): 79-87.
[9] 何炎祥, 刘健博, 孙松涛, 文卫东. 基于层叠条件随机场的微博商品评论情感分类[J]. 山东大学学报(理学版), 2015, 50(11): 67-73.
[10] 王立人, 余正涛, 王炎冰, 高盛祥, 李贤慧. 基于有指导LDA用户兴趣模型的微博主题挖掘[J]. 山东大学学报(理学版), 2015, 50(09): 36-41.
[11] 昝红英, 吴泳钢, 贾玉祥, 牛桂玲. 基于多源知识的中文微博命名实体链接[J]. 山东大学学报(理学版), 2015, 50(07): 9-16.
[12] 周超, 严馨, 余正涛, 洪旭东, 线岩团. 融合词频特性及邻接变化数的微博新词识别[J]. 山东大学学报(理学版), 2015, 50(03): 6-10.
[13] 唐波, 陈光, 王星雅, 王非, 陈小慧. 微博新词发现及情感倾向判断分析[J]. 山东大学学报(理学版), 2015, 50(01): 20-25.
[14] 刘培玉, 张艳辉, 朱振方, 荀静. 融合表情符号的微博文本倾向性分析[J]. 山东大学学报(理学版), 2014, 49(11): 8-13.
[15] 匡冲, 刘知远, 孙茂松. 微博转发者的个性化排序[J]. 山东大学学报(理学版), 2014, 49(11): 31-36.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!