您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(理学版)》

山东大学学报(理学版) ›› 2018, Vol. 53 ›› Issue (9): 23-34.doi: 10.6040/j.issn.1671-9352.1.2017.044

• • 上一篇    下一篇

基于复合主题演化模型的作者研究兴趣动态发现

余传明1,左宇恒1,郭亚静1,安璐2*   

  1. 1. 中南财经政法大学信息与安全工程学院, 湖北 武汉 430073;2. 武汉大学信息管理学院, 湖北 武汉 430072
  • 收稿日期:2017-07-04 出版日期:2018-09-20 发布日期:2018-09-10
  • 作者简介:余传明(1978— ),男,博士,副教授,研究方向为数据挖掘与商务智能. E-mail:yucm@zuel.edu.cn *通信作者简介:安璐(1979— ),女,博士,教授,博士生导师,研究方向为可视化知识发现. E-mail:anlu97@163.com
  • 基金资助:
    国家自然科学基金资助项目(71373286);教育部哲学社会科学研究重大课题攻关项目(17JZD034);国家自然科学基金青年科学基金项目(71603189)

Dynamic discovery of authors research interest based on the combined topic evolutional model

YU Chuan-ming1, ZUO Yu-heng1, GUO Ya-jing1, AN Lu2*   

  1. 1. School of Information and Safety Engineering, Zhongnan University of Economics and Law, Wuhan 430073, Hubei, China;
    2. School of Information Management, Wuhan University, Wuhan 430072, Hubei, China
  • Received:2017-07-04 Online:2018-09-20 Published:2018-09-10

摘要: 以金融领域的科技文献作为实验数据,提出了一种新的用于动态挖掘领域相关的作者研究兴趣的复合主题演化模型。该模型能够获取作者在不同时间片下的主题概率分布以及主题下词汇概率分布,并充分考虑作者在合作作者文献中的排名对于其研究主题和主题变化的影响。通过金融领域的实证研究表明,该复合主题演化模型能够有效地揭示金融领域作者研究兴趣的动态变化。

关键词: 主题挖掘, 主题演化模型, 复合主题演化模型

Abstract: We propose a new combined topic model, i.e. author topic time-latent dirichlet allocation(ATT-LDA)with author ranking(AR), for the of dynamic discovery of researchers' interest, which is based on the academic literature in the financial field. Through the proposed model, we can easily acquire the probability distribution of the authors' interest, as well as the probability distribution of topics on deferent words. The influence of the ranking in the co-author list are fully taken into consideration. The empirical study shows that the proposed method can effectively reveal the dynamic change of interest of the authors in the financial field.

Key words: combined topical model, topic mining, topic evolution model

中图分类号: 

  • TP391
[1] BLEI D, NG A, JORDAN M. Latent dirichlel allocation[J]. Journal of Machine Learning Research, 2003, 3:993-1022.
[2] ROSEN-ZVI M, GRIFFITHS T, STEYVERS M, et al. The author-topic model for authors and documents[C] // Conference on Uncertainty in Artificial Intelligence. [S.l.] : AUAI Press, 2004:487-494.
[3] ROSEN-ZVI M, CHEMUDUGUNTA C, GRIFFITHS T, et al. Learning author-topic models from text corpora [J]. ACM Transactions on Information Systems, 2010, 28(1):1-33.
[4] MIMNO D, McCallum A. Expertise modeling for matching papers with reviewers[C] //Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM Press, 2007:500-509.
[5] KAWAMAE N. Author interest topic model[C] // Proceedings of the 33th International ACM STGIR Conference on Research and Development in Information Retrieval. New York: ACM Press, 2010:887-888.
[6] KAWAMAE N. Latent interest-topic model: finding the causal relationships behind dyadic data[C] // Proceedings of the 19th ACM CIKM International Conference on Information and Knowledge Management. New York: ACM Press, 2010:649-658.
[7] 李纲, 李岚凤, 毛进, 等. 作者合著网络中研究兴趣相似性实证研究[J]. 图书情报工作, 2015(2):75-81. LI Gang, LI Lanfeng, MAO Jin, et al. Empirical research on similarity of research interests in co-authorship network [J]. Library and Information Service, 2015(2):75-81.
[8] 徐戈, 王厚峰.自然语言处理中主题模型的发展[J].计算机学报, 2011, 34(8): 1424-1437. XU Ge, WANG Houfeng. The development of topic models in natural language processing [J]. Chinese Journal of Computers, 2011, 34(8):1424-1437.
[9] 刘萍, 郑凯伦, 邹德安. 基于LDA模型的科研合作推荐研究[J]. 情报理论与实践, 2015, 38(9):79-85. LIU Ping, ZHEN Kailun, ZHOU Dean. The scientific collaboration recommendation based LDA model [J]. Information Studies: Theory & Application, 2015, 38(9):79-85.
[10] HAN J, LEE H. Characterizing the interests of social media users: refinement of a topic model for incorporating heterogeneous media [J]. Information Sciences, 2016, 358-359: 112-128.
[11] 余传明, 张小青, 陈雷.基于LDA模型的评论热点挖掘:原理与实现[J].情报理论与实践, 2010(5):103-106. YU Chuanming, ZHANG Xiaoqing, CHEN Lei. Mining hot topics of user comment based on LDA model: principle & approch[J]. Information Studies: Theory & Application, 2010(5):103-106.
[12] 陈文涛, 张小明, 李舟军.构建微博用户兴趣模型的主题模型的分析[J].计算机科学, 2013, 40(4):127-130. CHEN Wentao, ZHANG Xiaoming, LI Zhoujun. Analysis of topic models on modeling micro blog user interestingness [J]. Computer Science, 2013, 40(4):127-130.
[13] 单斌, 李芳.基于LDA话题演化研究方法综述[J].中文信息学报, 2010, 24(6):43-49. SHAN Bin, LI Fang. A survey of topic evolution based on LDA[J]. Journal of Chinese Information Processing, 2010, 24(6):43-49.
[14] BLEI D M, LAFFERTY J D. Dynamic topic models[C] // Proceedings of the 23rd International Conference on Machine Learning. New York: ACM, 2006: 113-120.
[15] WANG X, MCCALLUM A.Topics over time:a non-Markov continuous-time model of topical trends[C] // Proceedings of the 12th ACM SIGKDD international Conference on Knowledge Discovery and Data Mining.New York: ACM, 2006:424-433.
[16] XU Shuo, SHI Qingwei, QIAO Xiaodong, et al. Author-topic over time(AToT): a dynamic users interest model [J]. Lecture Notes in Electrical Engineering, 2014, 274:58-61.
[17] 倪丽萍, 刘小军, 马驰宇, 基于LDA模型和AP聚类的主题演化分析[J].计算机技术与发展, 2016, 26(12):6-11. NI liping, LIU Xiaojun, MA Chiyu. Topic evolution analysis based on LDA model and AP clustering [J]. Computer Technology and Development, 2016, 26(12):6-11.
[18] 史庆伟, 李艳妮, 郭朋亮, 科技文献中作者研究兴趣动态发现[J].计算机应用, 2013, 33(11):3080-3083. SHI Qingwei, LI Yanni, GUO Pengliang. Dynamic finding of authors research interests in scientific literature [J]. Journal of Computer Applications, 2013, 33(11):3080-3083.
[19] 廖君华, 孙克迎, 钟丽霞. 一种基于时序主题模型的网络热点话题演化分析系统[J].图书情报工作, 2013,57(9):96-102. LIAO Junhua, SUN Keying, ZHONG Lixia. Study a hot topics analysis system based on time sliced topic model [J]. Library and Information Service, 2013, 57(9):96-102.
[20] 李保利, 杨星. 基于LDA模型和话题过滤的研究主题演化分析[J]. 小型微型计算机系统, 2012(12): 2738-2743. LI Baoli, YANG Xing. Analyzing research topic evolution with LDA and topic filtering[J]. Journal of Chinese Computer Systems, 2012(12): 2738-2743.
[21] YAN Erjia. Research dynamics, impact, and dissemination: a topic-level analysis[J]. Journal of the Association for Information Science & Technology, 2015, 66(11):2357-2372.
[1] 奉国和,王丹迪,李媚婵. 基于SVD的档案学主题挖掘[J]. 山东大学学报(理学版), 2016, 51(1): 95-100.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!