您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(理学版)》

山东大学学报(理学版) ›› 2017, Vol. 52 ›› Issue (7): 97-103.doi: 10.6040/j.issn.1671-9352.1.2016.007

• • 上一篇    下一篇

基于网络距离和内容相似度的微博社交网络社区划分方法

张中军1,2,张文娟1,于来行1,3,李润川4,5   

  1. 1.周口师范学院计算机科学与技术学院, 河南 周口 466001;2.农产品质量安全追溯技术河南省工程实验室, 河南 周口 466001;3.大连理工大学计算机科学与技术学院, 辽宁 大连 116024;4.郑州大学互联网医疗与健康服务河南省协同创新中心, 河南 郑州 450000;5.郑州大学产业技术研究院, 河南 郑州 450000
  • 收稿日期:2016-11-25 出版日期:2017-07-20 发布日期:2017-07-07
  • 作者简介:张中军(1982— ),男,硕士研究生,讲师,研究方向为数据挖掘、大数据、机器学习. E-mail:suedy521@163.com
  • 基金资助:
    国家自然科学基金资助项目(U1504602);河南省教育厅科学技术研究资助项目(14B520014);河南省科技厅科技计划资助项目(162102310590);河南省高等学校重点科研资助项目(16A520106);教育教学改革项目资助(J2016037)

A community division method based on network distance and content similarity in micro-blog social network

ZHANG Zhong-jun1,2, ZHANG Wen-juan1, YU Lai-hang1,3, LI Run-chuan4,5   

  1. 1. School of Computer Science and Technology of Zhoukou Normal University, Zhoukou 466001, Henan, China;
    2. Traceability Technology of Agricultural products quality and Safety Engineering Laboratory of Henan Provincial, Zhoukou 466001, Henan, China;
    3. School of Computer Science and Technology of Dalian University of Technology, Dalian 116024, Liaoning, China;
    4. Collaborative Innovation Center of Internet Medical and Healthcare in Henan, Zhengzhou 450000, Henan, China;
    5. Institute of industrial technology, Zhengzhou University, Zhengzhou 450000, Henan, China
  • Received:2016-11-25 Online:2017-07-20 Published:2017-07-07

摘要: 现有的微博社交网络社区挖掘方法多是基于网络结构进行,忽略了节点本身行为的重要性,并且不能同时实现对大规模复杂网络结构适应性和社区挖掘的高效性。为缓解上述问题,提出了一种基于网络距离和内容相似度的微博社交网络社区划分方法,该方法在考虑微博社交网络结构的同时兼顾了网络中节点的历史微博内容,通过对历史微博数据的分析提高社区划分的精确度。文中对Louvain算法和其模块性的修改使用,保证了该方法能够处理大规模网络数据,同时又能保证社区挖掘的效率。实验证明,该方法能够高效地挖掘微博网络社区结构,对学术研究和商业应用都有十分重要的意义。

关键词: 社交网络, 模块度, 微博, 社区

Abstract: Existing micro-blog social network community mining methods are based on the network structure, ignoring the importance of nodes behavior, and can not guarantee the adaptability on large-scale complex network structure and the efficiency of community mining. To alleviate these problems, a new method ABDC is proposed for the community network of micro-blog based on the network distance and content similarity, the method considers the structure of the social network of micro-blog at the same time taking into account the historical blog content of the node in the network, improved the accuracy of community division through analysis the historical micro-blog data, In this paper, the Louvain algorithm and its modularity are modified and used to ensure that the method can deal with large scale network data, and 山 东 大 学 学 报 (理 学 版)第52卷 - 第7期张中军,等:基于网络距离和内容相似度的微博社交网络社区划分方法 \=-get high efficiency of community mining. Experiments show that the method can efficiently mine the community structure of micro-blog network, which has great significance for academic research and business applications.

Key words: micro-blog, social network, modularity, community

中图分类号: 

  • TP311
[1] BEDI Punam, SHARMA Chhavi. Community detection in social networks[J]. Wiley Interdisciplinary Reviews Data Mining & Knowledge Discovery, 2016, 496-500(3):2174 -2177.
[2] SPEIDEL L, TAKAGUCHI T, MASUDA N. Community detection in directed acyclic graphs[J]. European Physical Journal B, 2015, 88(8):1-10.
[3] ARAB M, AFSHARCHI M. Community detection in social networks using hybrid merging of sub-communities[J]. Journal of Network & Computer Applications, 2014, 40(2):73-84.
[4] ZADEH P M, KOBTI Z. Community detection in social networks by cultural algorithm[C] // Proceedings of the International Conference on Collaboration Technologies And Systems. New York: IEEE, 2015.
[5] DEV H. A user interaction based community detection algorithm for online social networks[C] // Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data. New York: ACM, 2014: 1607-1608.
[6] LI Kan, PANG Yin. A unified community detection algorithm in complex network[J]. Neurocomputing, 2014, 130(3):36-43.
[7] LU Zongqing, WEN Yanggang, CAO Guohong. Community detection in weighted networks: Algorithms and applications[J]. IEEE International Conference on Pervasive Computing & Communications, 2013, 26(3):179-184.
[8] LEE C, CUNNINGHAM P. Community detection: effective evaluation on large social networks[J]. Jcomplexnetw, 2014, 2(1):19-37.
[9] NEWMAN M E J. Community, modules and large-scale structure in networks[J]. Nature Physics, 2012(1):25.31.
[10] AARON C, NEWMAN M E J, CRISTOPHER M. Finding community structure in very large networks[J]. Physical Review E Statistical Nonlinear & Soft Matter Physics, 2004, 70(6):264-277.
[11] BLONDEL V D, GUILLAUME J L, LAMBIOTTE R, et al. Fast unfolding of communities in large networks[J]. Journal of Statistical Mechanics Theory & Experiment, 2008, 30(2):155-168.
[12] CHEN Xiaolei, CHEN Xiang, CHENG Yijie. Community structure discovery and community topic analysis in microblog[J]. International Conference on Information Management, 2013, 1:590-595.
[13] FELLER A, KUHNERT M, SPRENGER T O, et al. Divided they tweet: the network structure of political microbloggers and discussion topics[C] // Proceedings of the International Conference on Weblogs and Social Media.Barcelona: International Conference on Weblogs and Social Media, 2011.
[14] DING Ying. Community detection: Topological vs. topical[J]. Journal of Informetrics, 2011, 5(4):498-514.
[15] 孙怡帆,李赛.基于相似度的微博社交网络的社区发现方法[J]. 计算机研究与发展,2014,51(12):2797-2807. SUN Yifan, LI Sai. Similarity-based community detection in social network of microblog[J]. Journal of Computer Research and Development, 2014, 51(12):2797-2807.
[1] 张军,李竞飞,张瑞,阮兴茂,张烁. 基于网络有效阻抗的社区发现算法[J]. 山东大学学报(理学版), 2018, 53(3): 24-29.
[2] 张聪,裴家欢,黄锴宇,黄德根,殷章志. 基于语义图优化算法的中文微博观点摘要研究[J]. 山东大学学报(理学版), 2017, 52(7): 59-65.
[3] 邓小方,钟元生,吕琳媛,王明文,熊乃学. 融合社交网络的物质扩散推荐算法[J]. 山东大学学报(理学版), 2017, 52(3): 51-59.
[4] 祝升,周斌,朱湘. 综合用户相似性与话题时效性的影响力用户发现算法[J]. 山东大学学报(理学版), 2016, 51(9): 113-120.
[5] 李宇溪,王恺璇,林慕清,周福才. 基于匿名广播加密的P2P社交网络隐私保护系统[J]. 山东大学学报(理学版), 2016, 51(9): 84-91.
[6] 胡默之,姚天昉. 中文微博观点句识别及评价对象抽取方法[J]. 山东大学学报(理学版), 2016, 51(7): 81-89.
[7] 孙赫,李淑琴,吕学强,刘克会. 微博城市投诉文本中的地理位置实体识别[J]. 山东大学学报(理学版), 2016, 51(3): 77-85.
[8] 吴平杰,周斌,吴泉源. COT:一种连续时间序列建模的社区发现算法[J]. 山东大学学报(理学版), 2016, 51(11): 41-49.
[9] 朱梦珺,蒋洪迅,许伟. 基于金融微博情感与传播效果的股票价格预测[J]. 山东大学学报(理学版), 2016, 51(11): 13-25.
[10] 张少群,魏晶晶,廖祥文,简思远,陈国龙. Twitter中的情绪传染现象[J]. 山东大学学报(理学版), 2016, 51(1): 71-76.
[11] 刘井莲,王大玲,赵卫绩,冯时,张一飞. 一种基于核心节点扩展的社区挖掘算法[J]. 山东大学学报(理学版), 2016, 51(1): 106-114.
[12] 何炎祥, 刘健博, 孙松涛, 文卫东. 基于层叠条件随机场的微博商品评论情感分类[J]. 山东大学学报(理学版), 2015, 50(11): 67-73.
[13] 王立人, 余正涛, 王炎冰, 高盛祥, 李贤慧. 基于有指导LDA用户兴趣模型的微博主题挖掘[J]. 山东大学学报(理学版), 2015, 50(09): 36-41.
[14] 昝红英, 吴泳钢, 贾玉祥, 牛桂玲. 基于多源知识的中文微博命名实体链接[J]. 山东大学学报(理学版), 2015, 50(07): 9-16.
[15] 祝瑞. 一种基于信任度的电子商务社区聚类模型[J]. 山东大学学报(理学版), 2015, 50(05): 18-22.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!