您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(理学版)》

山东大学学报 (理学版) ›› 2018, Vol. 53 ›› Issue (11): 67-77.doi: 10.6040/j.issn.1671-9352.1.2017.023

• • 上一篇    下一篇

基于重要结点的社区发现算法

王鑫1,2,左万利2,3,朱枫彤3,王英2,3*   

  1. 1.长春工程学院计算机技术与工程学院, 吉林 长春 130012;2.吉林大学符号计算与知识工程教育部重点实验室, 吉林 长春 130012;3.吉林大学计算机科学与技术学院, 吉林 长春 130012
  • 发布日期:2018-11-14
  • 作者简介:王鑫(1981— ),男,副教授,博士,研究方向为社会计算、机器学习、数据挖掘、推荐系统. E-mail:xinwangjlu@gmail.com*通信作者简介:王英(1981— ),女,副教授,博士,研究方向为社会计算、机器学习、数据挖掘. E-mail:wangying2010@jlu.edu.cn
  • 基金资助:
    国家自然科学基金资助项目(61602057);教育部重点实验室开放基金资助项目(93K172016K13);吉林省科技厅优秀青年人才基金项目(20170520059JH);广西可信软件重点实验室研究课题项目(kx201533);中国博士后基金面上项目(2017M611301)

Important-node-based community detection algorithm

WANG Xin1,2, ZUO Wan-li2,3, ZHU Feng-tong3, WANG Ying2,3*   

  1. 1. School of Computer Technology and Engineering, Changchun Institute of Technology, Changchun 130012, Jilin, China;
    2. Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, Jilin, China;
    3. College of Computer Science and Technology, Jilin University, Changchun 130012, Jilin, China
  • Published:2018-11-14
  • About author:国家自然科学基金资助项目(61602057);教育部重点实验室开放基金资助项目(93K172016K13);吉林省科技厅优秀青年人才基金项目(20170520059JH);广西可信软件重点实验室研究课题项目(kx201533);中国博士后基金面上项目(2017M611301)
  • Supported by:
    国家自然科学基金资助项目(61602057);教育部重点实验室开放基金资助项目(93K172016K13);吉林省科技厅优秀青年人才基金项目(20170520059JH);广西可信软件重点实验室研究课题项目(kx201533);中国博士后基金面上项目(2017M611301)

摘要: 复杂网络中内部的社区结构是复杂网络结构特征和属性特征的具体体现。首先依据模块度最大化理论计算网络的模块度矩阵的最大k特征向量矩阵;然后提出聚类中心方法,并用于求出k个社团的重要结点作为k聚类中心,利用欧几里得距离计算每一个结点到k个聚类中心的距离,将结点分配到距离聚类中心最近的社区中;最后对网络应用k-means方法进行迭代计算,得到k个社区的划分。分别在Karate Club Network和American College Football数据集上对算法进行了实验验证,实验结果表明该算法可以有效发现潜在社区,其纯度与模块度比已有的社区发现算法都有一定的提高,并且迭代次数较少,效率较高。

关键词: 复杂网络, 社区相似, k-means算法, 社区发现

Abstract: The internal community structure is the specific expression of structural features and attribute features in complex networks. First, the maximum k eigenvectors matrix of the modularity matrix of the network was calculated based on the modularity maximization theory. Then, the method of cluster centrality was proposed and used to calculate the important nodes of the k communities as k cluster centralities. The distance between each node and k cluster centralities was calculate by Euclidean distance and each node was assigned to the nearest cluster centrality of the community. Final, k-means iterative calculation method was applied into the network and the k communities divisions was eventually obtained. The algorithm was verified experimentally on both Karate Club Network dataset and American College Football dataset. The experimental results show that the algorithm can effectively identify potential community. The algorithm improves the purity and modularity to a certain extent compared with other community detection algorithms, and with fewer iterations and higher efficiency.

Key words: complex network, community similarity, k-means algorithm, community detection

中图分类号: 

  • TP391
[1] NEWMAN M E J. Fast algorithm for detecting community structure in networks[J]. Phys Rev E, 2004, 69(6):066133. Doi:10.1103/PhysRevE.69.060133.
[2] SHEN H W, CHENG X Q. Spectral methods for the detection of network community structure: a comparative analysis[J]. Journal of Statistical Mechanics Theory & Experiment, 2010, 2010(10):10020.
[3] WANG T S, LIN H T, WANG P. Weighted-spectral clustering algorithm for detecting community structures in complex networks[J]. Artificial Intelligence Review, 2017, 47(4):463-483.
[4] CLAUSET A, NEWMAN M E J, MOORE C. Finding community structure in very large networks[J]. Phys Rev E, 2004, 70(6):066111.
[5] WAKITA K, TSURUMI T. Finding community structure in mega-scale social networks[C] // Proceedings of the 16th International Conference on World Wide Web(WWW 2007). 2007:1275-1276. ArXiv: cs/0702048.
[6] KIRKPATRICK S, GELATT C D, VECCHI M P. Optimization by simulated annealing[J]. Science, 1983, 220(4598):671-680.
[7] GUIMERÀ R, SALES-PARDO M, AMARAL L A N. Modularity from fluctuations in random graphs and complex networks[J]. Phys Rev E, 2004, 70(2):025101. Doi:10.1103/PhysRevE.70.025/01.
[8] GUIMERÀ R, AMARAL L A N. Functional cartography of complex metabolic networks[J]. Nature, 2005, 433(7028):895-900.
[9] BOETTCHER S, PERCUS A G. Optimization with extremal dynamics[J]. Phys Rev Lett, 2001, 86(23):5211-5214.
[10] DUCH J, ARENAS A. Community detection in complex networks using extremal optimization[J]. Phys Rev E, 2005, 72(2):027104. Doi:10.1103/PhysRevE.72.027104.
[11] LI Z P, ZHANG S H, WANG R S, et al. Quantitative function for community detection[J]. Physical Review E, 2008, 77(3):036109. Doi:10.1103/PhysRevE.77.036109.
[12] WANG G X, SHEN Y, OUYANG M. A vector partitioning approach to detecting community structure in complex networks[J]. Computer & Mathematics with Applications, 2008, 55(12):2746-2752.
[13] WANG X, QIAN B, DAVIDSON I. On constrained spectral clustering and its applications[J]. Data Mining and Knowledge Discovery, 2014, 28(1):1-30.
[14] BARNES E R. An algorithm for partitioning the nodes of a graph[C] // Proceedings of the 20th IEEE Conference on Decision and Control Including the Symposium on Adaptive Processes. New York: IEEE, 1981: 303-304.
[15] GOLUB G H, LOAN C F V. Matrix computations[M]. Baltimore: John Hopkins University Press, 1989.
[16] MA X K, GAO L, XUERONG Y, et al. Semi-supervised clustering algorithm for community structure detection in complex networks[J]. Physica A, 2010, 389(1):187-197.
[17] FU L D, GAO L. An eigenvector-based kernel clustering approach to detecting communities in complex networks[C] // Proceeding of the 1st IRAST International Conference on Data Engineering and Internet Technology. Berlin: Springer-Verlag, 2013, 156:23-28.
[18] NEWMAN M E J. Finding community structure in networks using the eigenvectors of matrices[J]. Physical Review E, 2006, 74(3):036104. Doi:10.1103/PhysRevE.74.036104.
[19] KHORASGANI R R, CHEN J, ZAÏANE O R. Top leaders community detection approach in information networks[C] // Proceedings of the 2010 International Conference on Knowledge Discovery and Data Mining(KDD10). New York: ACM, 2010: 1-9.
[20] ADAMCSEK B, PALLA G, FARKAS I J, et al. CFinder: locating cliques and overlapping modules in biological networks[J]. Bioinformatics, 2006, 22(8):1021-1023.
[21] ZACHARY W W. An information flow model for conflict and fission in small groups[J]. Journal of Anthropological Research, 1997, 33(4):452-473.
[1] 张军,李竞飞,张瑞,阮兴茂,张烁. 基于网络有效阻抗的社区发现算法[J]. 山东大学学报(理学版), 2018, 53(3): 24-29.
[2] 魏思敏,张宪华,张祯,孟庆春,张夏然. 基于复杂网络的虚拟品牌社区意见领袖识别研究——以魅族Flyme社区为例[J]. 山东大学学报 (理学版), 2018, 53(11): 26-34.
[3] 王亚奇,王静. 考虑好奇心理机制的动态复杂网络谣言传播研究[J]. 山东大学学报(理学版), 2017, 52(6): 99-104.
[4] 吴平杰,周斌,吴泉源. COT:一种连续时间序列建模的社区发现算法[J]. 山东大学学报(理学版), 2016, 51(11): 41-49.
[5] 孙松涛, 何炎祥, 蔡瑞, 李飞, 贺飞艳. 面向微博情感评测任务的多方法对比研究[J]. 山东大学学报(理学版), 2014, 49(11): 43-50.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 邵勇. 半格序完全正则周期半群[J]. 山东大学学报(理学版), 2018, 53(10): 1 -5 .
[2] 巩增泰,高寒. n维模糊数值函数的预不变凸性[J]. 山东大学学报(理学版), 2018, 53(10): 72 -81 .
[3] 陈文倩,张孝金,昝立博. Gorenstein代数上的倾斜模的个数[J]. 山东大学学报(理学版), 2018, 53(10): 14 -16 .
[4] 郭寿桃,王占平. 正合零因子下模的Gorenstein同调维数[J]. 山东大学学报(理学版), 2018, 53(10): 17 -21 .
[5] 吴小英,王芳贵. 分次版本的Enochs定理[J]. 山东大学学报(理学版), 2018, 53(10): 22 -26 .
[6] 李美莲,邓青英. 平图的transition多项式的Maple计算[J]. 山东大学学报(理学版), 2018, 53(10): 27 -34 .
[7] 王丹,王正攀. 用禁止子半群刻画带簇的一个真子簇[J]. 山东大学学报(理学版), 2018, 53(10): 6 -8 .
[8] 梁星亮,吴苏朋,任军. C(P')系对幺半群的刻画[J]. 山东大学学报(理学版), 2018, 53(10): 9 -13 .
[9] 房启明,张莉. 无4-圈和5-圈的平面图的k-frugal列表染色[J]. 山东大学学报(理学版), 2018, 53(10): 35 -41 .
[10] 甄苇苇,曾剑,任建龙. 基于变分理论与时间相关的抛物型反源问题[J]. 山东大学学报(理学版), 2018, 53(10): 61 -71 .