JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE) ›› 2021, Vol. 56 ›› Issue (1): 91-102.doi: 10.6040/j.issn.1671-9352.4.2020.145

•   • Previous Articles     Next Articles

Overlapping community detection based on density peaks and network embedding

Yi-ming ZHANG(),Guo-yin WANG*(),Jun HU,Shun FU   

  1. Chongqing Key Laboratory of Computational Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
  • Received:2020-06-15 Online:2021-01-01 Published:2021-01-05
  • Contact: Guo-yin WANG E-mail:s180201010@stu.cqupt.edu.cn;wanggy@cqupt.edu.cn

Abstract:

Density peaks is a density-based clustering algorithm. It assumes that the center points of the cluster have higher density and are surrounded by nodes with lower density. Due to the character of graph structure, density peaks cannot be directly applied to network structure, and most of density peaks based community detection algorithms based on density peaks are based on graph topology or adjacency matrix to measure node approximation, which often leads to great computational complexity. This paper proposes an overlapping community detection algorithm based on density and network embedding (OCDDNE). The proposed algorithm firstly embeds the network structure characteristics of nodes in network structure through network embedding, and then clusters the embedded node vectors based on the improved method of density peaks, so that the structural relationship between the encoded vectors can be better revealed and the overlapping communities which each node is located is determined. Experiments on synthetic networks and real networks data show that the proposed algorithm can efficiently find the overlapping community structure in networks, and it is superior to other algorithms especially in complex networks with high structural complexity.

Key words: overlapping community detection, network embedding, density peak, complex network, belonging coefficient

CLC Number: 

  • TP391

Fig.1

Example of selecting the center points from the decision graph on karate date (the red nodes in the graph is selected center points)"

Table 1

Parameter information of LFR benchmark"

参数 说明
N 网络节点数
k 节点平均度
max k 节点最大度
min c 社区最小节点数
max c 社区最大节点数
on 重叠社区节点数
om 重叠节点所属社区的数目
mut 网络结构混合参数
muw 边权重混合参数

Table 2

Parameter setting on synthetic network"

ID N k max k min c max c on om mut muw
R1 500 10 30 10 30 50 2 [0.2, 0.8] 0.1
R2 1 000 10 50 20 50 100 2 [0.2, 0.8] 0.1
R3 5 000 20 100 30 100 500 2 [0.2, 0.8] 0.1
R4 500 10 30 10 30 50 [2, 8] 0.3 0.3
R5 1 000 10 50 20 50 100 [2, 8] 0.3 0.3
R6 5 000 20 100 30 100 500 [2, 8] 0.6 0.3

Table 3

Parameter setting on real network"

数据集名称 节点数 边数 平均度 聚集系数
karate 34 78 4.590 0.571
football 115 613 10.660 0.403
email 1 005 25 571 9.622 0.220
Pol.Book 105 441 8.400 0.487
Jazz 198 2 742 27.700 0.618
Lesmis 77 256 6.600 0.573
facebook 4 039 88 234 43.780 0.605

Fig.2

Performance comparison under different parameter of mut"

Fig.3

Performance comparison under different parameter of om"

Fig.4

Performance comparison on real networks"

Fig.5

Center point selection on real network(the red nodes in the graph is selected center points)"

Fig.6

Experimental results with different λ and θ"

1 RODRIGUEZ A , LAIO A . Clustering by fast search and find of density peaks[J]. Science, 2014, 344 (6191): 1492- 1496.
doi: 10.1126/science.1242072
2 CUI P , WANG X , PEI J , et al. A survey on network embedding[J]. IEEE Transactions on Knowledge and Data Engineering, 2018, 31 (5): 833- 852.
3 PALLA G , DERÉNYI I , FARKAS I , et al. Uncovering the overlapping community structure of complex networks in nature and society[J]. Nature, 2005, 435 (7043): 814- 818.
doi: 10.1038/nature03607
4 RAGHAVAN U N , ALBERT R , KUMARA S . Near linear time algorithm to detect community structures in large-scale networks[J]. Physical Review E, 2007, 76 (3): 036106.
doi: 10.1103/PhysRevE.76.036106
5 GREGORY S . Finding overlapping communities in networks by label propagation[J]. New Journal of Physics, 2010, 12 (10): 103018.
doi: 10.1088/1367-2630/12/10/103018
6 LU M , ZHANG Z , QU Z , et al. LPANNI: overlapping community detection using label propagation in large-scale complex networks[J]. IEEE Transactions on Knowledge and Data Engineering, 2018, 31 (9): 1736- 1749.
7 AHN Y Y , BAGROW J P , LEHMANN S . Link communities reveal multiscale complexity in networks[J]. Nature, 2010, 466 (7307): 761- 764.
doi: 10.1038/nature09182
8 LANCICHINETTI A , FORTUNATO S , KERTÉSZ J . Detecting the overlapping and hierarchical community structure in complex networks[J]. New Journal of Physics, 2009, 11 (3): 033015.
doi: 10.1088/1367-2630/11/3/033015
9 LANCICHINETTI A , RADICCHI F , RAMASCO J J , et al. Finding statistically significant communities in networks[J]. PloS One, 2011, 6 (4): e18961.
doi: 10.1371/journal.pone.0018961
10 WHANG J J, GLEICH D F, DHILLON I S. Overlapping community detection using seed set expansion[C]//Proceedings of the 22nd ACM International Conference on Information & Knowledge Management. New York: ACM, 2013: 2099-2108.
11 BAI X , YANG P , SHI X . An overlapping community detection algorithm based on density peaks[J]. Neurocomputing, 2017, 226, 7- 15.
doi: 10.1016/j.neucom.2016.11.019
12 黄岚, 李玉, 王贵参, 等. 基于点距离和密度峰值聚类的社区发现方法[J]. 吉林大学学报(工学版), 2016, 46 (6): 2042- 2051.
HUANG Lan , LI Yu , WANG Guishen , et al. Community detection method based on vertex distance and clustering of density peaks[J]. Journal of Jilin University (Engineering and Technology Edition), 2016, 46 (6): 2042- 2051.
13 PEROZZI B, AL-RFOU R, SKIENA S. Deepwalk: online learning of social representations[C]//Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data mining. New York: ACM, 2014: 701-710.
14 GROVER A, LESKOVEC J. node2vec: scalable feature learning for networks[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data mining. New York: ACM, 2016: 855-864.
15 WANG D, CUI P, ZHU W. Structural deep network embedding[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data mining. New York: ACM, 2016: 1225-1234.
16 TANG J, QU M, WANG M, et al. Line: large-scale information network embedding[C]//Proceedings of the 24th International Conference on World Wide Web. Switzerland: ACM, 2015: 1067-1077.
17 LI A Q, AHMED A, RAVI S, et al. Reducing the sampling complexity of topic models[C]//Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2014: 891-900.
18 MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed representations of words and phrases and their compositionality[C]//Advances in Neural Information Processing Systems. Lake Tahoe: ACM, 2013: 3111-3119.
19 RECHT B, RE C, WRIGHT S, et al. Hogwild: a lock-free approach to parallelizing stochastic gradient descent[C]//Advances in Neural Information Processing Systems. Granada: ACM, 2011: 693-701.
20 MCDAID A F, GREENE D, HURLEY N. Normalized mutual information to evaluate overlapping community finding algorithms[J]. arXiv Preprint arXiv: 1110.2515, 2011.
21 COLLINS L M , DENT C W . Omega: a general formulation of the rand index of cluster recovery suitable for non-disjoint solutions[J]. Multivariate Behavioral Research, 1988, 23 (2): 231- 242.
doi: 10.1207/s15327906mbr2302_6
22 SHEN H , CHENG X , CAI K , et al. Detect overlapping and hierarchical community structure in networks[J]. Physica A: Statistical Mechanics and Its Applications, 2009, 388 (8): 1706- 1712.
doi: 10.1016/j.physa.2008.12.021
23 HUBERT L , ARABIE P . Comparing partitions[J]. Journal of Classification, 1985, 2 (1): 193- 218.
doi: 10.1007/BF01908075
24 XIE J, SZYMANSKI B K. Towards linear time overlapping community detection in social networks[C]//Pacific-Asia Conference on Knowledge Discovery and Data Mining. Berlin: Springer, 2012: 25-36.
25 COSCIA M, ROSSETTI G, GIANNOTTI F, et al. Demon: a local-first discovery method for overlapping communities[C]//Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2012: 615-623.
26 LANCICHINETTI A , FORTUNATO S , RADICCHI F . Benchmark graphs for testing community detection algorithms[J]. Physical Review E, 2008, 78 (4): 046110.
doi: 10.1103/PhysRevE.78.046110
27 ZACHARY W W . An information flow model for conflict and fission in small groups[J]. Journal of Anthropological Research, 1977, 33 (4): 452- 473.
doi: 10.1086/jar.33.4.3629752
28 GIRVAN M , NEWMAN M E J . Community structure in social and biological networks[J]. Proceedings of the National Academy of Sciences, 2002, 99 (12): 7821- 7826.
doi: 10.1073/pnas.122653799
29 YIN H, BENSON A R, LESKOVEC J, et al. Local higher-order graph clustering[C]//Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2017: 555-564.
30 GLEISER P M , DANON L . Community structure in jazz[J]. Advances in Complex Systems, 2003, 6 (4): 565- 573.
doi: 10.1142/S0219525903001067
31 KNUTH D E . The stanford graphbase: a platform for combinatorial computing[M]. New York: ACM Press, 1993.
32 LESKOVEC J, MCAULEY J J. Learning to discover social circles in ego networks[C]//Advances in Neural Information Processing Systems. Lake Tahoe: ACM, 2012: 539-547.
[1] Kan XU,Rui-xin LIU,Hong-fei LIN,Hai-feng LIU,Jiao-jiao FENG,Jia-ping LI,Yuan LIN,Bo XU. Academic paper recommendation based on heterogeneous network embedding [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2020, 55(11): 35-45.
[2] YANG Ya-ru, WANG Yong-qing, ZHANG Zhi-bin, LIU Yue, CHENG Xue-qi. Social network user identity linkage model based on comprehensive information [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2019, 54(9): 105-113.
[3] WANG Xin, ZUO Wan-li, ZHU Feng-tong, WANG Ying. Important-node-based community detection algorithm [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2018, 53(11): 67-77.
[4] Si-min WEI,Xian-hua ZHANG,Zhen ZHANG,Qing-chun MENG,Xia-ran ZHANG. Virtual brand community opinion leader recognition based on complex network——example of the Meizu Flyme community [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2018, 53(11): 26-34.
[5] SUN Song-tao, HE Yan-xiang, CAI Rui, LI Fei, HE Fei-yan. Comparative study of methods for Micro-blog sentiment evaluation tasks [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2014, 49(11): 43-50.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] SHI Chang-guang . Multi-soliton solution of the Faddeev model[J]. J4, 2007, 42(7): 38 -40 .
[2] DONG Jiong, CAO Xiao-hong. Weyls theorem for the cube of operator and compact perturbations[J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2016, 51(8): 15 -21 .
[3] . Study on the determination method of total flavonoids in ginkgo biloba extract[J]. J4, 2009, 44(5): 40 -44 .
[4] LIU Ru-jun,CAO Yu-xia,ZHOU Ping . Anti-control for discrete chaos systems by small feedback[J]. J4, 2007, 42(7): 30 -32 .
[5] HU Mo-Yin. Multiplicity results for a class of quasi-linear Neumann problems[J]. J4, 2009, 44(10): 39 -42 .
[6] DU Rui-ying, YANG Yong, CHEN Jing, WANG Chi-heng. An efficient network traffic classification scheme based on similarity[J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2014, 49(09): 109 -114 .
[7] GUO Fei-wang, ZHANG Xi-yong, HAN Wen-bao. A construction of perfect nonlinear functions[J]. J4, 2011, 46(3): 26 -30 .
[8] FAN Ming, LIU Jun, ZHENG Qing-hua, TIAN Zhen-zhou, ZHUANG Er-yue, LIU Ting. SODB:a novel method for software plagiarism detection based on stack operation dynamic birthmark[J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2014, 49(09): 9 -16 .
[9] Ma Jia-sai,ZHANG Yong-jun . An optimized method for minimal cubing approach[J]. J4, 2006, 41(3): 104 -107 .
[10] ZHANG Fengxia1, LI Ying1,2, GUO Wenbin1, ZHAO Jianli1. Extremal ranks for block Hermitian and skew-Hermitian matrices[J]. J4, 2010, 45(4): 106 -110 .