《山东大学学报(理学版)》 ›› 2021, Vol. 56 ›› Issue (1): 91-102.doi: 10.6040/j.issn.1671-9352.4.2020.145
Yi-ming ZHANG(),Guo-yin WANG*(),Jun HU,Shun FU
摘要:
密度峰值是一种基于密度的聚类算法, 该算法假设类簇中心点具有较高的密度且被密度较小的节点包围。由于图结构的性质, 密度峰值无法直接适用于网络结构, 现有的基于密度峰值的社区发现算法大部分是基于图的拓扑结构或者邻接矩阵度量节点近似度, 这种方法往往引入较大的计算复杂度。文中结合网络嵌入方法通过低维向量表示网络中的节点信息, 提出了一种基于密度峰值和网络嵌入的重叠社区发现算法(overlapping community detection based on density network embedding, OCDDNE)。该算法首先通过网络嵌入获取节点的网络结构特征, 然后基于改进的密度峰值的方法对嵌入后的节点向量进行多标签聚类, 使编码后的向量之间的结构关系得到更好的揭示, 从而发现网络中的重叠社区结构。在人工网络和真实网络的验证实验表明, 该算法可以有效的挖掘网络中的重叠社区结构, 并在结构复杂度较高的网络中优于其他算法。
中图分类号:
1 |
RODRIGUEZ A , LAIO A . Clustering by fast search and find of density peaks[J]. Science, 2014, 344 (6191): 1492- 1496.
doi: 10.1126/science.1242072 |
2 | CUI P , WANG X , PEI J , et al. A survey on network embedding[J]. IEEE Transactions on Knowledge and Data Engineering, 2018, 31 (5): 833- 852. |
3 |
PALLA G , DERÉNYI I , FARKAS I , et al. Uncovering the overlapping community structure of complex networks in nature and society[J]. Nature, 2005, 435 (7043): 814- 818.
doi: 10.1038/nature03607 |
4 |
RAGHAVAN U N , ALBERT R , KUMARA S . Near linear time algorithm to detect community structures in large-scale networks[J]. Physical Review E, 2007, 76 (3): 036106.
doi: 10.1103/PhysRevE.76.036106 |
5 |
GREGORY S . Finding overlapping communities in networks by label propagation[J]. New Journal of Physics, 2010, 12 (10): 103018.
doi: 10.1088/1367-2630/12/10/103018 |
6 | LU M , ZHANG Z , QU Z , et al. LPANNI: overlapping community detection using label propagation in large-scale complex networks[J]. IEEE Transactions on Knowledge and Data Engineering, 2018, 31 (9): 1736- 1749. |
7 |
AHN Y Y , BAGROW J P , LEHMANN S . Link communities reveal multiscale complexity in networks[J]. Nature, 2010, 466 (7307): 761- 764.
doi: 10.1038/nature09182 |
8 |
LANCICHINETTI A , FORTUNATO S , KERTÉSZ J . Detecting the overlapping and hierarchical community structure in complex networks[J]. New Journal of Physics, 2009, 11 (3): 033015.
doi: 10.1088/1367-2630/11/3/033015 |
9 |
LANCICHINETTI A , RADICCHI F , RAMASCO J J , et al. Finding statistically significant communities in networks[J]. PloS One, 2011, 6 (4): e18961.
doi: 10.1371/journal.pone.0018961 |
10 | WHANG J J, GLEICH D F, DHILLON I S. Overlapping community detection using seed set expansion[C]//Proceedings of the 22nd ACM International Conference on Information & Knowledge Management. New York: ACM, 2013: 2099-2108. |
11 |
BAI X , YANG P , SHI X . An overlapping community detection algorithm based on density peaks[J]. Neurocomputing, 2017, 226, 7- 15.
doi: 10.1016/j.neucom.2016.11.019 |
12 | 黄岚, 李玉, 王贵参, 等. 基于点距离和密度峰值聚类的社区发现方法[J]. 吉林大学学报(工学版), 2016, 46 (6): 2042- 2051. |
HUANG Lan , LI Yu , WANG Guishen , et al. Community detection method based on vertex distance and clustering of density peaks[J]. Journal of Jilin University (Engineering and Technology Edition), 2016, 46 (6): 2042- 2051. | |
13 | PEROZZI B, AL-RFOU R, SKIENA S. Deepwalk: online learning of social representations[C]//Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data mining. New York: ACM, 2014: 701-710. |
14 | GROVER A, LESKOVEC J. node2vec: scalable feature learning for networks[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data mining. New York: ACM, 2016: 855-864. |
15 | WANG D, CUI P, ZHU W. Structural deep network embedding[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data mining. New York: ACM, 2016: 1225-1234. |
16 | TANG J, QU M, WANG M, et al. Line: large-scale information network embedding[C]//Proceedings of the 24th International Conference on World Wide Web. Switzerland: ACM, 2015: 1067-1077. |
17 | LI A Q, AHMED A, RAVI S, et al. Reducing the sampling complexity of topic models[C]//Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2014: 891-900. |
18 | MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed representations of words and phrases and their compositionality[C]//Advances in Neural Information Processing Systems. Lake Tahoe: ACM, 2013: 3111-3119. |
19 | RECHT B, RE C, WRIGHT S, et al. Hogwild: a lock-free approach to parallelizing stochastic gradient descent[C]//Advances in Neural Information Processing Systems. Granada: ACM, 2011: 693-701. |
20 | MCDAID A F, GREENE D, HURLEY N. Normalized mutual information to evaluate overlapping community finding algorithms[J]. arXiv Preprint arXiv: 1110.2515, 2011. |
21 |
COLLINS L M , DENT C W . Omega: a general formulation of the rand index of cluster recovery suitable for non-disjoint solutions[J]. Multivariate Behavioral Research, 1988, 23 (2): 231- 242.
doi: 10.1207/s15327906mbr2302_6 |
22 |
SHEN H , CHENG X , CAI K , et al. Detect overlapping and hierarchical community structure in networks[J]. Physica A: Statistical Mechanics and Its Applications, 2009, 388 (8): 1706- 1712.
doi: 10.1016/j.physa.2008.12.021 |
23 |
HUBERT L , ARABIE P . Comparing partitions[J]. Journal of Classification, 1985, 2 (1): 193- 218.
doi: 10.1007/BF01908075 |
24 | XIE J, SZYMANSKI B K. Towards linear time overlapping community detection in social networks[C]//Pacific-Asia Conference on Knowledge Discovery and Data Mining. Berlin: Springer, 2012: 25-36. |
25 | COSCIA M, ROSSETTI G, GIANNOTTI F, et al. Demon: a local-first discovery method for overlapping communities[C]//Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2012: 615-623. |
26 |
LANCICHINETTI A , FORTUNATO S , RADICCHI F . Benchmark graphs for testing community detection algorithms[J]. Physical Review E, 2008, 78 (4): 046110.
doi: 10.1103/PhysRevE.78.046110 |
27 |
ZACHARY W W . An information flow model for conflict and fission in small groups[J]. Journal of Anthropological Research, 1977, 33 (4): 452- 473.
doi: 10.1086/jar.33.4.3629752 |
28 |
GIRVAN M , NEWMAN M E J . Community structure in social and biological networks[J]. Proceedings of the National Academy of Sciences, 2002, 99 (12): 7821- 7826.
doi: 10.1073/pnas.122653799 |
29 | YIN H, BENSON A R, LESKOVEC J, et al. Local higher-order graph clustering[C]//Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2017: 555-564. |
30 |
GLEISER P M , DANON L . Community structure in jazz[J]. Advances in Complex Systems, 2003, 6 (4): 565- 573.
doi: 10.1142/S0219525903001067 |
31 | KNUTH D E . The stanford graphbase: a platform for combinatorial computing[M]. New York: ACM Press, 1993. |
32 | LESKOVEC J, MCAULEY J J. Learning to discover social circles in ego networks[C]//Advances in Neural Information Processing Systems. Lake Tahoe: ACM, 2012: 539-547. |
[1] | 许侃,刘瑞鑫,林鸿飞,刘海峰,冯娇娇,李家平,林原,徐博. 基于异质网络嵌入的学术论文推荐方法[J]. 《山东大学学报(理学版)》, 2020, 55(11): 35-45. |
[2] | 郝秀梅,刘纪芹. 外P-模糊集的截集与扩展粗集模型[J]. 《山东大学学报(理学版)》, 2020, 55(10): 1-6. |
[3] | 王鑫,左万利,朱枫彤,王英. 基于重要结点的社区发现算法[J]. 山东大学学报 (理学版), 2018, 53(11): 67-77. |
[4] | 魏思敏,张宪华,张祯,孟庆春,张夏然. 基于复杂网络的虚拟品牌社区意见领袖识别研究——以魅族Flyme社区为例[J]. 《山东大学学报(理学版)》, 2018, 53(11): 26-34. |
[5] | 王亚奇,王静. 考虑好奇心理机制的动态复杂网络谣言传播研究[J]. 山东大学学报(理学版), 2017, 52(6): 99-104. |
[6] | 翟鹏,李登道. 基于高斯隶属度的包容性指标模糊聚类算法[J]. 山东大学学报(理学版), 2016, 51(5): 102-105. |
[7] | 赵斌,何泾沙,张伊璇. 基于信息熵隶属度的决策属性权重确定方法[J]. 山东大学学报(理学版), 2016, 51(3): 86-90. |
[8] | 孙松涛, 何炎祥, 蔡瑞, 李飞, 贺飞艳. 面向微博情感评测任务的多方法对比研究[J]. 山东大学学报(理学版), 2014, 49(11): 43-50. |
|