您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(理学版)》

《山东大学学报(理学版)》 ›› 2021, Vol. 56 ›› Issue (3): 67-76.doi: 10.6040/j.issn.1671-9352.4.2020.218

• • 上一篇    

基于L2,1范数和流形正则项的半监督谱聚类算法

杨婷1,2,朱恒东1,马盈仓1,汪义瑞2,杨小飞1*   

  1. 1.西安工程大学理学院, 陕西 西安 710600;2.安康学院数学与统计学院, 陕西 安康 725000
  • 发布日期:2021-03-16
  • 作者简介:杨婷(1996— ),女,硕士研究生,研究方向为机器学习. E-mail:15929121393@126.com*通信作者简介:杨小飞(1982— ),男,博士,副教授,研究方向为机器学习和粗糙集. E-mail:yangxiaofei2002@163.com
  • 基金资助:
    国家自然科学基金资助项目(11501435);西安工程大学研究生创新基金资助项目(chx2020031);安康学院专项基金资助(2019AYXNZX04)

Semi-supervised spectral clustering algorithm based on L2,1 norm and manifold regularization terms

YANG Ting1,2, ZHU Heng-dong1, MA Ying-cang1, WANG Yi-rui2, YANG Xiao-fei1*   

  1. 1. School of Science, Xian Polytechnic University, Xian 710600, Shaanxi, China;
    2. School of Mathematics and Statistics, Ankang University, Ankang 725000, Shaanxi, China
  • Published:2021-03-16

摘要: 谱聚类算法受到相似矩阵的影响以及没有使用先验信息,使得聚类结果有很大的局限性。针对这一问题,提出了一种基于L2,1范数和流形正则项的半监督谱聚类算法一方面借助L2,1范数的鲁棒性学习到合理的相似矩阵;另一方面充分利用监督信息,不仅指导了初始相似矩阵的构造,而且引入流形正则项去调整模型,从而改善聚类效果实验结果表明,所提出的聚类算法在人工数据集和真实数据集上的聚类结果较其他聚类算法更加有效

关键词: L2,1范数, 流形正则项, 谱聚类, 半监督学习

Abstract: The spectral clustering algorithm is affected by the similarity matrix and not using prior information, which makes the clustering results with great limitations. For this problem, we propose a semi-supervised spectral clustering algorithm based on L2,1 norm and manifold regularization terms. With the help of robustness in L2,1 norms, a reasonable similarity matrix is learned. In addition, full use of supervisory information not only is added in the initial similarity matrix, but also is used in manifold regularization term to adjust the model, thereby improving the clustering effect. The clustering results of the proposed clustering algorithm on artificial data sets and real data sets are more effective than other clustering algorithms in most cases.

Key words: L2,1 norm, manifold regularization term, spectral clustering, semi-supervised learning

中图分类号: 

  • TP311
[1] JAIN A K, MURTY M N, FLYNN P J. Data clustering: a review[J]. ACM Computing Survey, 1999, 31(3):264-323.
[2] MACEDO R, DUARTE R. Trends of multidrug-resistant tuberculosis clustering in Portugal[J]. ERJ Open Research, 2019, 5(1):00151-00155.
[3] 李玲俐. 谱聚类算法及其应用综述[J]. 软件导刊, 2016, 15(7):54-56. LI Lingli. Overview of spectral clustering algorithm and its application[J]. Software Guide, 2016, 15(7):54-56.
[4] BIAN Z, ISHIBUCHI H, SHITONG W. Joint learning of spectral clustering structure and fuzzy similarity matrix of data[J]. IEEE Transactions on Fuzzy Systems, 2019, 27(1):31-44.
[5] SHI Jianbo, MALIK Jitendra. Normalized cuts and image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(8):888-905.
[6] NG A Y, JORDAN M I, WEISS Y. On spectral clustering: analysis and an algorithm [C] //Advances in Neural Information Processing Systems. Cambridge: NIPS, 2001: 849-856.
[7] XU D, TIAN Y. A comprehensive survey of clustering algorithms[J]. Annals of Data Science, 2015, 2(2):165-193.
[8] JIA Hong, DING Shifei, XU Xinzheng. The latest research progress on spectral clustering[J]. Neural Computing and Applications, 2014, 24(1):1447-1486.
[9] 秦悦, 丁世飞. 半监督聚类综述[J]. 计算机科学, 2019, 46(9):15-21. QIN Yue, DING Shifei. Review of semi-supervised clustering[J]. Computer Science, 2019, 46(9):15-21.
[10] WAGSTAFF K, CARDIE C. Clustering with instance-level constraints [C] //Proceedings of 17th International Conference on Machine Learning. San Francisco: Morgan Kaufmann Publishers, 2000: 1097-1103.
[11] 李晁铭, 徐圣兵, 郝志峰. 基于成对约束的交叉熵半监督聚类算法[J]. 模式识别与人工智能, 2017, 30(7):598-608. LI Chaoming, XU Shengbing, HAO Zhifeng. Cross entropy semi-supervised clustering algorithm based on paired constraints[J]. Pattern Recognition and Artificial Intelligence, 2017, 30(7):598-608.
[12] 刘友超, 张曦煌. 基于密度自适应邻域相似图的半监督谱聚类[J]. 计算机应用研究, 2020, 37(9):2604-2609. LIU Youchao, ZHANG Xinhuang. Semi-supervised spectral clustering based on density adaptive neighborhood similarity map[J]. Computer Application Research, 2020, 37(9):2604-2609.
[13] 徐达宇, 郁莹珺, 冯海林, 等. 基于约束优化传播的改进大规模数据半监督式谱聚类算法[J]. 计算机应用研究, 2018, 35(5):1325-1330. XU Dayu, YU Yingjun, FENG Hailin, et al. Improved semi-supervised spectral clustering algorithm for large-scale data based on constrained optimization propagation[J]. Computer Application Research, 2008, 35(5):1325-1330.
[14] BASU S, BANERJEE A, MOONEY R. Semi-supervised clustering by seeding [C] //Proceedings of 19th International Conference on Machine Learning. San Francisco: Morgan Kaufmann Publishers, 2002: 19-26.
[15] HONG D F, YOKOYA N, CHANUSSOT J. Learning to propagate labels on graphs: an iterative multitask regression framework for semi-supervised hyperspectral dimensionality reduction[J]. Journal of Photogrammetry and Remote Sensing: Official Publication of the International Society for Photogrammetry and Remote Sensing, 2019, 158:35-49.
[16] BACH F, JORDAN M I. Learning spectral clustering[C] //Proceedings of Conference and Workshop on Neural Information Processing Systems. Vancouver: NIPS, 2003: 305-312.
[17] NIE F, HUANG H, CAI X, et al. Efficient and robust feature selection via joint L2,1-norms minimization [C] //Proceedings of Neural Information Processing Systems. Vancouver: NIPS, 2010: 1813-1821.
[18] GUO S, CUI X L, LI Y S. Subspace clustering guided convex nonnegative matrix factorization[J]. Neurocomputing, 2018, 292:38-48.
[19] NIE F, WANG X, JORDAN M I, et al. The constrained Laplacian rank algorithm for graph-based clustering[C] //Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence. [S.l] : AAAI, 2016: 969-1976.
[20] NIE F, WANG X, HUANG H. Clustering and projected clustering with adaptive neighbors [C] //Proceedings of ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. [S.l] : ACM, 2014: 977-986.
[21] ZHANG X, LI J, YU H. Local density adaptive similarity measurement for spectral clustering[J]. Pattern Recognition Letters, 2011, 32(2):352-358.
[22] KULIS B, BASU S, DHILLON I. Semi-supervised graph clustering: a kernel approach[J]. Machine Learning, 2009, 74(1):1-22.
[23] 白福均, 高建瓴, 宋文慧. 半监督模糊聚类算法的研究与改进[J]. 通信技术, 2018, 317(5):71-75. BAI Fujun, GAO Jianling, SONG Wenhui. Research and improvement of semi-supervised fuzzy clustering algorithm[J]. Communication Technology, 2018, 317(5):71-75.
[1] 张鹏,王素格,李德玉,王杰. 一种基于启发式规则的半监督垃圾评论分类方法[J]. 山东大学学报(理学版), 2017, 52(7): 44-51.
[2] 苏丰龙,谢庆华,黄清泉,邱继远,岳振军. 基于直推式学习的半监督属性抽取[J]. 山东大学学报(理学版), 2016, 51(3): 111-115.
[3] 杜红乐,张燕,张林. 不均衡数据集下的入侵检测[J]. 山东大学学报(理学版), 2016, 51(11): 50-57.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!