您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(理学版)》

《山东大学学报(理学版)》 ›› 2024, Vol. 59 ›› Issue (5): 45-51.doi: 10.6040/j.issn.1671-9352.4.2023.137

• • 上一篇    下一篇

基于自然最近邻的样本扰动三支聚类

朱金1,付玉2*,管文瑞3,王平心4   

  1. 1.江苏科技大学经济管理学院, 江苏 镇江 212100;2.南京中医药大学镇江附属医院(镇江中医院), 江苏 镇江 212000;3.江苏科技大学自动化学院, 江苏 镇江 212100;4.江苏科技大学理学院, 江苏 镇江 212100
  • 发布日期:2024-05-09
  • 通讯作者: 付玉(1978— ),女,副主任中医师,研究领域为医学三支决策、粒计算. E-mail:fumima0511@hotmail.com
  • 基金资助:
    国家自然科学基金资助项目(62076111,61773012);江苏省高校自然科学基金资助项目(15KJB110004)

Perturbation three-way clustering based on natural nearest neighbors

ZHU Jin1, FU Yu2*, GUAN Wenrui3, WANG Pingxin4   

  1. 1. School of Economics and Management, Jiangsu University of Science and Technology, Zhenjiang 212100, Jiangsu, China;
    2. Zhenjiang Hospital Affiliated to Nanjing University of Chinese Medicine(Zhenjiang Hospital of Traditional Chinese Medicine), Zhenjiang 212000, Jiangsu, China;
    3. School of Automation, Jiangsu University of Science and Technology, Zhenjiang 212100, Jiangsu, China;
    4. School of Science, Jiangsu University of Science and Technology, Zhenjiang 212100, Jiangsu, China
  • Published:2024-05-09

摘要: 利用数据样本的自然最近邻信息,给出了一种基于样本扰动理论的三支聚类算法,结合自然最近邻信息生成2组扰动数据集,随机提取特征子集并使用K-means聚类算法获得不同的聚类结果,利用共现概率矩阵和确定函数获得样本的稳定性,根据样本稳定性阈值将样本划分为稳定区域和不稳定区域,再对2个区域的样本使用不同的策略获得每个类簇的核心域和边界域。实验采用5个公开数据集与2种传统的聚类算法进行对比, 结果验证了所提算法的有效性。

关键词: 三支决策, 三支聚类, 样本扰动, 自然最近邻

Abstract: By using samples natural nearest neighbors, a three-way clustering algorithm is proposed based on samples perturbation theory. The proposed algorithm combines natural nearest neighbor information with samples perturbation to generate two datasets. By randomly selecting parts of the samples feature, different clustering results are obtained through K-means clustering algorithms. The stability of each sample is calculated based on the defined frequencies. The universe is divided into stable set and unstable set based on the samples stability. Then, we use different strategies to obtain the core region and fringe region of each cluster. The testing results on five open datasets verify the effectiveness of the proposed algorithm through comparative tests with two traditional clustering methods.

Key words: three-way decision, three-way clustering, samples perturbation, natural nearest neighbor

中图分类号: 

  • TP181
[1] FUJITA H, LI T R, YAO Y Y. Advances in three-way decisions and granular computing[J]. Knowledge-based Systems, 2016, 91:1-3.
[2] QIAN Yuhua, CHENG Honghong, WANG Jieting, et al. Grouping granular structures in human granulation intelligence[J]. Information Sciences, 2017, 382/383:150-169.
[3] XU Weuhua, YUAN Kehua, LI Weitao. Dynamic updating approximations of local generalized multi-granulation neighborhood rough set[J]. Applied Intelligence, 2022, 52(8):9148-9173.
[4] JI Xia, LIU Shuaishuai, ZHAO Peng, et al. Clustering ensemble based on samples certainty[J]. Cognitive Computation, 2021, 13(4):1034-1046.
[5] RAO Liang, JIA Ningxin, HU Jun, et al. ATPdock: a template-based method for ATP-specific protein-ligand docking[J]. Bioinformatics, 2022, 38(2):556-558.
[6] NIU Chuang, SHAN Hongming, WANG Ge. SPICE: semantic pseudo-labeling for image clustering[J]. IEEE Transactions on Image Processing, 2022, 31:7264-7278.
[7] LIU Keyu, YANG Xibei, YU Hualong, et al. Supervised information granulation strategy for attribute reduction[J]. International Journal of Machine Learning and Cybernetics, 2020, 11(9):2149-2163.
[8] YAO Yiyu. Three-way decisions with probabilistic rough sets[J]. Information Sciences, 2010, 180(3):341-353.
[9] 李金海,邓硕. 概念格与三支决策及其研究展望[J]. 西北大学学报(自然科学版), 2017, 47(3):321-329. LI Jinhai, DENG Shuo. Concept lattice, three-way decisions and their research outlooks[J]. Journal of Northwest University(Natural Science Edition), 2017, 47(3):321-329.
[10] YU Hong, WANG Xinchen, WANG Guoying, et al. An active three-way clustering method via low-rank matrices for multi-view data[J]. Information Sciences, 2020, 507:823-839.
[11] WANG Pingxin, YAO Yiyu. CE3: a three-way clustering method based on mathematical morphology[J]. Konwledge-based Systems, 2018, 155:54-65.
[12] YU Hui, CHEN Luyuan, YAO Jingtao, et al. A three-way clustering method based on an improved DBSCAN algorithm[J]. Physica A,Statistical Mechanics and Its Applications, 2019, 535:122289.
[13] 凡嘉琛,王平心,杨习贝. 基于三支决策的密度敏感谱聚类[J]. 山东大学学报(理学版), 2022, 57(11):10-17. FAN Jiachen, WANG Pingxin, YANG Xibei. Density sensitive spectral clustering based on three-way decision[J]. Journal of Shandong University(Natural Science)2022, 57(11):10-17.
[14] 姜春茂,赵书宝. 基于阴影集的多粒度三支聚类集成[J]. 电子学报, 2021, 49(8):1524-1532. JIANG Chunmao, ZHAO Shubao. Multi-granulation three-way clustering ensemble based on shadowed sets[J]. Acta Electronica Sinica, 2021, 49(8):1524-1532.
[15] FAN Jiachen, WANG Pingxin, JIANG Chunmao, et al. Ensemble learning using three-way density-sensitive spectral clustering[J]. International Journal of Approximate Reasoning, 2022, 149:70-84.
[16] ZOU Xianlin, ZHU Qingsheng, YANG Ruilong. Natural nearest neighbor for isomap algorithm without free-paramater[J]. Advanced Materials Research, 2011, 219/220:994-998.
[17] LI Feijiang, QIAN Yuhua, WANG Jieting, et al. Clustering ensemble based on samples stability[J]. Artificial Intelligence, 2019, 273:37-55.
[18] 李飞江,钱宇华,王婕婷,等. 基于样本稳定性的聚类方法[J]. 中国科学(信息科学), 2020, 50(8):1239-1254. LI Feijiang, QIAN Yuhua, WANG Jieting, et al. Clustering method based on samples stability[J]. Scientia Sinica Informationis, 2020, 50(8):1239-1254.
[19] OTUS N. A threshold selection method from gray-level histogarms[J]. IEEE Transcations on Systems, Man, and Cybernetics, 1979, 9:62-66.
[1] 范敏,秦琴,李金海. 基于三支因果力的邻域推荐算法[J]. 《山东大学学报(理学版)》, 2024, 59(5): 12-22.
[2] 方逢祺,吴伟志. 决策集值系统中的知识约简[J]. 《山东大学学报(理学版)》, 2024, 59(5): 82-89, 99.
[3] 王茜,张贤勇. 不完备邻域加权多粒度决策理论粗糙集及三支决策[J]. 《山东大学学报(理学版)》, 2023, 58(9): 94-104.
[4] 王君宇,杨亚锋,薛静轩,李丽红. 可拓序贯三支决策模型及应用[J]. 《山东大学学报(理学版)》, 2023, 58(7): 67-79.
[5] 胡成祥,张莉,黄晓玲,王汇彬. 面向属性变化的动态邻域粗糙集知识更新方法[J]. 《山东大学学报(理学版)》, 2023, 58(7): 37-51.
[6] 方宇,郑胡宇,曹雪梅. 三支过采样的不平衡数据分类方法[J]. 《山东大学学报(理学版)》, 2023, 58(12): 41-51.
[7] 凡嘉琛,王平心,杨习贝. 基于三支决策的密度敏感谱聚类[J]. 《山东大学学报(理学版)》, 2023, 58(1): 59-66.
[8] 钱进,汤大伟,洪承鑫. 多粒度层次序贯三支决策模型研究[J]. 《山东大学学报(理学版)》, 2022, 57(9): 33-45.
[9] 巩增泰,他广朋. 直觉模糊集所诱导的软集语义及其三支决策[J]. 《山东大学学报(理学版)》, 2022, 57(8): 68-76.
[10] 施极,索中英. 基于区间数层次分析法的损失函数确定方法[J]. 《山东大学学报(理学版)》, 2022, 57(5): 28-37.
[11] 杨洁,罗天,李阳军. 基于TOPSIS的无标签序贯三支决策模型[J]. 《山东大学学报(理学版)》, 2022, 57(3): 41-48.
[12] 李敏,杨亚锋,雷宇,李丽红. 基于可拓域变化代价最小的最优粒度选择[J]. 《山东大学学报(理学版)》, 2021, 56(2): 17-27.
[13] 余鹰,吴新念,王乐为,张应龙. 基于标记相关性的多标记三支分类算法[J]. 《山东大学学报(理学版)》, 2020, 55(3): 81-88.
[14] 姬儒雅,魏玲,任睿思,赵思雨. 毕达哥拉斯模糊三支概念格[J]. 《山东大学学报(理学版)》, 2020, 55(11): 58-65.
[15] 刘国涛,张燕平,徐晨初. 一种优化覆盖中心的三支决策模型[J]. 山东大学学报(理学版), 2017, 52(3): 105-110.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 金黎明,杨 艳*,刘万顺,韩宝芹,田文杰,范圣第 . 壳寡糖及其衍生物对CCl4诱导的小鼠肝损伤的保护作用[J]. J4, 2007, 42(7): 1 -04 .
[2] 章东青,殷晓斌,高汉鹏. Quasi-线性Armendariz模[J]. 山东大学学报(理学版), 2016, 51(12): 1 -6 .
[3] 秦兆宇,刘师莲*,杨银荣,刘芙君,李建远,宋春华 . 白斑综合征中国对虾肝胰腺蛋白质组学研究的技术探索[J]. J4, 2007, 42(7): 5 -08 .
[4] 伍代勇. 一类具有反馈控制非线性离散Logistic模型的全局吸引性[J]. J4, 2013, 48(4): 114 -110 .
[5] 罗斯特,卢丽倩,崔若飞,周伟伟,李增勇*. Monte-Carlo仿真酒精特征波长光子在皮肤中的传输规律及光纤探头设计[J]. J4, 2013, 48(1): 46 -50 .
[6] 张明明,秦永彬. 基于前序关系的非确定型有穷自动机极小化算法[J]. J4, 2010, 45(7): 34 -38 .
[7] 邵国俊,茹淼焱*,孙雪莹. 聚醚接枝聚羧酸系减水剂合成工艺研究[J]. J4, 2013, 48(05): 29 -33 .
[8] 邓 勇,丁龙云 . 双边伪欧氏环及其上的矩阵标准形[J]. J4, 2007, 42(9): 114 -118 .
[9] 张亮,,王树梅,黄河燕,张孝飞 . 面向中文问答系统的问句句法分析[J]. J4, 2006, 41(3): 30 -33 .
[10] 曲晓英,赵 静 . 含时线性Klein-Gordon方程的解[J]. J4, 2007, 42(7): 22 -26 .