JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE) ›› 2022, Vol. 57 ›› Issue (12): 13-24.doi: 10.6040/j.issn.1671-9352.7.2021.168

Previous Articles    

Feature selection using adaptive neighborhood mutual information and spectral clustering

SUN Lin1,2, LIANG Na1, XU Jiu-cheng1,2   

  1. 1. College of Computer and Information Engineering, Henan Normal University, Xinxiang 453007, Henan, China;
    2. Henan Engineering Laboratory of Smart Business and Internet of Things Technology, Xinxiang 453007, Henan, China
  • Published:2022-12-05

Abstract: In order to deal with the problem that traditional spectral clustering algorithms need set parameters manually, this paper proposes a feature selection algorithm based on adaptive neighborhood mutual information and spectral clustering, which takes the advantage of neighborhood rough sets to deal with continuous data. First, the standard deviation set and adaptive neighborhood set of each object on attribute are defined. Some uncertainty measures such as adaptive neighborhood entropy, average neighborhood entropy, joint entropy, neighborhood conditional entropy and neighborhood mutual information are given, and then the adaptive neighborhood mutual information is used to sort the correlation between features and labels. Second, the shared nearest neighbor spectral clustering algorithm is combined to cluster the strongly relevant features into the same feature cluster, so that the features in the different feature clusters are strongly diverse. Finally, the feature selection algorithm is designed by employing the minimum redundancy and maximum correlation technology. The experimental results of selecting the number of features and classification accuracy on ten datasets verify the effectiveness of the proposed algorithm.

Key words: feature selection, neighborhood rough set, adaptive neighborhood mutual information, spectral clustering, minimum redundancy and maximum correlation

CLC Number: 

  • TP181
[1] 景运革,景罗希,王宝丽,等. 属性值和属性变化的增量属性约简算法[J]. 山东大学学报(理学版), 2020, 55(1):62-68. JING Yunge, JING Luoxi, WANG Baoli, et al. An incremental attribute reduction approach when attribute values and attributes of the decision system change dynamically[J]. Journal of Shandong University(Natural Science), 2020, 55(1):62-68.
[2] 刘艳,程璐,孙林. 基于K-S检验和邻域粗糙集的特征选择方法[J]. 河南师范大学学报(自然科学版), 2019, 47(2):21-28. LIU Yan, CHENG Lu, SUN Lin. Feature selection method based on K-S test and neighborhood rough sets[J]. Journal of Henan Normal University(Natural Science Edition), 2019, 47(2):21-28.
[3] SUN Lin, WANG Lanying, DING Weiping, et al. Neighborhood multi-granulation rough sets-based attribute reduction using Lebesgue and entropy measures in incomplete neighborhood decision systems[J]. Knowledge-Based Systems, 2020, 192:105373.
[4] 刘琨,封硕. 加强局部搜索能力的人工蜂群算法[J]. 河南师范大学学报(自然科学版), 2021, 49(2):15-24. LIU Kun, FENG Shuo. An improved artificial bee colony algorithm for enhancing local search ability[J]. Journal of Henan Normal University(Natural Science Edition), 2021, 49(2):15-24.
[5] 邓威,郭钇秀,李勇,等. 基于特征选择和Stacking集成学习的配电网网损预测[J]. 电力系统保护与控制, 2020, 48(15):108-115. DENG Wei, GUO Yixiu, LI Yong, et al. Power losses prediction based on feature selection and Stacking integrated learning[J]. Power System Protection and Control, 2020, 48(15):108-115.
[6] 薛占熬,庞文莉,姚守倩,等. 基于前景理论的直觉模糊三支决策模型[J]. 河南师范大学学报(自然科学版), 2020, 48(5):31-36. XUE Zhanao, PANG Wenli, YAO Shouqian, et al. The prospect theory based intuitionistic fuzzy three-way decisions model[J]. Journal of Henan Normal University(Natural Science Edition), 2020, 48(5):31-36.
[7] CHEN Yingyue, CHEN Yumin. Feature subset selection based on variable precision neighborhood rough sets[J]. International Journal of Computational Intelligence Systems, 2021, 14(1):572-581.
[8] YANG Xiaoling, CHEN Hongmei, LI Tianrui, et al. Neighborhood rough sets with distance metric learning for feature selection[J].Knowledge-Based Systems, 2021, 224:107076.
[9] 姚晟,徐风,赵鹏,等. 基于自适应邻域空间粗糙集模型的直觉模糊熵特征选择[J]. 计算机研究与发展, 2018, 55(4):802-814. YAO Sheng, XU Feng, ZHAO Peng, et al. Feature selection of intuitionistic fuzzy entropy based on adaptive neighborhood spatial rough set model[J]. Journal of Computer Research and Development, 2018, 55(4):802-814.
[10] 王睿,高欣,李军良,等. 基于聚类分析的电动汽车充电负荷预测方法[J]. 电力系统保护与控制, 2020, 48(16):37-44. WANG Rui, GAO Xin, LI Junliang, et al. Electric vehicle charging demand forecasting method based on clustering analysis[J]. Power System Protection and Control, 2020, 48(16):37-44.
[11] 李福东,曾旭华,魏梅芳,等. 基于聚类分析和混合自适应进化算法的短期风电功率预测[J].电力系统保护与控制, 2020, 48(22):151-158. LI Fudong, ZENG Xuhua, WEI Meifang, et al. Short-term wind power forecasting based on cluster analysis and a hybrid evolutionary-adaptive methodology[J]. Power System Protection and Control, 2020, 48(22):151-158.
[12] 赵晓晓,周治平. 结合稀疏表示与约束传递的半监督谱聚类算法[J]. 智能系统学报, 2018, 13(5):855-862. ZHAO Xiaoxiao, ZHOU Zhiping. A semi-supervised spectral clustering algorithm combined with sparse representation and constraint propagation[J]. CAAI Transactions on Intelligent Systems, 2018, 13(5):855-863.
[13] SHANG Ronghua, XU Kaiming, SHANG Fanhua, et al. Sparse and low-redundant subspace learning-based dual-graph regularized robust feature selection[J]. Knowledge-Based Systems, 2020, 187:104830.
[14] 胡敏杰,郑荔平,唐莉,等. 联合谱聚类与邻域互信息的特征选择算法[J]. 模式识别与人工智能, 2017, 30(12):1121-1129. HU Minjie, ZHENG Liping, TANG Li, et al. Feature selection algorithm based on joint spectral clustering and neighborhood mutual information [J]. Pattern Recognition and Artificial Intelligence, 2017, 30(12):1121-1129.
[15] 储德润,周治平. 公理化模糊共享近邻自适应谱聚类算法[J]. 智能系统学报, 2019, 14(5):897-904. CHU Derun, ZHOU Zhiping. Shared nearest neighbor adaptive spectral clustering algorithm based on axiomatic fuzzy set theory[J]. CAAI Transactions on Intelligent Systems, 2019, 14(5):897-904.
[16] SUN Lin, ZHANG Xiaoyu, QIAN Yuhua, et al. Joint neighborhood entropy-based gene selection method with fisher score for tumor classification[J]. Applied Intelligence, 2018, 49(4):1245-1259.
[17] 林芷欣,刘遵仁,纪俊. 基于k近邻属性重要度和相关系数的属性约简[J]. 计算机工程与设计, 2020, 41(9):2488-2494. LIN Zhixin, LIU Zunren, JI Jun. Attribute reduction based on k nearest neighbor attribute importance and correlation coefficient[J]. Computer Engineering and Design, 2020, 41(9):2488-2494.
[18] LIU Yong, HUANG Wenliang, JIANG Yunliang, et al. Quick attribute reduct algorithm for neighborhood rough set model[J]. Information Sciences, 2014, 271(7):65-81.
[19] 林芷欣,刘遵仁,纪俊. 基于Relief属性重要度的快速约简算法[J]. 青岛大学学报(自然科学版), 2019, 32(3):8-13. LIN Zhixin, LIU Zunren, JI Jun. Fast reduction algorithm based on relief attribute importance[J]. Journal of Qingdao University(Natural Science Edition), 2019, 32(3):8-13.
[20] SUN Lin, WANG Lanying, Qian Yuhua, et al. Feature selection using Lebesgue and entropy measures for incomplete neighborhood decision systems[J]. Knowledge-Based Systems, 2019, 186:104942.
[21] CHEN Degang, ZHANG Lei, ZHAO Suyun, et al. A novel algorithm for finding reducts with fuzzy rough sets[J]. IEEE Transactions on Fuzzy Systems, 2012, 20(2):385-389.
[22] QIAN Yuhua, WANG Qi, CHENG Honghong, et al. Fuzzy-rough feature selection accelerator[J]. Fuzzy Sets and Systems, 2015, 258(1):61-78.
[23] JENSEN R, SHEN Q. New approaches to fuzzy-rough feature selection[J]. IEEE Transactions on Fuzzy Systems, 2009, 17(4):824-838.
[24] TAN Anhui, WU Weizhi, QIAN Yuhua, et al. Intuitionistic fuzzy rough set-based granular structures and attribute subset selection[J]. IEEE Transactions on Fuzzy Systems, 2019, 27(3):527-539.
[25] 姚晟,徐风,赵鹏,等. 基于改进邻域粒的模糊熵特征选择算法[J]. 南京大学学报(自然科学), 2017, 53(4):802-814. YAO Sheng, XU Feng, ZHAO Peng, et al. Fuzzy entropy feature selection algorithm based on improved neighborhood granules[J]. Journal of Nanjing University(Natural Sciences), 2017, 53(4):802-814.
[26] CHEN Yumin, WU Keshou, CHEN Xuhui, et al. An entropy-based uncertainty measurement approach in neighborhood systems[J]. Information Sciences, 2014, 279:239-250.
[27] JIANG Feng, SUI Yunfei, ZHOU Lin, et al. A relative decision entropy-based feature selection approach[J]. Pattern Recognition: The Journal of the Pattern Recognition Society, 2015, 48(7):2151-2163.
[28] WANG Changzhong, SHAO Mingwen, HE Qiang, et al. Feature subset selection based on fuzzy neighborhood rough sets[J]. Knowledge-Based Systems, 2016, 111(1):173-179.
[29] ZHU Pengfei, HU Qinghua. Adaptive neighborhood granularity selection and combination based on margin distribution optimization[J]. Information Sciences, 2013, 249:1-12.
[30] ZHAO Hong, WANG Ping, HU Qinghua. Cost-sensitive feature selection based on adaptive neighborhood granularity with multi-level confidence[J]. Information Sciences, 2016, 366:134-149.
[1] ZHANG Zhi-hao, LIN Yao-jin, LU Shun, WU Yi-lin, WANG Chen-xi. Multi-label feature selection with streaming and missing labels [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2022, 57(8): 39-52.
[2] SUN Lin, CHEN Yu-sheng, XU Jiu-cheng. Multilabel feature selection algorithm based on improved ReliefF [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2022, 57(4): 1-11.
[3] ZHANG Yao, MA Ying-cang, YAND Xiao-fei, ZHU Heng-dong, YANG Ting. Multi-label feature selection based on manifold structure and flexible embedding [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2021, 56(7): 91-102.
[4] YANG Ting, ZHU Heng-dong, MA Ying-cang, WANG Yi-rui, YANG Xiao-fei. Semi-supervised spectral clustering algorithm based on L2,1 norm and manifold regularization terms [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2021, 56(3): 67-76.
[5] HUANG Tian-yi, ZHU William. Cost-sensitive feature selection via manifold learning [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2017, 52(3): 91-96.
[6] WAN Zhong-ying, WANG Ming-wen, ZUO Jia-li, WAN Jian-yi. Feature selection combined with the global and local information(GLFS) [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2016, 51(5): 87-93.
[7] LI Zhao,SUN Zhan-,LI Xiao,LI Cheng,. Study on feature selection method based on information loss [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2016, 51(11): 7-12.
[8] ZHENG Yan, PANG Lin, BI Hui, LIU Wei, CHENG Gong. Feature selection algorithm based on sentiment topic model [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2014, 49(11): 74-81.
[9] XIA Meng-nan, DU Yong-ping, ZUO Ben-xin. Micro-blog opinion analysis based on syntactic dependency and feature combination [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2014, 49(11): 22-30.
[10] PAN Qing-qing, ZHOU Feng, YU Zheng-tao, GUO Jian-yi, XIAN Yan-tuan. Recognition method of Vietnamese named entity based on#br# conditional random fields [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2014, 49(1): 76-79.
[11] YU Ran 1,2, LIU Chun-yang3*, JIN Xiao-long 1, WANG Yuan-zhuo 1, CHENG Xue-qi 1. Chinese spam microblog filtering based on the fusion of
multi-angle features
[J]. J4, 2013, 48(11): 53-58.
[12] YI Chao-qun, LI Jian-ping, ZHU Cheng-wen. A kind of feature selection based on classification accuracy of SVM [J]. J4, 2010, 45(7): 119-121.
[13] YANG Yu-Zhen, LIU Pei-Yu, SHU Zhen-Fang, QIU Ye. Research of an improved information gain methodusing distribution information of terms [J]. J4, 2009, 44(11): 48-51.
[14] YUAN Xiao-hang,DU Xiao-yong . iRIPPER: an improved rule-based text categorization algorithm [J]. J4, 2007, 42(11): 66-68 .
[15] YU Jun-ying,WANG Ming-wen,SHENG Jun . Class information feature selection method for text classification [J]. J4, 2006, 41(3): 144-148 .
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!