《山东大学学报(理学版)》 ›› 2022, Vol. 57 ›› Issue (12): 13-24.doi: 10.6040/j.issn.1671-9352.7.2021.168
• • 上一篇
孙林1,2,梁娜1,徐久成1,2
SUN Lin1,2, LIANG Na1, XU Jiu-cheng1,2
摘要: 借鉴邻域粗糙集处理连续型数据的优势,为解决传统谱聚类算法需要人工选取参数的问题,提出基于自适应邻域互信息与谱聚类的特征选择算法。首先,定义各对象在属性下的标准差集合与自适应邻域集,给出自适应邻域熵、平均邻域熵、联合熵、邻域条件熵、邻域互信息等不确定性度量,利用自适应邻域互信息对特征与标签的相关性进行排序。然后,结合共享近邻自适应谱聚类算法,将相关性强的特征聚到同一特征簇内,使不同特征簇内的特征强相异。最后,使用最小冗余最大相关技术设计特征选择算法。在10个数据集上选择特征个数与分类精度的实验结果,验证了所提算法的有效性。
中图分类号:
[1] 景运革,景罗希,王宝丽,等. 属性值和属性变化的增量属性约简算法[J]. 山东大学学报(理学版), 2020, 55(1):62-68. JING Yunge, JING Luoxi, WANG Baoli, et al. An incremental attribute reduction approach when attribute values and attributes of the decision system change dynamically[J]. Journal of Shandong University(Natural Science), 2020, 55(1):62-68. [2] 刘艳,程璐,孙林. 基于K-S检验和邻域粗糙集的特征选择方法[J]. 河南师范大学学报(自然科学版), 2019, 47(2):21-28. LIU Yan, CHENG Lu, SUN Lin. Feature selection method based on K-S test and neighborhood rough sets[J]. Journal of Henan Normal University(Natural Science Edition), 2019, 47(2):21-28. [3] SUN Lin, WANG Lanying, DING Weiping, et al. Neighborhood multi-granulation rough sets-based attribute reduction using Lebesgue and entropy measures in incomplete neighborhood decision systems[J]. Knowledge-Based Systems, 2020, 192:105373. [4] 刘琨,封硕. 加强局部搜索能力的人工蜂群算法[J]. 河南师范大学学报(自然科学版), 2021, 49(2):15-24. LIU Kun, FENG Shuo. An improved artificial bee colony algorithm for enhancing local search ability[J]. Journal of Henan Normal University(Natural Science Edition), 2021, 49(2):15-24. [5] 邓威,郭钇秀,李勇,等. 基于特征选择和Stacking集成学习的配电网网损预测[J]. 电力系统保护与控制, 2020, 48(15):108-115. DENG Wei, GUO Yixiu, LI Yong, et al. Power losses prediction based on feature selection and Stacking integrated learning[J]. Power System Protection and Control, 2020, 48(15):108-115. [6] 薛占熬,庞文莉,姚守倩,等. 基于前景理论的直觉模糊三支决策模型[J]. 河南师范大学学报(自然科学版), 2020, 48(5):31-36. XUE Zhanao, PANG Wenli, YAO Shouqian, et al. The prospect theory based intuitionistic fuzzy three-way decisions model[J]. Journal of Henan Normal University(Natural Science Edition), 2020, 48(5):31-36. [7] CHEN Yingyue, CHEN Yumin. Feature subset selection based on variable precision neighborhood rough sets[J]. International Journal of Computational Intelligence Systems, 2021, 14(1):572-581. [8] YANG Xiaoling, CHEN Hongmei, LI Tianrui, et al. Neighborhood rough sets with distance metric learning for feature selection[J].Knowledge-Based Systems, 2021, 224:107076. [9] 姚晟,徐风,赵鹏,等. 基于自适应邻域空间粗糙集模型的直觉模糊熵特征选择[J]. 计算机研究与发展, 2018, 55(4):802-814. YAO Sheng, XU Feng, ZHAO Peng, et al. Feature selection of intuitionistic fuzzy entropy based on adaptive neighborhood spatial rough set model[J]. Journal of Computer Research and Development, 2018, 55(4):802-814. [10] 王睿,高欣,李军良,等. 基于聚类分析的电动汽车充电负荷预测方法[J]. 电力系统保护与控制, 2020, 48(16):37-44. WANG Rui, GAO Xin, LI Junliang, et al. Electric vehicle charging demand forecasting method based on clustering analysis[J]. Power System Protection and Control, 2020, 48(16):37-44. [11] 李福东,曾旭华,魏梅芳,等. 基于聚类分析和混合自适应进化算法的短期风电功率预测[J].电力系统保护与控制, 2020, 48(22):151-158. LI Fudong, ZENG Xuhua, WEI Meifang, et al. Short-term wind power forecasting based on cluster analysis and a hybrid evolutionary-adaptive methodology[J]. Power System Protection and Control, 2020, 48(22):151-158. [12] 赵晓晓,周治平. 结合稀疏表示与约束传递的半监督谱聚类算法[J]. 智能系统学报, 2018, 13(5):855-862. ZHAO Xiaoxiao, ZHOU Zhiping. A semi-supervised spectral clustering algorithm combined with sparse representation and constraint propagation[J]. CAAI Transactions on Intelligent Systems, 2018, 13(5):855-863. [13] SHANG Ronghua, XU Kaiming, SHANG Fanhua, et al. Sparse and low-redundant subspace learning-based dual-graph regularized robust feature selection[J]. Knowledge-Based Systems, 2020, 187:104830. [14] 胡敏杰,郑荔平,唐莉,等. 联合谱聚类与邻域互信息的特征选择算法[J]. 模式识别与人工智能, 2017, 30(12):1121-1129. HU Minjie, ZHENG Liping, TANG Li, et al. Feature selection algorithm based on joint spectral clustering and neighborhood mutual information [J]. Pattern Recognition and Artificial Intelligence, 2017, 30(12):1121-1129. [15] 储德润,周治平. 公理化模糊共享近邻自适应谱聚类算法[J]. 智能系统学报, 2019, 14(5):897-904. CHU Derun, ZHOU Zhiping. Shared nearest neighbor adaptive spectral clustering algorithm based on axiomatic fuzzy set theory[J]. CAAI Transactions on Intelligent Systems, 2019, 14(5):897-904. [16] SUN Lin, ZHANG Xiaoyu, QIAN Yuhua, et al. Joint neighborhood entropy-based gene selection method with fisher score for tumor classification[J]. Applied Intelligence, 2018, 49(4):1245-1259. [17] 林芷欣,刘遵仁,纪俊. 基于k近邻属性重要度和相关系数的属性约简[J]. 计算机工程与设计, 2020, 41(9):2488-2494. LIN Zhixin, LIU Zunren, JI Jun. Attribute reduction based on k nearest neighbor attribute importance and correlation coefficient[J]. Computer Engineering and Design, 2020, 41(9):2488-2494. [18] LIU Yong, HUANG Wenliang, JIANG Yunliang, et al. Quick attribute reduct algorithm for neighborhood rough set model[J]. Information Sciences, 2014, 271(7):65-81. [19] 林芷欣,刘遵仁,纪俊. 基于Relief属性重要度的快速约简算法[J]. 青岛大学学报(自然科学版), 2019, 32(3):8-13. LIN Zhixin, LIU Zunren, JI Jun. Fast reduction algorithm based on relief attribute importance[J]. Journal of Qingdao University(Natural Science Edition), 2019, 32(3):8-13. [20] SUN Lin, WANG Lanying, Qian Yuhua, et al. Feature selection using Lebesgue and entropy measures for incomplete neighborhood decision systems[J]. Knowledge-Based Systems, 2019, 186:104942. [21] CHEN Degang, ZHANG Lei, ZHAO Suyun, et al. A novel algorithm for finding reducts with fuzzy rough sets[J]. IEEE Transactions on Fuzzy Systems, 2012, 20(2):385-389. [22] QIAN Yuhua, WANG Qi, CHENG Honghong, et al. Fuzzy-rough feature selection accelerator[J]. Fuzzy Sets and Systems, 2015, 258(1):61-78. [23] JENSEN R, SHEN Q. New approaches to fuzzy-rough feature selection[J]. IEEE Transactions on Fuzzy Systems, 2009, 17(4):824-838. [24] TAN Anhui, WU Weizhi, QIAN Yuhua, et al. Intuitionistic fuzzy rough set-based granular structures and attribute subset selection[J]. IEEE Transactions on Fuzzy Systems, 2019, 27(3):527-539. [25] 姚晟,徐风,赵鹏,等. 基于改进邻域粒的模糊熵特征选择算法[J]. 南京大学学报(自然科学), 2017, 53(4):802-814. YAO Sheng, XU Feng, ZHAO Peng, et al. Fuzzy entropy feature selection algorithm based on improved neighborhood granules[J]. Journal of Nanjing University(Natural Sciences), 2017, 53(4):802-814. [26] CHEN Yumin, WU Keshou, CHEN Xuhui, et al. An entropy-based uncertainty measurement approach in neighborhood systems[J]. Information Sciences, 2014, 279:239-250. [27] JIANG Feng, SUI Yunfei, ZHOU Lin, et al. A relative decision entropy-based feature selection approach[J]. Pattern Recognition: The Journal of the Pattern Recognition Society, 2015, 48(7):2151-2163. [28] WANG Changzhong, SHAO Mingwen, HE Qiang, et al. Feature subset selection based on fuzzy neighborhood rough sets[J]. Knowledge-Based Systems, 2016, 111(1):173-179. [29] ZHU Pengfei, HU Qinghua. Adaptive neighborhood granularity selection and combination based on margin distribution optimization[J]. Information Sciences, 2013, 249:1-12. [30] ZHAO Hong, WANG Ping, HU Qinghua. Cost-sensitive feature selection based on adaptive neighborhood granularity with multi-level confidence[J]. Information Sciences, 2016, 366:134-149. |
[1] | 张志浩,林耀进,卢舜,吴镒潾,王晨曦. 流缺失标记环境下的多标记特征选择[J]. 《山东大学学报(理学版)》, 2022, 57(8): 39-52. |
[2] | 孙林,陈雨生,徐久成. 基于改进ReliefF的多标记特征选择算法[J]. 《山东大学学报(理学版)》, 2022, 57(4): 1-11. |
[3] | 张要,马盈仓,杨小飞,朱恒东,杨婷. 结合流形结构与柔性嵌入的多标签特征选择[J]. 《山东大学学报(理学版)》, 2021, 56(7): 91-102. |
[4] | 杨婷,朱恒东,马盈仓,汪义瑞,杨小飞. 基于L2,1范数和流形正则项的半监督谱聚类算法[J]. 《山东大学学报(理学版)》, 2021, 56(3): 67-76. |
[5] | 黄天意,祝峰. 基于流形学习的代价敏感特征选择[J]. 山东大学学报(理学版), 2017, 52(3): 91-96. |
[6] | 万中英,王明文,左家莉,万剑怡. 结合全局和局部信息的特征选择算法[J]. 山东大学学报(理学版), 2016, 51(5): 87-93. |
[7] | 李钊,孙占全,李晓,李诚. 基于信息损失量的特征选择方法研究及应用[J]. 山东大学学报(理学版), 2016, 51(11): 7-12. |
[8] | 郑妍, 庞琳, 毕慧, 刘玮, 程工. 基于情感主题模型的特征选择方法[J]. 山东大学学报(理学版), 2014, 49(11): 74-81. |
[9] | 夏梦南, 杜永萍, 左本欣. 基于依存分析与特征组合的微博情感分析[J]. 山东大学学报(理学版), 2014, 49(11): 22-30. |
[10] | 于然1,2,刘春阳3*,靳小龙1,王元卓1,程学旗1. 基于多视角特征融合的中文垃圾微博过滤[J]. J4, 2013, 48(11): 53-58. |
[11] | 易超群,李建平,朱成文. 一种基于分类精度的特征选择支持向量机[J]. J4, 2010, 45(7): 119-121. |
[12] | 杨玉珍 刘培玉 朱振方 邱烨. 应用特征项分布信息的信息增益改进方法研究[J]. J4, 2009, 44(11): 48-51. |
[13] | 袁晓航,杜小勇 . iRIPPER——一种改进的基于规则学习的文本分类算法[J]. J4, 2007, 42(11): 66-68 . |
[14] | 余俊英,王明文,盛 俊 . 文本分类中的类别信息特征选择方法[J]. J4, 2006, 41(3): 144-148 . |
|