《山东大学学报(理学版)》 ›› 2024, Vol. 59 ›› Issue (5): 70-81.doi: 10.6040/j.issn.1671-9352.7.2023.4523
程雨轩1,2,毛煜1,2*,张小清1,2,曾艺祥1,2,林耀进1,2
CHENG Yuxuan1,2, MAO Yu1,2*, ZHANG Xiaoqing1,2, ZENG Yixiang1,2, LIN Yaojin1,2
摘要: 为了充分地挖掘被单一度量指标算法忽略但对分类结果有利的特征,提出了基于次相关特征和邻域互信息的在线多标记特征选择算法,通过计算得到的新到达特征的重要性以及相关度,分析其显著性的区别,将特征区分为显著特征以及次相关特征。利用邻域交互信息对新到达的特征与已选特征集合进行冗余性分析,剔除依赖度较低的特征,以此逐步提升特征子集的质量。构建了基于全局的线性和非线性关系的度量指标,并以此来计算特征的局部相关度,有效地挖掘次相关特征。充分考虑特征空间中次相关特征存在的问题,将次相关特征从特征集合中剥离并单独保存,使之在冗余分析阶段不会因显著特征对度量指标敏感度高所产生的影响而被剔除出特征集合。建立了特征选择指标,利用迭代策略根据指标进行特征选择。实验结果表明,该算法具有很好的有效性和稳定性。
中图分类号:
[1] ZHANG Minling, ZHANG Qianwen, FANG Junpeng, et al. Leveraging implicit relative labeling-importance information for effective multi-label learning[J]. IEEE Transactions on Knowledge and Data Engineering, 2021, 33(5):2057-2070. [2] LIN Yaojin, HU Qinghua, LIU Jinghua, et al. MULFE: multi-label learning via label-specific feature space ensemble[J]. Transactions on Knowledge Discovery from Data, 2022, 16(1):1-24. [3] ZHANG Minling, ZHOU Zhihua. A review on multi-label learning algorithms[J]. IEEE Transactions on Knowledge and Data Engineering, 2014, 26(8):1819-1837. [4] LIU Jinghua, LIN Yaojin, DING Weiping, et al. Fuzzy mutual information-based multi-label feature selection with label dependency and streaming labels[J]. IEEE Transactions on Fuzzy Systems, 2022, 31(1):77-91. [5] WOLD S, TRYGG J, BERGLUND A, et al. Some recent developments in PLS modeling[J]. Chemometrics and Intelligent Laboratory Systems, 2001, 58(2):131-150. [6] HOTELLING H. Relations between two sets of variates[M] //New York, USA: Springer, 1992:162-190. [7] ZHANG Yin, ZHOU Zhihua. Multi-label dimensionality reduction via dependence maximization[J]. Transactions on Knowledge Discovery from Data, 2010, 4(3):1-41. [8] 许行,张凯,王问剑. 一种小样本数据的特征选择方法[J]. 计算机研究与发展, 2018, 55(10):2321-2330. XU Xing, ZHANG Kai, WANG Wenjian. A feature selection method for small samples[J]. Journal of Computer Research and Development, 2018, 55(10):2321-2330. [9] ZHANG Lingjun, HU Qinghua, DUAN Jie, et al. Multi-label feature selection with fuzzy rough sets[C] //Proceeding of International Conference on Rough Sets and Knowledge Technology. New York: Springer, Cham, 2014:121-128. [10] WU Yilin, LIU Jinghua, LIN Yaojin, et al. Neighborhood rough set based multi-label feature selection with label correlation[J]. Concurrency and Computation Practice and Experience, 2022, 34(22):1-13. [11] 曾艺祥, 林耀进, 李育林, 等. 基于抗噪声邻域粗糙集的在线流特征选择算法[J]. 小型微型计算机系统, 2023, 44(7):1494-1499. ZENG Yixiang, LIN Yaojin, LI Yulin, et al. Online streaming feature selection based on anti-noise neighborhood rough set[J]. Journal of Chinese Computer Systems, 2023, 44(7):1494-1499. [12] LIN Yaojin, HU Qinghua, LIU Jinghua, et al. Streaming feature selection for multilabel learning based on fuzzy mutual information[J]. IEEE Transactions on Fuzzy Systems, 2017, 25(6):1491-1507. [13] LIU Jinghua, LIN Yaojin, LI Yuwen, et al. Online multi-label streaming feature selection based on neighborhood rough set[J]. Pattern Recognition, 2018, 84:273-287. [14] YOU Dianlong, WANG Yang, XIAO Jiawei, et al. Online multi-label streaming feature selection with label correlation[J]. IEEE Transactions on Knowledge and Data Engineering, 2023, 35(3):2901-2915. [15] LIN Yaojin, HU Qinghua, LIU Jinghua, et al. Multi-label feature selection based on neighborhood mutual information[J]. Applied Soft Computing, 2016, 38:244-256. [16] KWAK N, CHOI C H. Input feature selection for classification problems[J]. IEEE Transactions on Neural Networks, 2002, 13(1):143-159. [17] HASHEMI A, DOWLATSHAHI M B, NEZAMABADI-POUR H. MFS-MCDM: multi-label feature selection using multi-criteria decision making[J]. Knowledge-based Systems, 2020, 206:106365. [18] PANIRI M, DOWLATSHAHI M B, NEZAMABADI-POUR H. MLACO: a multi-label feature selection algorithm based on ant colony optimization[J]. Knowledge-based Systems, 2020, 192:105285. [19] HUANG Rui, WU Zhejun. Multi-label feature selection via manifold regularization and dependence maximization[J]. Pattern Recognition, 2021, 120:108149. [20] FRIEDMAN M. A comparison of alternative tests of significance for the problem of m rankings[J]. The Annals of Mathematical Statistics, 1940, 11(1):86-92. [21] DUNN O J. Multiple comparisons among means[J]. Journal of the American Statistical Association, 1961, 56(293):52-64. |
[1] | 张珊丹,翁伟,谢小竹,魏博文,王劲波,文娟. 基于全局和局部关系的类属特征多标记分类算法[J]. 《山东大学学报(理学版)》, 2024, 59(5): 23-34. |
[2] | 张志浩,林耀进,卢舜,吴镒潾,王晨曦. 流缺失标记环境下的多标记特征选择[J]. 《山东大学学报(理学版)》, 2022, 57(8): 39-52. |
[3] | 孙林,梁娜,徐久成. 基于自适应邻域互信息与谱聚类的特征选择[J]. 《山东大学学报(理学版)》, 2022, 57(12): 13-24. |
[4] | 余鹰,吴新念,王乐为,张应龙. 基于标记相关性的多标记三支分类算法[J]. 《山东大学学报(理学版)》, 2020, 55(3): 81-88. |
[5] | 冯新营1,2,计华1,2,张化祥1,2. 基于聚类优化的RBF神经网络多标记学习算法[J]. J4, 2012, 47(5): 63-67. |
|