JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE) ›› 2024, Vol. 59 ›› Issue (5): 70-81.doi: 10.6040/j.issn.1671-9352.7.2023.4523

Previous Articles     Next Articles

Online multi-label feature selection based on sub-correlation features and neighborhood mutual information

CHENG Yuxuan1,2, MAO Yu1,2*, ZHANG Xiaoqing1,2, ZENG Yixiang1,2, LIN Yaojin1,2   

  1. 1. School of Computer Science, Minnan Normal University, Zhangzhou 363000, Fujian, China;
    2. Key Laboratory of Data Science and Intelligence Application, Minnan Normal University, Zhangzhou 363000, Fujian, China
  • Published:2024-05-09

Abstract: To fully mine the features neglected by the single metric algorithm but beneficial to the classifier, this paper proposes an online multi-label feature selection algorithm based on sub-correlation features and neighborhood mutual information. By calculating the importance and correlation of newly arrived features, the difference between the significance of new features is analyzed, and the features are divided into salient features and sub-correlation features. Redundancy analysis is performed on newly arrived features and selected feature sets using neighborhood interaction information, and features with low dependencies are eliminated, to gradually improve the quality of feature subsets. This paper also constructs a measurement index based on the global linear and nonlinear relationship and uses it to calculate the local correlation of features, effectively mining the sub-correlation features. Strip the sub-correlation features from the feature set and save them separately, so that they will not be eliminated from the feature set during the redundancy analysis stage due to the high sensitivity of the salient features to the measurement index. Using established feature selection indicators and iterative strategies to select features according to the indicators. Experimental results show that the proposed algorithm has good effectiveness and stability.

Key words: online feature selection, multi-label learning, neighborhood entropy, neighborhood mutual information, sub-correlation feature

CLC Number: 

  • TP391
[1] ZHANG Minling, ZHANG Qianwen, FANG Junpeng, et al. Leveraging implicit relative labeling-importance information for effective multi-label learning[J]. IEEE Transactions on Knowledge and Data Engineering, 2021, 33(5):2057-2070.
[2] LIN Yaojin, HU Qinghua, LIU Jinghua, et al. MULFE: multi-label learning via label-specific feature space ensemble[J]. Transactions on Knowledge Discovery from Data, 2022, 16(1):1-24.
[3] ZHANG Minling, ZHOU Zhihua. A review on multi-label learning algorithms[J]. IEEE Transactions on Knowledge and Data Engineering, 2014, 26(8):1819-1837.
[4] LIU Jinghua, LIN Yaojin, DING Weiping, et al. Fuzzy mutual information-based multi-label feature selection with label dependency and streaming labels[J]. IEEE Transactions on Fuzzy Systems, 2022, 31(1):77-91.
[5] WOLD S, TRYGG J, BERGLUND A, et al. Some recent developments in PLS modeling[J]. Chemometrics and Intelligent Laboratory Systems, 2001, 58(2):131-150.
[6] HOTELLING H. Relations between two sets of variates[M] //New York, USA: Springer, 1992:162-190.
[7] ZHANG Yin, ZHOU Zhihua. Multi-label dimensionality reduction via dependence maximization[J]. Transactions on Knowledge Discovery from Data, 2010, 4(3):1-41.
[8] 许行,张凯,王问剑. 一种小样本数据的特征选择方法[J]. 计算机研究与发展, 2018, 55(10):2321-2330. XU Xing, ZHANG Kai, WANG Wenjian. A feature selection method for small samples[J]. Journal of Computer Research and Development, 2018, 55(10):2321-2330.
[9] ZHANG Lingjun, HU Qinghua, DUAN Jie, et al. Multi-label feature selection with fuzzy rough sets[C] //Proceeding of International Conference on Rough Sets and Knowledge Technology. New York: Springer, Cham, 2014:121-128.
[10] WU Yilin, LIU Jinghua, LIN Yaojin, et al. Neighborhood rough set based multi-label feature selection with label correlation[J]. Concurrency and Computation Practice and Experience, 2022, 34(22):1-13.
[11] 曾艺祥, 林耀进, 李育林, 等. 基于抗噪声邻域粗糙集的在线流特征选择算法[J]. 小型微型计算机系统, 2023, 44(7):1494-1499. ZENG Yixiang, LIN Yaojin, LI Yulin, et al. Online streaming feature selection based on anti-noise neighborhood rough set[J]. Journal of Chinese Computer Systems, 2023, 44(7):1494-1499.
[12] LIN Yaojin, HU Qinghua, LIU Jinghua, et al. Streaming feature selection for multilabel learning based on fuzzy mutual information[J]. IEEE Transactions on Fuzzy Systems, 2017, 25(6):1491-1507.
[13] LIU Jinghua, LIN Yaojin, LI Yuwen, et al. Online multi-label streaming feature selection based on neighborhood rough set[J]. Pattern Recognition, 2018, 84:273-287.
[14] YOU Dianlong, WANG Yang, XIAO Jiawei, et al. Online multi-label streaming feature selection with label correlation[J]. IEEE Transactions on Knowledge and Data Engineering, 2023, 35(3):2901-2915.
[15] LIN Yaojin, HU Qinghua, LIU Jinghua, et al. Multi-label feature selection based on neighborhood mutual information[J]. Applied Soft Computing, 2016, 38:244-256.
[16] KWAK N, CHOI C H. Input feature selection for classification problems[J]. IEEE Transactions on Neural Networks, 2002, 13(1):143-159.
[17] HASHEMI A, DOWLATSHAHI M B, NEZAMABADI-POUR H. MFS-MCDM: multi-label feature selection using multi-criteria decision making[J]. Knowledge-based Systems, 2020, 206:106365.
[18] PANIRI M, DOWLATSHAHI M B, NEZAMABADI-POUR H. MLACO: a multi-label feature selection algorithm based on ant colony optimization[J]. Knowledge-based Systems, 2020, 192:105285.
[19] HUANG Rui, WU Zhejun. Multi-label feature selection via manifold regularization and dependence maximization[J]. Pattern Recognition, 2021, 120:108149.
[20] FRIEDMAN M. A comparison of alternative tests of significance for the problem of m rankings[J]. The Annals of Mathematical Statistics, 1940, 11(1):86-92.
[21] DUNN O J. Multiple comparisons among means[J]. Journal of the American Statistical Association, 1961, 56(293):52-64.
[1] ZHANG Shandan, WENG Wei, XIE Xiaozhu, WEI Bowen, WANG Jinbo, WEN Juan. Global and local relationships based on multi-label classification algorithm with label-specific features [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2024, 59(5): 23-34.
[2] CHEN Yumin, ZHENG Guangyu, JIAO Na. Multi-label learning based on granular neural networks [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2024, 59(5): 1-11.
[3] ZHANG Zhi-hao, LIN Yao-jin, LU Shun, WU Yi-lin, WANG Chen-xi. Multi-label feature selection with streaming and missing labels [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2022, 57(8): 39-52.
[4] SUN Lin, LIANG Na, XU Jiu-cheng. Feature selection using adaptive neighborhood mutual information and spectral clustering [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2022, 57(12): 13-24.
[5] ZHANG Yao, MA Ying-cang, YAND Xiao-fei, ZHU Heng-dong, YANG Ting. Multi-label feature selection based on manifold structure and flexible embedding [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2021, 56(7): 91-102.
[6] Ying YU,Xin-nian WU,Le-wei WANG,Ying-long ZHANG. A multi-label three-way classification algorithm based on label correlation [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2020, 55(3): 81-88.
[7] FENG Xin-ying1,2, JI Hua1,2, ZHANG Hua-xiang1,2. Multi-label RBF neural networks learning algorithm  based on clustering optimization [J]. J4, 2012, 47(5): 63-67.
Full text



[1] SUN Shou-bin,MENG Guang-wu ,ZHAO Feng . Dα-Continuity of order homomorphism[J]. J4, 2007, 42(7): 49 -53 .
[2] GUO Ting,BAO Xiao-ming . Influences of sitedirected mutagenesis on the enzymeactivity and thethermostability of the xylose isomerase from Thermus thermphilus[J]. J4, 2006, 41(6): 145 -148 .
[3] JIANG Xue-lian, SHI Hong-bo*. The learning algorithm of a generative and discriminative combination classifier[J]. J4, 2010, 45(7): 7 -12 .
[4] PENG Yan-fen,LI Bao-zong,LIU Tian-bao . Relationships between the structures and the anesthetic[J]. J4, 2006, 41(5): 148 -150 .
[5] . [J]. J4, 2009, 44(3): 84 -87 .
[6] GUO Lei,YU Rui-lin and TIAN Fa-zhong . Optimal control of one kind general jump transition systems[J]. J4, 2006, 41(1): 35 -40 .
[7] YANG Lun, XU Zheng-gang, WANG Hui*, CHEN Qi-mei, CHEN Wei, HU Yan-xia, SHI Yuan, ZHU Hong-lei, ZENG Yong-qing*. Silence of PID1 gene expression using RNA interference in C2C12 cell line[J]. J4, 2013, 48(1): 36 -42 .
[8] LIU Yan-ping, WU Qun-ying. Almost sure limit theorems for the maximum of Gaussian sequences#br# with optimized weight[J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2014, 49(05): 50 -53 .
[9] DU Ji-xiang1,2, YU Qing1, ZHAI Chuan-ming1. Age estimation of facial images based on non-negative matrix factorization with sparseness constraints[J]. J4, 2010, 45(7): 65 -69 .
[10] ZHOU Juan,GUO Wei-hua,ZONG Mei-juan,HAN Xue-mei,WANG REN-qing . Analysis of the soil cultivable bacterial diversities underdifferent vegetations of Fanggan village[J]. J4, 2006, 41(6): 161 -167 .