JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE) ›› 2022, Vol. 57 ›› Issue (8): 39-52.doi: 10.6040/j.issn.1671-9352.7.2021.141

Previous Articles     Next Articles

Multi-label feature selection with streaming and missing labels

ZHANG Zhi-hao1,2, LIN Yao-jin1,2*, LU Shun1,2, WU Yi-lin1,2, WANG Chen-xi1,2   

  1. 1. School of Computer Science, Minnan Normal University, Zhangzhou 363000, Fujian, China;
    2. Key Laboratory of Data Science and Intelligence Application, Zhangzhou 363000, Fujian, China
  • Online:2022-08-20 Published:2022-06-29

Abstract: In the practical tasks of supervised learning, the high dimensionality of feature space, the dynamic and missing of labels bring severe challenges to supervised learning. To address these problems, a multi-label feature selection with streaming and missing labels algorithm is proposed. Firstly, to solve the impact of missing labels, the missing matrix is completed by learning label correlations. Secondly, sparse learning is utilized to select label-specific features for each newly arrived label. Then, a representative feature subset is selected by calculating the score of each label-specific features of label. Finally, a series of experiments on 11 benchmark data sets demonstrate that the proposed algorithm can effectively select a representative feature subset with better classification performance.

Key words: multi-label learning, feature selection, label-specific feature, missing label, streaming label

CLC Number: 

  • TP181
[1] LIN Yaojin, HU Qinghua, LIU Jinghua, et al. Multi-label feature selection based on neighborhood mutual information[J]. Applied Soft Computing, 2015, 38(1):224-256.
[2] 王晨曦, 林耀进, 唐莉, 等. 基于信息粒化的多标记特征选择算法[J]. 模式识别与人工智能, 2018, 31(2):123-131. WANG Chenxi, LIN Yaojin, TANG Li, et al. Multi-label feature selection based on information granulation[J]. Pattern Recognition and Artificial Intelligence, 2018, 31(2):123-131.
[3] 林耀进, 陈祥焰, 白盛兴, 等. 基于最大决策边界的高维类不平衡数据在线流特征选择[J]. 模式识别与人工智能, 2020, 33(9):820-829. LIN Yaojin, CHEN Xiangyan, BAI Shengxing, et al. Online streaming feature selection for high-dimensional and class-imbalanced data based on max-decision boundary[J]. Pattern Recognition and Artificial Intelligence, 2020, 33(9):820-829.
[4] LIN Yaojin, HU Qinghua, LIU Jinghua, et al. Streaming feature selection for multilabel learning based on fuzzy mutual information[J]. IEEE Transactions on Fuzzy Systems, 2017, 25(6):1491-1507.
[5] LIU Jinghua, LIN Yaojin, WU Shunxiang, et al. Online multi-label group feature selection[J]. Knowledge-Based Systems, 2018, 143:42-57.
[6] ZHANG M L, PENA J M, ROBLES V. Feature selection for multi-label naive Bayes classification[J]. Information Sciences, 2009, 179(19):3218-3229.
[7] GHARROUDI O, ELGHAZEL H, AUSSEM A. A comparison of multi-label feature selection methods using the random forest paradigm[C] //Proceedings of the 2014 Canadian Conference on Artificial Intelligence. Montreal: Springer, 2014:95-106.
[8] ZHANG Jia, LI Candong, CAO Donglin, et al. Multi-label learning with label-specific features by resolving label correlations[J]. Knowledge-Based Systems, 2018, 159:148-157.
[9] ZHU Pengfei, XU Qian, HU Qinghua, et al. Multi-label feature selection with missing labels[J]. Pattern Recognition, 2018, 74:488-502.
[10] HUANG Jun, QIN Feng, ZHENG Xiao, et al. Improving multi-label classification with missing labels by learning label-specific features[J]. Information Sciences, 2019, 492:124-146.
[11] LIU Jinghua, LI Yuwen, WENG Wei, et al. Feature selection for multi-label learning with streaming label[J]. Neurocomputing, 2020, 387:268-278.
[12] LIN Yaojin, HU Qinghua, ZHANG Jia, et al. Multi-label feature selection with streaming labels[J]. Information Science, 2016, 372(1):256-275.
[13] XU Qian, ZHU Pengfei, HU Qinghua, et al. Robust multi-label feature selection with missing labels[C] //The 7th Chinese Conference on Pattern Recognition. Chengdu: Springer, 2016:752-765.
[14] MA J H, CHOW T. Label-specific feature selection and two-level label recovery for multi-label classification with missing labels[J]. Neural Networks, 2019, 118:110-126.
[15] WANG Chenxi, LIN Yaojin, LIU Jinghua. Feature selection for multi-label learning with missing labels[J]. Applied Intelligence, 2019, 49(8):3027-3042.
[16] BECK A, TEBOULLE M. A fast iterative shrinkage-thresholding algorithm for linear inverse problems[J]. SIAM Journal on Imaging Sciences, 2009, 2(1):183-202.
[17] ZHANG Minling, ZHOU Zhihua. A review on multi-label learning algorithms[J]. IEEE Transactions on Knowledge and Data Engineering, 2014, 26(8):1819-1837.
[18] ZHANG Minling, ZHOU Zhihua. ML-KNN: a lazy learning approach to multi-label learning[J]. Pattern Recognition, 2007, 40(7):2038-2048.
[19] LIN Yaojin, HU Qinghua, LIU Jinghua, et al. Multi-label feature selection based on max-dependency and min-redundancy[J]. Neurocomputing, 2015, 168:92-103.
[1] SUN Lin, CHEN Yu-sheng, XU Jiu-cheng. Multilabel feature selection algorithm based on improved ReliefF [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2022, 57(4): 1-11.
[2] ZHANG Yao, MA Ying-cang, YAND Xiao-fei, ZHU Heng-dong, YANG Ting. Multi-label feature selection based on manifold structure and flexible embedding [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2021, 56(7): 91-102.
[3] Ying YU,Xin-nian WU,Le-wei WANG,Ying-long ZHANG. A multi-label three-way classification algorithm based on label correlation [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2020, 55(3): 81-88.
[4] HUANG Tian-yi, ZHU William. Cost-sensitive feature selection via manifold learning [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2017, 52(3): 91-96.
[5] WAN Zhong-ying, WANG Ming-wen, ZUO Jia-li, WAN Jian-yi. Feature selection combined with the global and local information(GLFS) [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2016, 51(5): 87-93.
[6] LI Zhao,SUN Zhan-,LI Xiao,LI Cheng,. Study on feature selection method based on information loss [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2016, 51(11): 7-12.
[7] ZHENG Yan, PANG Lin, BI Hui, LIU Wei, CHENG Gong. Feature selection algorithm based on sentiment topic model [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2014, 49(11): 74-81.
[8] XIA Meng-nan, DU Yong-ping, ZUO Ben-xin. Micro-blog opinion analysis based on syntactic dependency and feature combination [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2014, 49(11): 22-30.
[9] PAN Qing-qing, ZHOU Feng, YU Zheng-tao, GUO Jian-yi, XIAN Yan-tuan. Recognition method of Vietnamese named entity based on#br# conditional random fields [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2014, 49(1): 76-79.
[10] YU Ran 1,2, LIU Chun-yang3*, JIN Xiao-long 1, WANG Yuan-zhuo 1, CHENG Xue-qi 1. Chinese spam microblog filtering based on the fusion of
multi-angle features
[J]. J4, 2013, 48(11): 53-58.
[11] FENG Xin-ying1,2, JI Hua1,2, ZHANG Hua-xiang1,2. Multi-label RBF neural networks learning algorithm  based on clustering optimization [J]. J4, 2012, 47(5): 63-67.
[12] YI Chao-qun, LI Jian-ping, ZHU Cheng-wen. A kind of feature selection based on classification accuracy of SVM [J]. J4, 2010, 45(7): 119-121.
[13] YANG Yu-Zhen, LIU Pei-Yu, SHU Zhen-Fang, QIU Ye. Research of an improved information gain methodusing distribution information of terms [J]. J4, 2009, 44(11): 48-51.
[14] YUAN Xiao-hang,DU Xiao-yong . iRIPPER: an improved rule-based text categorization algorithm [J]. J4, 2007, 42(11): 66-68 .
[15] YU Jun-ying,WANG Ming-wen,SHENG Jun . Class information feature selection method for text classification [J]. J4, 2006, 41(3): 144-148 .
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] YANG Jun. Characterization and structural control of metalbased nanomaterials[J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2013, 48(1): 1 -22 .
[2] HE Hai-lun, CHEN Xiu-lan* . Circular dichroism detection of the effects of denaturants and buffers on the conformation of cold-adapted protease MCP-01 and  mesophilic protease BP01[J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2013, 48(1): 23 -29 .
[3] ZHAO Jun1, ZHAO Jing2, FAN Ting-jun1*, YUAN Wen-peng1,3, ZHANG Zheng1, CONG Ri-shan1. Purification and anti-tumor activity examination of water-soluble asterosaponin from Asterias rollestoni Bell[J]. J4, 2013, 48(1): 30 -35 .
[4] SUN Xiao-ting1, JIN Lan2*. Application of DOSY in oligosaccharide mixture analysis[J]. J4, 2013, 48(1): 43 -45 .
[5] LUO Si-te, LU Li-qian, CUI Ruo-fei, ZHOU Wei-wei, LI Zeng-yong*. Monte-Carlo simulation of photons transmission at alcohol wavelength in  skin tissue and design of fiber optic probe[J]. J4, 2013, 48(1): 46 -50 .
[6] YANG Lun, XU Zheng-gang, WANG Hui*, CHEN Qi-mei, CHEN Wei, HU Yan-xia, SHI Yuan, ZHU Hong-lei, ZENG Yong-qing*. Silence of PID1 gene expression using RNA interference in C2C12 cell line[J]. J4, 2013, 48(1): 36 -42 .
[7] MAO Ai-qin1,2, YANG Ming-jun2, 3, YU Hai-yun2, ZHANG Pin1, PAN Ren-ming1*. Study on thermal decomposition mechanism of  pentafluoroethane fire extinguishing agent[J]. J4, 2013, 48(1): 51 -55 .
[8] YANG Ying, JIANG Long*, SUO Xin-li. Choquet integral representation of premium functional and related properties on capacity space[J]. J4, 2013, 48(1): 78 -82 .
[9] LI Yong-ming1, DING Li-wang2. The r-th moment consistency of estimators for a semi-parametric regression model for positively associated errors[J]. J4, 2013, 48(1): 83 -88 .
[10] DONG Wei-wei. A new method of DEA efficiency ranking for decision making units with independent subsystems[J]. J4, 2013, 48(1): 89 -92 .