《山东大学学报(理学版)》 ›› 2022, Vol. 57 ›› Issue (8): 39-52.doi: 10.6040/j.issn.1671-9352.7.2021.141
张志浩1,2,林耀进1,2*,卢舜1,2,吴镒潾1,2,王晨曦1,2
ZHANG Zhi-hao1,2, LIN Yao-jin1,2*, LU Shun1,2, WU Yi-lin1,2, WANG Chen-xi1,2
摘要: 在监督学习实际任务中,特征的高维性、标记的动态性和缺失性为监督学习带来严峻的挑战。为解决这些不足,提出流缺失标记环境下的多标记特征选择算法。首先,为解决缺失标记的影响,通过学习标记相关性填补不完整的标记矩阵。其次,利用稀疏学习方法为每个新到达的标记选择类属属性。然后,根据已到达标记的类属属性,通过计算得分选取一个有代表性的特征子集。最后,在11个基准数据集上进行一系列实验表明,所提算法能选择有代表性的特征子集,且分类性能较优。
中图分类号:
[1] LIN Yaojin, HU Qinghua, LIU Jinghua, et al. Multi-label feature selection based on neighborhood mutual information[J]. Applied Soft Computing, 2015, 38(1):224-256. [2] 王晨曦, 林耀进, 唐莉, 等. 基于信息粒化的多标记特征选择算法[J]. 模式识别与人工智能, 2018, 31(2):123-131. WANG Chenxi, LIN Yaojin, TANG Li, et al. Multi-label feature selection based on information granulation[J]. Pattern Recognition and Artificial Intelligence, 2018, 31(2):123-131. [3] 林耀进, 陈祥焰, 白盛兴, 等. 基于最大决策边界的高维类不平衡数据在线流特征选择[J]. 模式识别与人工智能, 2020, 33(9):820-829. LIN Yaojin, CHEN Xiangyan, BAI Shengxing, et al. Online streaming feature selection for high-dimensional and class-imbalanced data based on max-decision boundary[J]. Pattern Recognition and Artificial Intelligence, 2020, 33(9):820-829. [4] LIN Yaojin, HU Qinghua, LIU Jinghua, et al. Streaming feature selection for multilabel learning based on fuzzy mutual information[J]. IEEE Transactions on Fuzzy Systems, 2017, 25(6):1491-1507. [5] LIU Jinghua, LIN Yaojin, WU Shunxiang, et al. Online multi-label group feature selection[J]. Knowledge-Based Systems, 2018, 143:42-57. [6] ZHANG M L, PENA J M, ROBLES V. Feature selection for multi-label naive Bayes classification[J]. Information Sciences, 2009, 179(19):3218-3229. [7] GHARROUDI O, ELGHAZEL H, AUSSEM A. A comparison of multi-label feature selection methods using the random forest paradigm[C] //Proceedings of the 2014 Canadian Conference on Artificial Intelligence. Montreal: Springer, 2014:95-106. [8] ZHANG Jia, LI Candong, CAO Donglin, et al. Multi-label learning with label-specific features by resolving label correlations[J]. Knowledge-Based Systems, 2018, 159:148-157. [9] ZHU Pengfei, XU Qian, HU Qinghua, et al. Multi-label feature selection with missing labels[J]. Pattern Recognition, 2018, 74:488-502. [10] HUANG Jun, QIN Feng, ZHENG Xiao, et al. Improving multi-label classification with missing labels by learning label-specific features[J]. Information Sciences, 2019, 492:124-146. [11] LIU Jinghua, LI Yuwen, WENG Wei, et al. Feature selection for multi-label learning with streaming label[J]. Neurocomputing, 2020, 387:268-278. [12] LIN Yaojin, HU Qinghua, ZHANG Jia, et al. Multi-label feature selection with streaming labels[J]. Information Science, 2016, 372(1):256-275. [13] XU Qian, ZHU Pengfei, HU Qinghua, et al. Robust multi-label feature selection with missing labels[C] //The 7th Chinese Conference on Pattern Recognition. Chengdu: Springer, 2016:752-765. [14] MA J H, CHOW T. Label-specific feature selection and two-level label recovery for multi-label classification with missing labels[J]. Neural Networks, 2019, 118:110-126. [15] WANG Chenxi, LIN Yaojin, LIU Jinghua. Feature selection for multi-label learning with missing labels[J]. Applied Intelligence, 2019, 49(8):3027-3042. [16] BECK A, TEBOULLE M. A fast iterative shrinkage-thresholding algorithm for linear inverse problems[J]. SIAM Journal on Imaging Sciences, 2009, 2(1):183-202. [17] ZHANG Minling, ZHOU Zhihua. A review on multi-label learning algorithms[J]. IEEE Transactions on Knowledge and Data Engineering, 2014, 26(8):1819-1837. [18] ZHANG Minling, ZHOU Zhihua. ML-KNN: a lazy learning approach to multi-label learning[J]. Pattern Recognition, 2007, 40(7):2038-2048. [19] LIN Yaojin, HU Qinghua, LIU Jinghua, et al. Multi-label feature selection based on max-dependency and min-redundancy[J]. Neurocomputing, 2015, 168:92-103. |
[1] | 孙林,陈雨生,徐久成. 基于改进ReliefF的多标记特征选择算法[J]. 《山东大学学报(理学版)》, 2022, 57(4): 1-11. |
[2] | 张要,马盈仓,杨小飞,朱恒东,杨婷. 结合流形结构与柔性嵌入的多标签特征选择[J]. 《山东大学学报(理学版)》, 2021, 56(7): 91-102. |
[3] | 余鹰,吴新念,王乐为,张应龙. 基于标记相关性的多标记三支分类算法[J]. 《山东大学学报(理学版)》, 2020, 55(3): 81-88. |
[4] | 黄天意,祝峰. 基于流形学习的代价敏感特征选择[J]. 山东大学学报(理学版), 2017, 52(3): 91-96. |
[5] | 万中英,王明文,左家莉,万剑怡. 结合全局和局部信息的特征选择算法[J]. 山东大学学报(理学版), 2016, 51(5): 87-93. |
[6] | 李钊,孙占全,李晓,李诚. 基于信息损失量的特征选择方法研究及应用[J]. 山东大学学报(理学版), 2016, 51(11): 7-12. |
[7] | 郑妍, 庞琳, 毕慧, 刘玮, 程工. 基于情感主题模型的特征选择方法[J]. 山东大学学报(理学版), 2014, 49(11): 74-81. |
[8] | 夏梦南, 杜永萍, 左本欣. 基于依存分析与特征组合的微博情感分析[J]. 山东大学学报(理学版), 2014, 49(11): 22-30. |
[9] | 于然1,2,刘春阳3*,靳小龙1,王元卓1,程学旗1. 基于多视角特征融合的中文垃圾微博过滤[J]. J4, 2013, 48(11): 53-58. |
[10] | 冯新营1,2,计华1,2,张化祥1,2. 基于聚类优化的RBF神经网络多标记学习算法[J]. J4, 2012, 47(5): 63-67. |
[11] | 易超群,李建平,朱成文. 一种基于分类精度的特征选择支持向量机[J]. J4, 2010, 45(7): 119-121. |
[12] | 杨玉珍 刘培玉 朱振方 邱烨. 应用特征项分布信息的信息增益改进方法研究[J]. J4, 2009, 44(11): 48-51. |
[13] | 袁晓航,杜小勇 . iRIPPER——一种改进的基于规则学习的文本分类算法[J]. J4, 2007, 42(11): 66-68 . |
[14] | 余俊英,王明文,盛 俊 . 文本分类中的类别信息特征选择方法[J]. J4, 2006, 41(3): 144-148 . |
|