《山东大学学报(理学版)》 ›› 2021, Vol. 56 ›› Issue (7): 91-102.doi: 10.6040/j.issn.1671-9352.0.2020.588
• • 上一篇
张要,马盈仓*,杨小飞,朱恒东,杨婷
ZHANG Yao, MA Ying-cang*, YAND Xiao-fei, ZHU Heng-dong, YANG Ting
摘要: 将线性回归模型与流形结构相结合,构成了弱线性多标签特征选择的联合框架。首先,用最小二乘损失函数来学习回归系数矩阵;其次,通过标签流形结构来学习数据特征的权重矩阵;再次,用L2,1-范数来约束回归系数矩阵和特征权重矩阵,这样既能引导稀疏性,又有利于特征选择。此外,设计并证明了具有收敛性的迭代更新算法来解决上述提出的问题。最后,所提出的方法在多个经典多标签数据集上进行了验证,实验结果表明了所提算法的有效性。
中图分类号:
[1] CAI Jie, LUO Jiawei, WANG Shulin, et al. Feature selection in machine learning: a new perspective[J]. Neurocomputing, 2018, 300:70-79. [2] BERMINGHAM M L, PONG-WONG R, SPILIOPOULOU A, et al. Application of high-dimensional feature selection: evaluation for genomic prediction in man[J]. Scientific Reports, 2015, 5:10312. [3] HASTIE T, TIBSHIRANI R, FRIEDMAN J. The elements of statistical learning: data mining, inference, and prediction[J]. The Mathematical Intelligencer, 2005, 27(2):83-85. [4] SUN Xin, LIU Yanheng, LI Jin, et al. Using cooperative game theory to optimize the feature selection problem[J]. Neurocomputing, 2012, 97:86-93. [5] ZHANG Rui, NIE Feiping, LI Xuelong, et al. Feature selection with multi-view data: a survey[J]. Information Fusion, 2019(50):158-167. [6] DING Chuancang, ZHAO Ming, LIN Jing, et al. Multi-objective iterative optimization algorithm based optimal wavelet filter selection for multi-fault diagnosis of rolling element bearings[J]. ISA Trans, 2019, 82:199-215. [7] LABANI M, MORADI P, AHMADIZAR F, et al. A novel multivariate filter method for feature selection in text classification problems[J]. Engineering Applications of Artificial Intelligence, 2018, 70:25-37. [8] YAO Chao, LIU Yafeng, JIANG Bo, et al. LLE score: a new filter-based unsupervised feature selection method based on nonlinear manifold embedding and its application to image recognition[J]. IEEE Transactions on Image Processing, 2017, 26(11):5257-5269. [9] GONZALEZ J, ORTEGA J, DAMAS M, et al. A new multi-objective wrapper method for feature selection: Accuracy and stability analysis for BCI[J]. Neurocomputing, 2019, 333:407-418. [10] SWATI J, HONGMEI H, KARL J. Information gain directed genetic algorithm wrapper feature selection for credit rating[J]. Appl Soft Comput, 2018(69):541-553. [11] MALDONADO S, LÓPEZ J. Dealing with high-dimensional class-imbalanced datasets: embedded feature selection for SVM classification[J]. Appl Soft Comput, 2018(67):94-105. [12] KONG Yunchuan, YU Tianwei. A graph-embedded deep feedforward network for disease outcome classification and feature selection using gene expression data[J]. Bioinformatics, 2018, 34(21):3727-3737. [13] SUN Zhenqiang,ZHANG Jia, DAI Liang, et al. Mutual information based multi-label feature selection via constrained convex optimization[J]. Neurocomputing, 2019, 329:447-456. [14] ZHANG Ping, LIU Guixia, GAO Wanfu. Distinguishing two types of labels for multilabel feature selection[J]. Pattern Recognition. 2019, 95:72-82. [15] CHEN Linlin, CHEN Degang. Alignment based feature selection for multi-label learning[J]. Neural Processing Letters. 2019, 50(2/3):28-36. [16] CHEN Sibao, ZHANG Yumei, CHRIS H Q, et al. Extended adaptive Lasso for multi-class and multi-label feature selection[J]. Knowledge-Based Systems, 2019, 173:28-36. [17] ZHANG Jia, LUO Zhiming, LI Candong, et al. Manifold regularized discriminative feature selection for multi-label learning[J]. Pattern Recognition. 2019, 95(1):136-150. [18] 蔡志铃, 祝峰. 非负稀疏表示的多标签特征选择[J]. 计算机科学与探索, 2017, 11(7):1175-1182. CAI Zhiling, ZHU William. Multi-label feature selection via non-negative sparse representation[J]. Journal of Frontiers of Computer Science and Technology, 2017, 11(7):1175-1182. [19] GU Quanquan, ZHOU Jie. Co-clustering on manifolds [C] //ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2009: 359-368. [20] NIE Feiping, HUANG Heng, CAI Xiao, et al. Efficient and robust feature selection via joint L21-norms minimization [C] //Proceedings of the 23rd International Conference on Neural Information Processing Systems-Volume 2. New York: Curran Associates Inc, 2010: 1813-1821. [21] ZHANG Minling, ZHOU Zhihua. ML-KNN: a lazy learning approach to multi-label learning[J]. Pattern Recognit, 2007, 40(7):2038-2048. [22] LEE J, KIM D W. SCLS: multi-label feature selection based on scalable criterion for large label set[J]. Pattern Recognition. 2017, 66(1):342-352. [23] LIN Yaojin, HU Qinghua, LIU Jinghua, et al. Multi-label feature selection based on max-dependency and min-redundancy[J]. Neurocomputing. 2015, 168(30):92-103. [24] LEE J, KIM D W. Feature selection for multi-label classification using multivariate mutual information[J]. Pattern Recognit Lett, 2013, 34(3):349-357 [25] LEE J, KIM D W. Fast multi-label feature selection based on information-theoretic feature ranking[J]. Pattern Recognition, 2015, 48(9):2761-2771. [26] DOUGHERTY J, KOHAVI R, SAHAMI M, et al. Supervised and unsupervised discretization of continuous features[C] //Proceedings of the Twelfth International Conference on Machine Learning. Tahoe: Elsevier Inc, 1995: 194-202. |
[1] | 黄天意,祝峰. 基于流形学习的代价敏感特征选择[J]. 山东大学学报(理学版), 2017, 52(3): 91-96. |
[2] | 万中英,王明文,左家莉,万剑怡. 结合全局和局部信息的特征选择算法[J]. 山东大学学报(理学版), 2016, 51(5): 87-93. |
[3] | 李钊,孙占全,李晓,李诚. 基于信息损失量的特征选择方法研究及应用[J]. 山东大学学报(理学版), 2016, 51(11): 7-12. |
[4] | 郑妍, 庞琳, 毕慧, 刘玮, 程工. 基于情感主题模型的特征选择方法[J]. 山东大学学报(理学版), 2014, 49(11): 74-81. |
[5] | 夏梦南, 杜永萍, 左本欣. 基于依存分析与特征组合的微博情感分析[J]. 山东大学学报(理学版), 2014, 49(11): 22-30. |
[6] | 于然1,2,刘春阳3*,靳小龙1,王元卓1,程学旗1. 基于多视角特征融合的中文垃圾微博过滤[J]. J4, 2013, 48(11): 53-58. |
[7] | 易超群,李建平,朱成文. 一种基于分类精度的特征选择支持向量机[J]. J4, 2010, 45(7): 119-121. |
[8] | 杨玉珍 刘培玉 朱振方 邱烨. 应用特征项分布信息的信息增益改进方法研究[J]. J4, 2009, 44(11): 48-51. |
[9] | 袁晓航,杜小勇 . iRIPPER——一种改进的基于规则学习的文本分类算法[J]. J4, 2007, 42(11): 66-68 . |
[10] | 余俊英,王明文,盛 俊 . 文本分类中的类别信息特征选择方法[J]. J4, 2006, 41(3): 144-148 . |
|