JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE) ›› 2022, Vol. 57 ›› Issue (4): 1-11.doi: 10.6040/j.issn.1671-9352.7.2021.167

   

Multilabel feature selection algorithm based on improved ReliefF

SUN Lin1,2, CHEN Yu-sheng1, XU Jiu-cheng1,2   

  1. 1. College of Computer and Information Engineering, Henan Normal University, Xinxiang 453007, Henan, China;
    2. Henan Engineering Laboratory of Smart Business and Internet of Things Technology, Xinxiang 453007, Henan, China
  • Published:2022-03-29

Abstract: Aiming at the problems that the traditional ReliefF algorithm can only process single-label data, and its improved algorithms do not make full use of the correlation between samples, a multilabel feature selection algorithm based on improved ReliefF is proposed. First, the cosine similarity function is used to measure the similarity between features of samples, the Jaccard distance is employed to measure the correlation of labels among labels of samples, and then the similarity function among samples is defined to measure the similarity relationship between samples in the entire sample space. Second, the discrimination formula of the homogeneous or heterogeneous samples is defined to judge the nearest homogeneous or heterogeneous samples for the random samples. Finally, a new iterative formula of feature weights is proposed to improve the ReliefF algorithm, and then a multi-label feature selection algorithm is designed. The five different evaluation metrics including Average Precision, Coverage, One-error,Ranking Loss and Hamming Loss are employed to analyze and test the classification performance of the proposed algorithm on seven public multilabel datasets. The experimental results show that the proposed algorithm is effective.

Key words: multilabel, feature selection, correlation of labels, ReliefF

CLC Number: 

  • TP181
[1] 余鹰,吴新念,王乐为,等. 基于标记相关性的多标记三支分类算法[J]. 山东大学学报(理学版),2020,55(3):81-88. YU Ying, WU Xinnian, WANG Lewei, et al. A multi-label three-way classification algorithm based on label correlation[J]. Journal of Shandong University(Natural Science), 2020, 55(3):81-88.
[2] 王维博,张斌,曾文入,等. 基于特征融合一维卷积神经网络的电能质量扰动分类[J]. 电力系统保护与控制,2020,48(6):53-60. WANG Weibo, ZHANG Bin, ZENG Wenru, et al. Power quality disturbance classification of one-dimensional convolutional neural networks based on feature fusion[J]. Power Syetem Protection and Control, 2020, 48(6):53-60.
[3] 邓威,郭钇秀,李勇,等. 基于特征选择和Stacking集成学习的配电网网损预测[J]. 电力系统保护与控制,2020,48(15):108-115. DENG Wei, GUO Yixiu, LI Yong, et al. Power losses prediction based on feature selection and Stacking integrated learning[J]. Power System Protection and Control, 2020, 28(15):108-115.
[4] 薛占鳌,庞文莉,姚守倩,等. 基于前景理论的直觉模糊三支决策模型[J]. 河南师范大学学报(自然科学版),2020,48(5):31-36. XUE Zhanao, PANG Wenli, YAO Shouqian, et al. The prospect theory based intuitionistic fuzzy three-way decisions model[J]. Journal of Henan Normal University(Natural Science Edition), 2020, 48(5):31-36.
[5] 韩素敏,郑书晴,何永盛. 基于粗糙集贪心算法的逆变器开路故障诊断[J]. 电力系统保护与控制,2020,48(17):122-130. HAN Sumin, ZHENG Shuqing, HE Yongsheng. Open circuit fault diagnosis for inverters based on a greedy algorithm of a rough set[J]. Power System Protection and Control, 2020, 48(17):122-130.
[6] 刘琨,封硕. 加强局部搜索能力的人工蜂群算法[J]. 河南师范大学学报(自然科学版),2021,49(2):15-24. LIU Kun, FENG Shuo. An improved artificial bee colony algorithm for enhancing local search ability[J]. Journal of Henan Normal University(Natural Science Edition), 2021, 49(2):15-24.
[7] 刘艳,程璐,孙林. 基于K-S检验和邻域粗糙集的特征选择方法[J]. 河南师范大学学报(自然科学版),2019,47(2):21- 28. LIU Yan, CHENG Lu, SUN Lin. Feature selection method based on K-S test and neighborhood rough sets[J]. Journal of Henan Normal University(Natural Science Edition), 2019, 47(2):21- 28.
[8] SHA Zhichao, LIU Zhangmeng, MA Chen, et al. Feature selection for multi-label classification by maximizing full-dimensional conditional mutual information[J]. Applied Intelligence, 2021, 51:326-340.
[9] SHU Wenhao, QIAN Wenbin, XIE Yonghong. Incremental feature selection for dynamic hybrid data using neighborhood rough set[J]. Knowledge-Based Systems, 2020, 194:105516.
[10] LIM H, KIM D W. MFC: initialization method for multi-label feature selection based on conditional mutual information[J]. Neurocomputing, 2020, 382:40-51.
[11] FAN Yuling, LIU Jinghua, WENG Wei, et al. Multi-label feature selection with local model discriminant and label correlations[J]. Neurocomputing, 2021, 442:98-115.
[12] SUN Lin, YIN Tengyu, QIAN Yuhua, et al. Multilabel feature selection using ML-ReliefF and neighborhood mutual information for multilabel neighborhood decision systems[J]. Information Sciences, 2020, 537:401-424.
[13] KONONENKO I. Estimating attributes: analysis and extensions of RELIEF[C] //European Conference on Machine Learning on Machine Learning. Berlin: Springer, 1994: 171-182.
[14] 蔡亚萍,杨明. 一种利用局部标记相关性的多标记特征选择算法[J]. 南京大学学报(自然科学版),2016,52(4):693-704. CAI Yaping, YANG Ming. A multi-label feature selection algorithm by exploiting label correlations locally[J]. Journal of Nanjing University(Nature Science), 2016, 52(4):693-704.
[15] 刘海洋,王志海,张志东. 基于ReliefF剪枝的多标记分类算法[J]. 计算机学报,2019,42(3):483-496. LIU Haiyang, WANG Zhihai, ZHANG Zhidong. ReliefF based pruning for multi-label classification[J]. Journal of Computer, 2019, 42(3):483-496.
[16] 马晶莹,宣恒农. 扩展ReliefF的两种多标签特征选择算法[J]. 计算机应用与软件,2017,34(7):298-302,324. MA Jingying, XUAN Hengnong. Two feature selection algorithms for multi-label classification by extending ReliefF[J]. Computer Applications and Software, 2017, 34(7):298-302,324.
[17] KIRA K, RENDELL L. The feature selection problem: traditional methods and a new algorithm[C] //Proceedings of the 10th National Conference on Artificial Intelligence. Menlo Park: USA, AAAI, 1992: 129-134.
[18] ZHANG Minling, ZHOU Zhihua. ML-KNN: a lazy learning approach to multi-label learning[J]. Pattern Recognition, 2007, 40(7):2038-2048.
[19] 王一宾,吴陈,程玉胜,等. 不平衡标记差异性多标记特征选择算法[J]. 深圳大学学报(理工版),2020,37(3):234-242. WANG Yibin, WU Chen, CHENG Yusheng, et al. Multi-label feature selection algorithm with imbalance label otherness[J]. Journal of Shenzhen University(Science and Engineering), 2020, 37(3):234-242.
[20] ZHANG Yin, ZHOU Zhihua. Multilabel dimensionality reduction via dependence maximization[J]. ACM Transactions on Knowledge Discovery from Data, 2010, 4(3):14.
[21] LEE J, KIM D W. Feature selection for multi-label classification using multivariate mutual information[J]. Pattern Recognition Letters, 2013, 34(3):349-357.
[22] ZHANG M L, PENA J M, ROBLES V. Feature selection for multi-label naive Bayes classification[J]. Information Sciences, 2009, 179(19):3218-3229.
[23] LIN Yaojin, HU Qinghua, LIU Jinghua, et al. Streaming feature selection for multilabel learning based on fuzzy mutual information[J]. IEEE Transactions on Fuzzy Systems, 2017, 25(6):1491-1507.
[24] CHEN Linlin, CHEN Degang. Alignment based feature selection for multi-label learning[J]. Neural Processing Letters, 2019, 50(3):2323-2344.
[1] ZHANG Yao, MA Ying-cang, YAND Xiao-fei, ZHU Heng-dong, YANG Ting. Multi-label feature selection based on manifold structure and flexible embedding [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2021, 56(7): 91-102.
[2] HUANG Tian-yi, ZHU William. Cost-sensitive feature selection via manifold learning [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2017, 52(3): 91-96.
[3] WAN Zhong-ying, WANG Ming-wen, ZUO Jia-li, WAN Jian-yi. Feature selection combined with the global and local information(GLFS) [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2016, 51(5): 87-93.
[4] LI Zhao,SUN Zhan-,LI Xiao,LI Cheng,. Study on feature selection method based on information loss [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2016, 51(11): 7-12.
[5] ZHENG Yan, PANG Lin, BI Hui, LIU Wei, CHENG Gong. Feature selection algorithm based on sentiment topic model [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2014, 49(11): 74-81.
[6] XIA Meng-nan, DU Yong-ping, ZUO Ben-xin. Micro-blog opinion analysis based on syntactic dependency and feature combination [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2014, 49(11): 22-30.
[7] PAN Qing-qing, ZHOU Feng, YU Zheng-tao, GUO Jian-yi, XIAN Yan-tuan. Recognition method of Vietnamese named entity based on#br# conditional random fields [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2014, 49(1): 76-79.
[8] YU Ran 1,2, LIU Chun-yang3*, JIN Xiao-long 1, WANG Yuan-zhuo 1, CHENG Xue-qi 1. Chinese spam microblog filtering based on the fusion of
multi-angle features
[J]. J4, 2013, 48(11): 53-58.
[9] YI Chao-qun, LI Jian-ping, ZHU Cheng-wen. A kind of feature selection based on classification accuracy of SVM [J]. J4, 2010, 45(7): 119-121.
[10] YANG Yu-Zhen, LIU Pei-Yu, SHU Zhen-Fang, QIU Ye. Research of an improved information gain methodusing distribution information of terms [J]. J4, 2009, 44(11): 48-51.
[11] YUAN Xiao-hang,DU Xiao-yong . iRIPPER: an improved rule-based text categorization algorithm [J]. J4, 2007, 42(11): 66-68 .
[12] YU Jun-ying,WANG Ming-wen,SHENG Jun . Class information feature selection method for text classification [J]. J4, 2006, 41(3): 144-148 .
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!