JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE) ›› 2024, Vol. 59 ›› Issue (5): 90-99.doi: 10.6040/j.issn.1671-9352.7.2023.148

Previous Articles     Next Articles

Multi-label online stream feature selection based on high-dimensional correlation

ZHU Liquan1,2, LIN Yaojin1,2, MAO Yu1,2, CHENG Yuxuan1,2   

  1. 1. School of Computer Science, Minnan Normal University, Zhangzhou 363000, Fujian, China;
    2. Key Laboratory of Data Science and Intelligence Application, Minnan Normal University, Zhangzhou 363000, Fujian, China
  • Published:2024-05-09

Abstract: This paper proposes a multi-label online stream feature selection algorithm based on high-dimensional correlation. The algorithm employs an equivalent mapping of the label space and constructs a weighted undirected graph based on the high-dimensional label space. It utilizes graph information and Jaccard index to measure the high-dimensional weights between labels. The significance of newly arrived features is calculated based on the high-dimensional correlation of the labels, and the significance level of new features is determined through iterative mean significance. Furthermore, a balanced global and local online feature selection algorithm is designed to dynamically optimize the selected feature subset by considering the global correlation between the selected features and the label space, thereby filtering out irrelevant features. Redundant features are eliminated by analyzing the local correlation among the selected features. The testing results validate the effectiveness of the proposed algorithm through comparative tests with six other multi-label feature selection methods.

Key words: multi-label feature selection, online streaming feature, high dimensional correlation, label weight

CLC Number: 

  • TP391
[1] 白盛兴,林耀进,王晨曦,等. 基于邻域粗糙集的大规模层次分类在线流特征选择[J]. 模式识别与人工智能, 2019, 32(9):811-820. BAI Shengxing, LIN Yaojin, WANG Chenxi, et al. Large-scale hierarchical classification online streaming feature selection based on neighborhood rough set[J]. Pattern Recognition and Artificial Intelligence, 2019, 32(9):811-820.
[2] HE Zhifen, YANG Ming, LIU Huidong, et al. Calibrated multi-label classification with label correlations[J]. Neural Processing Letters, 2019, 50:1361-1380.
[3] ASDAGHI F, SOLEIMANI A. An effective feature selection method for web spam detection[J]. Knowledge-based Systems, 2019, 166:198-206.
[4] SONG Liangchen, WU Jialian, YANG Ming, et al. Handling difficult labels for multi-label image classification via uncertainty distillation[C] // Proceedings of the 29th ACM International Conference on Multimedia. New York: Association for Computing Machinery, 2021:2410-2419.
[5] LIN Yaojin, HU Qinghua, LIU Jinghua, et al. Streaming feature selection for multlabel learning based on fuzzy mutual information[J]. IEEE Transactions on Fuzzy Systems, 2017, 25:1491-1507.
[6] ZHANG Jia, WU Hanrui, JIANG Min, et al. Group-preserving label-specific feature selection for multi-label learning[J]. Expert Systems with Applications, 2023, 213:118861.
[7] LIN Yaojin, HU Qinghua, LIU Jinghua, et al. MULFE: multi-label learning via label-specific feature space ensemble[J]. ACM Transactions on Knowledge Discovery from Data, 2021, 16(1):1-24.
[8] WU Yilin, LIU Jinghua, YU Xiehua, et al. Neighborhood rough set based multi-label feature selection with label correlation[J]. Concurrency and Computation: Practice and Experience, 2022, 34(22):1-13.
[9] 尤殿龙,郭松,赵春慧,等. 面向分类的流特征在线特征选择算法[J].电子学报, 2020, 48(2):321-332. YOU Dianlong, GUO Song, ZHAO Chunhui, et al. Online feature selection with streaming features for classification[J]. Acta Electronica Sinica, 2020, 48(2):321-332.
[10] LIU Jinghua, LIN Yaojin, WU Shunxiang, et al. Online multi-label group feature selection[J]. Knowledge-based Systems, 2018, 143:42-57.
[11] LIN Jinghua, LIN Yaojin, LI Yuwen, et al. Online multi-label streaming feature selection based on neighborhood rough set[J]. Pattern Recognition, 2018, 84:273-287.
[12] YOU Dianlong, LI Ruiqi, LIANG Shunpan, et al. Online causal feature selection for streaming features[J]. IEEE Transactions on Neural Networks and Learning Systems, 2021, 34(3):1563-1577.
[13] SHANNON C E. A mathematical theory of communication[J]. Bell Systems Technical Journal, 1948, 27(3):379-423.
[14] 滕书华,周石琳,孙即祥,等. 基于条件熵的不完备信息系统属性约简算法[J]. 国防科技大学学报, 2010, 32(1):90-94. TENG Shuhua, ZHOU Shilin, SUN Jixiang, et al. Attribute reduction algorithm based on conditional entropy under incomplete information system[J]. Journal of National University of Defense Technology, 2010, 32(1):90-94.
[15] HASHEMI A, DOWLATSHAHI B M, NEZAMABADI-POUR H. MGFS: a multi-label graph-based feature selection algorithm via PageRank centrality[J]. Expert Systems with Applications, 2020, 142:1-43.
[16] YOU Dianlong, WANG Yang, XIAO Jiawei, et al. Online multi-label streaming feature selection with label correlation[J]. IEEE Transactions on Knowledge and Data Engineering, 2023, 35(3):2901-2915.
[17] LIN Yaojin, HU Qinghua, LIN Jinghua, et al. Streaming feature selection for multilabel learning based on fuzzy mutual information[J]. IEEE Transactions on Fuzzy Systems, 2017, 25(6):1491-1507.
[18] ZHANG Yin, ZHOU Zihong. Multi-label dimensionality reduction via dependence maximization[J]. ACM Transactions on Knowledge Discovery from Data, 2010, 4(3):1-21.
[19] LEE J, KIM D. SCLS: multi-label feature selection based on scalable criterion for large label set[J]. Pattern Recognition, 2017, 66:342-352.
[20] HASHEMI A, DOWLATSHAHI B M, NEZAMABADI-POUR H. MFS-MCDM: multi-label feature selection using multi-criteria decision making[J]. Knowledge-based Systems, 2020, 206:1-46.
[21] LEE J, KIM D. Feature selection for multi-label classification using multivariate mutual information[J]. Pattern Recognition Letters, 2013, 34(3):349-357.
[1] Haisu CHEN,Jiachun LIAO,Sicheng YAO. Identification and statistical analysis methods of personal information disclosure in open government data [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2024, 59(3): 95-106.
[2] Xin WEN,Deyu LI. The ML-KNN method based on attribute weighting [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2024, 59(3): 107-117.
[3] Xueqiang ZENG,Yu SUN,Ye LIU,Zhongying WAN,Jiali ZUO,Mingwen WANG. Emoji embedded representation based on emotion distribution [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2024, 59(3): 81-94.
[4] Zequn NIU,Xiaoge LI,Chengyu QIANG,Wei HAN,Yi YAO,Yang LIU. Entity disambiguation method based on graph attention networks [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2024, 59(3): 71-80, 94.
[5] Chunyu SHI,Yu MAO,Haoyang LIU,Yaojin LIN. Hierarchical feature selection algorithm based on instance correlations [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2024, 59(3): 61-70.
[6] Chan LU,Junjun GUO,Kaiwen TAN,Yan XIANG,Zhengtao YU. Multimodal sentiment analysis based on text-guided hierarchical adaptive fusion [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2023, 58(12): 31-40, 51.
[7] Xinsheng WANG,Xiaofei ZHU,Chenghong LI. Label guided multi-scale graph neural network for protein-protein interactions prediction [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2023, 58(12): 22-30.
[8] Naizhou ZHANG,Wei CAO. A memory network model based on semantic expansion of text for query suggestion [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2023, 58(12): 10-21.
[9] Shuzhen CHEN,Kaiquan SHI,Shouwei LI. Embedded generation of micro-information and its intelligent hiding-restoration [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2023, 58(12): 1-9.
[10] Chengcheng ZHONG,Heng ZHOU,Zitong ZHANG,Chunlei ZHANG. LAC-UNet: semantic segmentation model based on capsules for representing part-whole hierarchical features [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2023, 58(11): 116-126.
[11] Xianjun WU,Shaoshi TANG,Mingqiu WANG. Personalized recommendation of mobile users by integrating basic information and communication behavior [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2023, 58(9): 81-93.
[12] Yujia NA,Jun XIE,Haiyang YANG,Xinying XU. Context fusion-based knowledge graph completion [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2023, 58(9): 71-80.
[13] Cheng LI,Wengang CHE,Shengxiang GAO. A object detection algorithm for aerial images [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2023, 58(9): 59-70.
[14] San-li YI,Jian-ting CHEN,Jian-feng HE. ASR-UNet: an improved retinal vessels segmentation algorithm based on attention mechanism [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2021, 56(9): 13-20.
[15] Jing-hong WANG,Li-na LIANG,Hao-kang LI,Yi ZHOU. Community discovery algorithm based on attention network feature [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2021, 56(9): 1-12,20.
Full text



[1] TIAN Xue-gang, WANG Shao-ying. Solutions to the operator equation AXB=C[J]. J4, 2010, 45(6): 74 -80 .
[2] PANG Guan-song, ZHANG Li-sha, JIANG Sheng-yi*, KUANG Li-min, WU Mei-ling. A multi-level clustering approach based on noun phrases for search results[J]. J4, 2010, 45(7): 39 -44 .
[3] SHU Zhi-Jiang, HU An-Yin, HU Lin, LIAN Jian. Research on the method of viral mobile communication cross entropy based on video service[J]. J4, 2009, 44(9): 32 -34 .
[4] QIU Tao-rong, WANG Lu, XIONG Shu-jie, BAI Xiao-ming. A granular computing approach for knowledge hiding[J]. J4, 2010, 45(7): 60 -64 .
[5] XUE Qiu-fang1,2, GAO Xing-bao1*, LIU Xiao-guang1. Several equivalent conditions for H-matrix based on the extrapolated GaussSeidel iterative method[J]. J4, 2013, 48(4): 65 -71 .
[6] LIU Ji-qin, . Unionrepresentation theorem of bothbranch fuzzy set[J]. J4, 2006, 41(2): 7 -13 .
[7] WANG Qi,ZHAO Xiu-heng,LI Guo-jun . Embedding hypergraph in trees of rings[J]. J4, 2007, 42(10): 114 -117 .
[8] LIU Jian-ya and ZHAN Tao . The quadratic Waring-Goldbach problem[J]. J4, 2007, 42(2): 1 -18 .
[9] LI Min1,2, LI Qi-qiang1. Observer-based sliding mode control of uncertain singular time-delay systems#br#[J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2014, 49(03): 37 -42 .
[10] MA Yuan-yuan, MENG Hui-li, XU Jiu-cheng, ZHU Ma. Normal distribution of lattice close-degree based on granular computing[J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2014, 49(08): 107 -110 .