您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(理学版)》

《山东大学学报(理学版)》 ›› 2020, Vol. 55 ›› Issue (3): 81-88.doi: 10.6040/j.issn.1671-9352.1.2019.162

•   • 上一篇    下一篇

基于标记相关性的多标记三支分类算法

余鹰*,吴新念(),王乐为,张应龙   

  1. 华东交通大学软件学院, 江西 南昌 330013
  • 收稿日期:2019-05-21 出版日期:2020-03-20 发布日期:2020-03-27
  • 通讯作者: 余鹰 E-mail:yuyingjx@163.com
  • 作者简介:余鹰(1979—),女,博士,副教授,研究方向为机器学习、计算机视觉.E-mail:yuyingjx@163.com
  • 基金资助:
    国家自然科学基金资助项目(61563016);国家自然科学基金资助项目(61762036);江西省自然科学基金资助项目(20181BAB202023);江西省自然科学基金资助项目(20171BAB202012)

A multi-label three-way classification algorithm based on label correlation

Ying YU*,Xin-nian WU(),Le-wei WANG,Ying-long ZHANG   

  1. College of Software, East China Jiaotong University, Nanchang 330013, Jiangxi, China
  • Received:2019-05-21 Online:2020-03-20 Published:2020-03-27
  • Contact: Ying YU E-mail:yuyingjx@163.com

摘要:

提出了一种基于标记相关性的多标记三支分类算法TML_LC,该算法利用三支决策模型将多标记样本空间划分为接受域、拒绝域和边界域,然后利用概率图模型构建标记之间的相关性,并应用于边界域的延迟决策,从而降低分类模型的时间复杂度,并提高分类模型的精度。

关键词: 多标记学习, 三支决策, 标记相关性, 延迟决策

Abstract:

This paper uses the probability map model to the tag relationship is encoded, and three-way three decision models are used to solve the uncertainty of the data samples. A multi-label classification algorithm based on three-way decision-correlation correlation is proposed. The algorithm will solve the two-way decision problem in multi-label classification(TML_LC). The SVM mapping is divided into accepted domain, rejected domain and uncertain domain. The probability map model is used to consider the correlation between labels to transform the uncertainty of the uncertain domain, so as to improve the accuracy of the classification model.

Key words: multi-label learning, three-way decision, label correlation, delayed decision

中图分类号: 

  • TP391

图1

多标记图像"

图2

引入边界阈值的SVM分类"

图3

基于单评价函数的多标记三支决策"

表1

标记相关性矩阵DAG示例"

l1 l2 l3 l4
l1 0 1 0 1
l2 1 0 1 0
l3 0 0 0 0
l4 0 0 1 0

表2

验数据集描述"

数据集名称 样本个数 特征个数 标记个数 平均标记数 标记密度
yeast 2 417 103 14 4.237 0.303
scene 2 407 294 6 1.074 0.179
emotions 593 72 6 1.869 0.311

图4

Yeast数据集平均精度随边界域变化的情况"

图5

Scene数据集平均精度随边界域变化的情况"

图6

Emotions数据集平均精度随边界域变化的情况"

表3

5种多标记算法在3个数据集上的Hamming loss结果(mean±std)"

数据集 Algorithm
TML_LC BSVM ML-KNN BPMLL ECC
yeast 0.198±0.013 0.199±0.010 0.195±0.011 0.205±0.010 0.208±0.010
emotions 0.195±0.011 0.199±0.022 0.194±0.013 0.219±0.021 0.192±0.021
scene 0.101±0.006 0.104±0.006 0.084±0.008 0.282±0.014 0.096±0.010

表4

5种多标记算法在3个数据集上的One-error结果(mean±std)"

数据集 Algorithm
TML_LC BSVM ML-KNN BPMLL ECC
yeast 0.222±0.019 0.230±0.023 0.228±0.029 0.235±0.030 0.176±0.022
emotions 0.244±0.049 0.253±0.070 0.263±0.067 0.318±0.057 0.216±0.085
scene 0.179±0.080 0.250±0.027 0.219±0.029 0.821±0.031 0.226±0.034

表5

5种多标记算法在3个数据集上的Coverage结果(mean±std)"

数据集 Algorithm
TML_LC BSVM ML-KNN BPMLL ECC
yeast 0.458±0.022 0.514±0.018 0.447±0.014 0.456±0.019 0.516±0.015
emotions 0.284±0.033 0.295±0.027 0.300±0.019 0.300±0.022 0.322±0.022
scene 0.086±0.010 0.089±0.009 0.078±0.010 0.374±0.024 0.091±0.008

表6

5种多标记算法在3个数据集上的Ranking loss结果(mean±std)"

数据集 Algorithm
TML_LC BSVM ML-KNN BPMLL ECC
yeast 0.171±0.014 0.200±0.013 0.166±0.015 0.171±0.015 0.285±0.022
emotions 0.147±0.030 0.156±0.034 0.163±0.022 0.173±0.020 0.233±0.040
scene 0.086±0.012 0.089±0.011 0.076±0.012 0.434±0.026 0.135±0.013

表7

5种多标记算法在3个数据集上的Average precision结果(mean±std)"

数据集 Algorithm
TML_LC BSVM ML-KNN BPMLL ECC
yeast 0.761±0.016 0.794±0.019 0.765±0.021 0.754±0.020 0.728±0.019
emotions 0.817±0.031 0.807±0.037 0.799±0.031 0.779±0.027 0.796±0.042
scene 0.871±0.017 0.849±0.016 0.869±0.017 0.445±0.018 0.852±0.016
1 ZHANG Minlin , ZHOU Zhihua . A review on multi-label learning algorithms[J]. IEEE Transactions on Knowledge and Data Engineering, 2014, 26 (8): 1819- 1837.
doi: 10.1109/TKDE.2013.39
2 TROHIDIS K, TSOUMAKAS G, KALLIRIS G, et al. Multi-label classification of music into emotions[C]//9th International Conference on Music Information Retrieval. Philadelphia Springer, 2008: 325-330.
3 LIU S M , CHEN J H . A multi-label classification based approach for sentiment classification[J]. Expert Systems with Applications, 2015, 42 (3): 1083- 1093.
doi: 10.1016/j.eswa.2014.08.036
4 HUANG Shu, PENG Wei, LI Jingxuan, et al. Sentiment and topic analysis on social media: a multi-task multi-label classification approach[C]//Proceedings of the 5th Annual ACM Web Science Conference. Paris: AC, 2013.
5 WU Baoyuan , LYU S , HU Baogang , et al. Multi-label learning with missing labels for image annotation and facial action unit recognition[J]. Pattern Recognition, 2015, 48 (7): 2279- 2289.
doi: 10.1016/j.patcog.2015.01.022
6 QIN Jianzhao , YUNG N H C . Feature fusion within local region using localized maximum-margin learning for scene categorization[J]. Pattern Recognition, 2012, 45 (4): 1671- 1683.
doi: 10.1016/j.patcog.2011.09.027
7 BARUTCUOGLU Z , SCHAPIRE R E , TROYANSKAYA O G . Hierarchical multi-label prediction of gene function[J]. Bioinformatics, 2006, 22 (7): 830- 836.
doi: 10.1093/bioinformatics/btk048
8 JIANG J Q , MCQUAY L J . Predicting protein function by multi-label correlated semi-supervised learning[J]. IEEE/ACM Transactions on Computational Biology & Bioinformatics, 2012, 9 (4): 1059- 1069.
9 SONG Yang, ZHANG Lu, GILES C L. A sparse gaussian processes classification framework for fast tag suggestions[C]//Proceedings of the 17th ACM Conference on Information and Knowledge Management. Napa Valley: ACM, 2008: 93-102.
10 OZONAT K M, YOUNG D. Towards a universal marketplace over the web: statistical multi-label classification of service provider forms with simulated annealing[C]//Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Paris: ACM, 2009: 1295-1304.
11 YAO Yiyu , DENG Xiaofei . Sequential three-way decisions with probabilistic rough sets[J]. Information Sciences, 2010, 180 (3): 341- 353.
12 YU Ying , PEDRYCZ W , MIAO Duoqian . Multi-label classification by exploiting label correlations[J]. Expert Systems with Applications, 2014, 41 (6): 2989- 3004.
doi: 10.1016/j.eswa.2013.10.030
13 刘盾. 三支决策与粒计算[M]. 北京: 科学出版社, 2013.
LIU Dun . Three decisions and granular computing[M]. Beijing: Science Press, 2013.
14 YAO Yiyu. Granular computing and sequential three-way decisions[C]//RSKT 2013: Rough Sets and Knowledge Technology, Halifax: Springer, 2013: 16-27.
15 ZHENG Taoyu, ZHENG Zhiyun, TANG Shiping, et al. Query expansion for answer document retrieval in chinese question answering system[C]// International Conference on Machine Learning & Cybernetics. Guangzhou: Springer, 2005: 278-284.
16 YAO Jingtao , LI Huaxiong , PETERS G . Decision-theoretic rough sets and beyond[J]. International Journal of Approximate Reasoning, 2014, 55 (1): 99- 100.
doi: 10.1016/j.ijar.2013.09.022
17 BOUTELL M , LUO Jiebo , SHEN Xipeng , et al. Learning multi-label scene classification[J]. Pattern Recognition, 2004, 37 (9): 1757- 1771.
doi: 10.1016/j.patcog.2004.03.009
18 ZHANG Minling , ZHOU Zhihua . ML-KNN: A lazy learning approach to multi-label learning[J]. Pattern Recognition, 2007, 40 (7): 2038- 2048.
doi: 10.1016/j.patcog.2006.12.019
19 HUANG Jun, LI Guorong, HUANG Qingming, et al. Learning label specific features for multi-label classification[C]//IEEE International Conference on Data Mining. Barcelona: IEEE, 2016: 181-190.
20 KONG D G, DING C, HUANG H, et al. Multi-label ReliefF and F-statistic feature selections for image annotation[C]//Computer Vision and Pattern Recognition. Providence: IEEE, 2012: 2352-2359.
21 READ J , PFAHRINGER B , HOLMES G , et al. Classifier chains for multi-label classification[M]. Berlin: Springer, 2009: 254- 269.
22 READ J , PFAHRINGER B , HOLMES G , et al. Classifier chains for multi-label classification[J]. Machine Learning, 2011, 85 (3): 333- 359.
23 ZHANG Minlin, ZHANG Kun. Multi-label learning by exploiting label dependency[C]// ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Washington: ACM, 2010: 999-1008.
24 TSOUMAKAS G , KATAKIS I . Multi-label classification: an overview[J]. International Journal of Data Warehousing and Mining, 2007, (3): 1- 13.
25 CHEN Weijie , SHAO Yuanhai , LI Chunna , et al. MLTSVM: a novel twin support vector machine to multi-label learning[J]. Pattern Recognition, 2016, 52: 61- 74.
doi: 10.1016/j.patcog.2015.10.008
26 PLATT J . Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods[J]. Advances in Large Margin Classifiers, 1999, 10 (3): 61- 74.
27 ZHANG Minlin , ZHOU Zhihua . Multilabel neural networks with applications to functional genomics and text categorization[J]. IEEE Transactions on Knowledge and Data Engineering, 2006, 18 (10): 1338- 1351.
28 ELISSEEFF A , WESTON J . Kernel methods for multi-labelled classification and categorical regression problems[J]. Advances in Neural Information Processing Systems, 2002, 14: 681- 687.
[1] 司凤山,王晶,戴道明. 考虑服务质量和延迟决策的双渠道供应链演化博弈分析[J]. 《山东大学学报(理学版)》, 2020, 55(1): 86-93, 101.
[2] 刘国涛,张燕平,徐晨初. 一种优化覆盖中心的三支决策模型[J]. 山东大学学报(理学版), 2017, 52(3): 105-110.
[3] 田海龙, 朱艳辉, 梁韬, 马进, 刘璟. 基于三支决策的中文微博观点句识别研究[J]. 山东大学学报(理学版), 2014, 49(08): 58-65.
[4] 张聪, 于洪. 一种三支决策软增量聚类算法[J]. 山东大学学报(理学版), 2014, 49(08): 40-47.
[5] 杜丽娜, 徐久成, 刘洋洋, 孙林. 基于三支决策风险最小化的风险投资评估应用研究[J]. 山东大学学报(理学版), 2014, 49(08): 66-72.
[6] 张里博, 李华雄, 周献中, 黄兵. 人脸识别中的多粒度代价敏感三支决策[J]. 山东大学学报(理学版), 2014, 49(08): 48-57.
[7] 冯新营1,2,计华1,2,张化祥1,2. 基于聚类优化的RBF神经网络多标记学习算法[J]. J4, 2012, 47(5): 63-67.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 田学刚, 王少英. 算子方程AXB=C的解[J]. J4, 2010, 45(6): 74 -80 .
[2] 王刚,许信顺*. 一种新的基于多示例学习的场景分类方法[J]. J4, 2010, 45(7): 108 -113 .
[3] 郭亭,鲍晓明 . P137G点突变对嗜热细菌木糖异构酶酶活性及热稳定性的影响[J]. J4, 2006, 41(6): 145 -148 .
[4] 李世龙,张云峰 . 一类基于算术均差商的有理三次插值样条的逼近性质[J]. J4, 2007, 42(10): 106 -110 .
[5] 丰 晓,张顺华 . τ-平坦试验模与τ-平坦覆盖[J]. J4, 2007, 42(1): 31 -34 .
[6] 安云鹤, 蔡召平, 张伟, 黄淑红, 杨凌, 张红卫. 文昌鱼Rbx1同源基因的克隆,进化分析和表达图式的研究(英文)[J]. J4, 2009, 44(3): 11 -16 .
[7] 田有功, 刘转玲. 任意支撑上5阶凸随机序的极值分布及其在保险精算中的应用[J]. 山东大学学报(理学版), 2014, 49(07): 57 -62 .
[8] 程李晴1,2, 石巧连2. 一种新的混合共轭梯度算法[J]. J4, 2010, 45(6): 81 -85 .
[9] 周娟,郭卫华,宗美娟,韩雪梅,王仁卿 . 房干村不同植被下可培养细菌多样性研[J]. J4, 2006, 41(6): 161 -167 .
[10] 马建玲 . 菱体型消色差相位延迟器的光谱特性分析[J]. J4, 2007, 42(7): 27 -29 .