JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE) ›› 2023, Vol. 58 ›› Issue (9): 105-113, 126.doi: 10.6040/j.issn.1671-9352.4.2022.5119

Previous Articles     Next Articles

Feasible region localization and fast causal instance selection for multi-instance learning

Mei YANG1,2,3,*(),Wenjing KE1,Dandong WANG1   

  1. 1. School of Computer Science, Southwest Petroleum University, Chengdu 610500, Sichuan, China
    2. Institute for Artificial Intelligence, Southwest Petroleum University, Chengdu 610500, Sichuan, China
    3. Lab of Machin Learning, Southwest Petroleum University, Chengdu 610500, Sichuan, China
  • Received:2022-08-02 Online:2023-09-20 Published:2023-09-08
  • Contact: Mei YANG E-mail:yangmei@swpu.edu.cn

Abstract:

This paper proposes a feasible region localization and fast causal instance selection(FFCM)algorithm for multi-instance learning, incorporating three techniques. To minimize the feasible region of data, the fast feasible region localization technique is used to select representative instances from the positive bags as candidate instances based on distance measurement, and reduces the negative referee bags through probability analysis. The fast causal instance-based selection technique uses the causal relationship between candidate instances and negative referee bags to construct fusion bags. Subsequently, prior knowledge is employed to select causal instances from candidate instances based on the designed causal instance criteria. The bag mapping technique maps bags into single vectors with high distinguishability using causal instances and a difference-based mapping function. The proposed algorithm is compared with 6 state-of-the-art MIL algorithms on 27 commonly used datasets. The experimental results show that the proposed FFCM exhibits comparable classification performance.

Key words: causal instance, feasible region, mapping, multi-instance learning, probability analysis

CLC Number: 

  • TP181

Table 1

Notations"

符号 含义 符号 含义
$\mathscr{C}=\bf{R}^{d}$ 实例空间 $\mathit{{A}}$ 先验分类器
$\mathscr{Y}=\{+1,-1\}$ 标签空间 $\mathit{{A}}(\cdot)$ - 为正的概率
$\mathscr{B}=\left\{\left(\boldsymbol{B}_{i},y_{i}\right)\right\}_{i=1}^{n}$ 数据集 $\boldsymbol{B}_{i}^{x}=\{\boldsymbol{x}\} \cup \boldsymbol{B}_{i}^{-}$ 融合包
$\boldsymbol{B}_{i}$ C 候选实例集
$y_{i} \in \mathscr{Y}$ 包的标签 $\mathscr{B}^{r}$ 负裁判包集
$m_{i}$ 包中实例数 $\mathit{\boldsymbol{s}}_{i j}$ 因果性评判指标
$\mathit{\boldsymbol{x}}_{i j}$ 包中实例 R 因果实例集
$\mathscr{B}^{+}\left(\mathscr{B}^{-}\right)$ 正(负)包集合 $c$ 因果实例的个数
$N^{+}\left(N^{-}\right)$ 正(负) 包的个数 $\mathscr{V}$ 映射向量集

Fig.1

Schematic diagram of causal instance selection process"

Table 2

The characteristic of datasets"

数据集 子数据集个数 维度 包数 正包数 负包数 实例数 包内最大实例数 包内最小实例数 包内平均实例数
Musk1 1 166 92 47 45 476 40 2 5.17
Musk2 1 166 102 39 63 6 598 1 044 1 64.69
Elephant 1 230 200 100 100 1 391 13 2 6.96
Fox 1 230 200 100 100 1 320 13 2 6.60
Tiger 1 230 200 100 100 1 220 13 1 6.10
Mutagenesis1 1 7 188 125 63 10 486 88 28 55.78
Mutagenesis2 1 7 42 13 29 2 132 86 26 50.76
Messidor 1 687 1 200 654 546 12 352 12 8 10.29
Newsgroups 10 200 100 47~50 50~53 1 982~5 443 54~84 8~29 19.8~54.4
Web 9 5 863~6 519 113 21~88 25~92 3 423 200 4 30.29

Fig.2

Accuracy of FFCM under different c"

Fig.3

Time consumption of 10 times 10cv of FFCM under different c"

Table 3

Accuracy of FFCM under different pr %"

数据集 n×d pr=0.05 pr=0.10 pr=0.15 pr=0.20
Musk1 476×166 79.11±4.27 79.11±2.47 80.11±3.20 78.89±3.51
Musk2 6 598×166 78.20±4.09 77.00±2.68 75.60±3.38 78.10±2.80
Mutagenesis1 2 132×7 81.33±2.07 82.78±3.03 81.56±1.94 82.44±2.32
Elephant 1 391×230 84.15±2.01 84.05±1.56 84.80±1.55 84.45±1.17
Alt.atheism 5 443×200 87.62±0.80 88.10±1.54 87.20±1.22 88.10±1.45
Web4 3 423×6 059 85.00±1.59 85.73±1.03 84.91±1.77 84.82±1.59

Table 4

Accuracy of FFCM and comparison algorithms %"

数据集 miVLAD miFV MILDM StableMIL PL ELDB FFCM
Musk1 83.11±2.21 91.11±1.41 79.11±2.15 90.67±2.29 80.89±1.78 89.60±1.90 80.33±2.63
Musk2 78.20±2.64 84.80±1.72 79.11±2.33 84.80±3.06 76.40±4.27 85.04±2.25 76.30±3.38
Elephant 84.00±0.84 84.80±0.81 77.10±1.74 66.10±2.60 71.80±4.15 76.00±3.10 85.00±1.24
Fox 61.70±1.66 60.50±1.73 54.80±4.23 58.00±2.21 52.00±2.39 55.90±2.40 61.10±1.87
Tiger 84.70±0.75 78.40±1.02 69.80±1.17 67.50±2.66 69.70±2.56 71.10±2.20 80.50±0.67
Mutagenesis1 81.50±1.96 80.78±1.09 80.00±2.11 83.56±2.47 79.35±1.65 84.67±0.84 80.78±2.51
Mutagenesis2 78.50±2.29 80.50±1.00 81.00±2.55 86.00±3.00 74.00±10.56 59.93±6.40 71.00±5.61
Messidor 67.43±0.39 70.57±0.47 63.92±0.64 62.73±0.91 54.52±0.38 57.40±2.41 63.84±1.25
News.aa 84.70±1.42 82.60±1.74 54.60±4.03 51.80±7.14 80.60±1.50 84.67±1.35 87.60±1.02
News.cg 79.00±1.48 80.40±1.02 52.20±6.37 48.60±2.73 78.40±1.50 77.72±2.98 81.00±1.90
News.co 68.30±1.55 72.00±1.41 48.00±4.15 48.40±5.08 63.60±2.06 65.94±2.81 71.20±1.33
News.csm 78.70±1.73 77.80±1.60 49.20±2.79 53.40±2.87 77.60±0.80 76.13±4.58 79.00±2.14
News.mf 71.90±1.64 73.80±1.94 43.40±2.87 50.40±2.87 63.20±4.53 64.60±3.29 68.00±1.90
News.rsb 82.90±0.94 84.00±1.10 46.20±3.12 51.00±6.10 81.20±0.98 77.80±2.40 81.20±1.17
News.rsh 88.60±0.92 88.60±1.85 51.20±1.83 51.00±3.41 81.20±0.98 81.35±1.42 89.10±1.14
News.se 91.90±0.30 93.00±1.10 55.80±5.60 51.40±3.20 92.00±1.26 88.20±2.10 93.50±0.81
News.sm 81.90±1.70 83.00±1.67 53.80±3.43 49.20±2.71 81.60±1.50 80.40±1.56 82.40±1.43
News.ss 84.70±1.10 86.60±2.33 48.20±2.99 55.40±3.07 76.80±2.32 77.60±2.30 86.70±1.49
Web1 82.00±0.89 84.55±0.57 82.91±0.89 82.00±1.34 81.82±3.25 81.27±1.38 81.45±0.73
Web2 81.00±2.35 82.36±0.45 83.45±0.68 81.64±0.89 78.91±1.67 72.18±1.65 80.73±0.68
Web3 82.00±1.56 82.73±1.72 82.73±1.29 81.82±1.15 80.73±0.68 80.72±2.91 81.27±0.60
Web4 84.45±1.43 80.73±1.34 79.09±1.00 74.00±2.48 77.45±0.36 77.27±3.02 84.82±1.99
Web5 83.00±1.35 77.45±0.68 80.00±1.72 87.05±0.79 77.27±0.81 74.18±1.29 83.27±1.59
Web6 84.36±2.14 80.36±1.69 82.73±1.15 77.64±0.93 77.82±0.73 81.82±2.64 86.36±0.91
Web7 74.00±2.67 67.82±2.34 61.09±3.65 62.00±2.10 52.73±1.29 51.73±4.52 71.82±2.87
Web8 73.27±2.04 69.82±1.76 56.91±1.59 59.27±0.68 54.18±6.18 51.27±4.39 76.00±3.38
Web9 76.55±2.87 75.82±2.20 56.73±2.55 54.00±4.69 53.27±6.02 48.07±2.60 75.55±2.17
Mean Rank 2.63 2.48 4.85 4.93 5.37 5.00 2.74

Fig.4

CD diagram comparing FFCM with 6 contrasting algorithms using the Bonferroni-Dunn test"

1 XU Bicun, TING Kaiming, ZHOU Zhihua. Isolation set-kernel and its application to multi-instance learning[C]//Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. Anchorage: ACM, 2019: 941-949.
2 杨梅, 曾雯喜, 方宇, 等. 多示例学习的两阶段实例选择和自适应包映射算法[J]. 南京大学学报(自然科学版), 2022, 58 (1): 94- 102.
doi: 10.13232/j.cnki.jnju.2022.01.010
YANG Mei , ZENG Wenxi , FANG Yu , et al. Two stage instance selection and adaptive bag mapping algorithm for multi-instance learning[J]. Journal of Nanjing University (Natural Science), 2022, 58 (1): 94- 102.
doi: 10.13232/j.cnki.jnju.2022.01.010
3 王刚, 许信顺. 一种新的基于多示例学习的场景分类方法[J]. 山东大学学报(理学版), 2010, 45 (7): 108- 113.
WANG Gang , XU Xinshun . A new multi-instance learning method for scene classification[J]. Journal of Shandong University(Natural Science), 2010, 45 (7): 108- 113.
4 YANG Mei , ZHANG Yuxuan , WANG Xizhao , et al. Multi-instance ensemble learning with discriminative bags[J]. IEEE Transactions on Systems Man Cybernetics-Systems, 2021, 52 (9): 5456- 5467.
5 ANGELIDIS S , LAPATA M . Multiple instance learning networks for fine-grained sentiment analysis[J]. Transactions of the Association for Computational Linguistics, 2018, 6, 17- 31.
doi: 10.1162/tacl_a_00002
6 TARRAGÓ D S , CORNELIS C , BELLO R , et al. A multi-instance learning wrapper based on the Rocchio classifier for web index recommendation[J]. Knowledge-Based Systems, 2014, 59, 173- 181.
doi: 10.1016/j.knosys.2014.01.008
7 ZHANG Weijia, LI Jiuyong, LIU Lin. Robust multi-instance learning with stable instances[J/OL]. arXiv, 2019. https://arxiv.org/abs/1902.05066v3.
8 ZHANG Minling , ZHOU Zhihua . Multi-instance clustering with applications to multi-instance prediction[J]. Applied Intelligence, 2009, 31 (1): 47- 68.
doi: 10.1007/s10489-007-0111-x
9 WEI Xiushen , WU Jianxin , ZHOU Zhihua . Scalable algorithms for multi-instance learning[J]. IEEE Transactions on Neural Networks and Learning Systems, 2017, 28 (4): 975- 987.
doi: 10.1109/TNNLS.2016.2519102
10 CHI Ziqiu , WANG Zhe , DU Wenli . Explicit metric-based multiconcept multi-instance learning with triplet and superbag[J]. IEEE Transactions on Neural Networks and Learning Systems, 2021, 33 (10): 5888- 5897.
11 HUANG Shengjun , GAO Wei , ZHOU Zhihua . Fast multi-instance multi-label learning[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 41 (11): 2614- 2627.
12 GÄRTNER T, FLACH P A, KOWALCZYK A. multi-instance kernels[C]//Proceedings of the 19th International Conference on Machine Learning. San Francisco: Morgan Kaufmann Publishers, 2002: 179-186.
13 AMORES J . Multiple instance classification: review, taxonomy and comparative study[J]. Artificial Intelligence, 2013, 201, 81- 105.
doi: 10.1016/j.artint.2013.06.003
14 ZHOU Zhihua, SUN Yunyin, LI Yufeng. Multi-instance learning by treating instances as non-I.I.D. simples[C]// Proceedings of the 26th International Conference on Machine Learning. Montreal: ACM, 2009.
15 WANG J, ZUCKER J D. Solving the multiple-instance problem: a lazy learning approach[C]//Proceedings of the 17th International Conference on Machine Learning. San Francisco: Morgan Kaufmann Publishers, 2000: 1119-1125.
16 WU Jia , PAN Shirui , ZHU Xingquan , et al. Multi-instance learning with discriminative bag mapping[J]. IEEE Transactions on Knowledge and Data Engineering, 2018, 30 (6): 1065- 1080.
doi: 10.1109/TKDE.2017.2788430
17 SVRKAYA E , YÜKSEKGÖNÜL M , BAYDOǦAN M G . Learning prototypes for multiple instance learning[J]. Turkish Journal of Electrical Engineering & Computer Sciences, 2021, 29 (7): 2901- 2919.
18 HE Jianjun , GU Hong , WANG Zhelong . Bayesian multi-instance multi-label learning using Gaussian process prior[J]. Machine Learning, 2012, 88 (1): 273- 295.
19 DECENCIōRE E , ZHANG X , CAZUGUEL G , et al. Feedback on a publicly distributed image database: the Messidor database[J]. Image Analysis & Stereology, 2014, 33 (3): 231- 234.
20 SRINIVASAN A, MUGGLETON S, KING R D. Comparing the use of background knowledge by inductive logic programming systems[C]//Proceeding of the 5th International Workshop on Inductive Logic Programming. Leuven: Springer-Verlag, 1995.
21 KANDEMIR M , HAMPRECHT F A . Computer-aided diagnosis from weak supervision: a benchmarking study[J]. Computerized Medical Imaging and Graphics, 2015, 42, 44- 50.
doi: 10.1016/j.compmedimag.2014.11.010
22 LOWE D G . Distinctive image features from scale-invariant keypoints[J]. International Journal of Computer Vision, 2004, 60 (2): 91- 110.
doi: 10.1023/B:VISI.0000029664.99615.94
23 REUTEMANN P, PFAHRINGER B, FRANK E. A toolbox for learning from relational data with propositional and multi-instance learners[C]//Australasian Joint Conference on Artificial Intelligence. Berlin: Springer, 2004: 1017-1023.
24 ZHOU Zhihua , JIANG Kai , LI Ming . Multi-instance learning based web mining[J]. Applied Intelligence, 2005, 22 (2): 135- 140.
doi: 10.1007/s10489-005-5602-z
25 DEMŠAR J . Statistical comparisons of classifiers over multiple data sets[J]. The Journal of Machine Learning Research, 2006, 7, 1- 30.
[1] CHEN Jing-jing, YANG Yan-tao. Modified inertial projection algorithm for solving variational inequality and fixed point problems [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2023, 58(3): 64-76.
[2] XI Yan-li, CHEN Peng-yu. Uniqueness of solutions for initial value problems of implicit fractional order fuzzy differential equations [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2022, 57(4): 85-90.
[3] TANG Shan-gang. Enumerations of equivalent classes with actions of permutation group on a class of mapping set [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2021, 56(8): 67-75.
[4] ZHANG Fang-juan. A characterization of ξ-skew Jordan derivable mappings on factor von Neumann algebras [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2020, 55(7): 32-37.
[5] FENG Dan-dan, WU Hong-bo. Open remote neighborhoods of topological systems and their applications [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2019, 54(11): 90-96.
[6] WANG Su-yun, LI Yong-jun. Solvability of nonlinear second-order boundary value problems with nonlinearities which cross the resonance points [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2018, 53(6): 53-56.
[7] YANG Yan-tao. Modified subgradient extragradient method for solving monotone variational inequality problems [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2018, 53(2): 38-45.
[8] ZHANG Qian, LI Hai-yang. The iterative fraction thresholding algorithm in sparse information processing [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2017, 52(9): 76-82.
[9] . Uniqueness of solution for singular boundary value problems of fourth-order differential equations [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2017, 52(2): 73-76.
[10] LI Chun-hua, XU Bao-gen, HUANG Hua-wei. Unipotent congruences on a proper weakly left type B semigroup [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2016, 51(8): 49-52.
[11] XUE Wen-ping, JI Pei-sheng. On the HUR stability of a mixed functional equation deriving from AQC mappings in FFNLS [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2016, 51(4): 1-8.
[12] LIU Jian, XU Hong-bo, YI Mian-zhu, CHENG Xue-qi. Multi-dimensional semantic ontology construction oriented to knowledge-level application [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2015, 50(09): 13-20.
[13] KONG Liang, CAO Huai-xin. Characterization and perturbations of ε-approximate square isosceles-orthogonality preserving mappings [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2015, 50(06): 75-82.
[14] HAO Cui-xia, YAO Bing-xue*. θ-fuzzy homomorphism of groups#br# [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2014, 49(03): 51-56.
[15] YANG Lin. A Dugdale-Barenblatt model for a finite width  strip with single edge crack [J]. J4, 2013, 48(8): 63-67.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] MAO Ai-qin1,2, YANG Ming-jun2, 3, YU Hai-yun2, ZHANG Pin1, PAN Ren-ming1*. Study on thermal decomposition mechanism of  pentafluoroethane fire extinguishing agent[J]. J4, 2013, 48(1): 51 -55 .
[2] LI Yong-ming1, DING Li-wang2. The r-th moment consistency of estimators for a semi-parametric regression model for positively associated errors[J]. J4, 2013, 48(1): 83 -88 .
[3] DONG Li-hong1,2, GUO Shuang-jian1. The fundamental theorem for weak Hopf module in  Yetter-Drinfeld module categories[J]. J4, 2013, 48(2): 20 -22 .
[4] TANG Feng-qin1, BAI Jian-ming2. The precise large deviations for a risk model with extended negatively upper orthant dependent claim  sizes[J]. J4, 2013, 48(1): 100 -106 .
[5] CHENG Zhi1,2, SUN Cui-fang2, WANG Ning1, DU Xian-neng1. On the fibre product of Zn and its property[J]. J4, 2013, 48(2): 15 -19 .
[6] TANG Xiao-hong1, HU Wen-xiao2*, WEI Yan-feng2, JIANG Xi-long2, ZHANG Jing-ying2, SHAO Xue-dong3. Screening and biological characteristics studies of wide wine-making yeasts[J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2014, 49(03): 12 -17 .
[7] ZHAO Jun1, ZHAO Jing2, FAN Ting-jun1*, YUAN Wen-peng1,3, ZHANG Zheng1, CONG Ri-shan1. Purification and anti-tumor activity examination of water-soluble asterosaponin from Asterias rollestoni Bell[J]. J4, 2013, 48(1): 30 -35 .
[8] YANG Yong-wei1, 2, HE Peng-fei2, LI Yi-jun2,3. On strict filters of BL-algebras#br#[J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2014, 49(03): 63 -67 .
[9] LI Min1,2, LI Qi-qiang1. Observer-based sliding mode control of uncertain singular time-delay systems#br#[J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2014, 49(03): 37 -42 .
[10] YANG Lun, XU Zheng-gang, WANG Hui*, CHEN Qi-mei, CHEN Wei, HU Yan-xia, SHI Yuan, ZHU Hong-lei, ZENG Yong-qing*. Silence of PID1 gene expression using RNA interference in C2C12 cell line[J]. J4, 2013, 48(1): 36 -42 .