双注意力引导特征融合的半弱监督目标检测

doi:10.6040/j.issn.1671-9352.7.2023.3979

摘要/Abstract

摘要： 为了降低标注成本,解决目标定位不准、细节信息遗漏等问题,提出双注意力引导特征融合的半弱监督目标检测算法, 利用全标记和弱标记数据来平衡检测性能和标注成本,使用空间注意力将低层特征图与高层特征图进行像素级加权融合,使高层特征图具有丰富的低层信息,对融合后的特征图进行通道加权运算,得到细节、位置信息丰富的高层特征图。为了得到更准确的伪标注框,提出更具鲁棒性的候选框筛选策略。实验表明,本文提出的算法具有较优的检测性能,减少了全标记图像的数据量和额外的图像级标注。

关键词: 弱监督目标检测, 特征融合, 注意力机制, 半监督学习

Abstract: In order to reduce the cost of annotation and solve the problems of inaccurate target localization and omission of detail information, a semi-weakly supervised object detection method with bi-attention-guided feature fusion is proposed. Based on the method which fully labelled and weakly labelled data, the detection performance and annotation cost are balanced, and the spatial attention the low-level feature maps with the high-level feature maps with pixel-level weighting are fused, so that the high-level feature maps have rich low-level information, and performs channel-weighting operations on the fused feature maps to obtain high-level feature maps having rich details and location information. In order to get more accurate pseudo-labelled boxes, a more robust candidate box selection strategy is proposed. The proposed algorithm has better detection performance and reduce the amount of full-labeled image data and additional image-level labeling.

Key words: weakly supervised object detection, feature fusion, attention mechanism, semi-supervised learning

中图分类号:

TP391

陈俊芬,李娜娜,谢博鋆,张杰. 双注意力引导特征融合的半弱监督目标检测[J]. 《山东大学学报(理学版)》, 2025, 60(1): 1-13.

CHEN Junfen, LI Nana, XIE Bojun, ZHANG Jie. Semi-weakly supervised object detection using bi-attention-guided feature fusion[J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2025, 60(1): 1-13.

参考文献

[1] REDMON J, FARHADI A. YOLO9000: better, faster, stronger[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017:7263-7271.
[2] LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot multibox detector[C] //Computer Vision-ECCV 2016. Amsterdam: Springer, 2016:21-37.
[3] PARDO A, XU M, THABET A, et al. BAOD: budget-aware object detection[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021:1247-1256.
[4] 任冬伟,王旗龙,魏云超,等. 视觉弱监督学习研究进展[J]. 中国图象图形学报,2022,27(6):1768-1798. REN Dongwei, WANG Qilong, WEI Yunchao, et al. Progress in weakly supervised learning for visual understanding[J]. Journal of Image and Graphics, 2022, 27(6):1768-1798.
[5] DIETTERICH T G, LATHROP R H, LOZANO-PÉREZ T. Solving the multiple instance problem with axis-parallel rectangles[J]. Artificial Intelligence, 1997, 89(1/2):31-71.
[6] BODLA N, SINGH B, CHELLAPPA R, et al. Soft-NMS-improving object detection with one line of code[C] //Proceedings of the IEEE International Conference on Computer Vision. Venice: IEEE, 2017:5561-5569.
[7] GIRSHICK R. Fast R-CNN[C] //Proceedings of the IEEE International Conference on Computer Vision. Santiago, Chile: IEEE, 2015:1440-1448.
[8] BILEN H, VEDALDI A. Weakly supervised deep detection networks[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016:2846-2854.
[9] TANG Peng, WANG Xinggang, BAI Xiang, et al. Multiple instance detection network with online instance classifier refinement[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017:2843-2851.
[10] YANG Ke, LI Dongsheng, DOU Yong. Towards precise end-to-end weakly supervised object detection network[C] //Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019:8372-8381.
[11] WAN Fang, WEI Pengxu, JIAO Jianbin, et al. Min-entropy latent model for weakly supervised object detection[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018:1297-1306.
[12] WAN Fang, LIU Chang, KE Wei, et al. C-MIL: continuation multiple instance learning for weakly supervised object detection[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019:2199-2208.
[13] TANG Peng, WANG Xinggang, WANG Angtian, et al. Weakly supervised region proposal network and object detection[C] //Proceedings of the European Conference on Computer Vision(ECCV). Munich: Springer, 2018:352-368.
[14] REN Zhongzheng, YU Zhiding, YANG Xiaodong, et al. Instance-aware, context-focused, and memory-efficient weakly supervised object detection[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020:10598-10607.
[15] LI Dong, HUANG Jianbing, LI Yali, et al. Weakly supervised object localization with progressive domain adaptation[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016:3512-3520.
[16] ZHU Yi, ZHOU Yanzhao, YE Qixiang, et al. Soft proposal networks for weakly supervised object localization[C] //Proceedings of the IEEE International Conference on Computer Vision. Venice: IEEE, 2017:1841-1850.
[17] ARUN A, JAWAHAR C V, KUMAR M P. Dissimilarity coefficient based weakly supervised object detection[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019:9432-9441.
[18] PAN Tianxiang, WANG Bin, DING Guiguang, et al. Low shot box correction for weakly supervised object detection[C] //Proceedings of the 28th International Joint Conference on Artificial Intelligence. San Francisco: Morgan Kaufmann, 2019:890-896.
[19] BIFFI C, MCDONAGH S, TORR P, et al. Many-shot from low-shot: learning to annotate using mixed supervision for object detection[C] //European Conference on Computer Vision. Glasgow: Springer, 2020:35-50.
[20] CHEN Liangyu, YANG Tong, ZHANG Xiangyu, et al. Points as queries: weakly semi-supervised object detection by points[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021:8823-8832.
[21] MEETHAL A, PEDERSOLI M, ZHU Z, et al. Semi-weakly supervised object detection by sampling pseudo groundtruth boxes[C] //2022 International Joint Conference on Neural Networks(IJCNN). Padua: IEEE, 2022:1-8.
[22] 谢星星,程塨,姚艳清,等. 动态特征融合的遥感图像目标检测[J]. 计算机学报, 2022, 45(4):735-747. XIE Xingxing, CHEN Gong, YAO Yanqing, et al. Dynamic feature fusion for object detection in remote sensing images[J]. Chinese Journal of Computers, 2022, 45(4):735-747.
[23] 钱泽锋,钱梦莹. 基于改进特征融合的微表情识别方法[J]. 软件工程,2021,24(4):26-29. QIAN Zefeng, QIAN Mengying. Micro-expression recognition method based on improved feature fusion[J]. Software Engineering, 2021, 24(4):26-29.
[24] LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017:2117-2125.
[25] LIU Shu, QI Lu, QIN Haifang, et al. Path aggregation network for instance segmentation[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018:8759-8768.
[26] TAN Mingxing, PANG Ruoming, LE Quoc V. EfficientDet: scalable and efficient object detection[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020:10781-10790.
[27] SUTSKEVER I, VINYALS O, LE Quoc V. Sequence to sequence learning with neural networks[C] //Advances in Neural Information Processing Systems. Montréal: MIT Press, 2014:3104-3112.
[28] LUONG M T, PHAM H, MANNING C D. Effective approaches to attention-based neural machine translation[C] //Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Lisbon: ACL, 2015:1412-1421.
[29] WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C] //Proceedings of the European Conference on Computer Vision(ECCV). Munich: Springer, 2018:3-19.
[30] 赵珊,郑爱玲. 判别相关分析双注意力机制的目标检测算法[J]. 计算机工程与应用, 2022, 58(17):120-129. ZHAO Shan, ZHENG Ailing. Object detection based on dual attention mechanism combined with discriminant correlation analysis[J]. Computer Engineering and Applications, 2022, 58(17):120-129.
[31] REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6):1137-1149.
[32] UIJLINGS J R, SANDE K E, GEVERS T, et al. Selective search for object recognition[J]. International Journal of Computer Vision, 2013, 104(2):154-171.
[33] HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016:770-778.
[34] JEONG J, LEE S, KIM J, et al. Consistency-based semi-supervised learning for object detection[C] //Advances in Neural Information Processing Systems, Vancouver: MIT Press, 2019:10759-10768.
[35] ZHOU Qiang, YU Chaohui, WANG Zhibin, et al. Instant-teaching: an end-to-end semi-supervised object detection framework[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021:4081-4090.
[36] JEONG J, VERMA V, HYUN M, et al. Interpolation-based semi-supervised learning for object detection[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021:11602-11611.

多维度评价

Viewed

Full text

Abstract

Cited

Shared

Discussed