您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(理学版)》

《山东大学学报(理学版)》 ›› 2025, Vol. 60 ›› Issue (1): 1-13.doi: 10.6040/j.issn.1671-9352.7.2023.3979

• •    

双注意力引导特征融合的半弱监督目标检测

陈俊芬,李娜娜,谢博鋆*,张杰   

  1. 河北大学数学与信息科学学院, 河北 保定 071002
  • 发布日期:2025-01-10
  • 通讯作者: 谢博鋆(1981— ),男,副教授,硕士生导师,博士,研究方向为机器学习及计算机视觉. E-mail: xiebojun@126.com
  • 作者简介:陈俊芬(1976— ),女,副教授,博士,研究方向为机器学习及计算机视觉. E-mail: chenjunfen2010@126.com*通信作者:谢博鋆(1981— ),男,副教授,硕士生导师,博士,研究方向为机器学习及计算机视觉. E-mail: xiebojun@126.com
  • 基金资助:
    河北省引进留学人员资助项目(C20200302);河北省教育教学改革研究与实践项目(2020GJJG007)

Semi-weakly supervised object detection using bi-attention-guided feature fusion

CHEN Junfen, LI Nana, XIE Bojun*, ZHANG Jie   

  1. College of Mathematics and Information Science of Hebei University, Baoding 071002, Hebei, China
  • Published:2025-01-10

摘要: 为了降低标注成本,解决目标定位不准、细节信息遗漏等问题,提出双注意力引导特征融合的半弱监督目标检测算法, 利用全标记和弱标记数据来平衡检测性能和标注成本,使用空间注意力将低层特征图与高层特征图进行像素级加权融合,使高层特征图具有丰富的低层信息,对融合后的特征图进行通道加权运算,得到细节、位置信息丰富的高层特征图。为了得到更准确的伪标注框,提出更具鲁棒性的候选框筛选策略。实验表明,本文提出的算法具有较优的检测性能,减少了全标记图像的数据量和额外的图像级标注。

关键词: 弱监督目标检测, 特征融合, 注意力机制, 半监督学习

Abstract: In order to reduce the cost of annotation and solve the problems of inaccurate target localization and omission of detail information, a semi-weakly supervised object detection method with bi-attention-guided feature fusion is proposed. Based on the method which fully labelled and weakly labelled data, the detection performance and annotation cost are balanced, and the spatial attention the low-level feature maps with the high-level feature maps with pixel-level weighting are fused, so that the high-level feature maps have rich low-level information, and performs channel-weighting operations on the fused feature maps to obtain high-level feature maps having rich details and location information. In order to get more accurate pseudo-labelled boxes, a more robust candidate box selection strategy is proposed. The proposed algorithm has better detection performance and reduce the amount of full-labeled image data and additional image-level labeling.

Key words: weakly supervised object detection, feature fusion, attention mechanism, semi-supervised learning

中图分类号: 

  • TP391
[1] REDMON J, FARHADI A. YOLO9000: better, faster, stronger[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017:7263-7271.
[2] LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot multibox detector[C] //Computer Vision-ECCV 2016. Amsterdam: Springer, 2016:21-37.
[3] PARDO A, XU M, THABET A, et al. BAOD: budget-aware object detection[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021:1247-1256.
[4] 任冬伟,王旗龙,魏云超,等. 视觉弱监督学习研究进展[J]. 中国图象图形学报,2022,27(6):1768-1798. REN Dongwei, WANG Qilong, WEI Yunchao, et al. Progress in weakly supervised learning for visual understanding[J]. Journal of Image and Graphics, 2022, 27(6):1768-1798.
[5] DIETTERICH T G, LATHROP R H, LOZANO-PÉREZ T. Solving the multiple instance problem with axis-parallel rectangles[J]. Artificial Intelligence, 1997, 89(1/2):31-71.
[6] BODLA N, SINGH B, CHELLAPPA R, et al. Soft-NMS-improving object detection with one line of code[C] //Proceedings of the IEEE International Conference on Computer Vision. Venice: IEEE, 2017:5561-5569.
[7] GIRSHICK R. Fast R-CNN[C] //Proceedings of the IEEE International Conference on Computer Vision. Santiago, Chile: IEEE, 2015:1440-1448.
[8] BILEN H, VEDALDI A. Weakly supervised deep detection networks[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016:2846-2854.
[9] TANG Peng, WANG Xinggang, BAI Xiang, et al. Multiple instance detection network with online instance classifier refinement[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017:2843-2851.
[10] YANG Ke, LI Dongsheng, DOU Yong. Towards precise end-to-end weakly supervised object detection network[C] //Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019:8372-8381.
[11] WAN Fang, WEI Pengxu, JIAO Jianbin, et al. Min-entropy latent model for weakly supervised object detection[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018:1297-1306.
[12] WAN Fang, LIU Chang, KE Wei, et al. C-MIL: continuation multiple instance learning for weakly supervised object detection[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019:2199-2208.
[13] TANG Peng, WANG Xinggang, WANG Angtian, et al. Weakly supervised region proposal network and object detection[C] //Proceedings of the European Conference on Computer Vision(ECCV). Munich: Springer, 2018:352-368.
[14] REN Zhongzheng, YU Zhiding, YANG Xiaodong, et al. Instance-aware, context-focused, and memory-efficient weakly supervised object detection[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020:10598-10607.
[15] LI Dong, HUANG Jianbing, LI Yali, et al. Weakly supervised object localization with progressive domain adaptation[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016:3512-3520.
[16] ZHU Yi, ZHOU Yanzhao, YE Qixiang, et al. Soft proposal networks for weakly supervised object localization[C] //Proceedings of the IEEE International Conference on Computer Vision. Venice: IEEE, 2017:1841-1850.
[17] ARUN A, JAWAHAR C V, KUMAR M P. Dissimilarity coefficient based weakly supervised object detection[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019:9432-9441.
[18] PAN Tianxiang, WANG Bin, DING Guiguang, et al. Low shot box correction for weakly supervised object detection[C] //Proceedings of the 28th International Joint Conference on Artificial Intelligence. San Francisco: Morgan Kaufmann, 2019:890-896.
[19] BIFFI C, MCDONAGH S, TORR P, et al. Many-shot from low-shot: learning to annotate using mixed supervision for object detection[C] //European Conference on Computer Vision. Glasgow: Springer, 2020:35-50.
[20] CHEN Liangyu, YANG Tong, ZHANG Xiangyu, et al. Points as queries: weakly semi-supervised object detection by points[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021:8823-8832.
[21] MEETHAL A, PEDERSOLI M, ZHU Z, et al. Semi-weakly supervised object detection by sampling pseudo groundtruth boxes[C] //2022 International Joint Conference on Neural Networks(IJCNN). Padua: IEEE, 2022:1-8.
[22] 谢星星,程塨,姚艳清,等. 动态特征融合的遥感图像目标检测[J]. 计算机学报, 2022, 45(4):735-747. XIE Xingxing, CHEN Gong, YAO Yanqing, et al. Dynamic feature fusion for object detection in remote sensing images[J]. Chinese Journal of Computers, 2022, 45(4):735-747.
[23] 钱泽锋,钱梦莹. 基于改进特征融合的微表情识别方法[J]. 软件工程,2021,24(4):26-29. QIAN Zefeng, QIAN Mengying. Micro-expression recognition method based on improved feature fusion[J]. Software Engineering, 2021, 24(4):26-29.
[24] LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017:2117-2125.
[25] LIU Shu, QI Lu, QIN Haifang, et al. Path aggregation network for instance segmentation[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018:8759-8768.
[26] TAN Mingxing, PANG Ruoming, LE Quoc V. EfficientDet: scalable and efficient object detection[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020:10781-10790.
[27] SUTSKEVER I, VINYALS O, LE Quoc V. Sequence to sequence learning with neural networks[C] //Advances in Neural Information Processing Systems. Montréal: MIT Press, 2014:3104-3112.
[28] LUONG M T, PHAM H, MANNING C D. Effective approaches to attention-based neural machine translation[C] //Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Lisbon: ACL, 2015:1412-1421.
[29] WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C] //Proceedings of the European Conference on Computer Vision(ECCV). Munich: Springer, 2018:3-19.
[30] 赵珊,郑爱玲. 判别相关分析双注意力机制的目标检测算法[J]. 计算机工程与应用, 2022, 58(17):120-129. ZHAO Shan, ZHENG Ailing. Object detection based on dual attention mechanism combined with discriminant correlation analysis[J]. Computer Engineering and Applications, 2022, 58(17):120-129.
[31] REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6):1137-1149.
[32] UIJLINGS J R, SANDE K E, GEVERS T, et al. Selective search for object recognition[J]. International Journal of Computer Vision, 2013, 104(2):154-171.
[33] HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016:770-778.
[34] JEONG J, LEE S, KIM J, et al. Consistency-based semi-supervised learning for object detection[C] //Advances in Neural Information Processing Systems, Vancouver: MIT Press, 2019:10759-10768.
[35] ZHOU Qiang, YU Chaohui, WANG Zhibin, et al. Instant-teaching: an end-to-end semi-supervised object detection framework[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021:4081-4090.
[36] JEONG J, VERMA V, HYUN M, et al. Interpolation-based semi-supervised learning for object detection[C] //Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021:11602-11611.
[1] 黎超,廖薇. 基于医疗知识驱动的中文疾病文本分类模型[J]. 《山东大学学报(理学版)》, 2024, 59(7): 122-130.
[2] 罗奇,苟刚. 基于聚类和群组归一化的多模态对话情绪识别[J]. 《山东大学学报(理学版)》, 2024, 59(7): 105-112.
[3] 王静红,吴芝冰,黄鹏,杨家腾,李笔. 基于元路径属性融合的异质网络表示学习[J]. 《山东大学学报(理学版)》, 2024, 59(3): 1-13.
[4] 那宇嘉,谢珺,杨海洋,续欣莹. 融合上下文的知识图谱补全方法[J]. 《山东大学学报(理学版)》, 2023, 58(9): 71-80.
[5] 卢婵,郭军军,谭凯文,相艳,余正涛. 基于文本指导的层级自适应融合的多模态情感分析[J]. 《山东大学学报(理学版)》, 2023, 58(12): 31-40, 51.
[6] 王静红,梁丽娜,李昊康,王熙照. 基于标记注意力机制的社区发现算法[J]. 《山东大学学报(理学版)》, 2022, 57(12): 1-12.
[7] 张斌艳,朱小飞,肖朝晖,黄贤英,吴洁. 基于半监督图神经网络的短文本分类[J]. 《山东大学学报(理学版)》, 2021, 56(5): 57-65.
[8] 鲍亮,陈志豪,陈文章,叶锴,廖祥文. 基于双重多路注意力匹配的观点型阅读理解[J]. 《山东大学学报(理学版)》, 2021, 56(3): 44-53.
[9] 杨婷,朱恒东,马盈仓,汪义瑞,杨小飞. 基于L2,1范数和流形正则项的半监督谱聚类算法[J]. 《山东大学学报(理学版)》, 2021, 56(3): 67-76.
[10] 唐光远,郭军军,余正涛,张亚飞,高盛祥. 基于BERT与法条知识驱动的法条推荐方法[J]. 《山东大学学报(理学版)》, 2021, 56(11): 24-30.
[11] 廖祥文,徐阳,魏晶晶,杨定达,陈国龙. 基于双层堆叠分类模型的水军评论检测[J]. 《山东大学学报(理学版)》, 2019, 54(7): 57-67.
[12] 郝长盈,兰艳艳,张海楠,郭嘉丰,徐君,庞亮,程学旗. 基于拓展关键词信息的对话生成模型[J]. 《山东大学学报(理学版)》, 2019, 54(7): 68-76.
[13] 李润川,昝红英,申圣亚,毕银龙,张中军. 基于多特征融合的垃圾短信识别[J]. 山东大学学报(理学版), 2017, 52(7): 73-79.
[14] 张鹏,王素格,李德玉,王杰. 一种基于启发式规则的半监督垃圾评论分类方法[J]. 山东大学学报(理学版), 2017, 52(7): 44-51.
[15] 苏丰龙,谢庆华,黄清泉,邱继远,岳振军. 基于直推式学习的半监督属性抽取[J]. 山东大学学报(理学版), 2016, 51(3): 111-115.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!