JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE) ›› 2026, Vol. 61 ›› Issue (1): 15-25.doi: 10.6040/j.issn.1671-9352.8.2024.013

Previous Articles     Next Articles

Lightweight water surface small object detection model with multi-scale attention mechanism and improved feature fusion

ZHONG Shang1, MA Li1,2*, LIU Wenzhe1, LI Yuhao1   

  1. 1. College of Information Engineering, Hebei GEO University, Shijiazhuang 052161, Hebei, China;
    2. Intelligent Sensor Network Engineering Research Center of Hebei Province, Hebei GEO University, Shijiazhuang 052161, Hebei, China
  • Published:2026-01-15

Abstract: In complex water surface scenarios, addressing the issues of low detection accuracy, high missed detection rates, and limited computational resources for small target detection, this paper proposes a lightweight water surface small object detection model with multi-scale attention mechanism and improved feature fusion. Based on centerness theory, a new backbone network is designed, leveraging the multi-scale attention mechanism to enhance the models feature extraction capabilities. Partial convolution is used to improve the neck network by reducing feature map redundancy, effectively lowering the models computational load. A large separable kernel attention module is employed to improve the spatial pyramid pooling module, enhancing the models feature fusion ability. Experimental results demonstrate that, compared to other models, the proposed model achieves higher detection accuracy, lower missed detection rates, and fewer parameters.

Key words: small target detection, feature fusion, multi-scale attention mechanism, feature map redundancy

CLC Number: 

  • TP391
[1] GIRSHICK R. Fast R-CNN[C] //Proceedings of the IEEE International Conference on Computer Vision(ICCV). Boston: IEEE, 2015:1440-1448.
[2] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR). Las Vegas: IEEE, 2016:779-788.
[3] LIU W, ANGUELOW D, ERHAN D, et al. SSD: single shot multibox detector[C] //Proceedings of the 14th European Conference on Computer Vision(ECCV). Berlin: Springer, 2016:21-37.
[4] REN S Q, HE K M, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6):1137-1149.
[5] TAO Kong, YAO Anbang, CHEN Yurong, et al. HyperNet: Towards accurate region proposal generation and joint object detection[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR). Las Vegas: IEEE, 2016:845-853.
[6] LIN T Y, DOLLAR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C] //Proceedings of the 30th IEEE/CVF Confenence on Computer Vision and Pattern Recognition(CVPR). Honolulu: IEEE, 2017:936-944.
[7] CHEN C Y, LIU M Y, TUZEL O, et al. R-CNN for small object detection[C] //Proceedings of the 13th Asian Conference on Computer Vision. Berlin: Springer, 2016:214-230.
[8] 戚玲珑,高建瓴. 基于改进YOLOv7的小目标检测[J]. 计算机工程,2023,49(1):41-48. QI Linglong, GAO Jianling. Small target detection based on improved YOLOv7[J]. Computer Engineering, 2023, 49(1):41-48.
[9] LIN Feng, HOU Tian, JIN Qiannan, et al. Improved YOLO based detection on algoruthm for floating debris in waterway[J]. Entropy, 2021, 23(9):1111.
[10] 王林,汪钰婷. 基于加强特征融合的轻量化船舶目标检测[J]. 计算机系统应用,2023,32(2):288-294. WANG Lin, WANG Yuting. Lightweight ship target detection based on enhanced feature fusion[J]. Computer System Applications, 2023, 32(2):288-294.
[11] OUYANG Daliang, HE Su, ZHAN Jian, et al. Efficient multi-scale attention moudle with cross-spatial learning[C] //Proceedings of the IEEE Interrnational Conference on Acoustics Speech and Signal Processing(ICASSP). Rhodes Island: IEEE, 2023:1-5.
[12] CHEN J R, KAO S, HE H, et al. Run, dont walk: chasing higher flops for faster neural networks[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR). Vancouver: IEEE, 2023:12021-12031.
[13] LAU K W, PO L M, REHMAN Y A U. Large separable kernel attention: rethinking the large kernel attention design in CNN[J]. Expert Systems with Applications, 2024, 236:121352.
[14] ZHOU Zhiguo, SUN Jiaen, YU Jiabao, et al. An image-based benchmark dataset and a novel object detector for water surface object detection[J]. Forntiers in Neurorobotics, 2021, 15:723336.
[15] HU Jie, SHEN Li, SUN Gang. Squeeze-and-excitation networks[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR). Piscataway: IEEE, 2018:7132-7141.
[16] WOO S H, PARK J, LEE J Y, et al. CBAM: convolutional block attenetion module[C] //Proceedings of the 15th European Conference on Computer Vision(ECCV). Munich: Springer, 2018:3-19.
[1] CHEN Junfen, LI Nana, XIE Bojun, ZHANG Jie. Semi-weakly supervised object detection using bi-attention-guided feature fusion [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2025, 60(1): 1-13.
[2] Qi LUO,Gang GOU. Multimodal conversation emotion recognition based on clustering and group normalization [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2024, 59(7): 105-112.
[3] Xiang-wen LIAO,Yang XU,Jing-jing WEI,Ding-da YANG,Guo-long CHEN. Review spam detection based on the two-level stacking classification model [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2019, 54(7): 57-67.
[4] LI Run-chuan, ZAN Hong-ying, SHEN Sheng-ya, BI Yin-long, ZHANG Zhong-jun. Spam messages identification based on multi-feature fusion [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2017, 52(7): 73-79.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] ZENG Weng-fu1, HUANG Tian-qiang1,2, LI Kai1, YU YANG-qiang1, GUO Gong-de1,2. A local linear emedding agorithm based on harmonicmean geodesic kernel[J]. J4, 2010, 45(7): 55 -59 .
[2] SUN Liang-ji,JI Guo-xing . Jordan(α,β)-derivations and generalized Jordan(α,β)-derivations on upper triangular matrix algebras[J]. J4, 2007, 42(10): 100 -105 .
[3] GUO Lan-lan1,2, GENG Jie1, SHI Shuo1,3, YUAN Fei1, LEI Li1, DU Guang-sheng1*. Computing research of the water hammer pressure in the process of #br# the variable speed closure of valve based on UDF method[J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2014, 49(03): 27 -30 .
[4] SHI Kai-quan. P-information law intelligent fusion and soft information #br# image intelligent generation[J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2014, 49(04): 1 -17 .
[5] ZHANG Ling ,ZHOU De-qun . Research on the relationships among the λ fuzzy measures, Mbius representation and interaction representation[J]. J4, 2007, 42(7): 33 -37 .
[6] WU Da-qian,DU Ning,WANG Wei,ZHAI Wen,WANG Yu-feng,WANG Ren-qing and ZHANG Zhi-guo* . Quantitative analysis of structure and biodiversity of shrub layer and herbage layer under forest community at Kunyu Mountain[J]. J4, 2007, 42(1): 83 -88 .
[7] . Determination of nutrimental constituents and analysis of function for Japanese red pine pollen and Japanese black pine pollen[J]. J4, 2006, 41(1): 130 -132 .
[8] WANG Ting-ming,LI Bo-tang . Proof of a class of matrix rank identities[J]. J4, 2007, 42(2): 43 -45 .
[9] FU Yonghong 1, YU Miaomiao 2*, TANG Yinghui 3, LI Cailiang 4. [J]. J4, 2009, 44(4): 72 -78 .
[10] ZHANG De-yu,ZHAI Wen-guang . [J]. J4, 2006, 41(5): 4 -07 .