《山东大学学报(理学版)》 ›› 2023, Vol. 58 ›› Issue (9): 59-70.doi: 10.6040/j.issn.1671-9352.0.2022.349
Cheng LI1,2(),Wengang CHE1,2,*(),Shengxiang GAO1,2
摘要:
提出了一种用于航拍图像的目标检测算法DSB-YOLO(depthwise separable convolutional backbone and YOLO)。在YOLOv5s的基础上, 首先从主干网提取特征图感受野的角度出发, 通过改变卷积核的间隔采样, 降低特征图的感受野以便更好地提取小目标的信息; 其次, 改进了网络Neck部分的特征金字塔模型(feature pyramid network, FPN)和路径聚合网络(path aggregation network, PAN)的特征融合路径, 从而使网络浅层采样的特征图中大量位置信息能够与网络深层提取的特征图较好地结合在一起, 有效地提高了小目标的准确检出率; 接着将C3Transformer模块加入到主干网络中, 用来整合全图信息; 然后, 对网络进行了轻量化处理, 把网络主干的部分卷积改为深度可分离卷积并集成了SE注意力机制, 其目的是聚焦并选择对目标检测任务有用的信息, 从而提升了模型的检测效率。利用VisDrone数据集进行的对比实验结果表明, 在输入图像分辨率为1 280×1 280像素时, 本文提出的DSB-YOLO算法测试平均精度指标mAP50、mAP0.5 ∶0.95与原模型相比, 分别提升了11%和17.5%;部署在嵌入式平台Jetson TX2上的运算速率可以达到21FPS, 模型性能达到适用标准。
中图分类号:
1 | GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus: IEEE, 2014: 580-587. |
2 | REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 779-788. |
3 | LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot MultiBox detector[C]//Proceedings of the 14th European Conference on Computer Vision. Berlin: Springer-Verlag, 2016: 21-37. |
4 | ZHU P, WEN L, XIAO B, et al. Vision meets drones: a challenge[J/OL]. arXiv, 2018. https://arxiv.org/abs/1804.07437. |
5 | ZHANG H, CISSE M, DAUPHIN Y N, et al. Mixup: beyond empirical risk minimization[J/OL]. arXiv, 2017. https://arxiv.org/pdf/1710.09412.pdf. |
6 | BOCHKOVSKIY A, WANG C Y, LIAO H Y M. YOLOV4: optimal speed and accuracy of object detection[J/OL]. arXiv, 2020. https://arxiv.org/abs/2004.10934. |
7 | HOWARD A G, ZHU M L, CHEN B, et al. MobileNets: efficient convolutional neural networks for mobile vision applications[J/OL]. arXiv, 2017. https://arxiv.org/abs/1704.04861v1. |
8 | TAN M X, PANG R M, LE Q V. EfficientDet: scalable and efficient object detection[C]//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 10778-10787. |
9 | HAN K, WANG Y H, TIAN Q, et al. Ghostnet: more features from cheap operations[C]//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 1580-1589. |
10 | REDMON J, FARHADI A. YOLOV3: an incremental improvement[J/OL]. arXiv, 2018. https://arxiv.org/abs/1804.02767. |
11 | LI Z , LIU X , ZHAO Y , et al. A lightweight multi-scale aggregated model for detecting aerial images captured by UAVs[J]. Journal of Visual Communication and Image Representation, 2021, 77 (1): 103058. |
12 | ZHANG P, ZHONG Y, LI X. SlimYOLOv3: narrower, faster and better for real-time UAV applications[C]//Poceedings of Computer Vision and Pattern Recognition(CVPR). Seoul: IEEE, 2019. |
13 | DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: transformers for image recognition at scale[J/OL]. arXiv, 2020. https://arxiv.org/pdf/2010.11929v2.pdf. |
14 | 查俊伟, 张洪艳. 动态感受野特征选择去雾网络[J]. 电子科技, 2023, 36 (7): 1- 8. |
ZHA Junwei , ZHANG Hongyan . Dynamic receptive field feature selection dehazing network[J]. Electronic Science and Technology, 2023, 36 (7): 1- 8. | |
15 |
李翠平, 李仲学, 余东明. 基于泰森多边形法的空间品位插值[J]. 辽宁工程技术大学学报, 2007, 26 (4): 488- 491.
doi: 10.3969/j.issn.1008-0562.2007.04.003 |
LI Cuiping , LI Zhongxue , YU Dongming . Ore grade interpolation based on Thiessen polygon method[J]. Journal of Liaoning Technical University, 2007, 26 (4): 488- 491.
doi: 10.3969/j.issn.1008-0562.2007.04.003 |
|
16 | WANG C Y, LIAO H, WU Y H, et al. CSPNet: a new backbone that can enhance learning capability of CNN[C]// Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Seattle: IEEE, 2020. |
17 | CUI C, GAO T, WEI S, et al. PP-LCNet: a lightweight CPU convolutional neural network[J/OL]. arXiv, 2021. https://arxiv.org/abs/2109.15099v1. |
18 | JIE H, LI S, GANG S. Squeeze-and-excitation networks[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018. |
19 | 肖顺亮, 强赞霞, 刘卫光. 基于CSP改进用于拥挤情况的行人检测算法[J]. 计算机技术与发展, 2021, 31 (7): 52- 58. |
XIAO Shunliang , QIANG Zanxia , LIU Weiguang . An improved pedestrian detection algorithm for crowd based on CSP[J]. Computer Technology and Development, 2021, 31 (7): 52- 58. | |
20 |
JIANG J H , FU X J , QIN R , et al. High-speed lightweight ship detection algorithm based on YOLO-V4 for three-channels RGB SAR image[J]. Remote Sensing, 2021, 13 (10): 1909.
doi: 10.3390/rs13101909 |
21 | ALBABA B M, OZER S. SyNet: an ensemble network for object detection in UAV images[C]//Proceedings of Computer Vision and Pattern Recognition(CVPR). Milan: IEEE, 2020. |
22 | WANG H, WANG Z, JIA M, et al. Spatial attention for multi-scale feature refinement for object detection[C]// Proceedings of 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW). Seoul: IEEE, 2019. |
23 | ZHANG X, IZQUIERDO E, CHANDRAMOULI K. Dense and small object detection in UAV vision based on cascade network[C]//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW). Seoul: IEEE, 2019. |
[1] | 仲诚诚,周恒,张梓童,张春雷. LAC-UNet:基于胶囊表达局部-整体特征关系的语义分割模型[J]. 《山东大学学报(理学版)》, 2023, 58(11): 116-126. |
[2] | 徐菲菲,许赟杰. 基于Arc-LSTM的人职匹配研究[J]. 《山东大学学报(理学版)》, 2021, 56(1): 83-90. |
[3] | 郝长盈,兰艳艳,张海楠,郭嘉丰,徐君,庞亮,程学旗. 基于拓展关键词信息的对话生成模型[J]. 《山东大学学报(理学版)》, 2019, 54(7): 68-76. |
[4] | 刘飚,路哲,黄雨薇,焦萌,李泉其,薛瑞. 神经网络结构在功耗分析中的性能对比[J]. 《山东大学学报(理学版)》, 2019, 54(1): 60-66. |
[5] | 庞博,刘远超. 融合pointwise及深度学习方法的篇章排序[J]. 山东大学学报(理学版), 2018, 53(3): 30-35. |
[6] | 刘明明,张敏情,刘佳,高培贤. 一种基于浅层卷积神经网络的隐写分析方法[J]. 山东大学学报(理学版), 2018, 53(3): 63-70. |
[7] | 刘铭, 昝红英, 原慧斌. 基于SVM与RNN的文本情感关键句判定与抽取[J]. 山东大学学报(理学版), 2014, 49(11): 68-73. |
|