改进高通道卷积的YOLOv7-tiny视觉辅助轻量化算法

doi:10.6040/j.issn.1671-9352.0.2024.077

摘要/Abstract

摘要： 针对市面上大多数视觉辅助系统算法存在参数量大、检测性能低、不便于部署手机移动端等问题,基于YOLOv7-tiny设计了一个轻量级的视觉辅助算法。在网络中使用感受野模块(receptive field block, RFB)融合不同尺度的特征信息,提高对不同分辨率大小物体的检测精度;利用激活函数Silu(sigmoid linear unit)的非线性,增强模型的拟合能力,提升模型的学习速度和检测精度;通过对比实验选择性能更佳的深度卷积(depthwise convolution, DWConv)实现模型的轻量化。实验结果表明,改进后的轻量化模型相比原模型,参数量减少了52.1%,并获得了最佳的检测性能。与其他主流目标检测算法相比,该算法以2.90 M参数量实现了对室内目标更精准的实时检测。

关键词: 高通道卷积, 感受野模块, 激活函数Silu, 深度卷积

Abstract: In response to the issues of large parameter size, low detection performance, and inconvenience for deployment on mobile devices in most existing visual assistance system algorithms, a lightweight visual assistance algorithm is designed based on YOLOv7-tiny. The receptive field block(RFB)is used in the network to fuse the feature information of different scales to improve the detection accuracy of objects of different resolution sizes. The non-linearity of activation function sigmoid linear unit(Silu)is used to enhance the fitting ability of the model, improve the learning speed and detection accuracy of the model. Finally, the depth-wise convolution(DWConv)with better performance is selected by comparison experiment to achieve the lightweight of the model. The experimental results show that the parameters of the improved lightweight model are reduced by 52.1% compared with the original model, and the best detection performance is obtained. Compared with other mainstream object detection algorithms, this algorithm achieves more accurate real-time detection of indoor objects with 2.90 M parameters.

Key words: high-channel convolution, receptive field block, activation function Silu, DWConv

中图分类号:

TP391.41

欧阳玉旋,彭垚潘,张荣芬,刘宇红. 改进高通道卷积的YOLOv7-tiny视觉辅助轻量化算法[J]. 《山东大学学报(理学版)》, 2025, 60(9): 62-70.

OUYANG Yuxuan, PENG Yaopan, ZHANG Rongfen, LIU Yuhong. Improvement of the YOLOv7-tiny visual-assisted lightweight algorithm based on high-channel convolution[J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2025, 60(9): 62-70.

参考文献

[1] World Health Organization. World report on vision[R/OL].(2019-10-08)[2024-03-11]. https://www.who.int/publications/i/item/9789241516570.
[2] 徐波. 盲人视觉辅助导航系统设计[D]. 北京:中国科学院大学, 2020. XU Bo. Visual aided navigation system for the blind[D]. Beijing: University of Chinese Academy of Sciences, 2020.
[3] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C] //IEEE Conference on Computer Vision and Pattern Recognition(CVPR). Columbus: IEEE, 2014:580-587.
[4] GIRSHICK R. Fast R-CNN[C] //IEEE International Conference on Computer Vision(ICCV). Santiago, Chile: IEEE, 2015:1440-1448.
[5] REN S, HE K, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6):1137-1149.
[6] LIU W, ANGUELOV D, ERHAN D, et al. SSD: single ShotMultiBox Detector[C] //Computer Vision-ECCV. Cham: Springer International Publishing, 2016: 21-37.
[7] REDMON J, DIVVALA S, GIRSHICK R, et al. You Only Look Once: unified, real-time object detection[C] //2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR). Las Vegas: IEEE, 2016:779-788.
[8] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(2):318-327.
[9] TIAN Z, SHEN C, CHEN H, et al. FCOS: fully convolutional one-stage object detection[C] //IEEE/CVF International Conference on Computer Vision(ICCV). Seoul: IEEE, 2019:9627-9636.
[10] LAWAL M O. Tomato detection based on modified YOLOv3 framework[J]. Scientific Reports, 2021, 11(1):1-11.
[11] 郭君斌,于琳,于传强. 改进YOLOv5s算法在交通标志检测识别中的应用[J]. 国防科技大学学报,2024,46(6):123-130. GUO Junbing, YU Lin, YU Chuanqiang. Application of improved YOLOv5s algorithm in traffic sign detection and recognition[J]. Journal of National University of Defense Technology, 2024, 46(6):123-130.
[12] BIST R B, SUBEDI S, YANG X, et al. A novel YOLOv6 object detector for monitoring piling behavior of cage-free laying hens[J]. AgriEngineering, 2023, 5(2):905-923.
[13] WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C] //IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), Vancouver: IEEE, 2023:7464-7475.
[14] SHAFIQ M, GU Z. Deep residual learning for image recognition: a survey[J]. Applied Sciences, 2022, 12(18):8972.
[15] HAN K, WANG Y, TIAN Q, et al. GhostNet: more features from cheap operations[C] //IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), Seattle: IEEE, 2020:1577-1586.
[16] ZHANG X, ZHOU X, LIN M, et al. ShuffleNet: an extremely efficient convolutional neural network for mobile devices[C] // IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City: IEEE, 2018:6848-6856.
[17] JANG J G, QUAN C, LEE H D, et al. Falcon: lightweight and accurate convolution based on depthwise separable convolution[J]. Knowledge and Information Systems, 2023, 65(5):2225-2249.
[18] JIANG L, NIE W, ZHU J, et al. Lightweight object detection network model suitable for indoor mobile robots[J]. Journal of Mechanical Science and Technology, 2022, 36(2):907-920.
[19] YI C, LIU J, HUANG T, et al. An efficient method of pavement distress detection based on improved YOLOv7[J]. Measurement Science and Technology, 2023, 34(11):115402.
[20] YANG L, ZHANG R Y, LI L, et al. SimAM: A simple, parameter-free attention module for convolutional neural networks[C] //Proceedings of the 38th International Conference on Machine Learning. World Wide Web, PMLR, 2021:11863-11874.
[21] 卢俊哲,张铖怡,刘世鹏,等. 面向复杂环境中带钢表面缺陷检测的轻量级DCN-YOLO[J]. 计算机工程与应用, 2023, 59(15):318-328. LU Junzhe, ZHANG Chengyi, LIU Shipeng, et al. Lightweight DCN-YOLO for strip surface defect detection in complex environment[J]. Computer Engineering and Application, 2023, 59(15):318-328.
[22] LIU S, HUANG D. Receptive field block net for accurate and fast object detection[C] //Proceedings of the European conference on computer vision(ECCV). Munich: Springer, 2018:385-400.
[23] JOCHER G, STOKEN A, BOROVEC J, et al. ultralytics/yolov5: v4.0-nn.SiLU()activations, Weights & Biases logging, PyTorch Hub integration[EB/OL].(2021-01-05)[2024-03-11]. http://ui.adsabs.harvard.edu/abs/2021zndo...4418161J/abstract.
[24] ROSAL J E C, HISOLA D I E, DEMABILDO M S. Grade classification of yellow fin tuna meat using F-RCNN with inception V2 architecture[C] //IEEE International Conference on Artificial Intelligence in Engineering and Technology(IICAIET), Kota: IEEE, 2023:252-256.
[25] 杨锦辉,李鸿,杜芸彦,等. 基于改进YOLOv5s的轻量化目标检测算法[J]. 电光与控制,2023,30(2):24-30. YANG Jinhui, LI Hong, DU Yunyan, et al. A lightweight object detection algorithm based on improved YOLOv5s[J]. Electronics Optics & Control, 2023, 30(2):24-30.
[26] JIANG T, CHENG J. Target recognition based on CNN with LeakyReLU and PReLU activation functions [C] //International Conference on Sensing, Diag nostics, Prognostics, and Control(SDPC). Beijing: IEEE, 2019:718-722.
[27] MONDAL A, SHRIVASTAVA V K. A novel parametric flatten-p mish activation function based deep CNN model for brain tumor classification[J]. Computers in Biology and Medicine, 2022, 150:106183.
[28] DEVI T, DEEPA N. A novel intervention method for aspect-based emotionusing exponential linear unit(ELU)activation function in a deep neural network[C] //2021 5th international conference on intelligent computing and control systems(ICICCS). Madurai: IEEE, 2021:1671-1675.

多维度评价

Viewed

Full text

Abstract

Cited

Shared

Discussed