《山东大学学报(理学版)》 ›› 2025, Vol. 60 ›› Issue (9): 62-70.doi: 10.6040/j.issn.1671-9352.0.2024.077
• • 上一篇
欧阳玉旋,彭垚潘,张荣芬*,刘宇红
OUYANG Yuxuan, PENG Yaopan, ZHANG Rongfen*, LIU Yuhong
摘要: 针对市面上大多数视觉辅助系统算法存在参数量大、检测性能低、不便于部署手机移动端等问题,基于YOLOv7-tiny设计了一个轻量级的视觉辅助算法。在网络中使用感受野模块(receptive field block, RFB)融合不同尺度的特征信息,提高对不同分辨率大小物体的检测精度;利用激活函数Silu(sigmoid linear unit)的非线性,增强模型的拟合能力,提升模型的学习速度和检测精度;通过对比实验选择性能更佳的深度卷积(depthwise convolution, DWConv)实现模型的轻量化。实验结果表明,改进后的轻量化模型相比原模型,参数量减少了52.1%,并获得了最佳的检测性能。与其他主流目标检测算法相比,该算法以2.90 M参数量实现了对室内目标更精准的实时检测。
中图分类号:
[1] World Health Organization. World report on vision[R/OL].(2019-10-08)[2024-03-11]. https://www.who.int/publications/i/item/9789241516570. [2] 徐波. 盲人视觉辅助导航系统设计[D]. 北京:中国科学院大学, 2020. XU Bo. Visual aided navigation system for the blind[D]. Beijing: University of Chinese Academy of Sciences, 2020. [3] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C] //IEEE Conference on Computer Vision and Pattern Recognition(CVPR). Columbus: IEEE, 2014:580-587. [4] GIRSHICK R. Fast R-CNN[C] //IEEE International Conference on Computer Vision(ICCV). Santiago, Chile: IEEE, 2015:1440-1448. [5] REN S, HE K, GIRSHICK R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6):1137-1149. [6] LIU W, ANGUELOV D, ERHAN D, et al. SSD: single ShotMultiBox Detector[C] //Computer Vision-ECCV. Cham: Springer International Publishing, 2016: 21-37. [7] REDMON J, DIVVALA S, GIRSHICK R, et al. You Only Look Once: unified, real-time object detection[C] //2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR). Las Vegas: IEEE, 2016:779-788. [8] LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(2):318-327. [9] TIAN Z, SHEN C, CHEN H, et al. FCOS: fully convolutional one-stage object detection[C] //IEEE/CVF International Conference on Computer Vision(ICCV). Seoul: IEEE, 2019:9627-9636. [10] LAWAL M O. Tomato detection based on modified YOLOv3 framework[J]. Scientific Reports, 2021, 11(1):1-11. [11] 郭君斌,于琳,于传强. 改进YOLOv5s算法在交通标志检测识别中的应用[J]. 国防科技大学学报,2024,46(6):123-130. GUO Junbing, YU Lin, YU Chuanqiang. Application of improved YOLOv5s algorithm in traffic sign detection and recognition[J]. Journal of National University of Defense Technology, 2024, 46(6):123-130. [12] BIST R B, SUBEDI S, YANG X, et al. A novel YOLOv6 object detector for monitoring piling behavior of cage-free laying hens[J]. AgriEngineering, 2023, 5(2):905-923. [13] WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C] //IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), Vancouver: IEEE, 2023:7464-7475. [14] SHAFIQ M, GU Z. Deep residual learning for image recognition: a survey[J]. Applied Sciences, 2022, 12(18):8972. [15] HAN K, WANG Y, TIAN Q, et al. GhostNet: more features from cheap operations[C] //IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), Seattle: IEEE, 2020:1577-1586. [16] ZHANG X, ZHOU X, LIN M, et al. ShuffleNet: an extremely efficient convolutional neural network for mobile devices[C] // IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City: IEEE, 2018:6848-6856. [17] JANG J G, QUAN C, LEE H D, et al. Falcon: lightweight and accurate convolution based on depthwise separable convolution[J]. Knowledge and Information Systems, 2023, 65(5):2225-2249. [18] JIANG L, NIE W, ZHU J, et al. Lightweight object detection network model suitable for indoor mobile robots[J]. Journal of Mechanical Science and Technology, 2022, 36(2):907-920. [19] YI C, LIU J, HUANG T, et al. An efficient method of pavement distress detection based on improved YOLOv7[J]. Measurement Science and Technology, 2023, 34(11):115402. [20] YANG L, ZHANG R Y, LI L, et al. SimAM: A simple, parameter-free attention module for convolutional neural networks[C] //Proceedings of the 38th International Conference on Machine Learning. World Wide Web, PMLR, 2021:11863-11874. [21] 卢俊哲,张铖怡,刘世鹏,等. 面向复杂环境中带钢表面缺陷检测的轻量级DCN-YOLO[J]. 计算机工程与应用, 2023, 59(15):318-328. LU Junzhe, ZHANG Chengyi, LIU Shipeng, et al. Lightweight DCN-YOLO for strip surface defect detection in complex environment[J]. Computer Engineering and Application, 2023, 59(15):318-328. [22] LIU S, HUANG D. Receptive field block net for accurate and fast object detection[C] //Proceedings of the European conference on computer vision(ECCV). Munich: Springer, 2018:385-400. [23] JOCHER G, STOKEN A, BOROVEC J, et al. ultralytics/yolov5: v4.0-nn.SiLU()activations, Weights & Biases logging, PyTorch Hub integration[EB/OL].(2021-01-05)[2024-03-11]. http://ui.adsabs.harvard.edu/abs/2021zndo...4418161J/abstract. [24] ROSAL J E C, HISOLA D I E, DEMABILDO M S. Grade classification of yellow fin tuna meat using F-RCNN with inception V2 architecture[C] //IEEE International Conference on Artificial Intelligence in Engineering and Technology(IICAIET), Kota: IEEE, 2023:252-256. [25] 杨锦辉,李鸿,杜芸彦,等. 基于改进YOLOv5s的轻量化目标检测算法[J]. 电光与控制,2023,30(2):24-30. YANG Jinhui, LI Hong, DU Yunyan, et al. A lightweight object detection algorithm based on improved YOLOv5s[J]. Electronics Optics & Control, 2023, 30(2):24-30. [26] JIANG T, CHENG J. Target recognition based on CNN with LeakyReLU and PReLU activation functions [C] //International Conference on Sensing, Diag nostics, Prognostics, and Control(SDPC). Beijing: IEEE, 2019:718-722. [27] MONDAL A, SHRIVASTAVA V K. A novel parametric flatten-p mish activation function based deep CNN model for brain tumor classification[J]. Computers in Biology and Medicine, 2022, 150:106183. [28] DEVI T, DEEPA N. A novel intervention method for aspect-based emotionusing exponential linear unit(ELU)activation function in a deep neural network[C] //2021 5th international conference on intelligent computing and control systems(ICICCS). Madurai: IEEE, 2021:1671-1675. |
[1] | 李程,车文刚,高盛祥. 一种用于航拍图像的目标检测算法[J]. 《山东大学学报(理学版)》, 2023, 58(9): 59-70. |
[2] | 林明星. 基于变分结构引导滤波的低照度图像增强算法[J]. 《山东大学学报(理学版)》, 2020, 55(9): 72-80. |
[3] | 潘振宽,魏伟波,张海涛 . 基于梯度和拉普拉斯算子的图像扩散变分模型[J]. J4, 2008, 43(11): 11-16 . |
[4] | 王曾珍,刘华勇,查东东. 带形状参数的三角β-B曲线的渐进迭代逼近[J]. 《山东大学学报(理学版)》, 2021, 56(6): 81-94. |
[5] | 赵钰琳,梁峰宁,赵藤,曹亚茹,王淋,朱红. 基于深度真值发现的胶质瘤基因状态预测方法[J]. 《山东大学学报(理学版)》, 2025, 60(7): 13-21. |
|