《山东大学学报(理学版)》 ›› 2022, Vol. 57 ›› Issue (3): 20-30.doi: 10.6040/j.issn.1671-9352.4.2021.034
• • 上一篇
严晨旭1,邵海见1,2*,邓星1,2
YAN Chen-xu1, SHAO Hai-jian1,2*, DENG Xing1,2
摘要: 目标检测是计算机视觉的重要分支,目前基于深度学习的目标检测算法相较于传统目标检测算法在检测精度和检测时间上虽能略胜一筹,但其难以同时兼顾检测速度与检测精度,因此针对这一问题提出了改进YOLOv3后的Mul-YOLO目标检测网络。Mul-YOLO目标检测网络利用Haar小波进行数据预处理,将图像信息的低频特征在不同分辨率下层层分解,用以获得水平方向、垂直方向以及斜对角方向上的高频特征,进而利用高频特征记录的相应特征信息,减小被检测目标在几何状态变化、光照变化和背景变化下对检测精度带来的负面影响。在特征层上采样、卷积和拼接的过程中融入高阶计算,由此增强在有限的感受野内的特征表述能力,使得训练网络更加关注映射特征的显著性信息,增强了图像的分辨率,有效地减少了数据集训练过程中由连续的卷积和池化带来的信息丢失问题。在PASCAL VOC数据集下的实验结果表明,本文提出的Mul-YOLO目标检测模型相较于传统目标检测模型有了明显的改进,比如相较于Faster R-CNN ResNet提取特征的方法,mAP提高了8.97%,并且单张图片的检测时间提高了172 ms。与YOLOv3提取特征的方法相比,其mAP提高了33.48%,达到了检测精度与检测时间同时相得益彰的目的,综合其他比较结果,本文方法的有效性可以有效地得以验证。
中图分类号:
[1] 周俊宇,赵艳明.卷积神经网络在图像分类和目标检测应用综述[J]. 计算机工程与应用, 2017, 53(13):34-41. ZHOU Junyu, ZHOU Yanming. Application of convolution neural network in image classification and object detection[J]. Computer Engineering and Applications, 2017, 53(13):34-41. [2] 李忠海, 杨超, 梁书浩. 基于超像素分割和混合权值 AdaBoost 运动检测算法[J]. 电光与控制, 2018, 25(2):33-37. LI Zhonghai, YANG Chao, LIANG Shuhao. AdaBoost moving-target detection algorithm based on superpixel segmentation and mixed weight[J]. Electronics Optics & Control, 2018, 25(2):33-37. [3] 胡昭华, 张维新, 邵晓雯. 超像素特征的运动目标检测算法[J]. 控制理论与应用, 2017, 34(12):1568-1574. HU Zhaohua, ZHANG Weixin, SHAO Xiaowen. Moving object detection algorithm with superpixel features[J]. Control Theory & Applications, 2017, 34(12):1568-1574. [4] DALAL N, TRIGGS B. Histograms of oriented gradients for human detection[C] //Proceedings of the IEEE Computer Society Conference on Computer Vision & Pattern Recognition(CVPR). San Diego: IEEE, 2005: 886-893. [5] LIENHART R, MAYDT J. An extended set of Haar-like features for rapid object detection[C] //International Conference on Image Processing(ICIP). Rochester: IEEE, 2002: 900-903. [6] VIOLA P, JONES M. Rapid object detection using a boosted cascade of simple features[C] //Proceedings of the IEEE Computer Society Conference on Computer Vision & Pattern Recognition(CVPR). Kauai: IEEE, 2001: 511-518. [7] FELZENSZWALB P F, MCALLESTER D, RAMANAN D. A discriminatively trained, multiscale, deformable part model[C] //Proceedings of the IEEE Computer Society Conference on Computer Vision & Pattern Recognition(CVPR). Anchorage: IEEE, 2008: 1-8. [8] FELZENSZWALB P F, GIRSHICK R B, MCALLESTER D, et al. Object detection with discriminatively trained part-based models[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2009, 32(9):1627-1645. [9] CARREIRA J, AGRAWAL P, FRAGKIADAKI K, et al. Human pose estimation with iterative error feedback[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 4733-4742. [10] 张顺, 龚一宏, 王进军. 深度卷积神经网络的发展及其在计算机视觉领域的应用[J]. 计算机学报, 2019, 42(3):453-482. ZHANG Shun, GONG Yihong, WANG Jinjun. The development of deep convolution neural network and its applications on computer vision[J]. Chinese Journal of Computers, 2019, 42(3):453-482. [11] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmenta-teon[C] //Proceedings of the IEEE Conference On Computer Vision and Pattern Recognition. Columbus: IEEE, 2014: 580-587. [12] EVERINGHAM M, VAN GOOL L, WILLIAMS C K I, et al. The PASCAL visual object classes(VOC)wchallenge[J]. International Journal of Computer Vision, 2010, 88(2):303-338. [13] REN Shaoqing, HE Kaiming, GIRSHICK Ross, et al. Faster R-CNN: towards realtime object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 39(6):1137-1149. [14] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 779-788. [15] REDMON Joseph, FARHADI Ali. YOLO9000: better, faster, stronger[C] //2017 IEEE Conference on Computer Vision andPattern Recognition(CVPR). Honolulu: IEEE, 2017: 6517-6525. [16] REDMON Joseph, FARHADI Ali. YOLOv3: an incremental improvement[J/OL]. arXiv, 2018. https://arxiv.org/pdf/1804.02767.pdf. [17] DENG Z R, YANG R, LAN R S, et al. SE-IYOLOV3: an accurate small scale face detector for outdoor security[J]. Mathematics, 2020, 8(1): 93. [18] HURTIK Petr, MOLEK Vojtech, HULA Jan, et al. Poly-YOLO: higher speed, more precise detection and instance segmenta-tion for YOLOv3[J/OL]. arXiv, 2005. https://arxiv.org/pdf/2005.13243.pdf. [19] 张冬明, 靳国庆, 代锋, 等. 基于深度融合的显著性目标检测算法[J]. 计算机学报, 2019, 42(9):2076-2086. ZHANG Dongming, JIN Guoqing, DAI Feng, et al. Salient object detection based on deep fusion of hand-crafted features[J]. Chinese Journal of Computers, 2019, 42(9):2076-2086. [20] ALI B, CHENG M M, HOU Q B, et al. Salient object detection: a survey[J]. Computational Visual Media, 2019, 5(2):117-150. [21] GAO Y, WANG M, TAO D C. 3-D object retrieval and recognition with hypergraph analysis[J]. IEEE Transactions on Image Processing, 2012, 21(9):4290-4303. [22] ZHANG K, GUO Y R, WANG X S, et al. Channel-wise and feature-points reweights densenet for image classification[C] //Proceedings of the 2019 IEEE International Conference on Image Processing. Piscataway: IEEE, 2019: 410-414. [23] BENJILALI W, GUICQUERO W, JACQUES L, et al. Hardware-friendly compressive imaging on the basis of random modulati-ons & permutations for image acquisition and classification[C] //Proceedings of the 2019 IEEE International Conferenceon Image Processing. Piscataway: IEEE, 2019: 2085-2089. [24] ZHANG Z Y, CUI Z, XU C Y, et al. Pattern-affinitive propagation across depth, surface normal and semantic segmenta-ti-on[C] //Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition. Washington: IEEE Computer Society, 2019: 4106-4115. [25] DING H H, JIANG X D, SHUAI B, et al. Semantic correlation promoted shape-variant context for segmentation[C] //Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition. Washington: IEEE Computer Society, 2019: 8885-8894. [26] HONG S, YOU T, KWAK S, et al. Online tracking by learning discriminative saliency map with convolutional neural network[C] //Proceedings of the 32nd International Conference on Machine Learning. Lille: ACM, 2015: 597-606. [27] DI L, JI Y F, LISCHINSKI D, et al. Multi-scale context intertwining for semantic segmentation[C] //LNCS 11207: Proceedings of the 15th European Conference on Computer Vision. Berlin: Springer, 2018: 622-638. [28] CRAYE C, FILLIAT D, GOUDOU J F, et al. Environment exploration for object-based visual saliency learning[C] //Proceedings of the 2016 IEEE International Conference on Robotics and Automation, Stockholm. Piscataway: IEEE, 2016: 2303-2309. |
[1] | 耿万海,陈一鸣,刘玉风,汪晓娟. 应用Haar小波和算子矩阵求定积分的近似值[J]. J4, 2012, 47(4): 84-88. |
[2] | 李玉倩 刘林 李金屏. 视频分析中灰度直方图的叠加原理研究[J]. J4, 2009, 44(11): 63-67. |
|