JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE) ›› 2022, Vol. 57 ›› Issue (3): 20-30.doi: 10.6040/j.issn.1671-9352.4.2021.034

Previous Articles    

Multihigh order target detection method based on YOLOv3 model

YAN Chen-xu1, SHAO Hai-jian1,2*, DENG Xing1,2   

  1. 1. School of Computer Science, Jiangsu University of Science and Technology, Zhenjiang 212003, Jiangsu, China;
    2. College of Automation Key Laboratory of Ministry of Education of Complex Engineering System Measurement and Control, Southeast University, Nanjing 210009, Jiangsu, China
  • Published:2022-03-15

Abstract: Target detection is an important branch of computer vision, although the current target detection approaches based on deep learning can solve the issues that are usually caused by traditional target detection methods in detection accuracy and detection time, it is still difficult to take both detection speed and detection accuracy into account. Therefore, this paper proposes the Mul-YOLO target detection network based on the improved YOLOv3, which uses Haar wavelet for data preprocessing, decomposes low-frequency features of image information layer by layer in different resolutions, and then obtains high-frequency features in horizontal, vertical and diagonal directions. The information recorded by the aforementioned high-frequency features can reduce the negative effects to detection accuracy that are usually brought by geometric state change, illumination change and background change. Convolution and concatenating on the feature layer in combination with the third-order calculation are integrated, and the feature extraction which makes the training network pay more attention to the significant information of the mapping features, is strengthened in the limited receptive field. This enhances the image resolution, and makes up for the problem of information loss caused by continuous convolution and pooling in the data set training process. The experimental results on PASCAL VOC data sets show that the proposed Mul-YOLO target detection approach has obvious improvements compared with the previous generation of target detection model. For example, mAP is improved by 8.97%compared with the Faster R-CNN ResNet feature extraction method, the detection time of single image is decreased by 172 ms, while mAP is increased by 30.48% compared with the YOLOv3 feature extraction method, achieveing the purpose that detection accuracy and detection time complement each other at the same time. The detection accuracy is therefore improved, and the detection time remains unchanged and the effectiveness of proposed approaches can be guaranteed also.

Key words: target detection, Mul-YOLO, Haar wavelet, high distinguishability feature, high order sampling

CLC Number: 

  • TP30
[1] 周俊宇,赵艳明.卷积神经网络在图像分类和目标检测应用综述[J]. 计算机工程与应用, 2017, 53(13):34-41. ZHOU Junyu, ZHOU Yanming. Application of convolution neural network in image classification and object detection[J]. Computer Engineering and Applications, 2017, 53(13):34-41.
[2] 李忠海, 杨超, 梁书浩. 基于超像素分割和混合权值 AdaBoost 运动检测算法[J]. 电光与控制, 2018, 25(2):33-37. LI Zhonghai, YANG Chao, LIANG Shuhao. AdaBoost moving-target detection algorithm based on superpixel segmentation and mixed weight[J]. Electronics Optics & Control, 2018, 25(2):33-37.
[3] 胡昭华, 张维新, 邵晓雯. 超像素特征的运动目标检测算法[J]. 控制理论与应用, 2017, 34(12):1568-1574. HU Zhaohua, ZHANG Weixin, SHAO Xiaowen. Moving object detection algorithm with superpixel features[J]. Control Theory & Applications, 2017, 34(12):1568-1574.
[4] DALAL N, TRIGGS B. Histograms of oriented gradients for human detection[C] //Proceedings of the IEEE Computer Society Conference on Computer Vision & Pattern Recognition(CVPR). San Diego: IEEE, 2005: 886-893.
[5] LIENHART R, MAYDT J. An extended set of Haar-like features for rapid object detection[C] //International Conference on Image Processing(ICIP). Rochester: IEEE, 2002: 900-903.
[6] VIOLA P, JONES M. Rapid object detection using a boosted cascade of simple features[C] //Proceedings of the IEEE Computer Society Conference on Computer Vision & Pattern Recognition(CVPR). Kauai: IEEE, 2001: 511-518.
[7] FELZENSZWALB P F, MCALLESTER D, RAMANAN D. A discriminatively trained, multiscale, deformable part model[C] //Proceedings of the IEEE Computer Society Conference on Computer Vision & Pattern Recognition(CVPR). Anchorage: IEEE, 2008: 1-8.
[8] FELZENSZWALB P F, GIRSHICK R B, MCALLESTER D, et al. Object detection with discriminatively trained part-based models[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2009, 32(9):1627-1645.
[9] CARREIRA J, AGRAWAL P, FRAGKIADAKI K, et al. Human pose estimation with iterative error feedback[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 4733-4742.
[10] 张顺, 龚一宏, 王进军. 深度卷积神经网络的发展及其在计算机视觉领域的应用[J]. 计算机学报, 2019, 42(3):453-482. ZHANG Shun, GONG Yihong, WANG Jinjun. The development of deep convolution neural network and its applications on computer vision[J]. Chinese Journal of Computers, 2019, 42(3):453-482.
[11] GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmenta-teon[C] //Proceedings of the IEEE Conference On Computer Vision and Pattern Recognition. Columbus: IEEE, 2014: 580-587.
[12] EVERINGHAM M, VAN GOOL L, WILLIAMS C K I, et al. The PASCAL visual object classes(VOC)wchallenge[J]. International Journal of Computer Vision, 2010, 88(2):303-338.
[13] REN Shaoqing, HE Kaiming, GIRSHICK Ross, et al. Faster R-CNN: towards realtime object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 39(6):1137-1149.
[14] REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: Unified, real-time object detection[C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 779-788.
[15] REDMON Joseph, FARHADI Ali. YOLO9000: better, faster, stronger[C] //2017 IEEE Conference on Computer Vision andPattern Recognition(CVPR). Honolulu: IEEE, 2017: 6517-6525.
[16] REDMON Joseph, FARHADI Ali. YOLOv3: an incremental improvement[J/OL]. arXiv, 2018. https://arxiv.org/pdf/1804.02767.pdf.
[17] DENG Z R, YANG R, LAN R S, et al. SE-IYOLOV3: an accurate small scale face detector for outdoor security[J]. Mathematics, 2020, 8(1): 93.
[18] HURTIK Petr, MOLEK Vojtech, HULA Jan, et al. Poly-YOLO: higher speed, more precise detection and instance segmenta-tion for YOLOv3[J/OL]. arXiv, 2005. https://arxiv.org/pdf/2005.13243.pdf.
[19] 张冬明, 靳国庆, 代锋, 等. 基于深度融合的显著性目标检测算法[J]. 计算机学报, 2019, 42(9):2076-2086. ZHANG Dongming, JIN Guoqing, DAI Feng, et al. Salient object detection based on deep fusion of hand-crafted features[J]. Chinese Journal of Computers, 2019, 42(9):2076-2086.
[20] ALI B, CHENG M M, HOU Q B, et al. Salient object detection: a survey[J]. Computational Visual Media, 2019, 5(2):117-150.
[21] GAO Y, WANG M, TAO D C. 3-D object retrieval and recognition with hypergraph analysis[J]. IEEE Transactions on Image Processing, 2012, 21(9):4290-4303.
[22] ZHANG K, GUO Y R, WANG X S, et al. Channel-wise and feature-points reweights densenet for image classification[C] //Proceedings of the 2019 IEEE International Conference on Image Processing. Piscataway: IEEE, 2019: 410-414.
[23] BENJILALI W, GUICQUERO W, JACQUES L, et al. Hardware-friendly compressive imaging on the basis of random modulati-ons & permutations for image acquisition and classification[C] //Proceedings of the 2019 IEEE International Conferenceon Image Processing. Piscataway: IEEE, 2019: 2085-2089.
[24] ZHANG Z Y, CUI Z, XU C Y, et al. Pattern-affinitive propagation across depth, surface normal and semantic segmenta-ti-on[C] //Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition. Washington: IEEE Computer Society, 2019: 4106-4115.
[25] DING H H, JIANG X D, SHUAI B, et al. Semantic correlation promoted shape-variant context for segmentation[C] //Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition. Washington: IEEE Computer Society, 2019: 8885-8894.
[26] HONG S, YOU T, KWAK S, et al. Online tracking by learning discriminative saliency map with convolutional neural network[C] //Proceedings of the 32nd International Conference on Machine Learning. Lille: ACM, 2015: 597-606.
[27] DI L, JI Y F, LISCHINSKI D, et al. Multi-scale context intertwining for semantic segmentation[C] //LNCS 11207: Proceedings of the 15th European Conference on Computer Vision. Berlin: Springer, 2018: 622-638.
[28] CRAYE C, FILLIAT D, GOUDOU J F, et al. Environment exploration for object-based visual saliency learning[C] //Proceedings of the 2016 IEEE International Conference on Robotics and Automation, Stockholm. Piscataway: IEEE, 2016: 2303-2309.
[1] GENG Wan-hai, CHEN Yi-ming, LIU Yu-feng, WANG Xiao-juan. The approximation of definite integration by using Haar wavelet and operator matrix [J]. J4, 2012, 47(4): 84-88.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!