Road scene pedestrian detection based on detection-enhanced YOLOv3-tiny
(1. Hubei Key Laboratory of Advanced Technology for Automotive Components, Wuhan University of Technology, Wuhan, Hubei 430070, China; 2. Hubei Collaborative Innovation Center for Automotive Components Technology, Wuhan University of Technology, Wuhan, Hubei 430070, China)
Abstract:To provide drivers with real-time and accurate pedestrian information and reduce traffic accidents, the detection of enhanced YOLOv3-tiny (DOEYT) pedestrian detection algorithm was proposed. The robust feature extraction network was established, and the asymmetric max-pooling was used for down sampling to prevent the loss of lateral pedestrian features due to the increased receptive field. Hardswish was employed as activation function for the convolutional layers to optimize network performance, and the global context (GC) self-attention mechanism was used to capture holistic feature information. In the classification and regression network, the three-scale detection strategy was adopted to improve the accuracy of small-scale pedestrian target detection. The k-means++ algorithm was used to regenerate dataset anchor boxes for enhancing network convergence speed. The pedestrian detection dataset was constructed and divided into training and testing sets to evaluate DOEYT performance. The results show that by the asymmetric max-pooling, Hardswish function and GC self-attention mechanism, AP values are increased by 14.4%, 7.9% and 10.8%, respectively. On the testing set, DOEYT achieves average precision of 91.2% and detection speed of 103 frames per second, which demonstrates that the proposed algorithm can quickly and accurately detect pedestrians for reducing the risk of traffic accidents.
KUMAR K, MISHRA R K. A heuristic SVM based pedestrian detection approach employing shape and texture descriptors[J]. Multimedia Tools and Applications, 2020,79(29/30):21389-21408.
[2]
WEI X, LU W, BAO P, et al. MGA for feature weight learning in SVM: a novel optimization method in pedestrian detection[J]. Multimedia Tools and Applications, 2018,77(7):9021-9037.
YAO J, YU F Q. Pedestrian detection based on combination of candidate region location and HOG-CLBP features[J]. Advances in Laser and Optoelectronics Progress, DOI:10.3788/LOP202158.0210015.(in Chinese)
WANG B, LI X M, ZHAO Z P.Hand-held call detection of driver based on improved Faster RCNN[J]. Journal of Jiangsu University (Natural Science Edition), 2023,44(3):318-323.(in Chinese)
OUYANG J H, WANG Z M, LIU S G. YOLO_v4 object detection method with improved multi-scale features[J]. Journal of Jilin University (Science Edition), 2022,60(6):1349-1355.(in Chinese)
WANG B, HE Y. Traffic sign detection based on improved YOLOv3[J]. Journal of Sichuan University (Natural Science Edition), DOI:10.19907/j.0490-6756.2022.012004.(in Chinese)
XIA Y, HUANG B K. Object detection of high resolution remote sensing images based on improved YOLOv3[J]. Journal of Chongqing University of Posts and Telecommunications (Natural Science Edition), 2022,34(3):383-392.(in Chinese)
FANG Y, LIU Y J, SUN L B, et al. Fusion algorithm of face detection and head pose estimation based on SSD model[J]. Journal of Jiangsu University (Natural Science Edition), 2019,40(4):451-457.(in Chinese)
[9]
HSU W Y, LIN W Y. Adaptive fusion of multi-scale YOLO for pedestrian detection[J]. IEEE Access, 2021(9):110063-110073.
[10]
ZHENG Y, IZZAT I H, ZIAEE S. GFD-SSD: gated fusion double SSD for multispectral pedestrian detection[J]. arXiv preprint arXiv:1903.06999v2.
[11]
KIM J H, BATCHULUUN G, PARK K R. Pedestrian detection based on faster R-CNN in nighttime by fusing deep convolutional features of successive images[J]. Expert Systems with Applications, 2018,114:15-33.
[12]
CAO Y, XU J R, LIN S, et al. GCNet: non-local networks meet squeeze-excitation networks and beyond[C]∥Proceedings of the 17th IEEE/CVF International Conference on Computer Vision Workshop.Piscataway:IEEE, DOI:10.1109/ICCVW.2019.00246.
[13]
HOWARD A, SANDLER M, CHU G, et al. Searching for mobilenetV3[C]∥Proceedings of the 17th IEEE/CVF International Conference on Computer Vision Workshop.Piscataway:IEEE, DOI:10.1109/ICCV.2019.00140.