|
|
Research progress of YOLO detection technology for traffic object |
Hongzhao DONG( ),Shaoxuan LIN,Yini SHE |
ITS Joint Research Institute, Zhejiang University of Technology, Hangzhou 310023, China |
|
|
Abstract The development and research status of YOLO algorithm in traffic object detection were systematically summarized from the perspective of the three core elements of 'people-vehicle-road' in order to comprehensively analyze the important role of YOLO (You Only Look Once) algorithm in improving traffic safety and efficiency. The commonly used evaluation indexes of YOLO algorithm were outlined, and the practical significance of these indexes in traffic scenarios was elaborately expounded. An overview of the core architecture of YOLO algorithm was provided, its development process was traced, and the optimization and improvement measures in each version iteration were analyzed. The research status and application scenarios of YOLO algorithm for traffic object detection were sorted out and discussed from the perspective of the three traffic objects 'people-vehicle-road'. The limitations and challenges of YOLO algorithm in traffic object detection were analyzed, and corresponding improvement methods were proposed. Future research focuses were anticipated, providing a research reference for the intelligent development of road traffic.
|
Received: 06 February 2024
Published: 11 February 2025
|
|
Fund: 浙江省自然科学基金资助项目(LMS25F030007);浙江省“尖兵”“领雁”研发攻关计划资助项目(2024C01180). |
交通目标YOLO检测技术的研究进展
为了综合分析YOLO(You Only Look Once)算法在提升交通安全性和效率方面的重要作用,从“人-车-路” 3个核心要素的角度,对YOLO算法在交通目标检测中的发展和研究现状进行系统性地总结. 概述了YOLO算法常用的评价指标,详细阐述了这些指标在交通场景中的实际意义. 对YOLO算法的核心架构进行概述,追溯了该算法的发展历程,分析各个版本迭代中的优化和改进措施. 从“人-车-路”3种交通目标的视角出发,梳理并论述了采用YOLO算法进行交通目标检测的研究现状及应用情况. 分析目前YOLO算法在交通目标检测中存在的局限性和挑战,提出相应的改进方法,展望未来的研究重点,为道路交通的智能化发展提供了研究参考.
关键词:
YOLO算法,
目标检测,
计算机视觉,
交通目标,
交通安全
|
|
[1] |
KAFFASH S, NGUYEN A T, ZHU J Big data algorithms and applications in intelligent transportation system: a review and bibliometric analysis[J]. International Journal of Production Economics, 2021, 231: 107868
doi: 10.1016/j.ijpe.2020.107868
|
|
|
[2] |
REN S, HE K, GIRSHICK R, et al Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 39 (6): 1137- 1149
|
|
|
[3] |
LIU W, ANGUELOV D, ERHAN D, et al. Ssd: single shot multibox detector [C]// Computer Vision–ECCV 2016: 14th European Conference . Amsterdam: Springer, 2016: 21-37.
|
|
|
[4] |
LIN T Y, GOYAL P, GIRSHICK R, et al Focal loss for dense object detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 42 (2): 2999- 3007
|
|
|
[5] |
ZHAO Z Q, ZHENG P, XU S T, et al Object detection with deep learning: a review[J]. IEEE Transactions on Neural Networks and Learning Systems, 2019, 30 (11): 3212- 3232
doi: 10.1109/TNNLS.2018.2876865
|
|
|
[6] |
邓亚平, 李迎江 YOLO算法及其在自动驾驶场景中目标检测研究综述[J]. 计算机应用, 2024, 44 (6): 1949- 1958 DENG Yaping, LI Yingjiang Review of YOLO algorithm and its application to object detection in autonomous driving scenes[J]. Journal of ComputerApplications, 2024, 44 (6): 1949- 1958
|
|
|
[7] |
ZAIDI S S A, ANSARI M S, ASLAM A, et al A survey of modern deep learning based object detection models[J]. Digital Signal Processing, 2022, 126: 103514
doi: 10.1016/j.dsp.2022.103514
|
|
|
[8] |
王琳毅, 白静, 李文静, 等 YOLO系列目标检测算法研究进展[J]. 计算机工程与应用, 2023, 59 (14): 15- 29 WANG Linyi, BAI Jing, LI Wenjing, et al Research progress of YOLO series target detection algorithms[J]. Computer Engineering and Applications, 2023, 59 (14): 15- 29
doi: 10.3778/j.issn.1002-8331.2301-0081
|
|
|
[9] |
REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Las Vegas: IEEE, 2016: 779-788.
|
|
|
[10] |
SZEGEDY C, LIU W, JIA Y, et al. Going deeper with convolutions [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Boston: IEEE, 2015: 1-9.
|
|
|
[11] |
EVERINGHAM M, VAN GOOL L, WILLIAMS C K, et al The pascal visual object classes (voc) challenge[J]. International Journal of Computer Vision, 2010, 88 (2): 303- 338
doi: 10.1007/s11263-009-0275-4
|
|
|
[12] |
REDMON J, FARHADI A. YOLO9000: better, faster, stronger [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Honolulu: IEEE, 2017: 7263-7271.
|
|
|
[13] |
REDMON J, FARHADI A. Yolov3: an incremental improvement [EB/OL]. [2023-01-20]. https://arxiv.org/abs/1804.02767.
|
|
|
[14] |
HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Las Vegas: IEEE, 2016: 770-778.
|
|
|
[15] |
LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Honolulu: IEEE, 2017: 2117-2125.
|
|
|
[16] |
RUBY U, YENDAPALLI V Binary cross entropy with deep learning technique for image classification[J]. Advanced Trends in Computer Science and Engineering, 2020, 9 (4): 5393- 5397
doi: 10.30534/ijatcse/2020/175942020
|
|
|
[17] |
LIN T Y, MAIRE M, BELONGIE S, et al. Microsoft coco: common objects in context [C]// 13th European Conference of Computer Vision . Zurich: Springer, 2014: 740-755.
|
|
|
[18] |
BOCHKOVSKIY A, WANG C Y, LIAO H Y M. Yolov4: optimal speed and accuracy of object detection [EB/OL]. [2023-01-20]. https://arxiv.org/abs/2004.10934.
|
|
|
[19] |
WANG C Y, LIAO H Y M, WU Y H, et al. CSPNet: a new backbone that can enhance learning capability of CNN [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops . Seattle: IEEE, 2020: 390-391.
|
|
|
[20] |
HE K, ZHANG X, REN S, et al Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37 (9): 1904- 1916
doi: 10.1109/TPAMI.2015.2389824
|
|
|
[21] |
LIU S, QI L, QIN H, et al. Path aggregation network for instance segmentation [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Salt Lake City: IEEE, 2018: 8759-8768.
|
|
|
[22] |
ZHENG Z, WANG P, LIU W, et al. Distance-IoU loss: faster and better learning for bounding box regression [C]// Proceedings of the AAAI Conference on Artificial Intelligence . Vancouver: AAAI Press, 2020: 12993-13000.
|
|
|
[23] |
JOCHER G. YOLOv5 by ultralytics [EB/OL]. (2020-06-09) [2024-04-23]. https://github.com/ultralytics/yolov5.
|
|
|
[24] |
GHIASI G, CUI Y, SRINIVAS A, et al. Simple copy-paste is a strong data augmentation method for instance segmentation [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . Nashville: IEEE, 2021: 2918-2928.
|
|
|
[25] |
ZHANG H, CISSE M, DAUPHIN Y N, et al. Mixup: beyond empirical risk minimization [EB/OL]. [2023-01-20]. https://arxiv.org/abs/1710.09412.
|
|
|
[26] |
GE Z, LIU S, WANG F, et al. Yolox: exceeding yolo series in 2021 [EB/OL]. [2023-01-20]. https://arxiv.org/abs/2107.08430.
|
|
|
[27] |
LAW H, DENG J. Cornernet: detecting objects as paired keypoints [C]// Proceedings of the European Conference on Computer Vision . Munich: Springer, 2018: 734-750.
|
|
|
[28] |
DUAN K, BAI S, XIE L, et al. Centernet: keypoint triplets for object detection [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision . Seoul: IEEE, 2019: 6569-6578.
|
|
|
[29] |
TIAN Z, SHEN C, CHEN H, et al. Fcos: fully convolutional one-stage object detection [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision . Seoul: IEEE, 2019: 9627-9636.
|
|
|
[30] |
LI C, LI L, JIANG H, et al. YOLOv6: a single-stage object detection framework for industrial applications [EB/OL]. (2022-09-07) [2024-04-23]. https://arxiv.org/abs/2209.02976.
|
|
|
[31] |
DING X, ZHANG X, MA N, et al. Repvgg: making vgg-style convnets great again [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . Nashville: IEEE, 2021: 13733-13742.
|
|
|
[32] |
ZHANG H, WANG Y, DAYOUB F, et al. Varifocalnet: an iou-aware dense object detector [C]/ /Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . Nashville: IEEE, 2021: 8514-8523.
|
|
|
[33] |
GEVORGYAN Z. SIoU loss: more powerful learning for bounding box regression [EB/OL]. (2022-05-25) [2024-04-23]. https://arxiv.org/abs/2205.12740.
|
|
|
[34] |
REZATOFIGHI H, TSOI N, GWAK J, et al. Generalized intersection over union: a metric and a loss for bounding box regression [C]/ /Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . Seoul: IEEE, 2019: 658-666.
|
|
|
[35] |
WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . Vancouver: IEEE, 2023: 7464-7475.
|
|
|
[36] |
WANG C Y, LIAO H Y M, YEH I H. Designing network design strategies through gradient path analysis [EB/OL]. (2022-11-09) [2024-04-23]. https://arxiv.org/abs/2211.04800.
|
|
|
[37] |
JOCHER G, CHAURASIA A, QIU J. YOLO by ultralytics [EB/OL]. (2023-01-01) [2024-04-23]. https://github.com/ultralytics/ultralytics.
|
|
|
[38] |
LI X, WANG W, WU L, et al. Generalized focal loss: learning qualified and distributed bounding boxes for dense object detection [EB/OL]. [2023-01-20]. https://proceedings.neurips.cc/paper_files/paper/2020/file/f0bda020d2470f2e74990a07a607ebd9-Paper.pdf.
|
|
|
[39] |
DU L, CHEN X, PEI Z, et al Improved real-time traffic obstacle detection and classification method applied in intelligent and connected vehicles in mixed traffic environment[J]. Journal of Advanced Transportation, 2022, 2022 (1): 2259113
|
|
|
[40] |
ARJOVSKY M, CHINTALA S, BOTTOU L. Wasserstein generative adversarial networks [C]// International Conference on Machine Learning . Sydney: PMLR, 2017: 214-223.
|
|
|
[41] |
CAO J, ZHUANG Y, WANG M, et al. Pedestrian detection algorithm based on ViBe and YOLO [C]// Proceedings of the 5th International Conference on Video and Image Processing . New York: ACM, 2021: 92-97.
|
|
|
[42] |
BARNICH O, VAN DROOGENBROECK M ViBe: a universal background subtraction algorithm for video sequences[J]. IEEE Transactions on Image Processing, 2010, 20 (6): 1709- 1724
|
|
|
[43] |
CHEN X, JIA Y, TONG X, et al Research on pedestrian detection and deepsort tracking in front of intelligent vehicle based on deep learning[J]. Sustainability, 2022, 14 (15): 9281
doi: 10.3390/su14159281
|
|
|
[44] |
WOJKE N, BEWLEY A, PAULUS D. Simple online and realtime tracking with a deep association metric [C]// IEEE International Conference on Image Processing . Beijing: IEEE, 2017: 3645-3649.
|
|
|
[45] |
CHAVIS C, NYARKO K, CIRILLO C, et al. A comparative study of pedestrian crossing behavior and safety in Baltimore, MD and Washington, DC using video surveillance [R]. Baltimore: Morgan State University, 2023.
|
|
|
[46] |
LIU X, ZHU Y. Passenger flow modeling and simulation in transit stations [R]. Newark: Rutgers University, 2022.
|
|
|
[47] |
AURENHAMMER F, KLEIN R Voronoi diagrams[J]. Handbook of Computational Geometry, 2000, 5 (10): 201- 290
|
|
|
[48] |
YOGESH R, RITHEESH V, REDDY S, et al. Driver drowsiness detection and alert system using YOLO [C]// International Conference on Innovative Computing, Intelligent Communication and Smart Electrical Systems . Chennai: IEEE, 2022: 1-6.
|
|
|
[49] |
方浩杰, 董红召, 林少轩, 等 多特征融合的驾驶员疲劳状态检测方法[J]. 浙江大学学报: 工学版, 2023, 57 (7): 1287- 1296 FANG Haojie, DONG Hongzhao, LIN Shaoxuan, et al Driver fatigue state detection method based on multi-feature fusion[J]. Journal of Zhejiang University: Engineering Science, 2023, 57 (7): 1287- 1296
|
|
|
[50] |
LIU S, WANG Y, YU Q, et al CEAM-YOLOv7: improved YOLOv7 based on channel expansion and attention mechanism for driver distraction behavior detection[J]. IEEE Access, 2022, 10: 129116- 129124
doi: 10.1109/ACCESS.2022.3228331
|
|
|
[51] |
LIU Y, SHAO Z, HOFFMANN N. Global attention mechanism: retain information to enhance channel-spatial interactions [EB/OL]. (2021-12-10) [2024-04-23]. https://arxiv.org/abs/2112.05561.
|
|
|
[52] |
ZHAO J, LI C, XU Z, et al. Detection of passenger flow on and off buses based on video images and YOLO algorithm [EB/OL]. [2023-01-20]. https://link.springer.com/article/10.1007/s11042-021-10747-w.
|
|
|
[53] |
LI Y, WANG J, HUANG J, et al Research on deep learning automatic vehicle recognition algorithm based on RES-YOLO model[J]. Sensors, 2022, 22 (10): 3783
doi: 10.3390/s22103783
|
|
|
[54] |
YU F, CHEN H, WANG X, et al. Bdd100k: a diverse driving dataset for heterogeneous multitask learning [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . Seattle: IEEE, 2020: 2636-2645.
|
|
|
[55] |
叶佳林, 苏子毅, 马浩炎, 等 改进YOLOv3的非机动车检测与识别方法[J]. 计算机工程与应用, 2021, 57 (1): 194- 199 YE Jialin, SU Ziyi, MA Haoyan, et al Improved YOLOv3 non-motor vehicles detection and recognition method[J]. Computer Engineering and Applications, 2021, 57 (1): 194- 199
doi: 10.3778/j.issn.1002-8331.2005-0343
|
|
|
[56] |
RAJ V S, SAI J V M, YOGESH N L, et al. Smart traffic control for emergency vehicles prioritization using video and audio processing [C]// 6th International Conference on Intelligent Computing and Control Systems . Madurai: IEEE, 2022: 1588-1593.
|
|
|
[57] |
SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition [EB/OL]. (2014-09-04) [2024-04-23]. https://arxiv.org/abs/1409.1556.
|
|
|
[58] |
SIMONY M, MILZY S, AMENDEY K, et al. Complex-yolo: an euler-region-proposal for real-time 3d object detection on point clouds [C]/ /Proceedings of the European Conference on Computer Vision Workshops. Munich: Springer, 2018: 197-209.
|
|
|
[59] |
AZIMJONOV J, ÖZMEN A A real-time vehicle detection and a novel vehicle tracking systems for estimating and monitoring traffic flow on highways[J]. Advanced Engineering Informatics, 2021, 50: 101393
doi: 10.1016/j.aei.2021.101393
|
|
|
[60] |
LIN C J, JHANG J Y Intelligent traffic-monitoring system based on YOLO and convolutional fuzzy neural networks[J]. IEEE Access, 2022, 10: 14120- 14133
doi: 10.1109/ACCESS.2022.3147866
|
|
|
[61] |
EBADZADEH M M, SALIMI-BADR A CFNN: correlated fuzzy neural network[J]. Neurocomputing, 2015, 148: 430- 444
doi: 10.1016/j.neucom.2014.07.021
|
|
|
[62] |
CVIJETIĆ A, DJUKANOVIĆ S, PERUNIČIĆ A. Deep learning-based vehicle speed estimation using the YOLO detector and 1D-CNN [C]// 27th International Conference on Information Technology . Žabljak: IEEE, 2023: 1-4.
|
|
|
[63] |
RAHMAN Z, AMI A M, ULLAH M A. A real-time wrong-way vehicle detection based on YOLO and centroid tracking [C]// 2020 IEEE Region 10 Symposium . Dhaka: IEEE, 2020: 916-920.
|
|
|
[64] |
SABRY K, EMAD M. Road traffic accidents detection based on crash estimation [C]// 17th International Computer Engineering Conference . Cairo: IEEE, 2021: 63-68.
|
|
|
[65] |
BOLME D S, BEVERIDGE J R, DRAPER B A, et al. Visual object tracking using adaptive correlation filters [C]// IEEE Computer Society Conference on Computer Vision and Pattern Recognition . San Francisco: IEEE, 2010: 2544-2550.
|
|
|
[66] |
ARCEDA V M, FABIÁN K F, GUTÍERREZ J C. Real time violence detection in video [C]// The Institution of Engineering and Technology Conference Proceedings. Talca: IEEE, 2016.
|
|
|
[67] |
GEIGER A, LENZ P, STILLER C, et al Vision meets robotics: the kitti dataset[J]. The International Journal of Robotics Research, 2013, 32 (11): 1231- 1237
doi: 10.1177/0278364913491297
|
|
|
[68] |
DONG Z, WU Y, PEI M, et al Vehicle type classification using a semisupervised convolutional neural network[J]. IEEE Transactions on Intelligent Transportation Systems, 2015, 16 (4): 2247- 2256
doi: 10.1109/TITS.2015.2402438
|
|
|
[69] |
GUERRERO-GÓMEZ-OLMEDO R, LÓPEZ-SASTRE R J, MALDONADO-BASCÓN S, et al. Vehicle tracking by simultaneous detection and viewpoint estimation [C]// Natural and Artificial Computation in Engineering and Medical Applications: 5th International Work-Conference on the Interplay Between Natural and Artificial Computation . Mallorca: Springer, 2013: 306-316.
|
|
|
[70] |
DJUKANOVIĆ S, BULATOVIĆ N, ČAVOR I. A dataset for audio-video based vehicle speed estimation [C]// 30th Telecommunications Forum . Belgrade: IEEE, 2022: 1-4.
|
|
|
[71] |
SONG W, SUANDI S A Tsr-yolo: a Chinese traffic sign recognition algorithm for intelligent vehicles in complex scenes[J]. Sensors, 2023, 23 (2): 749
doi: 10.3390/s23020749
|
|
|
[72] |
ZHANG J, ZOU X, KUANG L D, et al. CCTSDB 2021: a more comprehensive traffic sign detection benchmark [EB/OL]. [2023-01-20]. https://centaur.reading.ac.uk/106129/1/12-23.pdf.
|
|
|
[73] |
钱伍, 王国中, 李国平 改进YOLOv5的交通灯实时检测鲁棒算法[J]. 计算机科学与探索, 2022, 16 (1): 231- 241 QIAN Wu, WANG Guozhong, LI Guoping Improved YOLOv5 traffic light real-time detection robust algorithm[J]. Journal of Frontiers of Computer Science and Technology, 2022, 16 (1): 231- 241
doi: 10.3778/j.issn.1673-9418.2105033
|
|
|
[74] |
BEHRENDT K, NOVAK L, BOTROS R. A deep learning approach to traffic lights: detection, tracking, and classification [C]// IEEE International Conference on Robotics and Automation . Singapore: IEEE, 2017: 1370-1377.
|
|
|
[75] |
MII Y, MIYAZAKI R, YOSHIMOTO Y, et al A road marking detection system using partial template matching and region estimation by deep neural network[J]. Journal of Japan Society for Fuzzy Theory and Intelligent Informatics, 2021, 33 (1): 566- 571
doi: 10.3156/jsoft.33.1_566
|
|
|
[76] |
CHEN Z, WANG X, ZHANG W, et al Autonomous parking space detection for electric vehicles based on improved YOLOV5-OBB algorithm[J]. World Electric Vehicle Journal, 2023, 14 (10): 276
doi: 10.3390/wevj14100276
|
|
|
[77] |
HENDRYCKS D, GIMPEL K. Gaussian error linear units (gelus) [EB/OL]. (2016-06-27) [2024-04-23]. https://arxiv.org/abs/1606.08415.
|
|
|
[78] |
HOU Q, ZHOU D, FENG J. Coordinate attention for efficient mobile network design [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . Nashville: IEEE, 2021: 13713-13722.
|
|
|
[79] |
董红召, 方浩杰, 张楠 旋转框定位的多尺度再生物品目标检测算法[J]. 浙江大学学报: 工学版, 2022, 56: 16- 25 DONG Hongzhao, FANG Haojie, ZHANG Nan Multi-scale object detection algorithm for recycled objects based on rotating block positioning[J]. Journal of Zhejiang University: Engineering Science, 2022, 56: 16- 25
|
|
|
[80] |
YANG X, YAN J. Arbitrary-oriented object detection with circular smooth label [C]// 16th European Conference of Computer Vision . Glasgow: Springer, 2020: 677-694.
|
|
|
[81] |
SRIVASTAVA I. Retraining of object detectors to become suitable for trash detection in the context of autonomous driving [D]. Dresden: Technische Universität Dresden, 2022.
|
|
|
[82] |
WAN F, SUN C, HE H, et al YOLO-LRDD: a lightweight method for road damage detection based on improved YOLOv5s[J]. EURASIP Journal on Advances in Signal Processing, 2022, 2022 (1): 98
doi: 10.1186/s13634-022-00931-x
|
|
|
[83] |
MA N, ZHANG X, ZHENG H T, et al. Shufflenet v2: practical guidelines for efficient cnn architecture design [C]// Proceedings of the European Conference on Computer Vision . Munich: Springer, 2018: 116-131.
|
|
|
[84] |
WANG Q, WU B, ZHU P, et al. ECA-Net: efficient channel attention for deep convolutional neural networks [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . Seattle: IEEE, 2020: 11534-11542.
|
|
|
[85] |
PROENÇA P F, SIMOES P. Taco: trash annotations in context for litter detection [EB/OL]. (2020-03-16) [2024-04-23]. https://arxiv.org/abs/2003.06975.
|
|
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|