Please wait a minute...
浙江大学学报(工学版)  2026, Vol. 60 Issue (7): 1599-1610    DOI: 10.3785/j.issn.1008-973X.2026.07.021
交通工程     
基于无人机航拍图像的实时车辆检测算法
孟昱煜(),马银宝,火久元*()
兰州交通大学 电子与信息工程学院,甘肃 兰州 730070
Real-time vehicle detection algorithm based on UAV aerial images
Yuyu MENG(),Yinbao MA,Jiuyuan HUO*()
School of Electronics and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China
 全文: PDF(4247 KB)   HTML
摘要:

针对无人机(UAV)航拍图像中多尺度目标,尤其是小目标,在密集、遮挡及低光照等复杂场景下检测精度较低的问题,提出卷积-小波双域下采样器RDWTConv,以保留小目标细节;设计3层跨尺度残差融合模块RCDFM,以增强多尺度特征交互;提出尺度-形状损失TSSIoU,以提升航拍视角下目标尺度与形状的边界框定位精度. 在此基础上,基于YOLOv8构建适配不同算力需求的CF-YOLOn、CF-YOLOs与CF-YOLOm模型. 实验结果显示,在VisDrone数据集上,CF-YOLOn在参数量减少23.7%、计算量仅增加22.5%的情况下,mAP@0.5和mAP@0.5:0.95较基线YOLOv8n分别提高5.5和4.0个百分点,帧率保持169.1帧/s,且在相同帧率区间内,s、m版本取得最高精度;在Drone-Vehicle数据集上重新训练后,CF-YOLOn的mAP@0.5:0.95较基线YOLOv8n提升3.0个百分点. 通过上述协同改进,所提方法不仅在轻量计算开销下保持实时检测,而且有效提升了复杂场景下的多尺度目标检测性能,达到同类方法的先进水平.

关键词: 车辆检测多尺度目标复杂场景YOLOv8下采样    
Abstract:

Multi-scale targets in unmanned aerial vehicle (UAV) aerial images, especially small targets, have low detection accuracy in complex scenarios such as dense scenes, occlusions, and low illumination. Thus, a convolution-wavelet dual-domain downsampling module (RDWTConv) was proposed to preserve fine details of small targets. Additionally, a three-layer cross-scale residual fusion module (RCDFM) was designed to enhance multi-scale feature interactions. Furthermore, a scale-shape loss function (TSSIoU) was introduced to improve bounding box localization accuracy for varying object scales and shapes under aerial perspectives. On this basis, a series of CF-YOLO models, namely CF-YOLOn, CF-YOLOs, and CF-YOLOm, were constructed based on YOLOv8 to meet diverse computational requirements. Experimental results demonstrated that on the VisDrone dataset, CF-YOLOn achieved a 23.7% reduction in parameters and only a 22.5% increase in computational cost, while improving mAP@0.5 and mAP@0.5:0.95 by 5.5 and 4.0 percentage points, respectively, compared with the baseline YOLOv8n, as well as maintaining a frame rate of 169.1 frames per second. The s and m variants also achieved the highest accuracy within the same frame rate range. After retraining on the Drone-Vehicle dataset, CF-YOLOn’s mAP@0.5:0.95 improved by 3.0 percentage points compared to the baseline. Through the above synergistic improvements, the proposed method not only maintains real-time detection under lightweight computational costs but also effectively enhances multi-scale target detection performance in complex scenarios, achieving state-of-the-art results among comparable methods.

Key words: vehicle detection    multi-scale target    complex scenario    YOLOv8    downsampling
收稿日期: 2025-05-29 出版日期: 2026-05-23
CLC:  TP 391  
基金资助: 国家自然科学基金资助项目(62262038);甘肃省技术创新指导计划-科技专家资助项目(25CXGA030);甘肃省重点研发计划-工业资助项目(25YFGA045).
通讯作者: 火久元     E-mail: mengyuyu@mail.lzjtu.cn;huojy@mail.lzjtu.cn
作者简介: 孟昱煜(1975—),女,副教授,硕导,从事数据挖掘研究. orcid.org/0009-0003-1310-7755. E-mail:mengyuyu@mail.lzjtu.cn
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  
孟昱煜
马银宝
火久元

引用本文:

孟昱煜,马银宝,火久元. 基于无人机航拍图像的实时车辆检测算法[J]. 浙江大学学报(工学版), 2026, 60(7): 1599-1610.

Yuyu MENG,Yinbao MA,Jiuyuan HUO. Real-time vehicle detection algorithm based on UAV aerial images. Journal of ZheJiang University (Engineering Science), 2026, 60(7): 1599-1610.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2026.07.021        https://www.zjujournals.com/eng/CN/Y2026/V60/I7/1599

图 1  YOLOv8n网络结构图
图 2  CF-YOLOn网络结构图
图 3  小波池化层结构图
图 4  RDWTConv结构图
图 5  Birectional Concatenate结构图
图 6  Sandwich-fusion结构图
图 7  RCDFM结构图
图 8  VisDrone 和 Drone-Vehicle 数据集标签数量及大小分布情况
MethodsP/%R/%mAP@
0.5/%
mAP@
0.5:0.95/%
Params/
106
GFLOPs/
109
CBS44.532.332.418.73.018.1
DWT43.432.332.318.72.797.6
RDWTConv45.133.233.719.63.188.5
ADown[23]43.430.831.117.82.727.4
SCDown[24]44.633.133.019.02.667.6
DWConv[25]43.131.231.217.82.627.2
RepVGGBlock[18]43.732.632.718.93.058.2
表 1  不同下采样模块的性能比较 (VisDrone数据集)
MethodsmAP@0.5/%mAP@0.5:0.95/%Params/106GFLOPs/109
Concat32.418.73.018.1
BiC[17]33.119.23.058.4
SF[18]32.819.03.028.3
RCDFM34.319.93.078.4
表 2  RCDFM的性能比较 (VisDrone数据集)
MethodsP/%R/%mAP@0.5/%mAP@0.5:0.95/%
CIoU44.532.332.418.7
DIoU42.832.431.918.5
SIoU42.532.832.218.5
GIoU44.332.232.818.9
EIoU43.232.832.518.8
TSSIoU44.732.933.219.2
表 3  不同边界框损失函数的性能比较 (VisDrone数据集)
图 9  超参数$ \gamma $分析
ModelsRDWTConvRCDFMTSSIoUP/%R/%mAP@0.5/%mAP@0.5:0.95/%Params/106GFLOPs/109FPS/(帧·s?1)
YOLOv8n44.532.332.418.73.018.1209.9
M145.133.233.719.63.188.5181.3
M246.935.736.421.63.1811.1178.4
M344.732.933.219.23.018.1209.9
M448.836.937.622.43.3511.5168.5
M549.237.538.322.93.3511.5168.5
CF-YOLOn49.336.937.922.72.309.9169.1
表 4  模型消融研究 (VisDrone数据集)
图 10  模型对比实验 (VisDrone数据集)
ModelsmAP@0.5/%mAP@0.5:0.95/%Params/106FPS/(帧·s?1)
Drone-YOLO74.550.12.97172.1
FBRT-YOLOn[26]74.650.20.90165.3
IV-YOLO[27]74.949.64.31184.7
YOLOv8n75.750.53.01205.4
YOLOv9t77.252.21.97103.2
YOLOv10n76.250.62.70136.8
YOLO11n75.450.32.58187.5
YOLO12n76.351.42.5175.9
CF-YOLOn77.853.52.30164.7
表 5  模型泛化实验 (Drone-Vehicle数据集)
图 11  检测结果可视化 (VisDrone数据集)
1 HEARST M A, DUMAIS S T, OSUNA E, et al Support vector machines[J]. IEEE Intelligent Systems and Their Applications, 1998, 13 (4): 18- 28
doi: 10.1109/5254.708428
2 BEJA-BATTAIS P. Overview of AdaBoost : reconciling its views to better understand its dynamics [EB/OL]. (2023-10-06)[2025-04-18]. https://arxiv.org/abs/2310.18323
3 REN S, HE K, GIRSHICK R, et al Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39 (6): 1137- 1149
doi: 10.1109/TPAMI.2016.2577031
4 REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection [C]// IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 779–788.
5 LIU W, ANGUELOV D, ERHAN D, et al. SSD: single shot MultiBox detector [C]// European Conference on Computer Vision (ECCV) 2016. Cham: Springer International Publishing, 2016: 21–37.
6 GUPTA P, PAREEK B, SINGAL G, et al Edge device based military vehicle detection and classification from UAV[J]. Multimedia Tools and Applications, 2022, 81 (14): 19813- 19834
doi: 10.1007/s11042-021-11242-y
7 史涛, 崔杰, 李松 优化改进YOLOv8实现实时无人机车辆检测的算法[J]. 计算机工程与应用, 2024, 60 (9): 79- 89
SHI Tao, CUI Jie, LI Song Algorithm for real-time vehicle detection from UAVs based on optimizing and improving YOLOv8[J]. Computer Engineering and Applications, 2024, 60 (9): 79- 89
doi: 10.3778/j.issn.1002-8331.2312-0291
8 SUN Y, SHAO Z, CHENG G, et al Road and car extraction using UAV images via efficient dual contextual parsing network[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 5632113
9 HAMZENEJADI M H, MOHSENI H Fine-tuned YOLOv5 for real-time vehicle detection in UAV imagery: architectural improvements and performance boost[J]. Expert Systems with Applications, 2023, 231: 120845
doi: 10.1016/j.eswa.2023.120845
10 YING Z, ZHOU J, ZHAI Y, et al Large-scale high-altitude UAV-based vehicle detection via pyramid dual pooling attention path aggregation network[J]. IEEE Transactions on Intelligent Transportation Systems, 2024, 25 (10): 14426- 14444
doi: 10.1109/TITS.2024.3396915
11 HUI Y, WANG J, LI B STF-YOLO: a small target detection algorithm for UAV remote sensing images based on improved SwinTransformer and class weighted classification decoupling head[J]. Measurement, 2024, 224: 113936
doi: 10.1016/j.measurement.2023.113936
12 姜贸翔, 司占军, 王晓喆 改进RT-DETR的无人机图像目标检测算法[J]. 计算机工程与应用, 2025, 61 (1): 98- 108
JIANG Maoxiang, SI Zhanjun, WANG Xiaozhe Improved target detection algorithm for UAV images with RT-DETR[J]. Computer Engineering and Applications, 2025, 61 (1): 98- 108
doi: 10.3778/j.issn.1002-8331.2405-0331
13 李彬, 李生林 改进YOLOv11n的无人机小目标检测算法[J]. 计算机工程与应用, 2025, 61 (7): 96- 104
LI Bin, LI Shenglin Improved YOLOv11n small object detection algorithm in UAV view[J]. Computer Engineering and Applications, 2025, 61 (7): 96- 104
doi: 10.3778/j.issn.1002-8331.2411-0072
14 梁燕, 何孝武, 邵凯, 等 改进YOLOv8的无人机航拍图像目标检测算法[J]. 计算机工程与应用, 2025, 61 (1): 121- 130
LIANG Yan, HE Xiaowu, SHAO Kai, et al Target detection algorithm for UAV images based on improved YOLOv8[J]. Computer Engineering and Applications, 2025, 61 (1): 121- 130
doi: 10.3778/j.issn.1002-8331.2405-0459
15 JOCHER G, CHAURASIA A, QIU J. Ultralytics YOLOv8 [EB/OL]. (2023-01-28)[2025-04-18]. https://github.com/ultralytics/ultralytics.
16 XUE Y, JIN G, SHEN T, et al SmallTrack: wavelet pooling and graph enhanced classification for UAV small object tracking[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61: 5618815
17 LI C, LI L, GENG Y, et al. YOLOv6 v3. 0: a full-scale reloading [EB/OL]. (2023-01-13)[2025-04-18]. https://arxiv.org/abs/2301.05586.
18 ZHANG Z Drone-YOLO: an efficient neural network method for target detection in drone images[J]. Drones, 2023, 7 (8): 526
doi: 10.3390/drones7080526
19 ZHANG Y F, REN W, ZHANG Z, et al Focal and efficient IOU loss for accurate bounding box regression[J]. Neurocomputing, 2022, 506: 146- 157
doi: 10.1016/j.neucom.2022.07.042
20 YANG X, YAN J, MING Q, et al. Rethinking rotated object detection with Gaussian Wasserstein distance loss [C]// International Conference on Machine Learning (ICML). Virtual Event: PMLR, 2021: 11830–11841.
21 DU D, ZHU P, WEN L, et al. VisDrone-DET2019: the Vision Meets Drone Object Detection in Image Challenge Results [C]// 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW). Seoul: IEEE, 2019: 213–226.
22 SUN Y, CAO B, ZHU P, et al Drone-based RGB-infrared cross-modality vehicle detection via uncertainty-aware learning[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2022, 32 (10): 6700- 6713
doi: 10.1109/TCSVT.2022.3168279
23 WANG C-Y, YEH I-H, LIAO H. YOLOv9: learning what you want to learn using programmable gradient information [EB/OL]. (2024-02-21)[2025-04-18]. https://arxiv.org/abs/2402.13616.
24 WANG A, CHEN H, LIU L, et al. YOLOv10: real-time end-to-end object detection [EB/OL]. (2023-05-23)[2025-04-18]. https://arxiv.org/abs/2405.14458.
25 CHOLLET F. Xception: deep learning with depthwise separable convolutions [C]// IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 1800–1807.
26 XIAO Y, XU T, XIN Y, et al. FBRT-YOLO: faster and better for real-time aerial image detection [EB/OL]. (2025-04-29)[2025-04-18]. https://arxiv.org/abs/2504.20670.
[1] 肖剑,杨小苑,何昕泽,陈林,胡欣. 基于全局信息感知的轻量级螺纹钢表面缺陷检测算法[J]. 浙江大学学报(工学版), 2026, 60(7): 1438-1451.
[2] 田卫,周菻鈜,李欣阳,王建明,黄余康. 基于改进YOLOv8的3D打印混凝土表观缺陷检测方法[J]. 浙江大学学报(工学版), 2026, 60(4): 833-843.
[3] 李彬彬,张超,覃涛,陈昌盛,刘兴艳,杨靖. 面向光伏电站建设的移动端人体跌倒检测方法[J]. 浙江大学学报(工学版), 2026, 60(3): 546-555.
[4] 雒伟群,陆敬蔚,吴佳缔,梁钰迎,申传鹏,朱睿. 藏区高原典型环境地形目标的轻量化检测模型[J]. 浙江大学学报(工学版), 2026, 60(3): 594-603.
[5] 孟昱煜,孔垂乐,火久元,武泽宇. 重构YOLOv11的无人机小目标检测算法[J]. 浙江大学学报(工学版), 2026, 60(2): 303-312.
[6] 肖剑,何昕泽,程鸿亮,杨小苑,胡欣. 基于多尺度特征增强的航拍小目标检测算法[J]. 浙江大学学报(工学版), 2026, 60(1): 19-31.
[7] 翟亚红,陈雅玲,徐龙艳,龚玉. 改进YOLOv8s的轻量级无人机航拍小目标检测算法[J]. 浙江大学学报(工学版), 2025, 59(8): 1708-1717.
[8] 何婧瑶,李鹏飞,汪承志,吕振鸣,牟萍. 基于双目视觉和改进YOLOv8的动态三维重建方法[J]. 浙江大学学报(工学版), 2025, 59(7): 1443-1450.
[9] 曹铭,段武峰,马梦骁,艾凡荣,周奎. 基于改进YOLOv8-Seg模型的生物打印机产物均一性评估[J]. 浙江大学学报(工学版), 2025, 59(6): 1277-1283.
[10] 梁礼明,龙鹏威,金家新,李仁杰,曾璐. 基于改进YOLOv8s的钢材表面缺陷检测算法[J]. 浙江大学学报(工学版), 2025, 59(3): 512-522.
[11] 何永福,谢世维,于佳禄,陈思宇. 考虑跨层特征融合的抛洒风险车辆检测方法[J]. 浙江大学学报(工学版), 2025, 59(2): 300-309.
[12] 朵琳,殷瑜,段威,张芸,任勇. 基于改进YOLOv8的船舶目标检测算法[J]. 浙江大学学报(工学版), 2025, 59(11): 2379-2388.
[13] 武晓春,张恒骏,谭磊. 基于YOLOv8-HSV的隧道螺栓锈蚀检测及等级判定[J]. 浙江大学学报(工学版), 2025, 59(10): 2144-2153.
[14] 于家艺,吴秦. 基于上下文信息增强和深度引导的单目3D目标检测[J]. 浙江大学学报(工学版), 2025, 59(1): 89-99.
[15] 邓天民,程鑫鑫,刘金凤,张曦月. 基于特征复用机制的航拍图像小目标检测算法[J]. 浙江大学学报(工学版), 2024, 58(3): 437-448.