Please wait a minute...
浙江大学学报(工学版)  2024, Vol. 58 Issue (3): 437-448    DOI: 10.3785/j.issn.1008-973X.2024.03.001
计算机技术     
基于特征复用机制的航拍图像小目标检测算法
邓天民,程鑫鑫,刘金凤,张曦月
1. 重庆交通大学 交通运输学院,重庆 400074
Small target detection algorithm for aerial images based on feature reuse mechanism
Tianmin DENG,Xinxin CHENG,Jinfeng LIU,Xiyue ZHANG
1. School of Traffic and Transportation, Chongqing Jiaotong University, Chongqing 400074, China
 全文: PDF(4531 KB)   HTML
摘要:

针对无人机(UAV)航拍图像检测存在的小目标检测精度低和模型参数量大的问题,提出轻量高效的航拍图像检测算法FS-YOLO. 该算法以YOLOv8s为基准网络,通过降低通道维数和改进网络架构提出轻量的特征提取网络,实现对冗余特征信息的高效复用,在较少的参数量下产生更多特征图,提高模型对特征信息的提取和表达能力,同时显著减小模型大小. 在特征融合阶段引入内容感知特征重组模块,加强对小目标显著语义信息的关注,提升网络对航拍图像的检测性能. 使用无人机航拍数据集VisDrone进行实验验证,结果表明,所提算法以仅5.48 M的参数量实现了mAP0.5=47.0%的检测精度,比基准算法YOLOv8s的参数量降低了50.7%,精度提升了6.1%. 在DIOR数据集上的实验表明,FS-YOLO的泛化能力较强,较其他先进算法更具竞争力.

关键词: 无人机(UVA)图像目标检测YOLOv8轻量化主干CARAFE    
Abstract:

A lightweight and efficient aerial image detection algorithm called Functional ShuffleNet YOLO (FS-YOLO) was proposed based on YOLOv8s, in order to address the issues of low detection accuracy for small targets and a large number of model parameters in current unmanned aerial vehicle (UAV) aerial image detection. A lightweight feature extraction network was introduced by reducing channel dimensions and improving the network architecture. This facilitated the efficient reuse of redundant feature information, generating more feature maps with fewer parameters, enhancing the model’s ability to extract and express feature information while significantly reducing the model size. Additionally, a content-aware feature recombination module was introduced during the feature fusion stage to enhance the attention on salient semantic information of small targets, thereby improving the detection performance of the network for aerial images. Experimental validation was conducted using the VisDrone dataset, and the results indicated that the proposed algorithm achieved a detection accuracy of 47.0% mAP0.5 with only 5.48 million parameters. This represented a 50.7% reduction in parameter count compared to the YOLOv8s benchmark algorithm, along with a 6.1% improvement in accuracy. Experimental results of DIOR dataset showed that FS-YOLO had strong generalization and was more competitive than other state-of-the-art algorithms.

Key words: unmanned aerial vehicle (UAV) image    object detection    YOLOv8    lightweight backbone    CARAFE
收稿日期: 2023-05-20 出版日期: 2024-03-05
CLC:  V 279  
基金资助: 国家重点研发计划资助项目(2022YFC3800502);重庆市技术创新与应用发展专项重点资助项目(cstc2021jscx-gksbX0058,CSTB2022TIAD-KPX0113,CSTB2022TIAD-KPX0118).
作者简介: 邓天民(1979—),男,教授,博导,从事交通环境感知研究. orcid.org/0000-0003-0511-0519. E-mail:dtianmin@cqjtu.edu.cn
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  
邓天民
程鑫鑫
刘金凤
张曦月

引用本文:

邓天民,程鑫鑫,刘金凤,张曦月. 基于特征复用机制的航拍图像小目标检测算法[J]. 浙江大学学报(工学版), 2024, 58(3): 437-448.

Tianmin DENG,Xinxin CHENG,Jinfeng LIU,Xiyue ZHANG. Small target detection algorithm for aerial images based on feature reuse mechanism. Journal of ZheJiang University (Engineering Science), 2024, 58(3): 437-448.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2024.03.001        https://www.zjujournals.com/eng/CN/Y2024/V58/I3/437

图 1  基于FS-YOLO的航拍图像轻量级检测算法整体结构
序号c1模型结构c2SizeP/KB序号c1模型结构c2SizeP/KB
13Conv32320×32092812256CARAFE256160×160103 012
232FS Block_264160×16025 40813256Concat320160×1600
364FS Blcok_164160×16023 96814320C2f128160×160140 032
464FS Blcok_212880×8094 80015128Conv12880×80147 712
5128FS Blcok_112880×80179 74416128Concat38480×800
6128FS Blcok_225640×40371 87217384C2f25680×80493 056
7256FS Blcok_125640×40353 82418256Conv25640×40590 336
8256SPPF25640×40164 60819256Concat51240×400
9256CARAFE25680×80103 01220512C2f25640×40525 824
10256Concat38480×80021Detect31 677 550
11384C2f25680×80493 056总计5 488 726
表 1  FS-YOLO算法的参数细节
图 2  FS瓶颈层的2种结构
图 3  T型特征感知卷积模块
图 4  坐标注意力模块结构
图 5  幽灵卷积模块结构
图 6  内容感知特征重组模块
模型ESNetTFPCCAGhostmAP0.5/%mAP0.5:0.95/%R/%P/MBM/MBF/(帧·s?1)
1)注:“√”表示加入相应模块,“—”表示未加入相应模块.
YOLOv81)40.924.639.311.1321.4144.9
(a)45.2(+4.3)27.543.34.678.583.3
(b)45.9(+5.0)28.143.45.0910.182.6
(c)45.4(+4.5)27.543.74.679.379.3
(d)45.1(+4.2)27.443.54.879.681.3
(e)46.1(+5.2)28.144.25.0810.281.9
(f)45.8(+4.9)27.843.15.2810.382.0
(g)45.7(+4.8)27.743.04.8610.482.6
(h)46.2(+5.3)28.344.55.2710.880.0
表 2  主干网络消融实验的检测性能结果对比
模型mAP0.5/%mAP0.5:0.95/%R/%P/MBF/(帧·s?1)
YOLOv8s40.924.639.311.12144.9
YOLOv8s+FS46.228.344.55.2780.0
YOLOv8s+FS+CARAFE(a)46.128.244.45.4778.1
YOLOv8s+FS+CARAFE(b)46.528.344.55.4771.9
YOLOv8s+FS+CARAFE(c)47.028.844.75.4868.0
表 3  颈部消融实验的检测性能结果对比
模型AP/%mAP0.5/%P/MB
行人自行车汽车面包车卡车三轮车遮阳蓬三轮车巴士摩托车
1)注:加粗字体为该列最优值,包含引用文献实验结果.
Faster R-CNN[18]20.914.87.351.029.719.514.08.830.521.221.8
Cascade R-CNN[18]22.214.87.654.631.521.614.88.634.921.423.2
YOLOv4[19]24.812.68.664.322.422.711.47.644.321.730.7
YOLOv5s39.031.311.273.535.429.520.511.143.137.033.27.03
YOLOv5l47.837.717.878.242.640.326.813.154.745.340.446.15
Tph-YOLOv5[3]53.31)42.121.183.745.242.533.016.361.151.044.960.42
YOLOX-l[20]34.824.516.972.434.440.523.117.853.136.035.354.16
YOLOv7-tiny[21]37.934.69.476.136.329.820.110.643.241.834.06.03
YOLOv8s44.334.515.680.446.137.929.515.758.546.640.911.13
YOLOv8l51.139.821.982.949.345.438.120.467.052.346.843.69
FS-YOLO53.143.618.985.051.441.335.321.166.054.947.05.48
表 4  不同算法在VisDrone数据集上的平均精度和参数量对比结果
图 7  VisDrone数据集上的复杂场景目标检测效果对比
模型BackBonemAP0.5/%模型BackBonemAP0.5/%
1)注:加粗字体表示该列最优值.
Faster R-CNN[22]VGG1654.1SCRDet++(FPN*)[24]ResNet-10177.81)
SSD[23]VGG1660.8CAT-Net[25]ResNet5076.3
YOLOv3[23]Darknet5359.9YOLOv5sDarknet53(C3)71.9
RetinaNet[22]ResNet5065.7YOLOv5l[26]Darknet53(C3)73.9
PANet[22]ResNet5063.8YOLOv8sDarknet53(C2f)72.2
CornerNet[22]Hourglass10464.9FS-YOLOFS74.3
FsoD-Net[23]MSE-Net71.8
表 5  不同算法在DIOR数据集上的对比结果
图 8  DIOR数据集上的检测效果对比
1 ZHANG X, WANG C, JIN J, et al Object detection of VisDrone by stronger feature extraction FasterRCNN[J]. Journal of Electronic Imaging, 2023, 32 (1): 13- 18
2 HUANG H, LI L, MA H. An improved Cascade R-CNN based target detection algorithm for UAV aerial images [C]// 2022 7th International Conference on Image, Vision and Computing . Xi’an: IEEE, 2022: 232−237.
3 ZHU X, LYU S, WANG X, et al. TPH-YOLOv5: improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops . Montreal: IEEE, 2021: 2778−2788.
4 LIU C, YANG D, TANG L, et al A lightweight object detector based on spatial-coordinate self-attention for UAV aerial images[J]. Remote Sensing, 2023, 15 (1): 83
5 HOWARD A, ZHU M, CHEN B, et al. Mobilenets: efficient convolutional neural networks for mobile vision applications [EB/OL]. (2017-04-17) [2023-05-04]. https://arxiv.org/abs/1704.04861.
6 ZHANG X, ZHOU X, LIN M, et al. Shufflenet: an extremely efficient convolutional neural network for mobile devices [C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Salt Lake City: IEEE, 2018: 6848–6856.
7 MA N, ZHANG X, ZHENG H T, et al. Shufflenet v2: practical guidelines for efficient CNN architecture sign [C]// Proceedings of the European Conference on Computer Vision . Munich: Springer, 2018: 116–131.
8 YU G, CHANG Q, LV W, et al. PP-PicoDet: a bett-er real-time object detector on mobile devices [EB/OL]. (2021-12-01) [2023-05-04]. https://arxiv.org/abs/2111.00902.
9 WANG J, CHEN K, XU R, et al. CARAFE: content-aware reassembly of features [C]// 2019 IEEE/CVF International Conference on Computer Vision . Seoul: IEEE, 2019: 3007−3016.
10 JOCHER G, NISHIMURA K, MINEEVA T, et al. YOLOv8 [EB/OL]. (2023-01-13) [2023-05-04]. https://github.com/ultralytics/ultralytics/tags.
11 JOCHER G, NISHIMURA K, MINEEVA T, et al. YOLOv5 (minor version 6.2) [EB/OL]. (2022-04-17) [2023-05-04]. https://github.com/ultralytics/yolov5/releases/tag/v6.2.
12 张融, 张为 基于改进GhostNet-FCOS的火灾检测算法[J]. 浙江大学学报:工学版, 2022, 56 (10): 1891- 1899
ZHANG Rong, ZHANG Wei Fire detection algorithm based on improved GhostNet-FCOS[J]. Journal of Zhejiang University: Engineering Science, 2022, 56 (10): 1891- 1899
13 HU J, SHEN L, SUN G. Squeeze-and-excitation networks [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Salt Lake City: IEEE, 2018: 7132-7141.
14 HAN K, WANG Y, TIAN Q, et al. Ghostnet: more features from cheap operations [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . Seattle: IEEE, 2020: 1580−1589.
15 HOU Q, ZHOU D, FENG J. Coordinate attention for efficient mobile network design [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . [s.l.]: IEEE, 2021: 13713−13722.
16 CHEN J, KAO S, HE H, et al. Run, don't walk: chasing higher FLOPS for faster neural networks [EB/OL]. (2022-04-04) [2023-05-04]. https://arxiv.org/abs/2303.03667.
17 DU D, ZHU P, WEN L, et al. VisDrone-DET2019: the vision meets drone object detection in image challenge results [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision . Seoul: IEEE, 2019: 213−226.
18 YU W, YANG T, CHEN C. Towards resolving the challenge of long-tail distribution in UAV images for object detection [C]// Proceedings of the 2021 IEEE Winter Conference on Applications of Computer Vision . Waikoloa: IEEE, 2021: 3257−3266.
19 ALI S, SIDDIQUE A, ATES H, et al. Improved YOLOv4 for aerial object detection [C]// Proceedings of the 2021 29th Signal Processing and Communications Applications Conference . Istanbul: IEEE, 2021, 1−4.
20 YANG Y, GAO X, WANG Y, et al VAMYOLOX: an accurate and efficient object detection algorithm based on visual attention mechanism for UAV optical sensors[J]. IEEE Sensors Journal, 2023, 23 (11): 11139- 11155
doi: 10.1109/JSEN.2022.3219199
21 WANG C, BOCHKOVSKIY A, LIAO H M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . Vancouver: IEEE, 2023: 7464−7475.
22 LI K, WAN G, CHENG G, et al Object detection in optical remote sensing images: a survey and a new benchmark[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2020, 159: 296- 307
doi: 10.1016/j.isprsjprs.2019.11.023
23 WANG, G, ZHUANG, Y, CHEN, et al FSoD-Net: full-scale object detection from optical remote sensing imagery[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 60: 1- 18
24 YANG X, YAN J, LIAO W, et al Scrdet++: detecting small, cluttered and rotated objects via instance-level feature denoising and rotation loss smoothing[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 45 (2): 2384- 2399
25 LIU Y, LI H, HU C, et al. Catnet: context aggregate on network for instance segmentation in remote sensing images [EB/OL]. (2022-05-27) [2023-05-04]. https://arxiv.org/abs/2111.11057.
[1] 王安静,袁巨龙,朱勇建,陈聪,吴金津. 基于改进YOLOv8s的鼓形滚子表面缺陷检测算法[J]. 浙江大学学报(工学版), 2024, 58(2): 370-380.
[2] 秦思怡,盖绍彦,达飞鹏. 混合采样下多级特征聚合的视频目标检测算法[J]. 浙江大学学报(工学版), 2024, 58(1): 10-19.
[3] 何卫隆,王平,张爱华,梁婷婷,马强杰. 基于目标检测的焊接电弧形态在线定量检测[J]. 浙江大学学报(工学版), 2023, 57(9): 1903-1914.
[4] 张艳,孙晶雪,孙叶美,刘树东,王传启. 基于分割注意力与线性变换的轻量化目标检测[J]. 浙江大学学报(工学版), 2023, 57(6): 1195-1204.
[5] 韩俊,袁小平,王准,陈烨. 基于YOLOv5s的无人机密集小目标检测算法[J]. 浙江大学学报(工学版), 2023, 57(6): 1224-1233.
[6] 曾耀,高法钦. 基于改进YOLOv5的电子元件表面缺陷检测算法[J]. 浙江大学学报(工学版), 2023, 57(3): 455-465.
[7] 张娜,戚旭磊,包晓安,吴彪,涂小妹,金瑜婷. 基于优化预测定位的单阶段目标检测算法[J]. 浙江大学学报(工学版), 2022, 56(4): 783-794.
[8] 褚晶辉,史李栋,井佩光,吕卫. 适用于目标检测的上下文感知知识蒸馏网络[J]. 浙江大学学报(工学版), 2022, 56(3): 503-509.
[9] 于楠晶,范晓飚,邓天民,冒国韬. 基于多头自注意力的复杂背景船舶检测算法[J]. 浙江大学学报(工学版), 2022, 56(12): 2392-2402.
[10] 谢誉,包梓群,张娜,吴彪,涂小妹,包晓安. 基于特征优化与深层次融合的目标检测算法[J]. 浙江大学学报(工学版), 2022, 56(12): 2403-2415.
[11] 张云佐,郭威,蔡昭权,李文博. 联合多尺度与注意力机制的遥感图像目标检测[J]. 浙江大学学报(工学版), 2022, 56(11): 2215-2223.
[12] 张融,张为. 基于改进GhostNet-FCOS的火灾检测算法[J]. 浙江大学学报(工学版), 2022, 56(10): 1891-1899.
[13] 周金海,周世镒,常阳,吴耿俊,王依川. 基于超宽带雷达基带信号的多人目标跟踪[J]. 浙江大学学报(工学版), 2021, 55(6): 1208-1214.
[14] 徐利锋,黄海帆,丁维龙,范玉雷. 基于改进DenseNet的水果小目标检测[J]. 浙江大学学报(工学版), 2021, 55(2): 377-385.
[15] 牛英杰,苏燕辰,程敦诚,廖家,赵海波,高永强. 高铁接触网U型抱箍螺母故障检测算法[J]. 浙江大学学报(工学版), 2021, 55(10): 1912-1921.