Please wait a minute...
Journal of ZheJiang University (Engineering Science)  2024, Vol. 58 Issue (3): 437-448    DOI: 10.3785/j.issn.1008-973X.2024.03.001
    
Small target detection algorithm for aerial images based on feature reuse mechanism
Tianmin DENG,Xinxin CHENG,Jinfeng LIU,Xiyue ZHANG
1. School of Traffic and Transportation, Chongqing Jiaotong University, Chongqing 400074, China
Download: HTML     PDF(4531KB) HTML
Export: BibTeX | EndNote (RIS)      

Abstract  

A lightweight and efficient aerial image detection algorithm called Functional ShuffleNet YOLO (FS-YOLO) was proposed based on YOLOv8s, in order to address the issues of low detection accuracy for small targets and a large number of model parameters in current unmanned aerial vehicle (UAV) aerial image detection. A lightweight feature extraction network was introduced by reducing channel dimensions and improving the network architecture. This facilitated the efficient reuse of redundant feature information, generating more feature maps with fewer parameters, enhancing the model’s ability to extract and express feature information while significantly reducing the model size. Additionally, a content-aware feature recombination module was introduced during the feature fusion stage to enhance the attention on salient semantic information of small targets, thereby improving the detection performance of the network for aerial images. Experimental validation was conducted using the VisDrone dataset, and the results indicated that the proposed algorithm achieved a detection accuracy of 47.0% mAP0.5 with only 5.48 million parameters. This represented a 50.7% reduction in parameter count compared to the YOLOv8s benchmark algorithm, along with a 6.1% improvement in accuracy. Experimental results of DIOR dataset showed that FS-YOLO had strong generalization and was more competitive than other state-of-the-art algorithms.



Key wordsunmanned aerial vehicle (UAV) image      object detection      YOLOv8      lightweight backbone      CARAFE     
Received: 20 May 2023      Published: 05 March 2024
CLC:  V 279  
  TN 911.73  
  TP 391.41  
Fund:  国家重点研发计划资助项目(2022YFC3800502);重庆市技术创新与应用发展专项重点资助项目(cstc2021jscx-gksbX0058,CSTB2022TIAD-KPX0113,CSTB2022TIAD-KPX0118).
Cite this article:

Tianmin DENG,Xinxin CHENG,Jinfeng LIU,Xiyue ZHANG. Small target detection algorithm for aerial images based on feature reuse mechanism. Journal of ZheJiang University (Engineering Science), 2024, 58(3): 437-448.

URL:

https://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2024.03.001     OR     https://www.zjujournals.com/eng/Y2024/V58/I3/437


基于特征复用机制的航拍图像小目标检测算法

针对无人机(UAV)航拍图像检测存在的小目标检测精度低和模型参数量大的问题,提出轻量高效的航拍图像检测算法FS-YOLO. 该算法以YOLOv8s为基准网络,通过降低通道维数和改进网络架构提出轻量的特征提取网络,实现对冗余特征信息的高效复用,在较少的参数量下产生更多特征图,提高模型对特征信息的提取和表达能力,同时显著减小模型大小. 在特征融合阶段引入内容感知特征重组模块,加强对小目标显著语义信息的关注,提升网络对航拍图像的检测性能. 使用无人机航拍数据集VisDrone进行实验验证,结果表明,所提算法以仅5.48 M的参数量实现了mAP0.5=47.0%的检测精度,比基准算法YOLOv8s的参数量降低了50.7%,精度提升了6.1%. 在DIOR数据集上的实验表明,FS-YOLO的泛化能力较强,较其他先进算法更具竞争力.


关键词: 无人机(UVA)图像,  目标检测,  YOLOv8,  轻量化主干,  CARAFE 
Fig.1 Overall structure of aerial image lightweight detection algorithm based on FS-YOLO
序号c1模型结构c2SizeP/KB序号c1模型结构c2SizeP/KB
13Conv32320×32092812256CARAFE256160×160103 012
232FS Block_264160×16025 40813256Concat320160×1600
364FS Blcok_164160×16023 96814320C2f128160×160140 032
464FS Blcok_212880×8094 80015128Conv12880×80147 712
5128FS Blcok_112880×80179 74416128Concat38480×800
6128FS Blcok_225640×40371 87217384C2f25680×80493 056
7256FS Blcok_125640×40353 82418256Conv25640×40590 336
8256SPPF25640×40164 60819256Concat51240×400
9256CARAFE25680×80103 01220512C2f25640×40525 824
10256Concat38480×80021Detect31 677 550
11384C2f25680×80493 056总计5 488 726
Tab.1 Parameter details of FS-YOLO algorithm
Fig.2 Two structures of FS bottleneck layer
Fig.3 T-shaped feature perception convolution module
Fig.4 Coordinate attention module structure
Fig.5 Ghost convolution module structure
Fig.6 Module of content-aware feature reassembly
模型ESNetTFPCCAGhostmAP0.5/%mAP0.5:0.95/%R/%P/MBM/MBF/(帧·s?1)
1)注:“√”表示加入相应模块,“—”表示未加入相应模块.
YOLOv81)40.924.639.311.1321.4144.9
(a)45.2(+4.3)27.543.34.678.583.3
(b)45.9(+5.0)28.143.45.0910.182.6
(c)45.4(+4.5)27.543.74.679.379.3
(d)45.1(+4.2)27.443.54.879.681.3
(e)46.1(+5.2)28.144.25.0810.281.9
(f)45.8(+4.9)27.843.15.2810.382.0
(g)45.7(+4.8)27.743.04.8610.482.6
(h)46.2(+5.3)28.344.55.2710.880.0
Tab.2 Comparison of detection performance results of Backbone ablation experiment
模型mAP0.5/%mAP0.5:0.95/%R/%P/MBF/(帧·s?1)
YOLOv8s40.924.639.311.12144.9
YOLOv8s+FS46.228.344.55.2780.0
YOLOv8s+FS+CARAFE(a)46.128.244.45.4778.1
YOLOv8s+FS+CARAFE(b)46.528.344.55.4771.9
YOLOv8s+FS+CARAFE(c)47.028.844.75.4868.0
Tab.3 Comparison of detection performance results of neck ablation experiment
模型AP/%mAP0.5/%P/MB
行人自行车汽车面包车卡车三轮车遮阳蓬三轮车巴士摩托车
1)注:加粗字体为该列最优值,包含引用文献实验结果.
Faster R-CNN[18]20.914.87.351.029.719.514.08.830.521.221.8
Cascade R-CNN[18]22.214.87.654.631.521.614.88.634.921.423.2
YOLOv4[19]24.812.68.664.322.422.711.47.644.321.730.7
YOLOv5s39.031.311.273.535.429.520.511.143.137.033.27.03
YOLOv5l47.837.717.878.242.640.326.813.154.745.340.446.15
Tph-YOLOv5[3]53.31)42.121.183.745.242.533.016.361.151.044.960.42
YOLOX-l[20]34.824.516.972.434.440.523.117.853.136.035.354.16
YOLOv7-tiny[21]37.934.69.476.136.329.820.110.643.241.834.06.03
YOLOv8s44.334.515.680.446.137.929.515.758.546.640.911.13
YOLOv8l51.139.821.982.949.345.438.120.467.052.346.843.69
FS-YOLO53.143.618.985.051.441.335.321.166.054.947.05.48
Tab.4 Comparative results of different algorithms in average precision and parameters on VisDrone dataset
Fig.7 Comparison of target detection effectiveness in complex scenes on VisDrone Dataset
模型BackBonemAP0.5/%模型BackBonemAP0.5/%
1)注:加粗字体表示该列最优值.
Faster R-CNN[22]VGG1654.1SCRDet++(FPN*)[24]ResNet-10177.81)
SSD[23]VGG1660.8CAT-Net[25]ResNet5076.3
YOLOv3[23]Darknet5359.9YOLOv5sDarknet53(C3)71.9
RetinaNet[22]ResNet5065.7YOLOv5l[26]Darknet53(C3)73.9
PANet[22]ResNet5063.8YOLOv8sDarknet53(C2f)72.2
CornerNet[22]Hourglass10464.9FS-YOLOFS74.3
FsoD-Net[23]MSE-Net71.8
Tab.5 Comparison results of different algorithms on DIOR dataset
Fig.8 Comparison of detection effects on DIOR data set
[1]   ZHANG X, WANG C, JIN J, et al Object detection of VisDrone by stronger feature extraction FasterRCNN[J]. Journal of Electronic Imaging, 2023, 32 (1): 13- 18
[2]   HUANG H, LI L, MA H. An improved Cascade R-CNN based target detection algorithm for UAV aerial images [C]// 2022 7th International Conference on Image, Vision and Computing . Xi’an: IEEE, 2022: 232−237.
[3]   ZHU X, LYU S, WANG X, et al. TPH-YOLOv5: improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops . Montreal: IEEE, 2021: 2778−2788.
[4]   LIU C, YANG D, TANG L, et al A lightweight object detector based on spatial-coordinate self-attention for UAV aerial images[J]. Remote Sensing, 2023, 15 (1): 83
[5]   HOWARD A, ZHU M, CHEN B, et al. Mobilenets: efficient convolutional neural networks for mobile vision applications [EB/OL]. (2017-04-17) [2023-05-04]. https://arxiv.org/abs/1704.04861.
[6]   ZHANG X, ZHOU X, LIN M, et al. Shufflenet: an extremely efficient convolutional neural network for mobile devices [C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Salt Lake City: IEEE, 2018: 6848–6856.
[7]   MA N, ZHANG X, ZHENG H T, et al. Shufflenet v2: practical guidelines for efficient CNN architecture sign [C]// Proceedings of the European Conference on Computer Vision . Munich: Springer, 2018: 116–131.
[8]   YU G, CHANG Q, LV W, et al. PP-PicoDet: a bett-er real-time object detector on mobile devices [EB/OL]. (2021-12-01) [2023-05-04]. https://arxiv.org/abs/2111.00902.
[9]   WANG J, CHEN K, XU R, et al. CARAFE: content-aware reassembly of features [C]// 2019 IEEE/CVF International Conference on Computer Vision . Seoul: IEEE, 2019: 3007−3016.
[10]   JOCHER G, NISHIMURA K, MINEEVA T, et al. YOLOv8 [EB/OL]. (2023-01-13) [2023-05-04]. https://github.com/ultralytics/ultralytics/tags.
[11]   JOCHER G, NISHIMURA K, MINEEVA T, et al. YOLOv5 (minor version 6.2) [EB/OL]. (2022-04-17) [2023-05-04]. https://github.com/ultralytics/yolov5/releases/tag/v6.2.
[12]   张融, 张为 基于改进GhostNet-FCOS的火灾检测算法[J]. 浙江大学学报:工学版, 2022, 56 (10): 1891- 1899
ZHANG Rong, ZHANG Wei Fire detection algorithm based on improved GhostNet-FCOS[J]. Journal of Zhejiang University: Engineering Science, 2022, 56 (10): 1891- 1899
[13]   HU J, SHEN L, SUN G. Squeeze-and-excitation networks [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Salt Lake City: IEEE, 2018: 7132-7141.
[14]   HAN K, WANG Y, TIAN Q, et al. Ghostnet: more features from cheap operations [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . Seattle: IEEE, 2020: 1580−1589.
[15]   HOU Q, ZHOU D, FENG J. Coordinate attention for efficient mobile network design [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . [s.l.]: IEEE, 2021: 13713−13722.
[16]   CHEN J, KAO S, HE H, et al. Run, don't walk: chasing higher FLOPS for faster neural networks [EB/OL]. (2022-04-04) [2023-05-04]. https://arxiv.org/abs/2303.03667.
[17]   DU D, ZHU P, WEN L, et al. VisDrone-DET2019: the vision meets drone object detection in image challenge results [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision . Seoul: IEEE, 2019: 213−226.
[18]   YU W, YANG T, CHEN C. Towards resolving the challenge of long-tail distribution in UAV images for object detection [C]// Proceedings of the 2021 IEEE Winter Conference on Applications of Computer Vision . Waikoloa: IEEE, 2021: 3257−3266.
[19]   ALI S, SIDDIQUE A, ATES H, et al. Improved YOLOv4 for aerial object detection [C]// Proceedings of the 2021 29th Signal Processing and Communications Applications Conference . Istanbul: IEEE, 2021, 1−4.
[20]   YANG Y, GAO X, WANG Y, et al VAMYOLOX: an accurate and efficient object detection algorithm based on visual attention mechanism for UAV optical sensors[J]. IEEE Sensors Journal, 2023, 23 (11): 11139- 11155
doi: 10.1109/JSEN.2022.3219199
[21]   WANG C, BOCHKOVSKIY A, LIAO H M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . Vancouver: IEEE, 2023: 7464−7475.
[22]   LI K, WAN G, CHENG G, et al Object detection in optical remote sensing images: a survey and a new benchmark[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2020, 159: 296- 307
doi: 10.1016/j.isprsjprs.2019.11.023
[23]   WANG, G, ZHUANG, Y, CHEN, et al FSoD-Net: full-scale object detection from optical remote sensing imagery[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 60: 1- 18
[24]   YANG X, YAN J, LIAO W, et al Scrdet++: detecting small, cluttered and rotated objects via instance-level feature denoising and rotation loss smoothing[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 45 (2): 2384- 2399
[25]   LIU Y, LI H, HU C, et al. Catnet: context aggregate on network for instance segmentation in remote sensing images [EB/OL]. (2022-05-27) [2023-05-04]. https://arxiv.org/abs/2111.11057.
[1] Anjing WANG,Julong YUAN,Yongjian ZHU,Cong CHEN,Jinjin WU. Drum roller surface defect detection algorithm based on improved YOLOv8s[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(2): 370-380.
[2] Siyi QIN,Shaoyan GAI,Feipeng DA. Video object detection algorithm based on multi-level feature aggregation under mixed sampler[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(1): 10-19.
[3] Wei-long HE,Ping WANG,Ai-hua ZHANG,Ting-ting LIANG,Qiang-jie MA. On-line quantitative detection of welding arc shape based on object detection[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(9): 1903-1914.
[4] Yan ZHANG,Jing-xue SUN,Ye-mei SUN,Shu-dong LIU,Chuan-qi WANG. Lightweight object detection based on split attention and linear transformation[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(6): 1195-1204.
[5] Yao ZENG,Fa-qin GAO. Surface defect detection algorithm of electronic components based on improved YOLOv5[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(3): 455-465.
[6] Na ZHANG,Xu-lei QI,Xiao-an BAO,Biao WU,Xiao-mei TU,Yu-ting JIN. Single-stage object detection algorithm based on optimizing position prediction[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(4): 783-794.
[7] Jing-hui CHU,Li-dong SHI,Pei-guang JING,Wei LV. Context-aware knowledge distillation network for object detection[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(3): 503-509.
[8] Nan-jing YU,Xiao-biao FAN,Tian-min DENG,Guo-tao MAO. Ship detection algorithm in complex backgrounds via multi-head self-attention[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(12): 2392-2402.
[9] Yu XIE,Zi-qun BAO,Na ZHANG,Biao WU,Xiao-mei TU,Xiao-an BAO. Object detection algorithm based on feature enhancement and deep fusion[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(12): 2403-2415.
[10] Rong ZHANG,Wei ZHANG. Fire detection algorithm based on improved GhostNet-FCOS[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(10): 1891-1899.
[11] Kai DU,Guo-rong ZHU,Jiang-hua LU,Mu-ye PANG. Metal object detection method in wireless electric vehicle charging system[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(1): 56-62.
[12] Ying-jie NIU,Yan-chen SU,Dun-cheng CHENG,Jia LIAO,Hai-bo ZHAO,Yong-qiang GAO. High-speed rail contact network U-holding nut fault detection algorithm[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(10): 1912-1921.
[13] Ying-jie XIA,Cong-yu OUYANG. Dynamic image background modeling method for detecting abandoned objects in highway[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(7): 1249-1255.
[14] Chen-bin ZHENG,Yong ZHANG,Hang HU,Ying-rui WU,Guang-jing HUANG. Object detection enhanced context model[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(3): 529-539.
[15] Yao JIN,Wei ZHANG. Real-time fire detection algorithm with Anchor-Free network architecture[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(12): 2430-2436.