|
|
|
| Aerial small target detection algorithm based on multi-scale feature enhancement |
Jian XIAO1( ),Xinze HE1,Hongliang CHENG1,Xiaoyuan YANG1,Xin HU2,*( ) |
1. School of Electronics and Control Engineering, Chang’an University, Xi’an 710064, China 2. School of Energy and Electrical Engineering, Chang’an University, Xi’an 710064, China |
|
|
|
Abstract An aerial small target detection algorithm that balanced performance and resource consumption was proposed to address the issues of low detection accuracy and large model parameter size in small target detection of aerial images. On the basis of YOLOv8s, an adaptive detail-enhanced module (ADEM) was proposed by reducing the channel dimension and enhancing the focus on the high-frequency features to capture the fine-grained features of small targets while discarding the redundant information. A feature fusion network was optimized based on the PAN-FPN architecture to enhance the attention on shallow features. Multi-scale convolutional kernels were introduced to enhance the focus on the target contextual information, thereby adapting to the small object detection scenario. A parameter-adjustable Nin-IoU was constructed to overcome the limitations of traditional IoU in flexibility and generalization, and this adjustment achieved by introducing adjustable parameters allowed the Nin-IoU to be tailored to different detection tasks. A lightweight detection head was proposed to enhance the integration of multi-scale feature information while reducing redundant information transmission. Experimental results on the VisDrone2019 dataset indicated that the proposed algorithm achieved an mAP0.5 of 50.3% with only 8.08×106 parameters, representing a 27.4% reduction in parameters and an improvement of 11.5 percentage points in accuracy compared to the YOLOv8s benchmark algorithm. Experimental results on the DOTA and DIOR datasets further demonstrated the strong generalization capabilities of the proposed algorithm.
|
|
Received: 25 November 2024
Published: 15 December 2025
|
|
|
| Fund: 陕西省秦创原“科学家+工程师”队伍建设项目(2024QCY-KXJ-161);西安市人工智能重点产业链项目(23ZDCYJSGG0013-2023). |
|
Corresponding Authors:
Xin HU
E-mail: xiaojian@-chd.edu.cn;huxin@chd.edu.cn
|
基于多尺度特征增强的航拍小目标检测算法
针对航拍图像小目标检测中存在的检测精度低和模型参数量大的问题,提出兼顾性能与资源消耗的航拍小目标检测算法. 以YOLOv8s为基准网络,通过降低通道维数和加强对高频特征的关注,提出自适应细节增强模块(ADEM),在减少冗余信息的同时加强对小目标细粒度特征的捕获;基于PAN-FPN 架构调整特征融合网络,增加对浅层特征的关注,同时引入多尺度卷积核增强对目标上下文信息的关注,以适应小目标检测场景;针对传统IoU灵活性、泛化性不强的问题,构建参数可调的Nin-IoU,通过引入可调参数,实现对IoU的针对性调整,以适应不同检测任务的需求;提出轻量化检测头,在增强多尺度特征信息交融的同时减少冗余信息的传递. 结果表明,在VisDrone2019数据集上,所提算法以8.08×106的参数量实现了mAP0.5=50.3%的检测精度;相较于基准算法YOLOv8s,参数量降低了27.4%,精度提升了11.5个百分点. 在DOTA与DIOR数据集上的实验结果表明,所提算法具有较强的泛化能力.
关键词:
目标检测,
YOLOv8,
无人机图像,
特征融合,
损失函数
|
|
| [17] |
ZHANG H, XU C, ZHANG S J. Inner-IoU: more effective intersection over union loss with auxiliary bounding box [EB/OL]. (2023−11−14) [2024−11−20]. https://arxiv.org/abs/2311.02877.
|
|
|
| [18] |
TIAN Z, SHEN C, CHEN H, et al FCOS: a simple and strong anchor-free object detector[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44 (4): 1922- 1933
|
|
|
| [19] |
DU D, ZHU P, WEN L, et al. VisDrone-DET2019: the vision meets drone object detection in image challenge results [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision Workshop. Seoul: IEEE, 2019: 213−226.
|
|
|
| [20] |
YU W, YANG T, CHEN C. Towards resolving the challenge of long-tail distribution in UAV images for object detection [C]// Proceedings of the IEEE Winter Conference on Applications of Computer Vision. Waikoloa: IEEE, 2021: 3257−3266.
|
|
|
| [21] |
WANG C Y, BOCHKOVSKIY A, LIAO H M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver: IEEE, 2023: 7464−7475.
|
|
|
| [22] |
WANG C Y, YEH I H, LIAO H Y M. YOLOv9: learning what you want to learn using programmable gradient information [C]// European Conference on Computer Vision. Milan: Springer, 2025: 1−21.
|
|
|
| [23] |
SELVARAJU R R, COGSWELL M, DAS A, et al Grad-CAM: visual explanations from deep networks via gradient-based localization[J]. International Journal of Computer Vision, 2020, 128 (2): 336- 359
doi: 10.1007/s11263-019-01228-7
|
|
|
| [24] |
XIA G S, BAI X, DING J, et al. DOTA: a large- scale dataset for object detection in aerial images [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 3974−3983.
|
|
|
| [1] |
ZHU X, LYU S, WANG X, et al. TPH-YOLOv5: improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops. Montreal: IEEE, 2021: 2778−2788.
|
|
|
| [2] |
LUO X, WU Y, WANG F Target detection method of UAV aerial imagery based on improved YOLOv5[J]. Remote Sensing, 2022, 14 (19): 5063
doi: 10.3390/rs14195063
|
|
|
| [3] |
宋耀莲, 王粲, 李大焱, 等 基于改进YOLOv5s的无人机小目标检测算法[J]. 浙江大学学报: 工学版, 2024, 58 (12): 2417- 2426 SONG Yaolian, WANG Can, LI Dayan, et al UAV small target detection algorithm based on improved YOLOv5s[J]. Journal of Zhejiang University: Engineering Science, 2024, 58 (12): 2417- 2426
|
|
|
| [25] |
LI K, WAN G, CHENG G, et al Object detection in optical remote sensing images: a survey and a new benchmark[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2020, 159: 296- 307
|
|
|
| [4] |
邓天民, 余洋, 陈月田, 等. 基于自适应复合卷积的航拍小目标检测算法[J/OL]. 北京航空航天大学学报, 2024: 1–14. (2024−06−19) [2024−11−19]. https://doi.org/10.13700/j.bh.100-5965.2024.0135. DENG Tianmin, YU Yang, CHEN Yuetian, et al. Small object detection algorithm for aerial photography based on adaptive compound convolution [J/OL]. Journal of Beijing University of Aeronautics and Astronautics, 2024: 1–14. (2024−06−19) [2024−11−19]. https://doi.org/10.13700/j.bh.100-5965.2024.0135.
|
|
|
| [5] |
CAO J, BAO W, SHANG H, et al GCL-YOLO: a GhostConv-based lightweight YOLO network for UAV small object detection[J]. Remote Sensing, 2023, 15 (20): 4932
doi: 10.3390/rs15204932
|
|
|
| [6] |
WANG H, LIU C, CAI Y, et al YOLOv8-QSD: an improved small object detection algorithm for autonomous vehicles based on YOLOv8[J]. IEEE Transactions on Instrumentation and Measurement, 2024, 73: 2513916
|
|
|
| [7] |
FENG F, HU Y, LI W, et al Improved YOLOv8 algorithms for small object detection in aerial imagery[J]. Journal of King Saud University-Computer and Information Sciences, 2024, 36 (6): 102113
doi: 10.1016/j.jksuci.2024.102113
|
|
|
| [8] |
BODLA N, SINGH B, CHELLAPPA R, et al. Soft-NMS: improving object detection with one line of code [C]// Proceedings of the IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 5562−5570.
|
|
|
| [9] |
CHEN J, KAO SH, HE H, et al. Run, don’t walk: chasing higher FLOPS for faster neural networks [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver: IEEE, 2023: 12021−12031.
|
|
|
| [10] |
YANG B, BENDER G, LE Q V, et al. CondConv: conditionally parameterized convolutions for efficient inference [C]// Proceedings of the 33rd International Conference on Neural Information Processing Systems. Vancouver: NeurIPS Foundation, 2020: 1296−1307.
|
|
|
| [11] |
CHEN Z, HE Z, LU Z M DEA-net: single image dehazing based on detail-enhanced convolution and content-guided attention[J]. IEEE Transactions on Image Processing, 2024, 33: 1002- 1015
doi: 10.1109/TIP.2024.3354108
|
|
|
| [12] |
LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 936−944.
|
|
|
| [13] |
LIU S, QI L, QIN H, et al. Path aggregation network for instance segmentation [C]// Proceedings of the IEEE CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 8759−8768.
|
|
|
| [14] |
SUNKARA R, LUO T. No more strided convolutions or pooling: a new CNN building block for low-resolution images and small objects [C]// Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Grenoble: Springer, 2023: 443−459.
|
|
|
| [15] |
CUI Y, REN W, KNOLL A Omni-kernel network for image restoration[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2024, 38 (2): 1426- 1434
doi: 10.1609/aaai.v38i2.27907
|
|
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
| |
Shared |
|
|
|
|
| |
Discussed |
|
|
|
|