Please wait a minute...
浙江大学学报(工学版)  2025, Vol. 59 Issue (11): 2379-2388    DOI: 10.3785/j.issn.1008-973X.2025.11.017
计算机技术     
基于改进YOLOv8的船舶目标检测算法
朵琳(),殷瑜,段威,张芸,任勇
昆明理工大学 信息工程与自动化学院,云南 昆明 650500
Ship target detection algorithm based on improved YOLOv8
Lin DUO(),Yu YIN,Wei DUAN,Yun ZHANG,Yong REN
School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China
 全文: PDF(4617 KB)   HTML
摘要:

为了解决合成孔径雷达(SAR)图像中部分船舶目标体积小、目标尺寸变化大及背景噪声复杂等问题,提出基于YOLOv8的船舶目标检测算法DD-YOLO. 其中,在骨干网络中采用改进的C2f模块增强多尺度特征提取与融合能力,结合新设计的SPA模块优化梯度流信息传递,显著提升多尺度目标检测的效果. 颈部使用更轻量的动态上采样,在降低计算开销和模型复杂性的同时优化复杂背景中小型船舶的识别. 检测部分设计加入多维度注意力机制,开展轻量化处理,增强模型对复杂背景中关键特征的敏感性,提高检测准确性. 在HRSID和SSDD 2个公开数据集上进行实验,DD-YOLO的mAP50分别达到92.2%和98.5%,比基线分别提高了2%和2.2%,模型复杂度显著低于主流算法,实现了精度与效率的平衡.

关键词: 合成孔径雷达(SAR)船舶检测深度学习YOLOv8多尺度特征融合    
Abstract:

An improved ship target detection algorithm DD-YOLO based on YOLOv8 was proposed in order to address the challenges of small target size, significant scale variations and complex background noise in synthetic aperture radar (SAR) images for ship detection. An enhanced C2f module was incorporated in the backbone network to strengthen multi-scale feature extraction and fusion capabilities, combined with a newly designed SPA module to optimize gradient flow information propagation, significantly improving multi-scale target detection performance. A more lightweight dynamic upsampling approach was adopted in the neck network, which reduced computational overhead and model complexity while enhancing the recognition of small ships in complex backgrounds. A multi-dimensional attention mechanism was integrated in the detection head and lightweight processing was conducted to improve the model’s sensitivity to key features in complex backgrounds, thereby increasing detection accuracy. Experiments conducted on two public datasets, HRSID and SSDD, demonstrate that DD-YOLO achieves mAP50 scores of 92.2% and 98.5%, respectively, representing improvement of 2% and 2.2% over the baseline model. The model complexity is significantly lower than that of mainstream algorithms, achieving an optimal balance between accuracy and efficiency.

Key words: synthetic aperture radar (SAR)    ship inspection    deep learning    YOLOv8    multiscale feature fusion
收稿日期: 2024-11-01 出版日期: 2025-10-30
:  TP 391  
基金资助: 云南省科技厅重大科技专项计划资助项目(202302AD080006); 云南省媒体融合重点实验室资助项目(220245201).
作者简介: 朵琳(1974—),女,副教授,从事移动通信与人工智能的研究. orcid.org/0000-0001-9221-5209. E-mail:duolin2003@126.com
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  
朵琳
殷瑜
段威
张芸
任勇

引用本文:

朵琳,殷瑜,段威,张芸,任勇. 基于改进YOLOv8的船舶目标检测算法[J]. 浙江大学学报(工学版), 2025, 59(11): 2379-2388.

Lin DUO,Yu YIN,Wei DUAN,Yun ZHANG,Yong REN. Ship target detection algorithm based on improved YOLOv8. Journal of ZheJiang University (Engineering Science), 2025, 59(11): 2379-2388.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2025.11.017        https://www.zjujournals.com/eng/CN/Y2025/V59/I11/2379

图 1  DD-YOLO网络的结构
图 2  C2f-DD模块
图 3  扩展重参数模块
图 4  DWR_DRB模块
图 5  可变形注意模块
图 6  动态上采样
图 7  采样点发生器
图 8  DyHead-Prune的结构
编号YOLOV8C2f-DDSPADysampleDetectp/%r/%mAP50/%mAP50-95/%Np/106FLOPs/109
A90.582.590.264.011.468.1
B91.183.691.064.812.278.0
C91.183.990.764.712.498.3
D90.483.890.664.111.428.1
E91.083.991.466.113.309.6
F91.183.991.666.413.249.6
G91.984.092.066.713.429.6
H92.085.092.267.413.949.8
表 1  HRSID上的DD-YOLO消融实验
C2f-DD数量p/%r/%mAP50/%mAP50-95/%Np/106FLOPs/109
192.085.092.267.413.949.8
291.284.591.166.620.3611.7
389.484.688.653.425.8013.0
491.984.992.467.830.1028.4
表 2  HRSID上的C2f-DD消融实验
图 9  YOLOv8与DD-YOLO模型的可视化对比
图 10  YOLOv8与DD-YOLO模型的热力对比
方法p/%r/%mAP50/%Np/106FLOPs/109
FBR-Net[15]92.493.194.231.3029.4
TWC-Net[16]91.295.194.126.3620.8
DCMSNN[17]90.383.590.320.7021.6
TOOD[18]83.193.197.172.2338.8
Key-Point Estimation[19]94.895.197 .773.3049.6
ADERLNet-CW[20]98.195.498.338.20105.2
DD-YOLO98.593.398.513.949.8
表 3  SSDD 数据集的不同目标检测模型的比较
图 11  SSDD数据集的可视化对比
方法p/%r/%mAP50/%Np/
106
FLOPs/
109
RetianNet[21]88.877.578.260.1181.9
CenterNet[22]84.693.184.555.1175.4
DAPN[23]88.977.186.163,8266.1
CoAM+RFIM[24]91.984.592.027.7123.6
PPA-Net[25]93.489.192.072.238.8
Context-aware network[26]93.888.892.170.4144.5
DD-YOLO92.085.092.213.99.8
表 4  HRSID 数据集不同目标检测模型的比较
图 12  HRSID数据集的可视化对比
1 YANG Z, LAI Y, ZHOU H, et al Improving ship detection based on decision tree classification for high frequency surface wave radar[J]. Journal of Marine Science and Engineering, 2023, 11 (3): 493- 499
doi: 10.3390/jmse11030493
2 ZHANG X, FENG S. ZHAO C, et al . MGSFA-Net: multiscale global scattering feature association network for SAR ship target recognition [J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2024, 17(1): 4611–4625.
3 GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Columbus: IEEE, 2014: 580-587.
4 REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection [C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 779-788.
5 LI N, YE X D, WANG H, et al SAR image ship detection in complex scenarios using modified YOLOv5[J]. Journal of Signal Processing, 2022, 38 (5): 1009- 1018
6 TANG H, GAO S, LI S, et al A lightweight SAR Image ship detection method based on improved convolution and YOLOv7[J]. Remote Sensing, 2024, 16 (3): 486
doi: 10.3390/rs16030486
7 ZHANG L, FANG N, YANG X, et al MSFA-YOLO: a multi-scale SAR ship detection algorithm based on fused attention[J]. IEEE Access, 2024, 12 (1): 24554- 24568
8 ZHAO S, TAO R, JIA F DML-YOLOv8-SAR image object detection algorithm[J]. Signal, Image and Video Processing, 2024, 18 (10): 6911- 6923
doi: 10.1007/s11760-024-03361-4
9 WEI H, LIU X, XU S, et al. DWRSeg: rethinking efficient acquisition of multi-scale contextual information for real-time semantic segmentation [EB/OL]. [2024-10-15]. https://arxiv.org/abs/2212.01173.
10 XU W, LONG C, WANG R, et al. DRB-GAN: a dynamic Resblock generative adversarial network for artistic style transfer [C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2021: 6383-6392.
11 XIA Z, PAN X, SONG S, et al. Vision Transformer with deformable attention [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022: 4794-4803.
12 LIU W, LU H, FU H, et al. Learning to upsample by learning to sample [C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Paris: IEEE, 2023: 6027-6037.
13 ZHANG T, ZHANG X, LI J, et al SAR ship detection dataset (SSDD): official release and comprehensive data analysis[J]. Remote Sensing, 2021, 13 (18): 3690- 3710
doi: 10.3390/rs13183690
14 WEI S, ZENG X, QU Q, et al HRSID: a high-resolution SAR images dataset for ship detection and instance segmentation[J]. IEEE Access, 2020, 8 (1): 120234- 120254
15 FU J, SUN X, WANG Z, et al An anchor-free method based on feature balancing and refinement network for multiscale ship detection in SAR images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 59 (2): 1331- 1344
doi: 10.1109/TGRS.2020.3005151
16 YU L, WU H, ZHONG Z, et al TWC-Net: a SAR ship detection using two-way convolution and multiscale feature mapping[J]. Remote Sensing, 2021, 13 (13): 2558
doi: 10.3390/rs13132558
17 JIAO J, ZHANG Y, SUN H, et al. A densely connected end-to-end neural network for multiscale and multiscene SAR ship detection [J]. IEEE Access, 2018, 6: 20881-20892.
18 FENG C, ZHONG Y, GAO Y, et al. Tood: task-aligned one-stage object detection [C]//IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2021: 3490-3499.
19 MA X, HOU S, WANG Y, et al Multiscale and dense ship detection in SAR images based on key-point estimation and attention mechanism[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 1- 11
20 YANG Y, MA C, LI G, et al ADERLNet: adaptive denoising enhancement representation learning for low-latency and high-accurate target detection on SAR sensors[J]. IEEE Sensors Journal, 2024, 24 (5): 6430- 6450
21 LIN T, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection [C]// IEEE/CVF International Conference on Computer Vision. Venice: IEEE, 2017: 2980-2988.
22 ZHOU X, WANG D, KRÄHENBÜHL P Objects as points[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43 (12): 4234- 4242
23 CUI Z, LI Q, CAO Z, et al Dense attention pyramid networks for multi-scale ship detection in SAR images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2019, 57 (11): 8983- 8997
doi: 10.1109/TGRS.2019.2923988
24 YANG X, ZHANG X, WANG N, et al A robust one-stage detector for multiscale ship detection with complex background in massive SAR images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 60: 1- 12
25 TANG G, ZHAO H, CLARAMUNT C, et al PPA-Net: pyramid pooling attention network for multi-scale ship detection in SAR images[J]. Remote Sensing, 2023, 15 (11): 2855
doi: 10.3390/rs15112855
[1] 段继忠,李海源. 基于变分模型和Transformer的多尺度并行磁共振成像重建[J]. 浙江大学学报(工学版), 2025, 59(9): 1826-1837.
[2] 王福建,张泽天,陈喜群,王殿海. 基于多通道图聚合注意力机制的共享单车借还量预测[J]. 浙江大学学报(工学版), 2025, 59(9): 1986-1995.
[3] 张弘,张学成,王国强,顾潘龙,江楠. 基于三维视觉的软体机器人实时定位与控制[J]. 浙江大学学报(工学版), 2025, 59(8): 1574-1582.
[4] 翟亚红,陈雅玲,徐龙艳,龚玉. 改进YOLOv8s的轻量级无人机航拍小目标检测算法[J]. 浙江大学学报(工学版), 2025, 59(8): 1708-1717.
[5] 何婧瑶,李鹏飞,汪承志,吕振鸣,牟萍. 基于双目视觉和改进YOLOv8的动态三维重建方法[J]. 浙江大学学报(工学版), 2025, 59(7): 1443-1450.
[6] 王圣举,张赞. 基于加速扩散模型的缺失值插补算法[J]. 浙江大学学报(工学版), 2025, 59(7): 1471-1480.
[7] 章东平,王大为,何数技,汤斯亮,刘志勇,刘中秋. 基于跨维度特征融合的航空发动机寿命预测[J]. 浙江大学学报(工学版), 2025, 59(7): 1504-1513.
[8] 蔡永青,韩成,权巍,陈兀迪. 基于注意力机制的视觉诱导晕动症评估模型[J]. 浙江大学学报(工学版), 2025, 59(6): 1110-1118.
[9] 王立红,刘新倩,李静,冯志全. 基于联邦学习和时空特征融合的网络入侵检测方法[J]. 浙江大学学报(工学版), 2025, 59(6): 1201-1210.
[10] 徐慧智,王秀青. 基于车辆图像特征的前车距离与速度感知[J]. 浙江大学学报(工学版), 2025, 59(6): 1219-1232.
[11] 曹铭,段武峰,马梦骁,艾凡荣,周奎. 基于改进YOLOv8-Seg模型的生物打印机产物均一性评估[J]. 浙江大学学报(工学版), 2025, 59(6): 1277-1283.
[12] 陈赞,李冉,冯远静,李永强. 基于时间维超分辨率的视频快照压缩成像重构[J]. 浙江大学学报(工学版), 2025, 59(5): 956-963.
[13] 马莉,王永顺,胡瑶,范磊. 预训练长短时空交错Transformer在交通流预测中的应用[J]. 浙江大学学报(工学版), 2025, 59(4): 669-678.
[14] 陈巧红,郭孟浩,方贤,孙麒. 基于跨模态级联扩散模型的图像描述方法[J]. 浙江大学学报(工学版), 2025, 59(4): 787-794.
[15] 顾正宇,赖菲菲,耿辰,王希明,戴亚康. 基于知识引导的缺血性脑卒中梗死区分割方法[J]. 浙江大学学报(工学版), 2025, 59(4): 814-820.