Please wait a minute...
Journal of ZheJiang University (Engineering Science)  2025, Vol. 59 Issue (11): 2379-2388    DOI: 10.3785/j.issn.1008-973X.2025.11.017
    
Ship target detection algorithm based on improved YOLOv8
Lin DUO(),Yu YIN,Wei DUAN,Yun ZHANG,Yong REN
School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China
Download: HTML     PDF(4617KB) HTML
Export: BibTeX | EndNote (RIS)      

Abstract  

An improved ship target detection algorithm DD-YOLO based on YOLOv8 was proposed in order to address the challenges of small target size, significant scale variations and complex background noise in synthetic aperture radar (SAR) images for ship detection. An enhanced C2f module was incorporated in the backbone network to strengthen multi-scale feature extraction and fusion capabilities, combined with a newly designed SPA module to optimize gradient flow information propagation, significantly improving multi-scale target detection performance. A more lightweight dynamic upsampling approach was adopted in the neck network, which reduced computational overhead and model complexity while enhancing the recognition of small ships in complex backgrounds. A multi-dimensional attention mechanism was integrated in the detection head and lightweight processing was conducted to improve the model’s sensitivity to key features in complex backgrounds, thereby increasing detection accuracy. Experiments conducted on two public datasets, HRSID and SSDD, demonstrate that DD-YOLO achieves mAP50 scores of 92.2% and 98.5%, respectively, representing improvement of 2% and 2.2% over the baseline model. The model complexity is significantly lower than that of mainstream algorithms, achieving an optimal balance between accuracy and efficiency.



Key wordssynthetic aperture radar (SAR)      ship inspection      deep learning      YOLOv8      multiscale feature fusion     
Received: 01 November 2024      Published: 30 October 2025
CLC:  TP 391  
Fund:  云南省科技厅重大科技专项计划资助项目(202302AD080006); 云南省媒体融合重点实验室资助项目(220245201).
Cite this article:

Lin DUO,Yu YIN,Wei DUAN,Yun ZHANG,Yong REN. Ship target detection algorithm based on improved YOLOv8. Journal of ZheJiang University (Engineering Science), 2025, 59(11): 2379-2388.

URL:

https://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2025.11.017     OR     https://www.zjujournals.com/eng/Y2025/V59/I11/2379


基于改进YOLOv8的船舶目标检测算法

为了解决合成孔径雷达(SAR)图像中部分船舶目标体积小、目标尺寸变化大及背景噪声复杂等问题,提出基于YOLOv8的船舶目标检测算法DD-YOLO. 其中,在骨干网络中采用改进的C2f模块增强多尺度特征提取与融合能力,结合新设计的SPA模块优化梯度流信息传递,显著提升多尺度目标检测的效果. 颈部使用更轻量的动态上采样,在降低计算开销和模型复杂性的同时优化复杂背景中小型船舶的识别. 检测部分设计加入多维度注意力机制,开展轻量化处理,增强模型对复杂背景中关键特征的敏感性,提高检测准确性. 在HRSID和SSDD 2个公开数据集上进行实验,DD-YOLO的mAP50分别达到92.2%和98.5%,比基线分别提高了2%和2.2%,模型复杂度显著低于主流算法,实现了精度与效率的平衡.


关键词: 合成孔径雷达(SAR),  船舶检测,  深度学习,  YOLOv8,  多尺度特征融合 
Fig.1 Structure of DD-YOLO network
Fig.2 C2f-DD module
Fig.3 Dilated reparam block module
Fig.4 DWR_DRB module
Fig.5 Deformable attention module
Fig.6 Dynamic upsampling
Fig.7 Sampling point generator
Fig.8 structure of DyHead-Prune
编号YOLOV8C2f-DDSPADysampleDetectp/%r/%mAP50/%mAP50-95/%Np/106FLOPs/109
A90.582.590.264.011.468.1
B91.183.691.064.812.278.0
C91.183.990.764.712.498.3
D90.483.890.664.111.428.1
E91.083.991.466.113.309.6
F91.183.991.666.413.249.6
G91.984.092.066.713.429.6
H92.085.092.267.413.949.8
Tab.1 DD-YOLO ablation experiment on HRSID
C2f-DD数量p/%r/%mAP50/%mAP50-95/%Np/106FLOPs/109
192.085.092.267.413.949.8
291.284.591.166.620.3611.7
389.484.688.653.425.8013.0
491.984.992.467.830.1028.4
Tab.2 C2f-DD ablation experiment on HRSID
Fig.9 Visualization comparison between YOLOv8 and DD-YOLO model
Fig.10 Comparison of thermal force of YOLOv8 and DD-YOLO model
方法p/%r/%mAP50/%Np/106FLOPs/109
FBR-Net[15]92.493.194.231.3029.4
TWC-Net[16]91.295.194.126.3620.8
DCMSNN[17]90.383.590.320.7021.6
TOOD[18]83.193.197.172.2338.8
Key-Point Estimation[19]94.895.197 .773.3049.6
ADERLNet-CW[20]98.195.498.338.20105.2
DD-YOLO98.593.398.513.949.8
Tab.3 Comparison of different object detection models in SSDD dataset
Fig.11 Visualization comparison of SSDD dataset
方法p/%r/%mAP50/%Np/
106
FLOPs/
109
RetianNet[21]88.877.578.260.1181.9
CenterNet[22]84.693.184.555.1175.4
DAPN[23]88.977.186.163,8266.1
CoAM+RFIM[24]91.984.592.027.7123.6
PPA-Net[25]93.489.192.072.238.8
Context-aware network[26]93.888.892.170.4144.5
DD-YOLO92.085.092.213.99.8
Tab.4 Comparison of different object detection models in HRSID dataset
Fig.12 Visualization comparison of HRSID dataset
[1]   YANG Z, LAI Y, ZHOU H, et al Improving ship detection based on decision tree classification for high frequency surface wave radar[J]. Journal of Marine Science and Engineering, 2023, 11 (3): 493- 499
doi: 10.3390/jmse11030493
[2]   ZHANG X, FENG S. ZHAO C, et al . MGSFA-Net: multiscale global scattering feature association network for SAR ship target recognition [J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2024, 17(1): 4611–4625.
[3]   GIRSHICK R, DONAHUE J, DARRELL T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Columbus: IEEE, 2014: 580-587.
[4]   REDMON J, DIVVALA S, GIRSHICK R, et al. You only look once: unified, real-time object detection [C] //Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 779-788.
[5]   LI N, YE X D, WANG H, et al SAR image ship detection in complex scenarios using modified YOLOv5[J]. Journal of Signal Processing, 2022, 38 (5): 1009- 1018
[6]   TANG H, GAO S, LI S, et al A lightweight SAR Image ship detection method based on improved convolution and YOLOv7[J]. Remote Sensing, 2024, 16 (3): 486
doi: 10.3390/rs16030486
[7]   ZHANG L, FANG N, YANG X, et al MSFA-YOLO: a multi-scale SAR ship detection algorithm based on fused attention[J]. IEEE Access, 2024, 12 (1): 24554- 24568
[8]   ZHAO S, TAO R, JIA F DML-YOLOv8-SAR image object detection algorithm[J]. Signal, Image and Video Processing, 2024, 18 (10): 6911- 6923
doi: 10.1007/s11760-024-03361-4
[9]   WEI H, LIU X, XU S, et al. DWRSeg: rethinking efficient acquisition of multi-scale contextual information for real-time semantic segmentation [EB/OL]. [2024-10-15]. https://arxiv.org/abs/2212.01173.
[10]   XU W, LONG C, WANG R, et al. DRB-GAN: a dynamic Resblock generative adversarial network for artistic style transfer [C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2021: 6383-6392.
[11]   XIA Z, PAN X, SONG S, et al. Vision Transformer with deformable attention [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022: 4794-4803.
[12]   LIU W, LU H, FU H, et al. Learning to upsample by learning to sample [C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Paris: IEEE, 2023: 6027-6037.
[13]   ZHANG T, ZHANG X, LI J, et al SAR ship detection dataset (SSDD): official release and comprehensive data analysis[J]. Remote Sensing, 2021, 13 (18): 3690- 3710
doi: 10.3390/rs13183690
[14]   WEI S, ZENG X, QU Q, et al HRSID: a high-resolution SAR images dataset for ship detection and instance segmentation[J]. IEEE Access, 2020, 8 (1): 120234- 120254
[15]   FU J, SUN X, WANG Z, et al An anchor-free method based on feature balancing and refinement network for multiscale ship detection in SAR images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 59 (2): 1331- 1344
doi: 10.1109/TGRS.2020.3005151
[16]   YU L, WU H, ZHONG Z, et al TWC-Net: a SAR ship detection using two-way convolution and multiscale feature mapping[J]. Remote Sensing, 2021, 13 (13): 2558
doi: 10.3390/rs13132558
[17]   JIAO J, ZHANG Y, SUN H, et al. A densely connected end-to-end neural network for multiscale and multiscene SAR ship detection [J]. IEEE Access, 2018, 6: 20881-20892.
[18]   FENG C, ZHONG Y, GAO Y, et al. Tood: task-aligned one-stage object detection [C]//IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2021: 3490-3499.
[19]   MA X, HOU S, WANG Y, et al Multiscale and dense ship detection in SAR images based on key-point estimation and attention mechanism[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 1- 11
[20]   YANG Y, MA C, LI G, et al ADERLNet: adaptive denoising enhancement representation learning for low-latency and high-accurate target detection on SAR sensors[J]. IEEE Sensors Journal, 2024, 24 (5): 6430- 6450
[21]   LIN T, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection [C]// IEEE/CVF International Conference on Computer Vision. Venice: IEEE, 2017: 2980-2988.
[22]   ZHOU X, WANG D, KRÄHENBÜHL P Objects as points[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43 (12): 4234- 4242
[23]   CUI Z, LI Q, CAO Z, et al Dense attention pyramid networks for multi-scale ship detection in SAR images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2019, 57 (11): 8983- 8997
doi: 10.1109/TGRS.2019.2923988
[24]   YANG X, ZHANG X, WANG N, et al A robust one-stage detector for multiscale ship detection with complex background in massive SAR images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 60: 1- 12
[25]   TANG G, ZHAO H, CLARAMUNT C, et al PPA-Net: pyramid pooling attention network for multi-scale ship detection in SAR images[J]. Remote Sensing, 2023, 15 (11): 2855
doi: 10.3390/rs15112855
[1] Jizhong DUAN,Haiyuan LI. Multi-scale parallel magnetic resonance imaging reconstruction based on variational model and Transformer[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(9): 1826-1837.
[2] Fujian WANG,Zetian ZHANG,Xiqun CHEN,Dianhai WANG. Usage prediction of shared bike based on multi-channel graph aggregation attention mechanism[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(9): 1986-1995.
[3] Hong ZHANG,Xuecheng ZHANG,Guoqiang WANG,Panlong GU,Nan JIANG. Real-time positioning and control of soft robot based on three-dimensional vision[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(8): 1574-1582.
[4] Yahong ZHAI,Yaling CHEN,Longyan XU,Yu GONG. Improved YOLOv8s lightweight small target detection algorithm of UAV aerial image[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(8): 1708-1717.
[5] Jingyao HE,Pengfei LI,Chengzhi WANG,Zhenming LV,Ping MU. Dynamic 3D reconstruction method using binocular vision and improved YOLOv8[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(7): 1443-1450.
[6] Shengju WANG,Zan ZHANG. Missing value imputation algorithm based on accelerated diffusion model[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(7): 1471-1480.
[7] Dongping ZHANG,Dawei WANG,Shuji HE,Siliang TANG,Zhiyong LIU,Zhongqiu LIU. Remaining useful life prediction of aircraft engines based on cross-dimensional feature fusion[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(7): 1504-1513.
[8] Yongqing CAI,Cheng HAN,Wei QUAN,Wudi CHEN. Visual induced motion sickness estimation model based on attention mechanism[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(6): 1110-1118.
[9] Lihong WANG,Xinqian LIU,Jing LI,Zhiquan FENG. Network intrusion detection method based on federated learning and spatiotemporal feature fusion[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(6): 1201-1210.
[10] Huizhi XU,Xiuqing WANG. Perception of distance and speed of front vehicle based on vehicle image features[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(6): 1219-1232.
[11] Ming CAO,Wufeng DUAN,Mengxiao MA,Fanrong AI,Kui ZHOU. Uniformity evaluation of bio-printer based on improved YOLOv8-Seg model[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(6): 1277-1283.
[12] Zan CHEN,Ran LI,Yuanjing FENG,Yongqiang LI. Video snapshot compressive imaging reconstruction based on temporal super-resolution[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(5): 956-963.
[13] Li MA,Yongshun WANG,Yao HU,Lei FAN. Pre-trained long-short spatiotemporal interleaved Transformer for traffic flow prediction applications[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(4): 669-678.
[14] Qiaohong CHEN,Menghao GUO,Xian FANG,Qi SUN. Image captioning based on cross-modal cascaded diffusion model[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(4): 787-794.
[15] Zhengyu GU,Feifei LAI,Chen GENG,Ximing WANG,Yakang DAI. Knowledge-guided infarct segmentation of ischemic stroke[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(4): 814-820.