1. School of Shipping and Naval Architecture, Chongqing Jiaotong University, Chongqing 400074, China 2. College of Traffic and Transportation, Chongqing Jiaotong University, Chongqing 400074, China
A ship object detection algorithm was proposed based on a multi-head self-attention (MHSA) mechanism and YOLO network (MHSA-YOLO), aiming at the characteristics of complex backgrounds, large differences in scale between classes and many small objects in inland rivers and ports. In the feature extraction process, a parallel self-attention residual module (PARM) based on MHSA was designed to weaken the interference of complex background information and strengthen the feature information of the ship objects. In the feature fusion process, a simplified two-way feature pyramid was developed so as to strengthen the feature fusion and representation ability. Experimental results on the Seaships dataset showed that the MHSA-YOLO method had a better learning ability, achieved 97.59% mean average precision in the aspect of object detection and was more effective compared with the state-of-the-art object detection methods. Experimental results based on a self-made dataset showed that MHSA-YOLO had strong generalization.
Fig.1Overall structure of ship object detection algorithm based on multi-head self-attention mechanism and YOLO network
Fig.2Structure of self-attention layer
Fig.3Structure of parallel self-attention residual module
Fig.4Structure of path aggregation network
Fig.5Structure of simplified two-way feature pyramid
Fig.6Total loss curves of two algorithms
算法
pma/%
F1
Fps/(帧?s?1)
M/MB
FLOPS/109
Par
YOLOv5
96.30
0.86
26
13.7
16.4
7 277 027
YOLOv5+简化的FPN
96.73
0.91
25
15.7
17.8
8 145 079
YOLOv5+PARM
96.51
0.93
27
13.1
16.1
6 760 099
MHSA-YOLO
97.59
0.94
25
15.1
17.5
7 828 151
Tab.1Comparison of ablation experimental results of algorithm performance
Fig.7Feature visualization comparison before and after introducing parallel self-attention residual module (PARM)
Fig.8Results of ships detection on Seaships with different algorithms
算法
pma/%
pa/%
OC
BCC
GCS
CS
FB
PS
Faster(VGG16)[21]
90.12
89.44
90.34
90.73
90.87
88.76
90.57
Faster(ResNet18)[21]
90.63
90.37
89.78
90.45
90.91
87.17
88.93
Faster(ResNet50)[21]
91.65
92.38
90.88
92.46
92.91
89.27
90.93
Faster(ResNet101)[21]
92.40
93.68
90.22
93.87
93.41
89.96
91.78
SSD300(MobileNet)[21]
77.66
64.77
76.69
87.43
90.77
71.00
75.32
SSD300(VGG16)[21]
79.37
75.03
76.66
87.66
90.71
71.79
74.35
SSD512(VGG16)[21]
86.73
83.99
83.00
87.08
90.81
85.85
89.65
YOLOv2 random=0[21]
77.51
83.01
79.36
80.60
88.90
62.70
70.48
YOLOv2 random=1[21]
79.06
83.16
82.07
83.21
88.31
64.74
72.89
YOLOv3[23]
87.00
86.00
86.20
87.10
87.10
88.00
90.00
YOLOv4[23]
90.70
90.80
90.70
90.80
90.90
90.60
90.50
MHSA-YOLO
97.59
98.73
98.42
96.41
96.53
98.51
96.94
Tab.2Comparison of ship detection results of different convolutional neural network
算法
pma/%
pa/%
OC
BCC
GCS
CS
FB
PS
YOLOv5s
86.1
89.8
90.2
82.0
97.9
89.5
67.4
MHSA-YOLO
92.4
94.5
96.6
84.6
99.5
97.5
81.8
Tab.3Comparison of detection accuracy of self-made dataset
Fig.9Results of ships detection with different algorithms on self-made dataset based on Singapore maritime dataset
[1]
ZHANG T W, ZHANG X L, SHI J, et al Depthwise separable convolution neural network for high-speed SAR ship detection[J]. Remote Sensing, 2019, 11 (21): 2483
doi: 10.3390/rs11212483
[2]
ZHANG T W, ZHANG X L Injection of traditional hand-crafted features into modern CNN-based models for SAR ship classification: what, why, where, and how[J]. Remote Sensing, 2021, 13 (11): 2091
doi: 10.3390/rs13112091
[3]
徐诚极, 王晓峰, 杨亚东 Attention-YOLO: 引入注意力机制的YOLO检测算法[J]. 计算机工程与应用, 2019, 55 (6): 13- 23 XU Cheng-ji, WANG Xiao-feng, YANG Ya-dong Attention-YOLO: YOLO detection algorithm that introduces attention mechanism[J]. Computer Engineering and Applications, 2019, 55 (6): 13- 23
[4]
OKSUZ K, CAM B C, KALKAN S, et al Imbalance problems in object detection: a review[J]. IEEE Transactions on Pattern Analysis Machine Intelligence, 2021, 43 (10): 3388- 3415
doi: 10.1109/TPAMI.2020.2981890
[5]
齐亮, 李邦昱, 陈连凯 基于改进的Faster R-CNN船舶目标检测算法[J]. 中国造船, 2020, 61 (Suppl.1): 40- 51 QI Liang, LI Bang-yu, CHEN Lian-kai Ship target detection algorithm based on improved Fast R-CNN[J]. Shipbuilding of China, 2020, 61 (Suppl.1): 40- 51
doi: 10.3969/j.issn.1000-4882.2020.z1.006
[6]
汤丽丹. 基于图像的无人船目标检测研究[D]. 哈尔滨: 哈尔滨工业大学, 2018. TANG Li-dan. Research on object detection of USV based on images [D]. Harbin: Harbin Institute of Technology, 2018.
[7]
SHAO Z, WANG L, WANG Z, et al Saliency-aware convolution neural network for ship detection in surveillance video[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 30 (3): 781- 794
doi: 10.1109/TCSVT.2019.2897980
[8]
LI H, DENG L B, YANG C, et al Enhanced YOLO v3 tiny network for real-time ship detection from visual image[J]. IEEE Access, 2021, 9: 16692- 16706
doi: 10.1109/ACCESS.2021.3053956
[9]
甘兴旺, 魏汉迪, 肖龙飞, 等 基于视觉的船舶环境感知数据融合算法研究[J]. 中国造船, 2021, 62 (2): 201- 210 GAN Xing-wang, WEI Han-di, XIAO Long-fei, et al Research on vision-based data fusion algorithm for environment perception of ships[J]. Shipbuilding of China, 2021, 62 (2): 201- 210
doi: 10.3969/j.issn.1000-4882.2021.02.018
[10]
FENG Y C, DIAO W H, SUN X, et al Towards automated ship detection and category recognition from high-resolution aerial images[J]. Remote Sensing, 2019, 11 (16): 1901
doi: 10.3390/rs11161901
[11]
KIM M, JEONG J, KIM S ECAP-YOLO: efficient channel attention pyramid YOLO for small object detection in aerial image[J]. Remote Sensing, 2021, 13 (23): 4851
doi: 10.3390/rs13234851
[12]
CHEN L Q, SHI W X, DENG D X Improved YOLOv3 based on attention mechanism for fast and accurate ship detection in optical remote sensing images[J]. Remote Sensing, 2021, 13 (4): 660
doi: 10.3390/rs13040660
[13]
YU J M, ZHOU G Y, ZHOU S B, et al A fast and lightweight detection network for multi-scale SAR ship detection under complex backgrounds[J]. Remote Sensing, 2021, 14 (1): 31
doi: 10.3390/rs14010031
[14]
LIN Z H, FENG M W, NOGUEIRA C, et al. A structured self-attentive sentence embedding [EB/OL]. [2021-09-17]. https://arxiv.org/abs/1703.03130.pdf.
[15]
VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach: NIPS, 2017: 5998-6008.
[16]
SRINIVAS A, LIN T Y, PARMAR N, et al. Bottleneck transformers for visual recognition [C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 16514-16524.
[17]
LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection [C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 937-944.
[18]
LIU S, QI L, QIN H, et al. Path aggregation network for instance segmentation [C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 8759-8768.
[19]
ZHANG Y, SHENG W, JIANG J, et al Priority branches for ship detection in optical remote sensing images[J]. Remote Sensing, 2020, 12 (7): 1196
[20]
TAN M, PANG R, LE Q V. EfficientDet: scalable and efficient object detection [C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 10781-10790.
[21]
SHAO Z F, WU W J, WANG Z Y, et al SeaShips: a large-scale precisely annotated dataset for ship detection[J]. IEEE Transactions on Multimedia, 2018, 20 (10): 2593- 2604
doi: 10.1109/TMM.2018.2865686
[22]
ZHENG Z H, WANG P, LIU W, et al. Distance-IoU loss: faster and better learning for bounding box regression [C]// Proceedings of the34th AAAI Conference on Artificial Intelligence. New York: AAAI, 2020: 12993-13000.
[23]
赵玉蓉, 郭会明, 焦函, 等 融合混合域注意力的YOLOv4在船舶检测中的应用[J]. 计算机与现代化, 2021, (9): 75- 82 ZHAO Yu-rong, GUO Hui-ming, JIAO Han, et al Application of YOLOv4 with mixed-domain attention in ship detection[J]. Computer and Modernization, 2021, (9): 75- 82
doi: 10.3969/j.issn.1006-2475.2021.09.012
[24]
LEI S L, LU D D, QIU X L, et al SRSDD-v1.0: a high-resolution SAR rotation ship detection dataset[J]. Remote Sensing, 2021, 13 (24): 5104
doi: 10.3390/rs13245104
[25]
RODGER M, GUIDA R Classification-aided SAR and AIS data fusion for space-based maritime surveillance[J]. Remote Sensing, 2020, 13 (1): 104
doi: 10.3390/rs13010104
[26]
LIU J M, CHEN H, WANG Y Multi-source remote sensing image fusion for ship target detection and recognition[J]. Remote Sensing, 2021, 13 (23): 4852
doi: 10.3390/rs13234852