Please wait a minute...
Journal of ZheJiang University (Engineering Science)  2022, Vol. 56 Issue (12): 2392-2402    DOI: 10.3785/j.issn.1008-973X.2022.12.008
    
Ship detection algorithm in complex backgrounds via multi-head self-attention
Nan-jing YU1(),Xiao-biao FAN1,Tian-min DENG2,*(),Guo-tao MAO2
1. School of Shipping and Naval Architecture, Chongqing Jiaotong University, Chongqing 400074, China
2. College of Traffic and Transportation, Chongqing Jiaotong University, Chongqing 400074, China
Download: HTML     PDF(1335KB) HTML
Export: BibTeX | EndNote (RIS)      

Abstract  

A ship object detection algorithm was proposed based on a multi-head self-attention (MHSA) mechanism and YOLO network (MHSA-YOLO), aiming at the characteristics of complex backgrounds, large differences in scale between classes and many small objects in inland rivers and ports. In the feature extraction process, a parallel self-attention residual module (PARM) based on MHSA was designed to weaken the interference of complex background information and strengthen the feature information of the ship objects. In the feature fusion process, a simplified two-way feature pyramid was developed so as to strengthen the feature fusion and representation ability. Experimental results on the Seaships dataset showed that the MHSA-YOLO method had a better learning ability, achieved 97.59% mean average precision in the aspect of object detection and was more effective compared with the state-of-the-art object detection methods. Experimental results based on a self-made dataset showed that MHSA-YOLO had strong generalization.



Key wordsintelligent navigation      object detection      complex background      self-attention mechanism      multi-scale fusion     
Received: 11 January 2022      Published: 03 January 2023
CLC:  TU 675.79  
  TP 391.41  
Fund:  国家重点研发计划项目(SQ2020YFF0418521);重庆市技术创新与应用发展专项重点项目(cstc2020jscx-dxwtBX0019);川渝联合实施重点研发项目(cstc2020jscx-cylhX0005, cstc2020jscx-cylhX0007)
Corresponding Authors: Tian-min DENG     E-mail: yunanjing527@163.com;dtianmin@cqjtu.edu.cn
Cite this article:

Nan-jing YU,Xiao-biao FAN,Tian-min DENG,Guo-tao MAO. Ship detection algorithm in complex backgrounds via multi-head self-attention. Journal of ZheJiang University (Engineering Science), 2022, 56(12): 2392-2402.

URL:

https://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2022.12.008     OR     https://www.zjujournals.com/eng/Y2022/V56/I12/2392


基于多头自注意力的复杂背景船舶检测算法

针对内河港口背景复杂、类间尺度差异大和小目标实例多的特点,提出基于多头自注意力机制(MHSA)和YOLO网络的船舶目标检测算法(MHSA-YOLO). 在特征提取过程中,基于MHSA设计并行的自注意力残差模块(PARM),以弱化复杂背景信息干扰并强化船舶目标特征信息;在特征融合过程中,开发简化的双向特征金字塔结构,以强化特征信息的融合与表征能力. 在Seaships数据集上的实验结果表明,与其他先进的目标检测方法相比,MHSA-YOLO拥有较好的学习能力,在检测精度方面取得97.59%的平均均值精度,MHSA-YOLO对复杂背景船舶目标和小尺寸目标的检测更有效. 基于自制数据集的实验结果表明,MHSA-YOLO的泛化能力强.


关键词: 智能航行,  目标检测,  复杂背景,  自注意力机制,  多尺度特征融合 
Fig.1 Overall structure of ship object detection algorithm based on multi-head self-attention mechanism and YOLO network
Fig.2 Structure of self-attention layer
Fig.3 Structure of parallel self-attention residual module
Fig.4 Structure of path aggregation network
Fig.5 Structure of simplified two-way feature pyramid
Fig.6 Total loss curves of two algorithms
算法 pma/% F1 Fps/(帧?s?1) M/MB FLOPS/109 Par
YOLOv5 96.30 0.86 26 13.7 16.4 7 277 027
YOLOv5+简化的FPN 96.73 0.91 25 15.7 17.8 8 145 079
YOLOv5+PARM 96.51 0.93 27 13.1 16.1 6 760 099
MHSA-YOLO 97.59 0.94 25 15.1 17.5 7 828 151
Tab.1 Comparison of ablation experimental results of algorithm performance
Fig.7 Feature visualization comparison before and after introducing parallel self-attention residual module (PARM)
Fig.8 Results of ships detection on Seaships with different algorithms
算法 pma/% pa/%
OC BCC GCS CS FB PS
Faster(VGG16)[21] 90.12 89.44 90.34 90.73 90.87 88.76 90.57
Faster(ResNet18)[21] 90.63 90.37 89.78 90.45 90.91 87.17 88.93
Faster(ResNet50)[21] 91.65 92.38 90.88 92.46 92.91 89.27 90.93
Faster(ResNet101)[21] 92.40 93.68 90.22 93.87 93.41 89.96 91.78
SSD300(MobileNet)[21] 77.66 64.77 76.69 87.43 90.77 71.00 75.32
SSD300(VGG16)[21] 79.37 75.03 76.66 87.66 90.71 71.79 74.35
SSD512(VGG16)[21] 86.73 83.99 83.00 87.08 90.81 85.85 89.65
YOLOv2 random=0[21] 77.51 83.01 79.36 80.60 88.90 62.70 70.48
YOLOv2 random=1[21] 79.06 83.16 82.07 83.21 88.31 64.74 72.89
YOLOv3[23] 87.00 86.00 86.20 87.10 87.10 88.00 90.00
YOLOv4[23] 90.70 90.80 90.70 90.80 90.90 90.60 90.50
MHSA-YOLO 97.59 98.73 98.42 96.41 96.53 98.51 96.94
Tab.2 Comparison of ship detection results of different convolutional neural network
算法 pma/% pa/%
OC BCC GCS CS FB PS
YOLOv5s 86.1 89.8 90.2 82.0 97.9 89.5 67.4
MHSA-YOLO 92.4 94.5 96.6 84.6 99.5 97.5 81.8
Tab.3 Comparison of detection accuracy of self-made dataset
Fig.9 Results of ships detection with different algorithms on self-made dataset based on Singapore maritime dataset
[1]   ZHANG T W, ZHANG X L, SHI J, et al Depthwise separable convolution neural network for high-speed SAR ship detection[J]. Remote Sensing, 2019, 11 (21): 2483
doi: 10.3390/rs11212483
[2]   ZHANG T W, ZHANG X L Injection of traditional hand-crafted features into modern CNN-based models for SAR ship classification: what, why, where, and how[J]. Remote Sensing, 2021, 13 (11): 2091
doi: 10.3390/rs13112091
[3]   徐诚极, 王晓峰, 杨亚东 Attention-YOLO: 引入注意力机制的YOLO检测算法[J]. 计算机工程与应用, 2019, 55 (6): 13- 23
XU Cheng-ji, WANG Xiao-feng, YANG Ya-dong Attention-YOLO: YOLO detection algorithm that introduces attention mechanism[J]. Computer Engineering and Applications, 2019, 55 (6): 13- 23
[4]   OKSUZ K, CAM B C, KALKAN S, et al Imbalance problems in object detection: a review[J]. IEEE Transactions on Pattern Analysis Machine Intelligence, 2021, 43 (10): 3388- 3415
doi: 10.1109/TPAMI.2020.2981890
[5]   齐亮, 李邦昱, 陈连凯 基于改进的Faster R-CNN船舶目标检测算法[J]. 中国造船, 2020, 61 (Suppl.1): 40- 51
QI Liang, LI Bang-yu, CHEN Lian-kai Ship target detection algorithm based on improved Fast R-CNN[J]. Shipbuilding of China, 2020, 61 (Suppl.1): 40- 51
doi: 10.3969/j.issn.1000-4882.2020.z1.006
[6]   汤丽丹. 基于图像的无人船目标检测研究[D]. 哈尔滨: 哈尔滨工业大学, 2018.
TANG Li-dan. Research on object detection of USV based on images [D]. Harbin: Harbin Institute of Technology, 2018.
[7]   SHAO Z, WANG L, WANG Z, et al Saliency-aware convolution neural network for ship detection in surveillance video[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 30 (3): 781- 794
doi: 10.1109/TCSVT.2019.2897980
[8]   LI H, DENG L B, YANG C, et al Enhanced YOLO v3 tiny network for real-time ship detection from visual image[J]. IEEE Access, 2021, 9: 16692- 16706
doi: 10.1109/ACCESS.2021.3053956
[9]   甘兴旺, 魏汉迪, 肖龙飞, 等 基于视觉的船舶环境感知数据融合算法研究[J]. 中国造船, 2021, 62 (2): 201- 210
GAN Xing-wang, WEI Han-di, XIAO Long-fei, et al Research on vision-based data fusion algorithm for environment perception of ships[J]. Shipbuilding of China, 2021, 62 (2): 201- 210
doi: 10.3969/j.issn.1000-4882.2021.02.018
[10]   FENG Y C, DIAO W H, SUN X, et al Towards automated ship detection and category recognition from high-resolution aerial images[J]. Remote Sensing, 2019, 11 (16): 1901
doi: 10.3390/rs11161901
[11]   KIM M, JEONG J, KIM S ECAP-YOLO: efficient channel attention pyramid YOLO for small object detection in aerial image[J]. Remote Sensing, 2021, 13 (23): 4851
doi: 10.3390/rs13234851
[12]   CHEN L Q, SHI W X, DENG D X Improved YOLOv3 based on attention mechanism for fast and accurate ship detection in optical remote sensing images[J]. Remote Sensing, 2021, 13 (4): 660
doi: 10.3390/rs13040660
[13]   YU J M, ZHOU G Y, ZHOU S B, et al A fast and lightweight detection network for multi-scale SAR ship detection under complex backgrounds[J]. Remote Sensing, 2021, 14 (1): 31
doi: 10.3390/rs14010031
[14]   LIN Z H, FENG M W, NOGUEIRA C, et al. A structured self-attentive sentence embedding [EB/OL]. [2021-09-17]. https://arxiv.org/abs/1703.03130.pdf.
[15]   VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach: NIPS, 2017: 5998-6008.
[16]   SRINIVAS A, LIN T Y, PARMAR N, et al. Bottleneck transformers for visual recognition [C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 16514-16524.
[17]   LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection [C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 937-944.
[18]   LIU S, QI L, QIN H, et al. Path aggregation network for instance segmentation [C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 8759-8768.
[19]   ZHANG Y, SHENG W, JIANG J, et al Priority branches for ship detection in optical remote sensing images[J]. Remote Sensing, 2020, 12 (7): 1196
[20]   TAN M, PANG R, LE Q V. EfficientDet: scalable and efficient object detection [C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 10781-10790.
[21]   SHAO Z F, WU W J, WANG Z Y, et al SeaShips: a large-scale precisely annotated dataset for ship detection[J]. IEEE Transactions on Multimedia, 2018, 20 (10): 2593- 2604
doi: 10.1109/TMM.2018.2865686
[22]   ZHENG Z H, WANG P, LIU W, et al. Distance-IoU loss: faster and better learning for bounding box regression [C]// Proceedings of the 34th AAAI Conference on Artificial Intelligence. New York: AAAI, 2020: 12993-13000.
[23]   赵玉蓉, 郭会明, 焦函, 等 融合混合域注意力的YOLOv4在船舶检测中的应用[J]. 计算机与现代化, 2021, (9): 75- 82
ZHAO Yu-rong, GUO Hui-ming, JIAO Han, et al Application of YOLOv4 with mixed-domain attention in ship detection[J]. Computer and Modernization, 2021, (9): 75- 82
doi: 10.3969/j.issn.1006-2475.2021.09.012
[24]   LEI S L, LU D D, QIU X L, et al SRSDD-v1.0: a high-resolution SAR rotation ship detection dataset[J]. Remote Sensing, 2021, 13 (24): 5104
doi: 10.3390/rs13245104
[25]   RODGER M, GUIDA R Classification-aided SAR and AIS data fusion for space-based maritime surveillance[J]. Remote Sensing, 2020, 13 (1): 104
doi: 10.3390/rs13010104
[26]   LIU J M, CHEN H, WANG Y Multi-source remote sensing image fusion for ship target detection and recognition[J]. Remote Sensing, 2021, 13 (23): 4852
doi: 10.3390/rs13234852
[1] Xiao-chen JU,Xin-xin ZHAO,Sheng-sheng QIAN. Self-attention mechanism based bridge bolt detection algorithm[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(5): 901-908.
[2] Na ZHANG,Xu-lei QI,Xiao-an BAO,Biao WU,Xiao-mei TU,Yu-ting JIN. Single-stage object detection algorithm based on optimizing position prediction[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(4): 783-794.
[3] Jing-hui CHU,Li-dong SHI,Pei-guang JING,Wei LV. Context-aware knowledge distillation network for object detection[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(3): 503-509.
[4] Ying-li LIU,Rui-gang WU,Chang-hui YAO,Tao SHEN. Construction method of extraction dataset of Al-Si alloy entity relationship[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(2): 245-253.
[5] Kai LI,Yu-shun LIN,Xiao-lin WU,Fei-yu LIAO. Small target vehicle detection based on multi-scale fusion technology and attention mechanism[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(11): 2241-2250.
[6] Rong ZHANG,Wei ZHANG. Fire detection algorithm based on improved GhostNet-FCOS[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(10): 1891-1899.
[7] Kai DU,Guo-rong ZHU,Jiang-hua LU,Mu-ye PANG. Metal object detection method in wireless electric vehicle charging system[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(1): 56-62.
[8] Ying-jie NIU,Yan-chen SU,Dun-cheng CHENG,Jia LIAO,Hai-bo ZHAO,Yong-qiang GAO. High-speed rail contact network U-holding nut fault detection algorithm[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(10): 1912-1921.
[9] Ying-jie XIA,Cong-yu OUYANG. Dynamic image background modeling method for detecting abandoned objects in highway[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(7): 1249-1255.
[10] Chen-bin ZHENG,Yong ZHANG,Hang HU,Ying-rui WU,Guang-jing HUANG. Object detection enhanced context model[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(3): 529-539.
[11] Yao JIN,Wei ZHANG. Real-time fire detection algorithm with Anchor-Free network architecture[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(12): 2430-2436.
[12] YE Fang-fang, XU Li. Real-time detection and discrimination of static objects and ghosts[J]. Journal of ZheJiang University (Engineering Science), 2015, 49(1): 181-185.
[13] LAI Xiao-bo, ZHU Shi-qiang, FANG Chun-jie. A three-dimensional reconstruction algorithm for complex
background image and its medical applications
[J]. Journal of ZheJiang University (Engineering Science), 2012, 46(11): 2061-2067.
[14] XU Xue-mei, LI Li-xian, ZHANG Jian-yang, NI Lan, HUANG Zheng-yu, CAO Jian. Tracking algorithm of visible particles in transparent
liquid pharmaceutical
[J]. Journal of ZheJiang University (Engineering Science), 2012, 46(10): 1822-1830.