Please wait a minute...
浙江大学学报(工学版)  2022, Vol. 56 Issue (12): 2392-2402    DOI: 10.3785/j.issn.1008-973X.2022.12.008
计算机技术     
基于多头自注意力的复杂背景船舶检测算法
于楠晶1(),范晓飚1,邓天民2,*(),冒国韬2
1. 重庆交通大学 航运与船舶工程学院,重庆 400074
2. 重庆交通大学 交通运输学院,重庆 400074
Ship detection algorithm in complex backgrounds via multi-head self-attention
Nan-jing YU1(),Xiao-biao FAN1,Tian-min DENG2,*(),Guo-tao MAO2
1. School of Shipping and Naval Architecture, Chongqing Jiaotong University, Chongqing 400074, China
2. College of Traffic and Transportation, Chongqing Jiaotong University, Chongqing 400074, China
 全文: PDF(1335 KB)   HTML
摘要:

针对内河港口背景复杂、类间尺度差异大和小目标实例多的特点,提出基于多头自注意力机制(MHSA)和YOLO网络的船舶目标检测算法(MHSA-YOLO). 在特征提取过程中,基于MHSA设计并行的自注意力残差模块(PARM),以弱化复杂背景信息干扰并强化船舶目标特征信息;在特征融合过程中,开发简化的双向特征金字塔结构,以强化特征信息的融合与表征能力. 在Seaships数据集上的实验结果表明,与其他先进的目标检测方法相比,MHSA-YOLO拥有较好的学习能力,在检测精度方面取得97.59%的平均均值精度,MHSA-YOLO对复杂背景船舶目标和小尺寸目标的检测更有效. 基于自制数据集的实验结果表明,MHSA-YOLO的泛化能力强.

关键词: 智能航行目标检测复杂背景自注意力机制多尺度特征融合    
Abstract:

A ship object detection algorithm was proposed based on a multi-head self-attention (MHSA) mechanism and YOLO network (MHSA-YOLO), aiming at the characteristics of complex backgrounds, large differences in scale between classes and many small objects in inland rivers and ports. In the feature extraction process, a parallel self-attention residual module (PARM) based on MHSA was designed to weaken the interference of complex background information and strengthen the feature information of the ship objects. In the feature fusion process, a simplified two-way feature pyramid was developed so as to strengthen the feature fusion and representation ability. Experimental results on the Seaships dataset showed that the MHSA-YOLO method had a better learning ability, achieved 97.59% mean average precision in the aspect of object detection and was more effective compared with the state-of-the-art object detection methods. Experimental results based on a self-made dataset showed that MHSA-YOLO had strong generalization.

Key words: intelligent navigation    object detection    complex background    self-attention mechanism    multi-scale fusion
收稿日期: 2022-01-11 出版日期: 2023-01-03
CLC:  TU 675.79  
基金资助: 国家重点研发计划项目(SQ2020YFF0418521);重庆市技术创新与应用发展专项重点项目(cstc2020jscx-dxwtBX0019);川渝联合实施重点研发项目(cstc2020jscx-cylhX0005, cstc2020jscx-cylhX0007)
通讯作者: 邓天民     E-mail: yunanjing527@163.com;dtianmin@cqjtu.edu.cn
作者简介: 于楠晶(1998—),女,硕士生,从事目标检测研究. orcid.org/0000-0001-7617-4478. E-mail: yunanjing527@163.com
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  
于楠晶
范晓飚
邓天民
冒国韬

引用本文:

于楠晶,范晓飚,邓天民,冒国韬. 基于多头自注意力的复杂背景船舶检测算法[J]. 浙江大学学报(工学版), 2022, 56(12): 2392-2402.

Nan-jing YU,Xiao-biao FAN,Tian-min DENG,Guo-tao MAO. Ship detection algorithm in complex backgrounds via multi-head self-attention. Journal of ZheJiang University (Engineering Science), 2022, 56(12): 2392-2402.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2022.12.008        https://www.zjujournals.com/eng/CN/Y2022/V56/I12/2392

图 1  基于多头自注意力机制和YOLO网络的船舶目标检测算法整体结构图
图 2  自注意力层结构图
图 3  并行的自注意力残差模块结构图
图 4  路径聚合网络结构图
图 5  简化的双向特征金字塔结构图
图 6  2种算法的总损失变化曲线
算法 pma/% F1 Fps/(帧?s?1) M/MB FLOPS/109 Par
YOLOv5 96.30 0.86 26 13.7 16.4 7 277 027
YOLOv5+简化的FPN 96.73 0.91 25 15.7 17.8 8 145 079
YOLOv5+PARM 96.51 0.93 27 13.1 16.1 6 760 099
MHSA-YOLO 97.59 0.94 25 15.1 17.5 7 828 151
表 1  算法性能的消融实验结果对比
图 7  引入并行自注意力残差模块(PARM)前后浅层特征提取图可视化对比
图 8  不同算法的Seaships测试集检测效果对比图
算法 pma/% pa/%
OC BCC GCS CS FB PS
Faster(VGG16)[21] 90.12 89.44 90.34 90.73 90.87 88.76 90.57
Faster(ResNet18)[21] 90.63 90.37 89.78 90.45 90.91 87.17 88.93
Faster(ResNet50)[21] 91.65 92.38 90.88 92.46 92.91 89.27 90.93
Faster(ResNet101)[21] 92.40 93.68 90.22 93.87 93.41 89.96 91.78
SSD300(MobileNet)[21] 77.66 64.77 76.69 87.43 90.77 71.00 75.32
SSD300(VGG16)[21] 79.37 75.03 76.66 87.66 90.71 71.79 74.35
SSD512(VGG16)[21] 86.73 83.99 83.00 87.08 90.81 85.85 89.65
YOLOv2 random=0[21] 77.51 83.01 79.36 80.60 88.90 62.70 70.48
YOLOv2 random=1[21] 79.06 83.16 82.07 83.21 88.31 64.74 72.89
YOLOv3[23] 87.00 86.00 86.20 87.10 87.10 88.00 90.00
YOLOv4[23] 90.70 90.80 90.70 90.80 90.90 90.60 90.50
MHSA-YOLO 97.59 98.73 98.42 96.41 96.53 98.51 96.94
表 2  不同卷积神经网络的船舶检测结果对比
算法 pma/% pa/%
OC BCC GCS CS FB PS
YOLOv5s 86.1 89.8 90.2 82.0 97.9 89.5 67.4
MHSA-YOLO 92.4 94.5 96.6 84.6 99.5 97.5 81.8
表 3  自制数据集检测精度对比
图 9  基于新加坡海事数据集的自制数据集中不同算法的检测效果对比图
1 ZHANG T W, ZHANG X L, SHI J, et al Depthwise separable convolution neural network for high-speed SAR ship detection[J]. Remote Sensing, 2019, 11 (21): 2483
doi: 10.3390/rs11212483
2 ZHANG T W, ZHANG X L Injection of traditional hand-crafted features into modern CNN-based models for SAR ship classification: what, why, where, and how[J]. Remote Sensing, 2021, 13 (11): 2091
doi: 10.3390/rs13112091
3 徐诚极, 王晓峰, 杨亚东 Attention-YOLO: 引入注意力机制的YOLO检测算法[J]. 计算机工程与应用, 2019, 55 (6): 13- 23
XU Cheng-ji, WANG Xiao-feng, YANG Ya-dong Attention-YOLO: YOLO detection algorithm that introduces attention mechanism[J]. Computer Engineering and Applications, 2019, 55 (6): 13- 23
4 OKSUZ K, CAM B C, KALKAN S, et al Imbalance problems in object detection: a review[J]. IEEE Transactions on Pattern Analysis Machine Intelligence, 2021, 43 (10): 3388- 3415
doi: 10.1109/TPAMI.2020.2981890
5 齐亮, 李邦昱, 陈连凯 基于改进的Faster R-CNN船舶目标检测算法[J]. 中国造船, 2020, 61 (Suppl.1): 40- 51
QI Liang, LI Bang-yu, CHEN Lian-kai Ship target detection algorithm based on improved Fast R-CNN[J]. Shipbuilding of China, 2020, 61 (Suppl.1): 40- 51
doi: 10.3969/j.issn.1000-4882.2020.z1.006
6 汤丽丹. 基于图像的无人船目标检测研究[D]. 哈尔滨: 哈尔滨工业大学, 2018.
TANG Li-dan. Research on object detection of USV based on images [D]. Harbin: Harbin Institute of Technology, 2018.
7 SHAO Z, WANG L, WANG Z, et al Saliency-aware convolution neural network for ship detection in surveillance video[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 30 (3): 781- 794
doi: 10.1109/TCSVT.2019.2897980
8 LI H, DENG L B, YANG C, et al Enhanced YOLO v3 tiny network for real-time ship detection from visual image[J]. IEEE Access, 2021, 9: 16692- 16706
doi: 10.1109/ACCESS.2021.3053956
9 甘兴旺, 魏汉迪, 肖龙飞, 等 基于视觉的船舶环境感知数据融合算法研究[J]. 中国造船, 2021, 62 (2): 201- 210
GAN Xing-wang, WEI Han-di, XIAO Long-fei, et al Research on vision-based data fusion algorithm for environment perception of ships[J]. Shipbuilding of China, 2021, 62 (2): 201- 210
doi: 10.3969/j.issn.1000-4882.2021.02.018
10 FENG Y C, DIAO W H, SUN X, et al Towards automated ship detection and category recognition from high-resolution aerial images[J]. Remote Sensing, 2019, 11 (16): 1901
doi: 10.3390/rs11161901
11 KIM M, JEONG J, KIM S ECAP-YOLO: efficient channel attention pyramid YOLO for small object detection in aerial image[J]. Remote Sensing, 2021, 13 (23): 4851
doi: 10.3390/rs13234851
12 CHEN L Q, SHI W X, DENG D X Improved YOLOv3 based on attention mechanism for fast and accurate ship detection in optical remote sensing images[J]. Remote Sensing, 2021, 13 (4): 660
doi: 10.3390/rs13040660
13 YU J M, ZHOU G Y, ZHOU S B, et al A fast and lightweight detection network for multi-scale SAR ship detection under complex backgrounds[J]. Remote Sensing, 2021, 14 (1): 31
doi: 10.3390/rs14010031
14 LIN Z H, FENG M W, NOGUEIRA C, et al. A structured self-attentive sentence embedding [EB/OL]. [2021-09-17]. https://arxiv.org/abs/1703.03130.pdf.
15 VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach: NIPS, 2017: 5998-6008.
16 SRINIVAS A, LIN T Y, PARMAR N, et al. Bottleneck transformers for visual recognition [C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville: IEEE, 2021: 16514-16524.
17 LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection [C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 937-944.
18 LIU S, QI L, QIN H, et al. Path aggregation network for instance segmentation [C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 8759-8768.
19 ZHANG Y, SHENG W, JIANG J, et al Priority branches for ship detection in optical remote sensing images[J]. Remote Sensing, 2020, 12 (7): 1196
20 TAN M, PANG R, LE Q V. EfficientDet: scalable and efficient object detection [C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 10781-10790.
21 SHAO Z F, WU W J, WANG Z Y, et al SeaShips: a large-scale precisely annotated dataset for ship detection[J]. IEEE Transactions on Multimedia, 2018, 20 (10): 2593- 2604
doi: 10.1109/TMM.2018.2865686
22 ZHENG Z H, WANG P, LIU W, et al. Distance-IoU loss: faster and better learning for bounding box regression [C]// Proceedings of the 34th AAAI Conference on Artificial Intelligence. New York: AAAI, 2020: 12993-13000.
23 赵玉蓉, 郭会明, 焦函, 等 融合混合域注意力的YOLOv4在船舶检测中的应用[J]. 计算机与现代化, 2021, (9): 75- 82
ZHAO Yu-rong, GUO Hui-ming, JIAO Han, et al Application of YOLOv4 with mixed-domain attention in ship detection[J]. Computer and Modernization, 2021, (9): 75- 82
doi: 10.3969/j.issn.1006-2475.2021.09.012
24 LEI S L, LU D D, QIU X L, et al SRSDD-v1.0: a high-resolution SAR rotation ship detection dataset[J]. Remote Sensing, 2021, 13 (24): 5104
doi: 10.3390/rs13245104
25 RODGER M, GUIDA R Classification-aided SAR and AIS data fusion for space-based maritime surveillance[J]. Remote Sensing, 2020, 13 (1): 104
doi: 10.3390/rs13010104
26 LIU J M, CHEN H, WANG Y Multi-source remote sensing image fusion for ship target detection and recognition[J]. Remote Sensing, 2021, 13 (23): 4852
doi: 10.3390/rs13234852
[1] 鞠晓臣,赵欣欣,钱胜胜. 基于自注意力机制的桥梁螺栓检测算法[J]. 浙江大学学报(工学版), 2022, 56(5): 901-908.
[2] 张娜,戚旭磊,包晓安,吴彪,涂小妹,金瑜婷. 基于优化预测定位的单阶段目标检测算法[J]. 浙江大学学报(工学版), 2022, 56(4): 783-794.
[3] 褚晶辉,史李栋,井佩光,吕卫. 适用于目标检测的上下文感知知识蒸馏网络[J]. 浙江大学学报(工学版), 2022, 56(3): 503-509.
[4] 刘英莉,吴瑞刚,么长慧,沈韬. 铝硅合金实体关系抽取数据集的构建方法[J]. 浙江大学学报(工学版), 2022, 56(2): 245-253.
[5] 张云佐,郭威,蔡昭权,李文博. 联合多尺度与注意力机制的遥感图像目标检测[J]. 浙江大学学报(工学版), 2022, 56(11): 2215-2223.
[6] 张融,张为. 基于改进GhostNet-FCOS的火灾检测算法[J]. 浙江大学学报(工学版), 2022, 56(10): 1891-1899.
[7] 杨栋杰,高贤君,冉树浩,张广斌,王萍,杨元维. 基于多重多尺度融合注意力网络的建筑物提取[J]. 浙江大学学报(工学版), 2022, 56(10): 1924-1934.
[8] 陈智超,焦海宁,杨杰,曾华福. 基于改进MobileNet v2的垃圾图像分类算法[J]. 浙江大学学报(工学版), 2021, 55(8): 1490-1499.
[9] 周金海,周世镒,常阳,吴耿俊,王依川. 基于超宽带雷达基带信号的多人目标跟踪[J]. 浙江大学学报(工学版), 2021, 55(6): 1208-1214.
[10] 徐利锋,黄海帆,丁维龙,范玉雷. 基于改进DenseNet的水果小目标检测[J]. 浙江大学学报(工学版), 2021, 55(2): 377-385.
[11] 陈雪云,夏瑾,杜珂. 基于多线型特征增强网络的架空输电线检测[J]. 浙江大学学报(工学版), 2021, 55(12): 2382-2389.
[12] 牛英杰,苏燕辰,程敦诚,廖家,赵海波,高永强. 高铁接触网U型抱箍螺母故障检测算法[J]. 浙江大学学报(工学版), 2021, 55(10): 1912-1921.
[13] 郑浦,白宏阳,李伟,郭宏伟. 复杂背景下的小目标检测算法[J]. 浙江大学学报(工学版), 2020, 54(9): 1777-1784.
[14] 张峻宁,苏群星,刘鹏远,王正军,谷宏强. 基于空间约束的自适应单目3D物体检测算法[J]. 浙江大学学报(工学版), 2020, 54(6): 1138-1146.
[15] 郑晨斌,张勇,胡杭,吴颖睿,黄广靖. 目标检测强化上下文模型[J]. 浙江大学学报(工学版), 2020, 54(3): 529-539.