Please wait a minute...
浙江大学学报(工学版)  2022, Vol. 56 Issue (11): 2215-2223    DOI: 10.3785/j.issn.1008-973X.2022.11.012
计算机技术     
联合多尺度与注意力机制的遥感图像目标检测
张云佐1,2(),郭威1,蔡昭权3,李文博1
1. 石家庄铁道大学 信息科学与技术学院,河北 石家庄 050043
2. 河北省电磁环境效应与信息处理重点实验室,河北 石家庄 050043
3. 汕尾职业技术学院,广东 汕尾 516600
Remote sensing image target detection combining multi-scale and attention mechanism
Yun-zuo ZHANG1,2(),Wei GUO1,Zhao-quan CAI3,Wen-bo LI1
1. School of Information Science and Technology, Shijiazhuang Tiedao University, Shijiazhuang 050043, China
2. Hebei Key Laboratory of Electromagnetic Environmental Effects and Information Processing, Shijiazhuang Tiedao University, Shijiazhuang 050043, China
3. Shanwei Institute of Technology, Shanwei 516600, China
 全文: PDF(2731 KB)   HTML
摘要:

遥感图像存在背景复杂、目标尺度差异大且密集分布等不足,为提高现有算法的检测效果提出联合多尺度与注意力机制的遥感图像目标检测算法. 改进空洞空间金字塔池化模块,增大不同尺寸图像的感受野;提出注意力模块用于学习特征图通道信息和空间位置信息,提升算法对复杂背景下遥感图像目标区域的特征提取能力;引入加权双向特征金字塔网络结构与主干网结合来增进多层次特征的融合;使用基于距离的非极大值抑制方法进行后处理,改善检测框易重叠的问题. 在DIOR和NWPUVHR-10数据集上的实验结果表明:所提算法的平均精度均值mAP分别达到71.6%和91.6%,相比于主流的YOLOv5s算法分别提升了2.9%和1.5%. 所提算法对复杂遥感图像取得了更好的检测效果.

关键词: 遥感图像目标检测YOLOv5s算法多尺度特征注意力模块特征融合非极大值抑制    
Abstract:

Remote sensing images have deficiencies such as complex backgrounds, significant differences in target scales, and dense distribution, resulting in poor detection of existing algorithms. A remote sensing image object detection algorithm that combined multi-scale and attention mechanisms was proposed. The receptive field of images of different sizes improved the atrous spatial pyramid pooling module. An attention module was proposed to improve the feature extraction ability for target regions of remote sensing images under complex backgrounds by learning the feature map channel information and the spatial location information. A weighted bidirectional feature pyramid network structure was introduced to combine with the backbone network to improve the fusion of multi-level features. A distance-based non-maximum suppression method was used for postprocessing, which improved the problem of easy overlapping of detection frames. Experimental results on DIOR and NWPU VHR-10 datasets showed that the mean average precision (mAP) of the proposed algorithm reached 71.6% and 91.6%, which were 2.9% and 1.5% higher than those of the mainstream YOLOv5s algorithm respectively. The algorithm achieved good detection results for complex remote sensing images.

Key words: remote sensing image    target detection    YOLOv5s algorithm    multi-scale feature    attention module    feature fusion    non-maximum suppression
收稿日期: 2021-11-30 出版日期: 2022-12-02
CLC:  TP 751.1  
基金资助: 广东省重点领域研发计划资助项目(2019B010137002);国家自然科学基金资助项目(61702347, 62027801);河北省自然科学基金资助项目(F2022210007, F2017210161);河北省高等学校科学技术研究项目(ZD2022100, QN2017132);中央引导地方科技发展资金资助项目(226Z0501G)
作者简介: 张云佐(1984—),男,副教授,博导,从事图像处理、视频智能分析和大数据处理研究. orcid.org/0000-0001-7499-4835. E-mail: zhangyunzuo888@sina.com
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  
张云佐
郭威
蔡昭权
李文博

引用本文:

张云佐,郭威,蔡昭权,李文博. 联合多尺度与注意力机制的遥感图像目标检测[J]. 浙江大学学报(工学版), 2022, 56(11): 2215-2223.

Yun-zuo ZHANG,Wei GUO,Zhao-quan CAI,Wen-bo LI. Remote sensing image target detection combining multi-scale and attention mechanism. Journal of ZheJiang University (Engineering Science), 2022, 56(11): 2215-2223.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2022.11.012        https://www.zjujournals.com/eng/CN/Y2022/V56/I11/2215

图 1  联合多尺度与注意力机制算法的网络结构框图
图 2  ASPP+模块
图 3  注意力模块
图 4  BiFPN结构图
图 5  DIOR数据集样例
图 6  DIOR数据集分析
类别 名称 类别 名称
C1 airplane C11 ground track field
C2 airport C12 harbor
C3 baseball field C13 overpass
C4 basketball court C14 ship
C5 bridge C15 stadium
C6 chimney C16 storage tank
C7 dam C17 tennis court
C8 expressway service area C18 train station
C9 expressway toll station C19 vehicle
C10 golf field C20 wind mill
表 1  DIOR数据集类别信息
算法模型 mAP AP
C1/C11 C2/C12 C3/C13 C4/C14 C5/C15 C6/C16 C7/C17 C8/C18 C9/C19 C10/C20
RetinaNet[22] 65.7 53.7/74.2 77.3/50.7 69.0/59.6 81.3/71.2 44.1/69.3 72.3/44.8 62.5/81.3 76.2/54.2 66.0/45.1 77.7/83.4
PANet[23] 66.1 60.2/73.4 72.0/45.3 70.6/56.9 80.5/71.7 43.6/70.4 72.3/62.0 61.4/80.9 72.1/57.0 66.7/47.2 72.0/84.5
CBD-E[24] 67.8 54.2/79.5 77.0/47.5 71.5/59.3 87.1/69.1 44.6/69.7 75.4/64.3 63.5/84.5 76.2/59.4 65.3/44.7 79.3/83.1
YOLOv5s 68.7 78.3/73.1 65/58.3 74.3/57.4 90.6/91.8 44.3/67.9 80.1/82.7 48.9/89.1 57.7/49.7 63.2/55.4 68.6/78.1
Ours 71.6 85.8/75.7 74.2/59.9 78.9/58.6 89.8/89.7 46.1/71.9 77.8/78.7 60.5/89.5 65.1/55.4 65.3/56.4 75.6/78.1
表 2  不同算法模型在DIOR测试集上的对比
图 7  YOLOv5s算法与所提算法检测效果对比
算法模型 mAP AP
airplane ship storage tank baseball diamond tennis court basketball court ground track field harbor bridge vehicle
RetinaNet[22] 84.3 91.2 82.8 88.5 93.8 83.0 85.9 79.4 73.5 78.8 86.0
文献[25] 83.8 90.2 86.2 90.1 96.7 89.8 68.5 91.0 81.4 63.9 79.2
文献[26] 84.8 93.0 84.5 87.1 92.8 82.0 89.0 78.0 76.0 81.0 84.5
YOLOv5s 90.1 94.6 90.3 81.8 92.2 90.5 88.7 99.5 93.1 82.1 88.2
Ours 91.6 95.3 91.9 88.7 95.8 91.2 88.5 99.5 92.4 85.1 87.6
表 3  不同算法模型在NWPU VHR-10测试集上的对比
Baseline ASPP ASPP+
(1,3,5)
ASPP+
(3,6,9)
ASPP+
(6,12,18)
CBAM AM mAP/%
68.7
68.8
69.1
70.3
69.8
69.2
70.9
表 4  ASPP+和注意力模块在精确度方面的性能对比
模型 P /
%
R /
%
mAP /
%
FPS /
(frame·s?1)
YOLOv5s 65.3 70.2 68.7 28.1
YOLOv5s-ASPP+ 64.4 71.0 70.3 27.4
YOLOv5s-ASPP+-AM 63.7 72.2 70.9 25.9
YOLOv5s-ASPP+-AM-BiFPN 67.0 72.5 71.6 25.4
表 5  各模块添加后的实验结果
1 姜鑫, 陈武雄, 聂海涛, 等 航空遥感图像的实时舰船目标检[J]. 光学精密工程, 2020, 28 (10): 2360- 2369
JIANG Xin, CHEN Wu-xiong, NIE Hai-tao, et al Real-time ships target detection based on aerial remote sensing images[J]. Optics and Precision Engineering, 2020, 28 (10): 2360- 2369
doi: 10.37188/OPE.20202810.2360
2 聂光涛, 黄华 光学遥感图像目标检测算法综述[J]. 自动化学报, 2021, 47 (8): 1749- 1768
NIE Guang-tao, HUANG Hua A survey of object detection in optical remote sensing images[J]. Acta Automatica Sinica, 2021, 47 (8): 1749- 1768
doi: 10.16383/j.aas.c200596
3 王昶, 张永生, 王旭, 等 基于深度学习的遥感影像变化检测方法[J]. 浙江大学学报:工学版, 2020, 54 (11): 2138- 2148
WANG Chang, ZHANG Yong-sheng, WANG Xu, et al Remote sensing image change detection method based on deep neural networks[J]. Journal of Zhejiang University: Engineering Science, 2020, 54 (11): 2138- 2148
4 YANG X, SUN H, SUN X, et al Position detection and direction prediction for arbitrary-oriented ships via multitask rotation region convolutional neural network[J]. IEEE Access, 2018, 6: 50839- 50849
doi: 10.1109/ACCESS.2018.2869884
5 FENG J, LIANG Y P, YE Z W, et al. Small object detection in optical remote sensing video with motion guided R-CNN [C]// IEEE International Geoscience and Remote Sensing Symposium. Waikoloa: IEEE, 2020: 272-275.
6 GUAN H Y, YU Y T, LI D L, et al Road Caps FPN: capsule feature pyramid network for road extraction from VHR optical remote sensing imagery[J]. IEEE Transactions on Intelligent Transportation Systems, 2021, 1- 11
7 COURTRAI L, PHAM M T, LEFEVRE S Small object detection in remote sensing images based on super-resolution with auxiliary generative adversarial networks[J]. Remote Sensing, 2020, 12 (19): 3152
doi: 10.3390/rs12193152
8 ZHANG X D, ZHU K, CHEN G Z, et al Geospatial object detection on high resolution remote sensing imagery based on double multi-scale feature pyramid network[J]. Remote Sensing, 2019, 11 (7): 755
doi: 10.3390/rs11070755
9 LI L L, CHENG L, GUO X H, et al. Deep adaptive proposal network in optical remote sensing images objective detection [C]// IEEE International Geoscience and Remote Sensing Symposium. Waikoloa: IEEE, 2020: 2651-2654.
10 CHEN C Y, GONG W G, CHEN Y L, et al Object detection in remote sensing images based on a scene-contextual feature pyramid network[J]. Remote Sensing, 2019, 11 (3): 339
doi: 10.3390/rs11030339
11 HE W P, HUANG Z, WEI Z F, et al TF-YOLO: an improved incremental network for real-time object detection[J]. Applied Sciences, 2019, 9 (16): 3225
doi: 10.3390/app9163225
12 SHAMSOLMOALI P, CHANUSSOT J, ZAREAPOOR M, et al Multi-patch feature pyramid network for weakly supervised object detection in optical remote sensing images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 60: 1- 13
13 CHEN L C, PAPANDREOU G, KOKKINOS I, et al DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40 (4): 834- 848
doi: 10.1109/TPAMI.2017.2699184
14 BERTASIUS G, TORRESANI L, YU S X, et al. Convolutional random walk networks for semantic image segmentation [C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 858-866.
15 CHEN L C, PAPANDREOU G, SCHROFF F, et al. Rethinking atrous convolution for semantic image segmentation [EB/OL]. [2022-01-14]. https://arxiv.53yu. com/abs/1706.05587v3.
16 HU J, SHEN L, SUN G. Squeeze-and-excitation network [C]// IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: Computer Vision Foundation, 2018: 7132-7141.
17 WOO S, PARK J, LEE J Y, et al. Cbam: convolutional block attention module [C]// European Conference on Computer Vision. Berlin: Springer, 2018: 3-19.
18 周勇, 陈思霖, 赵佳琦, 等 基于弱语义注意力的遥感图像可解释目标检测[J]. 电子学报, 2021, 49 (4): 679- 689
ZHOU Yong, CHEN Si-lin, ZHAO Jia-qi, et al Weakly semantic based attention network for interpretable object detection in remote sensing imagery[J]. Acta Electronica Sinica, 2021, 49 (4): 679- 689
doi: 10.12263/DZXB.20200554
19 ZHANG Y N, KONG J, QI M, et al Object detection based on multiple information fusion net[J]. Applied Sciences, 2020, 10 (1): 418
doi: 10.3390/app10010418
20 TAN M X, PANG R M, LE Q V. Efficientdet: scalable and efficient object detection [C]// IEEE Conference on Computer Vision and Pattern Recognition. Seattle: Computer Vision Foundation, 2020: 10778-10787.
21 LI K, WAN G, CHENG G, et al Object detection in optical remote sensing images: a survey and a new benchmark[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2020, 159: 296- 307
doi: 10.1016/j.isprsjprs.2019.11.023
22 LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection [C]// IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 2980-2988.
23 LIU S, QI L, QIN H F, et al. Path aggregation network for instance segmentation [C]// IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: Computer Vision Foundation, 2018: 8759-8768.
24 ZHANG J, XIE C M, XU X, et al A contextual bidirectional enhancement method for remote sensing image object detection[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2020, 13: 4518- 4531
doi: 10.1109/JSTARS.2020.3015049
25 WANG C, BAI X, WANG S A, et al Multiscale visual attention networks for object detection in VHR remote sensing images[J]. IEEE Geoscience and Remote Sensing Letters, 2018, 16 (2): 310- 314
[1] 刘近贞,陈飞,熊慧. 多尺度残差网络模型的开放式电阻抗成像算法[J]. 浙江大学学报(工学版), 2022, 56(9): 1789-1795.
[2] 莫仁鹏,司小胜,李天梅,朱旭. 基于多尺度特征与注意力机制的轴承寿命预测[J]. 浙江大学学报(工学版), 2022, 56(7): 1447-1456.
[3] 周国华,卢剑伟,倪彤光,胡学龙. 层次型非线性子空间字典学习[J]. 浙江大学学报(工学版), 2022, 56(6): 1159-1167.
[4] 张娜,戚旭磊,包晓安,吴彪,涂小妹,金瑜婷. 基于优化预测定位的单阶段目标检测算法[J]. 浙江大学学报(工学版), 2022, 56(4): 783-794.
[5] 褚晶辉,史李栋,井佩光,吕卫. 适用于目标检测的上下文感知知识蒸馏网络[J]. 浙江大学学报(工学版), 2022, 56(3): 503-509.
[6] 黄新宇,游帆,张沛,张昭,张柏礼,吕建华,徐立臻. 基于多分类及特征融合的静默活体检测算法[J]. 浙江大学学报(工学版), 2022, 56(2): 263-270.
[7] 张融,张为. 基于改进GhostNet-FCOS的火灾检测算法[J]. 浙江大学学报(工学版), 2022, 56(10): 1891-1899.
[8] 杨栋杰,高贤君,冉树浩,张广斌,王萍,杨元维. 基于多重多尺度融合注意力网络的建筑物提取[J]. 浙江大学学报(工学版), 2022, 56(10): 1924-1934.
[9] 陈智超,焦海宁,杨杰,曾华福. 基于改进MobileNet v2的垃圾图像分类算法[J]. 浙江大学学报(工学版), 2021, 55(8): 1490-1499.
[10] 周金海,周世镒,常阳,吴耿俊,王依川. 基于超宽带雷达基带信号的多人目标跟踪[J]. 浙江大学学报(工学版), 2021, 55(6): 1208-1214.
[11] 徐利锋,黄海帆,丁维龙,范玉雷. 基于改进DenseNet的水果小目标检测[J]. 浙江大学学报(工学版), 2021, 55(2): 377-385.
[12] 王浩远,梁煜,张为. 融合多分辨率表征的实时烟雾分割算法[J]. 浙江大学学报(工学版), 2021, 55(12): 2334-2341.
[13] 陈岳林,田文靖,蔡晓东,郑淑婷. 基于密集连接网络和多维特征融合的文本匹配模型[J]. 浙江大学学报(工学版), 2021, 55(12): 2352-2358.
[14] 陈雪云,夏瑾,杜珂. 基于多线型特征增强网络的架空输电线检测[J]. 浙江大学学报(工学版), 2021, 55(12): 2382-2389.
[15] 刘清清,周志勇,范国华,钱旭升,胡冀苏,陈光强,戴亚康. 基于3D scSE-UNet的肝脏CT图像半监督学习分割方法[J]. 浙江大学学报(工学版), 2021, 55(11): 2033-2044.