|
|
Remote sensing image target detection combining multi-scale and attention mechanism |
Yun-zuo ZHANG1,2( ),Wei GUO1,Zhao-quan CAI3,Wen-bo LI1 |
1. School of Information Science and Technology, Shijiazhuang Tiedao University, Shijiazhuang 050043, China 2. Hebei Key Laboratory of Electromagnetic Environmental Effects and Information Processing, Shijiazhuang Tiedao University, Shijiazhuang 050043, China 3. Shanwei Institute of Technology, Shanwei 516600, China |
|
|
Abstract Remote sensing images have deficiencies such as complex backgrounds, significant differences in target scales, and dense distribution, resulting in poor detection of existing algorithms. A remote sensing image object detection algorithm that combined multi-scale and attention mechanisms was proposed. The receptive field of images of different sizes improved the atrous spatial pyramid pooling module. An attention module was proposed to improve the feature extraction ability for target regions of remote sensing images under complex backgrounds by learning the feature map channel information and the spatial location information. A weighted bidirectional feature pyramid network structure was introduced to combine with the backbone network to improve the fusion of multi-level features. A distance-based non-maximum suppression method was used for postprocessing, which improved the problem of easy overlapping of detection frames. Experimental results on DIOR and NWPU VHR-10 datasets showed that the mean average precision (mAP) of the proposed algorithm reached 71.6% and 91.6%, which were 2.9% and 1.5% higher than those of the mainstream YOLOv5s algorithm respectively. The algorithm achieved good detection results for complex remote sensing images.
|
Received: 30 November 2021
Published: 02 December 2022
|
|
Fund: 广东省重点领域研发计划资助项目(2019B010137002);国家自然科学基金资助项目(61702347, 62027801);河北省自然科学基金资助项目(F2022210007, F2017210161);河北省高等学校科学技术研究项目(ZD2022100, QN2017132);中央引导地方科技发展资金资助项目(226Z0501G) |
联合多尺度与注意力机制的遥感图像目标检测
遥感图像存在背景复杂、目标尺度差异大且密集分布等不足,为提高现有算法的检测效果提出联合多尺度与注意力机制的遥感图像目标检测算法. 改进空洞空间金字塔池化模块,增大不同尺寸图像的感受野;提出注意力模块用于学习特征图通道信息和空间位置信息,提升算法对复杂背景下遥感图像目标区域的特征提取能力;引入加权双向特征金字塔网络结构与主干网结合来增进多层次特征的融合;使用基于距离的非极大值抑制方法进行后处理,改善检测框易重叠的问题. 在DIOR和NWPUVHR-10数据集上的实验结果表明:所提算法的平均精度均值mAP分别达到71.6%和91.6%,相比于主流的YOLOv5s算法分别提升了2.9%和1.5%. 所提算法对复杂遥感图像取得了更好的检测效果.
关键词:
遥感图像,
目标检测,
YOLOv5s算法,
多尺度特征,
注意力模块,
特征融合,
非极大值抑制
|
|
[1] |
姜鑫, 陈武雄, 聂海涛, 等 航空遥感图像的实时舰船目标检[J]. 光学精密工程, 2020, 28 (10): 2360- 2369 JIANG Xin, CHEN Wu-xiong, NIE Hai-tao, et al Real-time ships target detection based on aerial remote sensing images[J]. Optics and Precision Engineering, 2020, 28 (10): 2360- 2369
doi: 10.37188/OPE.20202810.2360
|
|
|
[2] |
聂光涛, 黄华 光学遥感图像目标检测算法综述[J]. 自动化学报, 2021, 47 (8): 1749- 1768 NIE Guang-tao, HUANG Hua A survey of object detection in optical remote sensing images[J]. Acta Automatica Sinica, 2021, 47 (8): 1749- 1768
doi: 10.16383/j.aas.c200596
|
|
|
[3] |
王昶, 张永生, 王旭, 等 基于深度学习的遥感影像变化检测方法[J]. 浙江大学学报:工学版, 2020, 54 (11): 2138- 2148 WANG Chang, ZHANG Yong-sheng, WANG Xu, et al Remote sensing image change detection method based on deep neural networks[J]. Journal of Zhejiang University: Engineering Science, 2020, 54 (11): 2138- 2148
|
|
|
[4] |
YANG X, SUN H, SUN X, et al Position detection and direction prediction for arbitrary-oriented ships via multitask rotation region convolutional neural network[J]. IEEE Access, 2018, 6: 50839- 50849
doi: 10.1109/ACCESS.2018.2869884
|
|
|
[5] |
FENG J, LIANG Y P, YE Z W, et al. Small object detection in optical remote sensing video with motion guided R-CNN [C]// IEEE International Geoscience and Remote Sensing Symposium. Waikoloa: IEEE, 2020: 272-275.
|
|
|
[6] |
GUAN H Y, YU Y T, LI D L, et al Road Caps FPN: capsule feature pyramid network for road extraction from VHR optical remote sensing imagery[J]. IEEE Transactions on Intelligent Transportation Systems, 2021, 1- 11
|
|
|
[7] |
COURTRAI L, PHAM M T, LEFEVRE S Small object detection in remote sensing images based on super-resolution with auxiliary generative adversarial networks[J]. Remote Sensing, 2020, 12 (19): 3152
doi: 10.3390/rs12193152
|
|
|
[8] |
ZHANG X D, ZHU K, CHEN G Z, et al Geospatial object detection on high resolution remote sensing imagery based on double multi-scale feature pyramid network[J]. Remote Sensing, 2019, 11 (7): 755
doi: 10.3390/rs11070755
|
|
|
[9] |
LI L L, CHENG L, GUO X H, et al. Deep adaptive proposal network in optical remote sensing images objective detection [C]// IEEE International Geoscience and Remote Sensing Symposium. Waikoloa: IEEE, 2020: 2651-2654.
|
|
|
[10] |
CHEN C Y, GONG W G, CHEN Y L, et al Object detection in remote sensing images based on a scene-contextual feature pyramid network[J]. Remote Sensing, 2019, 11 (3): 339
doi: 10.3390/rs11030339
|
|
|
[11] |
HE W P, HUANG Z, WEI Z F, et al TF-YOLO: an improved incremental network for real-time object detection[J]. Applied Sciences, 2019, 9 (16): 3225
doi: 10.3390/app9163225
|
|
|
[12] |
SHAMSOLMOALI P, CHANUSSOT J, ZAREAPOOR M, et al Multi-patch feature pyramid network for weakly supervised object detection in optical remote sensing images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 60: 1- 13
|
|
|
[13] |
CHEN L C, PAPANDREOU G, KOKKINOS I, et al DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40 (4): 834- 848
doi: 10.1109/TPAMI.2017.2699184
|
|
|
[14] |
BERTASIUS G, TORRESANI L, YU S X, et al. Convolutional random walk networks for semantic image segmentation [C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 858-866.
|
|
|
[15] |
CHEN L C, PAPANDREOU G, SCHROFF F, et al. Rethinking atrous convolution for semantic image segmentation [EB/OL]. [2022-01-14]. https://arxiv.53yu. com/abs/1706.05587v3.
|
|
|
[16] |
HU J, SHEN L, SUN G. Squeeze-and-excitation network [C]// IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: Computer Vision Foundation, 2018: 7132-7141.
|
|
|
[17] |
WOO S, PARK J, LEE J Y, et al. Cbam: convolutional block attention module [C]// European Conference on Computer Vision. Berlin: Springer, 2018: 3-19.
|
|
|
[18] |
周勇, 陈思霖, 赵佳琦, 等 基于弱语义注意力的遥感图像可解释目标检测[J]. 电子学报, 2021, 49 (4): 679- 689 ZHOU Yong, CHEN Si-lin, ZHAO Jia-qi, et al Weakly semantic based attention network for interpretable object detection in remote sensing imagery[J]. Acta Electronica Sinica, 2021, 49 (4): 679- 689
doi: 10.12263/DZXB.20200554
|
|
|
[19] |
ZHANG Y N, KONG J, QI M, et al Object detection based on multiple information fusion net[J]. Applied Sciences, 2020, 10 (1): 418
doi: 10.3390/app10010418
|
|
|
[20] |
TAN M X, PANG R M, LE Q V. Efficientdet: scalable and efficient object detection [C]// IEEE Conference on Computer Vision and Pattern Recognition. Seattle: Computer Vision Foundation, 2020: 10778-10787.
|
|
|
[21] |
LI K, WAN G, CHENG G, et al Object detection in optical remote sensing images: a survey and a new benchmark[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2020, 159: 296- 307
doi: 10.1016/j.isprsjprs.2019.11.023
|
|
|
[22] |
LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection [C]// IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 2980-2988.
|
|
|
[23] |
LIU S, QI L, QIN H F, et al. Path aggregation network for instance segmentation [C]// IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: Computer Vision Foundation, 2018: 8759-8768.
|
|
|
[24] |
ZHANG J, XIE C M, XU X, et al A contextual bidirectional enhancement method for remote sensing image object detection[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2020, 13: 4518- 4531
doi: 10.1109/JSTARS.2020.3015049
|
|
|
[25] |
WANG C, BAI X, WANG S A, et al Multiscale visual attention networks for object detection in VHR remote sensing images[J]. IEEE Geoscience and Remote Sensing Letters, 2018, 16 (2): 310- 314
|
|
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|