Remote sensing image target detection combining multi-scale and attention mechanism

doi:10.3785/j.issn.1008-973X.2022.11.012

Journal of ZheJiang University (Engineering Science)

2022, Vol. 56

Issue (11): 2215-2223 DOI: 10.3785/j.issn.1008-973X.2022.11.012

Remote sensing image target detection combining multi-scale and attention mechanism

Yun-zuo ZHANG1,2(

),Wei GUO1,Zhao-quan CAI3,Wen-bo LI1

1. School of Information Science and Technology, Shijiazhuang Tiedao University, Shijiazhuang 050043, China
2. Hebei Key Laboratory of Electromagnetic Environmental Effects and Information Processing, Shijiazhuang Tiedao University, Shijiazhuang 050043, China
3. Shanwei Institute of Technology, Shanwei 516600, China

Download:

HTML

PDF(2731KB) HTML
Export: BibTeX | EndNote (RIS)

Abstract

Remote sensing images have deficiencies such as complex backgrounds, significant differences in target scales, and dense distribution, resulting in poor detection of existing algorithms. A remote sensing image object detection algorithm that combined multi-scale and attention mechanisms was proposed. The receptive field of images of different sizes improved the atrous spatial pyramid pooling module. An attention module was proposed to improve the feature extraction ability for target regions of remote sensing images under complex backgrounds by learning the feature map channel information and the spatial location information. A weighted bidirectional feature pyramid network structure was introduced to combine with the backbone network to improve the fusion of multi-level features. A distance-based non-maximum suppression method was used for postprocessing, which improved the problem of easy overlapping of detection frames. Experimental results on DIOR and NWPU VHR-10 datasets showed that the mean average precision (mAP) of the proposed algorithm reached 71.6% and 91.6%, which were 2.9% and 1.5% higher than those of the mainstream YOLOv5s algorithm respectively. The algorithm achieved good detection results for complex remote sensing images.

Key words： remote sensing image target detection YOLOv5s algorithm multi-scale feature attention module feature fusion non-maximum suppression

Received: 30 November 2021 Published: 02 December 2022

CLC:

TP 751.1

Fund: 广东省重点领域研发计划资助项目(2019B010137002)；国家自然科学基金资助项目(61702347, 62027801)；河北省自然科学基金资助项目(F2022210007, F2017210161)；河北省高等学校科学技术研究项目(ZD2022100, QN2017132)；中央引导地方科技发展资金资助项目(226Z0501G)

	Service
	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors
	Yun-zuo ZHANG
	Wei GUO
	Zhao-quan CAI
	Wen-bo LI

Cite this article:

Yun-zuo ZHANG,Wei GUO,Zhao-quan CAI,Wen-bo LI. Remote sensing image target detection combining multi-scale and attention mechanism. Journal of ZheJiang University (Engineering Science), 2022, 56(11): 2215-2223.

URL:

https://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2022.11.012 OR https://www.zjujournals.com/eng/Y2022/V56/I11/2215

联合多尺度与注意力机制的遥感图像目标检测

遥感图像存在背景复杂、目标尺度差异大且密集分布等不足，为提高现有算法的检测效果提出联合多尺度与注意力机制的遥感图像目标检测算法. 改进空洞空间金字塔池化模块，增大不同尺寸图像的感受野；提出注意力模块用于学习特征图通道信息和空间位置信息，提升算法对复杂背景下遥感图像目标区域的特征提取能力；引入加权双向特征金字塔网络结构与主干网结合来增进多层次特征的融合；使用基于距离的非极大值抑制方法进行后处理，改善检测框易重叠的问题. 在DIOR和NWPUVHR-10数据集上的实验结果表明：所提算法的平均精度均值mAP分别达到71.6%和91.6%，相比于主流的YOLOv5s算法分别提升了2.9%和1.5%. 所提算法对复杂遥感图像取得了更好的检测效果.

关键词： 遥感图像, 目标检测, YOLOv5s算法, 多尺度特征, 注意力模块, 特征融合, 非极大值抑制

Fig.1 Network structure block diagram of joint multiscale and attention mechanism algorithm

Fig.2 ASPP+ module

Fig.3 Attention module

Fig.4 BiFPN structure diagram

Fig.5 Sample DIOR data set

Fig.6 DIOR data set analysis

Tab.1 DIOR data set category information

Tab.2 Comparison of different algorithm models on DIOR test set %

Fig.7 Comparison of detection effect between YOLOv5s algorithm and proposed algorithm

Tab.3 Comparison of different algorithm models on NWPU VHR-10 test set %

Tab.4 Performance comparison of ASPP+ and attention module in terms of mAP

Tab.5 Experimental results after adding each module


[1]	姜鑫, 陈武雄, 聂海涛, 等航空遥感图像的实时舰船目标检[J]. 光学精密工程, 2020, 28 (10): 2360- 2369 JIANG Xin, CHEN Wu-xiong, NIE Hai-tao, et al Real-time ships target detection based on aerial remote sensing images[J]. Optics and Precision Engineering, 2020, 28 (10): 2360- 2369 doi: 10.37188/OPE.20202810.2360

[2]	聂光涛, 黄华光学遥感图像目标检测算法综述[J]. 自动化学报, 2021, 47 (8): 1749- 1768 NIE Guang-tao, HUANG Hua A survey of object detection in optical remote sensing images[J]. Acta Automatica Sinica, 2021, 47 (8): 1749- 1768 doi: 10.16383/j.aas.c200596

[3]	王昶, 张永生, 王旭, 等基于深度学习的遥感影像变化检测方法[J]. 浙江大学学报:工学版, 2020, 54 (11): 2138- 2148 WANG Chang, ZHANG Yong-sheng, WANG Xu, et al Remote sensing image change detection method based on deep neural networks[J]. Journal of Zhejiang University: Engineering Science, 2020, 54 (11): 2138- 2148

[4]	YANG X, SUN H, SUN X, et al Position detection and direction prediction for arbitrary-oriented ships via multitask rotation region convolutional neural network[J]. IEEE Access, 2018, 6: 50839- 50849 doi: 10.1109/ACCESS.2018.2869884

[5]	FENG J, LIANG Y P, YE Z W, et al. Small object detection in optical remote sensing video with motion guided R-CNN [C]// IEEE International Geoscience and Remote Sensing Symposium. Waikoloa: IEEE, 2020: 272-275.

[6]	GUAN H Y, YU Y T, LI D L, et al Road Caps FPN: capsule feature pyramid network for road extraction from VHR optical remote sensing imagery[J]. IEEE Transactions on Intelligent Transportation Systems, 2021, 1- 11

[7]	COURTRAI L, PHAM M T, LEFEVRE S Small object detection in remote sensing images based on super-resolution with auxiliary generative adversarial networks[J]. Remote Sensing, 2020, 12 (19): 3152 doi: 10.3390/rs12193152

[8]	ZHANG X D, ZHU K, CHEN G Z, et al Geospatial object detection on high resolution remote sensing imagery based on double multi-scale feature pyramid network[J]. Remote Sensing, 2019, 11 (7): 755 doi: 10.3390/rs11070755

[9]	LI L L, CHENG L, GUO X H, et al. Deep adaptive proposal network in optical remote sensing images objective detection [C]// IEEE International Geoscience and Remote Sensing Symposium. Waikoloa: IEEE, 2020: 2651-2654.

[10]	CHEN C Y, GONG W G, CHEN Y L, et al Object detection in remote sensing images based on a scene-contextual feature pyramid network[J]. Remote Sensing, 2019, 11 (3): 339 doi: 10.3390/rs11030339

[11]	HE W P, HUANG Z, WEI Z F, et al TF-YOLO: an improved incremental network for real-time object detection[J]. Applied Sciences, 2019, 9 (16): 3225 doi: 10.3390/app9163225

[12]	SHAMSOLMOALI P, CHANUSSOT J, ZAREAPOOR M, et al Multi-patch feature pyramid network for weakly supervised object detection in optical remote sensing images[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 60: 1- 13

[13]	CHEN L C, PAPANDREOU G, KOKKINOS I, et al DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40 (4): 834- 848 doi: 10.1109/TPAMI.2017.2699184

[14]	BERTASIUS G, TORRESANI L, YU S X, et al. Convolutional random walk networks for semantic image segmentation [C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 858-866.

[15]	CHEN L C, PAPANDREOU G, SCHROFF F, et al. Rethinking atrous convolution for semantic image segmentation [EB/OL]. [2022-01-14]. https://arxiv.53yu. com/abs/1706.05587v3.

[16]	HU J, SHEN L, SUN G. Squeeze-and-excitation network [C]// IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: Computer Vision Foundation, 2018: 7132-7141.

[17]	WOO S, PARK J, LEE J Y, et al. Cbam: convolutional block attention module [C]// European Conference on Computer Vision. Berlin: Springer, 2018: 3-19.

[18]	周勇, 陈思霖, 赵佳琦, 等基于弱语义注意力的遥感图像可解释目标检测[J]. 电子学报, 2021, 49 (4): 679- 689 ZHOU Yong, CHEN Si-lin, ZHAO Jia-qi, et al Weakly semantic based attention network for interpretable object detection in remote sensing imagery[J]. Acta Electronica Sinica, 2021, 49 (4): 679- 689 doi: 10.12263/DZXB.20200554

[19]	ZHANG Y N, KONG J, QI M, et al Object detection based on multiple information fusion net[J]. Applied Sciences, 2020, 10 (1): 418 doi: 10.3390/app10010418

[20]	TAN M X, PANG R M, LE Q V. Efficientdet: scalable and efficient object detection [C]// IEEE Conference on Computer Vision and Pattern Recognition. Seattle: Computer Vision Foundation, 2020: 10778-10787.

[21]	LI K, WAN G, CHENG G, et al Object detection in optical remote sensing images: a survey and a new benchmark[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2020, 159: 296- 307 doi: 10.1016/j.isprsjprs.2019.11.023

[22]	LIN T Y, GOYAL P, GIRSHICK R, et al. Focal loss for dense object detection [C]// IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 2980-2988.

[23]	LIU S, QI L, QIN H F, et al. Path aggregation network for instance segmentation [C]// IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: Computer Vision Foundation, 2018: 8759-8768.

[24]	ZHANG J, XIE C M, XU X, et al A contextual bidirectional enhancement method for remote sensing image object detection[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2020, 13: 4518- 4531 doi: 10.1109/JSTARS.2020.3015049

[25]	WANG C, BAI X, WANG S A, et al Multiscale visual attention networks for object detection in VHR remote sensing images[J]. IEEE Geoscience and Remote Sensing Letters, 2018, 16 (2): 310- 314

[1]	Jin-zhen LIU,Fei CHEN,Hui XIONG. Open electrical impedance imaging algorithm based on multi-scale residual network model[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(9): 1789-1795.

[2]	Ren-peng MO,Xiao-sheng SI,Tian-mei LI,Xu ZHU. Bearing life prediction based on multi-scale features and attention mechanism[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(7): 1447-1456.

[3]	Guo-hua ZHOU,Jian-wei LU,Tong-guang NI,Xue-long HU. Hierarchical nonlinear subspace dictionary learning[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(6): 1159-1167.

[4]	Ze-kang WU,Shan ZHAO,Hong-wei LI,Yi-rui JIANG. Spatial global context information network for semantic segmentation of remote sensing image[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(4): 795-802.

[5]	Na ZHANG,Xu-lei QI,Xiao-an BAO,Biao WU,Xiao-mei TU,Yu-ting JIN. Single-stage object detection algorithm based on optimizing position prediction[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(4): 783-794.

[6]	Xin-yu HUANG,Fan YOU,Pei ZHANG,Zhao ZHANG,Bai-li ZHANG,Jian-hua LV,Li-zhen XU. Silent liveness detection algorithm based on multi classification and feature fusion network[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(2): 263-270.

[7]	Rong ZHANG,Wei ZHANG. Fire detection algorithm based on improved GhostNet-FCOS[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(10): 1891-1899.

[8]	Dong-jie YANG,Xian-jun GAO,Shu-hao RAN,Guang-bin ZHANG,Ping WANG,Yuan-wei YANG. Building extraction based on multiple multiscale-feature fusion attention network[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(10): 1924-1934.

[9]	Zhi-chao CHEN,Hai-ning JIAO,Jie YANG,Hua-fu ZENG. Garbage image classification algorithm based on improved MobileNet v2[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(8): 1490-1499.

[10]	Jin-hai ZHOU,Shi-yi ZHOU,Yang CHANG,Geng-jun WU,Yi-chuan WANG. Multi-human target tracking based on baseband signals of ultra wide band radar[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(6): 1208-1214.

[11]	Li-feng XU,Hai-fan HUANG,Wei-long DING,Yu-lei FAN. Detection of small fruit target based on improved DenseNet[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(2): 377-385.

[12]	Hao-yuan WANG,Yu LIANG,Wei ZHANG. Real-time smoke segmentation algorithm fused with multi-resolution representation[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(12): 2334-2341.

[13]	Yue-lin CHEN,Wen-jing TIAN,Xiao-dong CAI,Shu-ting ZHENG. Text matching model based on dense connection networkand multi-dimensional feature fusion[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(12): 2352-2358.

[14]	Xue-yun CHEN,Jin XIA,Ke DU. Overhead transmission line detection based on multiple linear-feature enhanced detector[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(12): 2382-2389.

[15]	Qing-qing LIU,Zhi-yong ZHOU,Guo-hua FAN,Xu-sheng QIAN,Ji-su HU,Guang-qiang CHEN,Ya-kang DAI. Semi-supervised learning segmentation method of liver CT images based on 3D scSE-UNet[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(11): 2033-2044.

Viewed

Full text

Abstract

Cited

Shared

Discussed