Semantic segmentation network for remote sensing image based on multi-scale mutual attention

doi:10.3785/j.issn.1008-973X.2023.07.008

Journal of ZheJiang University (Engineering Science)

2023, Vol. 57

Issue (7): 1335-1344 DOI: 10.3785/j.issn.1008-973X.2023.07.008

Semantic segmentation network for remote sensing image based on multi-scale mutual attention

Chun-juan LIU1(

),Ze QIAO1,Hao-wen YAN2,3,Xiao-suo WU1,3,*(

),Jia-wei WANG1,Yu-qiang XIN1

1. School of Electronic and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China
2. School of Surveying, Mapping and Geographic Information, Lanzhou Jiaotong University, Lanzhou 730070, China
3. Academician Expert Workstation of Gansu Dayu Jiuzhou Space Information Technology Limited Company, Lanzhou 730070, China

Download:

HTML

PDF(1726KB) HTML
Export: BibTeX | EndNote (RIS)

Abstract

A network with multi-scale mutual attention and guidance upsampling was proposed in order to solve the segmentation accuracy degradation caused by the huge scale difference between target objects and the loss of spatial details in the semantic segmentation of remote sensing images. The multi-scale mutual attention module was used to obtain the pixel relations between different scale images and balance the weights of different target objects in order to improve the segmentation performance of small-scale objects. The image upsampling process was guided by the information in the coding structure, and spatial details were incorporated to enhance the classification of target object boundary pixels in the coding guidance upsampling module. The mIoU scores of the proposed network on the Potsdam dataset and Jiage dataset were 85.52% and 86.59% respectively, which increased by 1.32% and 1.46% compared with the suboptimal network.

Key words： remote sensing image semantic segmentation multi-scale mutual attention small scale object coding guidance upsampling

Received: 25 July 2022 Published: 17 July 2023

CLC:

TP 751

Fund: 甘肃省自然科学基金资助项目（21JR7RA289）；甘肃省重点研发资助项目（20YF8GA035）

Corresponding Authors: Xiao-suo WU E-mail: liuchj@mail.lzjtu.cn;43452740@qq.com

	Service
	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors
	Chun-juan LIU
	Ze QIAO
	Hao-wen YAN
	Xiao-suo WU
	Jia-wei WANG
	Yu-qiang XIN

Cite this article:

Chun-juan LIU,Ze QIAO,Hao-wen YAN,Xiao-suo WU,Jia-wei WANG,Yu-qiang XIN. Semantic segmentation network for remote sensing image based on multi-scale mutual attention. Journal of ZheJiang University (Engineering Science), 2023, 57(7): 1335-1344.

URL:

https://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2023.07.008 OR https://www.zjujournals.com/eng/Y2023/V57/I7/1335

基于多尺度互注意力的遥感图像语义分割网络

为了解决在遥感图像语义分割任务中存在的目标物体之间巨大尺度差异和丢失空间细节信息导致分割精度下降的问题，提出多尺度互注意力与指导上采样网络. 利用多尺度互注意力模块获得不同尺度图像之间的像素关系，平衡不同尺度物体的权重，提高小尺度物体的分割性能. 编码指导上采样模块利用编码结构中的信息，指导图像上采样的过程，融合空间细节信息，提升目标物体边界像素的分类效果. 在Potsdam数据集和Jiage数据集上的mIoU得分分别为85.52%和86.59%，较次优网络分别提升了1.32%和1.46%.

关键词： 遥感图像, 语义分割, 多尺度互注意力, 小尺度物体, 编码指导上采样

Fig.1 Multi-scale mutual attention and guided upsampling network structure

Fig.2 Structure of multi-scale mutual attention module

Fig.3 Structure of code-guided upsampling module

Tab.1 Abbreviation for all experimental strategies

Tab.2 Results of ablation experiments on Potsdam dataset

Fig.4 Local visual comparison results of ablation experiments on Potsdam dataset

Tab.3 Results of ablation experiments of various categories on Potsdam dataset

Tab.4 Results of ablation experiments on Jiage dataset

Tab.5 Results of ablation experiments of various categories on Jiage dataset

Fig.5 Local visual comparison results of ablation experiments on Jiage dataset

Tab.6 Quantitative comparison with 8 state-of-the-art methods on Potsdam dataset

Fig.6 Local visual comparison results of PSPNet, CCNet, MagNet and DCED-MMA-CGU on Potsdam dataset

Tab.7 Quantitative comparison with 8 state-of-the-art methods on Jiage dataset

Fig.7 Local visual comparison results of PSPNet, CCNet, MagNet and DCED-MMA-CGU on Jiage dataset


[1]	ZHANG X, XIAO Z, LI D, et al Semantic segmentation of remote sensing images using multiscale decoding network[J]. IEEE Geoscience and Remote Sensing Letters, 2019, 16 (9): 1492- 1496 doi: 10.1109/LGRS.2019.2901592

[2]	LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 3431-3440.

[3]	RONNEBERGER O, FISCHER P, BROX T. U-net: convolutional networks for biomedical image segmentation [C]// International Conference on Medical Image Computing and Computer-Assisted Intervention. Cham: Springer, 2015: 234-241.

[4]	CHEN L C, PAPANDREOU G, SCHROFF F, et al. Rethinking atrous convolution for semantic image segmentation [EB/OL]. [2017-06-17]. https://arxiv.org/ abs/1706.05587.

[5]	ZHAO H, SHI J, QI X, et al. Pyramid scene parsing network [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 2881-2890.

[6]	WANG X, LI Z, HUANG Y, et al Multimodal medical image segmentation using multi-scale context-aware network[J]. Neurocomputing, 2022, 486: 135- 146 doi: 10.1016/j.neucom.2021.11.017

[7]	DOU F, ZHANG C, HU D, et al EASNet: a multiscale attention semantic segmentation network combined with asymmetric convolution[J]. Journal of Electronic Imaging, 2022, 31 (4): 043034

[8]	LUO J, ZHAO L, ZHU L, et al Multi-scale receptive field fusion network for lightweight image super-resolution[J]. Neurocomputing, 2022, 493: 314- 326 doi: 10.1016/j.neucom.2022.04.038

[9]	LIN D, SHEN D, SHEN S, et al. Zigzagnet: fusing top-down and bottom-up context for object segmentation [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 7490-7499.

[10]	吴泽康, 赵姗, 李宏伟, 等遥感图像语义分割空间全局上下文信息网络[J]. 浙江大学学报: 工学版, 2022, 56 (4): 795- 802 WU Ze-kang, ZHAO Shan, LI Hong-wei, et al Spatial global context information network for semantic segmentation of remote sensing image[J]. Journal of Zhejiang University: Engineering Science, 2022, 56 (4): 795- 802

[11]	FU J, LIU J, TIAN H, et al. Dual attention network for scene segmentation [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 3146-3154.

[12]	HUANG Z, WANG X, HUANG L, et al. CCNet: criss-cross attention for semantic segmentation [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 603-612.

[13]	HU J, SHEN L, SUN G. Squeeze-and-excitation networks [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 7132–7141.

[14]	WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module [C]// Proceedings of the European Conference on Computer Vision. Munich: [s. n. ], 2018: 3–19.

[15]	ZHOU Z, ZHOU Y, WANG D, et al Self-attention feature fusion network for semantic segmentation[J]. Neurocomputing, 2021, 453: 50- 59 doi: 10.1016/j.neucom.2021.04.106

[16]	谭大宁, 刘瑜, 姚力波, 等基于视觉注意力机制的多源遥感图像语义分割[J]. 信号处理, 2022, 38 (6): 1180- 1191 TAN Da-ning, LIU Yu, YAO Li-bo, et al Semantic segmentation of multi-source remote sensing images based on visual attention mechanism[J]. Journal of Signal Processing, 2022, 38 (6): 1180- 1191

[17]	ZOU L, ZHANG Z, DU H, et al DA-IMRN: dual-attention-guided interactive multi-scale residual network for hyperspectral image classification[J]. Remote Sensing, 2022, 14 (3): 530 doi: 10.3390/rs14030530

[18]	CUI W, WANG F, HE X, et al Multi-scale semantic segmentation and spatial relationship recognition of remote sensing images based on an attention model[J]. Remote Sensing, 2019, 11 (9): 1044 doi: 10.3390/rs11091044

[19]	QI X, LI K, LIU P, et al Deep attention and multi-scale networks for accurate remote sensing image segmentation[J]. IEEE Access, 2020, 8: 146627- 146639 doi: 10.1109/ACCESS.2020.3015587

[20]	SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition [EB/OL]. [2014-09-04]. https://arxiv.org/abs/1409.1556.

[21]	BADRINARAYANAN V, KENDALL A, CIPOLLA R Segnet: a deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39 (12): 2481- 2495 doi: 10.1109/TPAMI.2016.2644615

[22]	SRIVASTAVA A, JHA D, CHANDA S, et al Msrf-net: a multi-scale residual fusion network for biomedical image segmentation[J]. IEEE Journal of Biomedical and Health Informatics, 2021, 26 (5): 2252- 2263

[23]	LI X, ZHONG Z, WU J, et al. Expectation-maximization attention networks for semantic segmentation [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 9167-9176.

[24]	WU X, WU Z, GUO H, et al. DANNet: a one-stage domain adaptation network for unsupervised nighttime semantic segmentation [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. [S. l. ]: IEEE, 2021: 15769-15778.

[1]	Hao-ran GUO,Ji-chang GUO,Yu-dong WANG. Lightweight semantic segmentation network for underwater image[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(7): 1278-1286.

[2]	Hai-bo ZHANG,Lei CAI,Jun-ping REN,Ru-yan WANG,Fu LIU. Efficient and adaptive semantic segmentation network based on Transformer[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(6): 1205-1214.

[3]	Guo-hua ZHOU,Jian-wei LU,Tong-guang NI,Xue-long HU. Hierarchical nonlinear subspace dictionary learning[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(6): 1159-1167.

[4]	Ze-kang WU,Shan ZHAO,Hong-wei LI,Yi-rui JIANG. Spatial global context information network for semantic segmentation of remote sensing image[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(4): 795-802.

[5]	Yun-zuo ZHANG,Wei GUO,Zhao-quan CAI,Wen-bo LI. Remote sensing image target detection combining multi-scale and attention mechanism[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(11): 2215-2223.

[6]	Dong-jie YANG,Xian-jun GAO,Shu-hao RAN,Guang-bin ZHANG,Ping WANG,Yuan-wei YANG. Building extraction based on multiple multiscale-feature fusion attention network[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(10): 1924-1934.

[7]	Jing-xin CHANG,Xian-jun GAO,Yuan-wei YANG,Shao-hua LI,Ping WANG. Building boundary optimization method based on object-oriented contour constraint GGVF Snake model[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(10): 1847-1855.

[8]	Deng-wen ZHOU,Jin-yue TIAN,Lu-yao MA,Xiu-xiu SUN. Lightweight image semantic segmentation based on multi-level feature cascaded network[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(8): 1516-1524.

[9]	Shu-hao RAN,Yu-long HU,Yuan-wei YANG,Xian-jun GAO,Xi LI,Ming-zhu CHEN. Building extraction from high resolution remote sensing image based on samples morphological transformation[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(5): 996-1006.

[10]	Xue-yan GAO,An-ning PAN,Yang YANG. Urban green space remote sensing image registration using image mixed features[J]. Journal of ZheJiang University (Engineering Science), 2019, 53(6): 1205-1217.

[11]	WU Ning, CHEN Qiu-xiao, ZHOU Ling, WAN Li. Multi-level method of optimizing vector graphs converted from remote sensing images[J]. Journal of ZheJiang University (Engineering Science), 2013, 47(4): 581-587.

Viewed

Full text

Abstract

Cited

Shared

Discussed