|
|
Semantic segmentation network for remote sensing image based on multi-scale mutual attention |
Chun-juan LIU1( ),Ze QIAO1,Hao-wen YAN2,3,Xiao-suo WU1,3,*( ),Jia-wei WANG1,Yu-qiang XIN1 |
1. School of Electronic and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China 2. School of Surveying, Mapping and Geographic Information, Lanzhou Jiaotong University, Lanzhou 730070, China 3. Academician Expert Workstation of Gansu Dayu Jiuzhou Space Information Technology Limited Company, Lanzhou 730070, China |
|
|
Abstract A network with multi-scale mutual attention and guidance upsampling was proposed in order to solve the segmentation accuracy degradation caused by the huge scale difference between target objects and the loss of spatial details in the semantic segmentation of remote sensing images. The multi-scale mutual attention module was used to obtain the pixel relations between different scale images and balance the weights of different target objects in order to improve the segmentation performance of small-scale objects. The image upsampling process was guided by the information in the coding structure, and spatial details were incorporated to enhance the classification of target object boundary pixels in the coding guidance upsampling module. The mIoU scores of the proposed network on the Potsdam dataset and Jiage dataset were 85.52% and 86.59% respectively, which increased by 1.32% and 1.46% compared with the suboptimal network.
|
Received: 25 July 2022
Published: 17 July 2023
|
|
Fund: 甘肃省自然科学基金资助项目(21JR7RA289);甘肃省重点研发资助项目(20YF8GA035) |
Corresponding Authors:
Xiao-suo WU
E-mail: liuchj@mail.lzjtu.cn;43452740@qq.com
|
基于多尺度互注意力的遥感图像语义分割网络
为了解决在遥感图像语义分割任务中存在的目标物体之间巨大尺度差异和丢失空间细节信息导致分割精度下降的问题,提出多尺度互注意力与指导上采样网络. 利用多尺度互注意力模块获得不同尺度图像之间的像素关系,平衡不同尺度物体的权重,提高小尺度物体的分割性能. 编码指导上采样模块利用编码结构中的信息,指导图像上采样的过程,融合空间细节信息,提升目标物体边界像素的分类效果. 在Potsdam数据集和Jiage数据集上的mIoU得分分别为85.52%和86.59%,较次优网络分别提升了1.32%和1.46%.
关键词:
遥感图像,
语义分割,
多尺度互注意力,
小尺度物体,
编码指导上采样
|
|
[1] |
ZHANG X, XIAO Z, LI D, et al Semantic segmentation of remote sensing images using multiscale decoding network[J]. IEEE Geoscience and Remote Sensing Letters, 2019, 16 (9): 1492- 1496
doi: 10.1109/LGRS.2019.2901592
|
|
|
[2] |
LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 3431-3440.
|
|
|
[3] |
RONNEBERGER O, FISCHER P, BROX T. U-net: convolutional networks for biomedical image segmentation [C]// International Conference on Medical Image Computing and Computer-Assisted Intervention. Cham: Springer, 2015: 234-241.
|
|
|
[4] |
CHEN L C, PAPANDREOU G, SCHROFF F, et al. Rethinking atrous convolution for semantic image segmentation [EB/OL]. [2017-06-17]. https://arxiv.org/ abs/1706.05587.
|
|
|
[5] |
ZHAO H, SHI J, QI X, et al. Pyramid scene parsing network [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 2881-2890.
|
|
|
[6] |
WANG X, LI Z, HUANG Y, et al Multimodal medical image segmentation using multi-scale context-aware network[J]. Neurocomputing, 2022, 486: 135- 146
doi: 10.1016/j.neucom.2021.11.017
|
|
|
[7] |
DOU F, ZHANG C, HU D, et al EASNet: a multiscale attention semantic segmentation network combined with asymmetric convolution[J]. Journal of Electronic Imaging, 2022, 31 (4): 043034
|
|
|
[8] |
LUO J, ZHAO L, ZHU L, et al Multi-scale receptive field fusion network for lightweight image super-resolution[J]. Neurocomputing, 2022, 493: 314- 326
doi: 10.1016/j.neucom.2022.04.038
|
|
|
[9] |
LIN D, SHEN D, SHEN S, et al. Zigzagnet: fusing top-down and bottom-up context for object segmentation [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 7490-7499.
|
|
|
[10] |
吴泽康, 赵姗, 李宏伟, 等 遥感图像语义分割空间全局上下文信息网络[J]. 浙江大学学报: 工学版, 2022, 56 (4): 795- 802 WU Ze-kang, ZHAO Shan, LI Hong-wei, et al Spatial global context information network for semantic segmentation of remote sensing image[J]. Journal of Zhejiang University: Engineering Science, 2022, 56 (4): 795- 802
|
|
|
[11] |
FU J, LIU J, TIAN H, et al. Dual attention network for scene segmentation [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 3146-3154.
|
|
|
[12] |
HUANG Z, WANG X, HUANG L, et al. CCNet: criss-cross attention for semantic segmentation [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 603-612.
|
|
|
[13] |
HU J, SHEN L, SUN G. Squeeze-and-excitation networks [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 7132–7141.
|
|
|
[14] |
WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module [C]// Proceedings of the European Conference on Computer Vision. Munich: [s. n. ], 2018: 3–19.
|
|
|
[15] |
ZHOU Z, ZHOU Y, WANG D, et al Self-attention feature fusion network for semantic segmentation[J]. Neurocomputing, 2021, 453: 50- 59
doi: 10.1016/j.neucom.2021.04.106
|
|
|
[16] |
谭大宁, 刘瑜, 姚力波, 等 基于视觉注意力机制的多源遥感图像语义分割[J]. 信号处理, 2022, 38 (6): 1180- 1191 TAN Da-ning, LIU Yu, YAO Li-bo, et al Semantic segmentation of multi-source remote sensing images based on visual attention mechanism[J]. Journal of Signal Processing, 2022, 38 (6): 1180- 1191
|
|
|
[17] |
ZOU L, ZHANG Z, DU H, et al DA-IMRN: dual-attention-guided interactive multi-scale residual network for hyperspectral image classification[J]. Remote Sensing, 2022, 14 (3): 530
doi: 10.3390/rs14030530
|
|
|
[18] |
CUI W, WANG F, HE X, et al Multi-scale semantic segmentation and spatial relationship recognition of remote sensing images based on an attention model[J]. Remote Sensing, 2019, 11 (9): 1044
doi: 10.3390/rs11091044
|
|
|
[19] |
QI X, LI K, LIU P, et al Deep attention and multi-scale networks for accurate remote sensing image segmentation[J]. IEEE Access, 2020, 8: 146627- 146639
doi: 10.1109/ACCESS.2020.3015587
|
|
|
[20] |
SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition [EB/OL]. [2014-09-04]. https://arxiv.org/abs/1409.1556.
|
|
|
[21] |
BADRINARAYANAN V, KENDALL A, CIPOLLA R Segnet: a deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39 (12): 2481- 2495
doi: 10.1109/TPAMI.2016.2644615
|
|
|
[22] |
SRIVASTAVA A, JHA D, CHANDA S, et al Msrf-net: a multi-scale residual fusion network for biomedical image segmentation[J]. IEEE Journal of Biomedical and Health Informatics, 2021, 26 (5): 2252- 2263
|
|
|
[23] |
LI X, ZHONG Z, WU J, et al. Expectation-maximization attention networks for semantic segmentation [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 9167-9176.
|
|
|
[24] |
WU X, WU Z, GUO H, et al. DANNet: a one-stage domain adaptation network for unsupervised nighttime semantic segmentation [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. [S. l. ]: IEEE, 2021: 15769-15778.
|
|
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|