Please wait a minute...
浙江大学学报(工学版)  2023, Vol. 57 Issue (7): 1335-1344    DOI: 10.3785/j.issn.1008-973X.2023.07.008
自动化技术     
基于多尺度互注意力的遥感图像语义分割网络
刘春娟1(),乔泽1,闫浩文2,3,吴小所1,3,*(),王嘉伟1,辛钰强1
1. 兰州交通大学 电子与信息工程学院,甘肃 兰州 730070
2. 兰州交通大学 测绘与地理信息学院,甘肃 兰州 730070
3. 甘肃大禹九洲空间信息科技有限公司院士专家工作站,甘肃 兰州 730070
Semantic segmentation network for remote sensing image based on multi-scale mutual attention
Chun-juan LIU1(),Ze QIAO1,Hao-wen YAN2,3,Xiao-suo WU1,3,*(),Jia-wei WANG1,Yu-qiang XIN1
1. School of Electronic and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China
2. School of Surveying, Mapping and Geographic Information, Lanzhou Jiaotong University, Lanzhou 730070, China
3. Academician Expert Workstation of Gansu Dayu Jiuzhou Space Information Technology Limited Company, Lanzhou 730070, China
 全文: PDF(1726 KB)   HTML
摘要:

为了解决在遥感图像语义分割任务中存在的目标物体之间巨大尺度差异和丢失空间细节信息导致分割精度下降的问题,提出多尺度互注意力与指导上采样网络. 利用多尺度互注意力模块获得不同尺度图像之间的像素关系,平衡不同尺度物体的权重,提高小尺度物体的分割性能. 编码指导上采样模块利用编码结构中的信息,指导图像上采样的过程,融合空间细节信息,提升目标物体边界像素的分类效果. 在Potsdam数据集和Jiage数据集上的mIoU得分分别为85.52%和86.59%,较次优网络分别提升了1.32%和1.46%.

关键词: 遥感图像语义分割多尺度互注意力小尺度物体编码指导上采样    
Abstract:

A network with multi-scale mutual attention and guidance upsampling was proposed in order to solve the segmentation accuracy degradation caused by the huge scale difference between target objects and the loss of spatial details in the semantic segmentation of remote sensing images. The multi-scale mutual attention module was used to obtain the pixel relations between different scale images and balance the weights of different target objects in order to improve the segmentation performance of small-scale objects. The image upsampling process was guided by the information in the coding structure, and spatial details were incorporated to enhance the classification of target object boundary pixels in the coding guidance upsampling module. The mIoU scores of the proposed network on the Potsdam dataset and Jiage dataset were 85.52% and 86.59% respectively, which increased by 1.32% and 1.46% compared with the suboptimal network.

Key words: remote sensing image    semantic segmentation    multi-scale mutual attention    small scale object    coding guidance upsampling
收稿日期: 2022-07-25 出版日期: 2023-07-17
CLC:  TP 751  
基金资助: 甘肃省自然科学基金资助项目(21JR7RA289);甘肃省重点研发资助项目(20YF8GA035)
通讯作者: 吴小所     E-mail: liuchj@mail.lzjtu.cn;43452740@qq.com
作者简介: 刘春娟(1973—),女,教授,从事硅基光学器件及遥感图像处理的研究. orcid.org/ 0000-0001-5118-3327. E-mail: liuchj@mail.lzjtu.cn
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  
刘春娟
乔泽
闫浩文
吴小所
王嘉伟
辛钰强

引用本文:

刘春娟,乔泽,闫浩文,吴小所,王嘉伟,辛钰强. 基于多尺度互注意力的遥感图像语义分割网络[J]. 浙江大学学报(工学版), 2023, 57(7): 1335-1344.

Chun-juan LIU,Ze QIAO,Hao-wen YAN,Xiao-suo WU,Jia-wei WANG,Yu-qiang XIN. Semantic segmentation network for remote sensing image based on multi-scale mutual attention. Journal of ZheJiang University (Engineering Science), 2023, 57(7): 1335-1344.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2023.07.008        https://www.zjujournals.com/eng/CN/Y2023/V57/I7/1335

图 1  多尺度互注意力与指导上采样网络结构
图 2  多尺度互注意力模块的结构
图 3  编码指导上采样模块的结构
名称缩写 描述
DCED 单尺度输入且骨干网络为VGG16的深度卷积编码-解码网络
DCED-MMA 在DCED基础上添加了MMA的网络
DCED-CGU 在DCED基础上添加了CGU的网络
DCED-MMA-CGU 在DCED基础上添加了MMA和CGU的网络
表 1  所有实验策略的缩写
网络模型 F1/% mIoU/% PA/%
DCED 84.85 74.33 86.29
DCED-MMA 91.36 84.21 91.39
DCED-CGU 90.56 82.87 90.92
DCED-MMA-CGU 92.15 85.52 92.33
表 2  Potsdam数据集上的消融实验结果
图 4  Potsdam数据集上消融实验的局部视觉对比结果
模型 IoU/%
背景 汽车 不透水表面 低植被 建筑物
DCED 54.57 76.05 79.37 72.02 74.97 89.02
DCED-MMA 81.96 81.22 87.87 79.37 81.92 92.91
DCED-CGU 81.26 77.39 86.04 79.19 81.64 91.71
DCED-MMA-CGU 83.21 82.42 87.89 83.09 83.79 92.71
表 3  Potsdam数据集上各类别的消融实验结果
网络模型 F1/% mIoU/% PA/%
DCED 84.71 75.25 91.89
DCED-MMA 91.34 84.50 94.73
DCED-CGU 90.93 83.86 94.44
DCED-MMA-CGU 92.66 86.59 95.13
表 4  Jiage数据集上的消融实验结果
模型 IoU/%
背景 植被 道路 建筑物
DCED 77.04 92.84 43.91 83.91 78.56
DCED-MMA 83.11 95.84 69.38 90.34 83.82
DCED-CGU 82.57 95.41 67.68 90.64 82.99
DCED-MMA-CGU 84.15 95.51 75.31 91.53 86.44
表 5  Jiage数据集上各类别的消融实验结果
图 5  Jiage数据集上消融实验的局部视觉对比结果
模型 IoU/% mIoU/%
背景 汽车 不透水表面 低植被 建筑物
SegNet 69.49 59.85 83.44 52.97 79.26 80.36 70.90
PSPNet 78.33 65.84 86.78 56.21 81.55 88.32 76.17
DeeplabV3 78.86 67.57 85.63 60.38 80.57 87.51 76.75
MSRF 77.22 73.86 85.56 73.40 79.60 90.66 80.05
EMANet 77.40 75.60 85.60 80.70 82.10 89.30 81.80
CCNet 76.39 78.79 87.60 79.62 82.24 89.71 82.39
DANNet 82.19 77.35 87.28 82.57 82.62 92.51 84.09
MagNet 79.54 82.09 88.67 79.85 83.00 92.07 84.20
DCED-MMA-CGU 83.21 82.42 87.89 83.09 83.79 92.71 85.52
表 6  在Potsdam数据集上与8种最先进的方法进行定量比较
图 6  Potsdam数据集上PSPNet、CCNet、MagNet和DCED-MMA-CGU的局部视觉对比结果
模型 IoU/% mIoU/%
背景 植被 道路 建筑物
SegNet 61.42 87.27 91.44 45.42 66.58 70.42
PSPNet 79.08 89.91 96.25 48.81 81.27 79.06
DeeplabV3 80.83 88.67 95.27 56.51 78.66 79.99
EMANet 81.93 88.37 95.13 63.88 82.52 82.37
MSRF 80.62 87.49 94.19 69.51 81.37 82.64
CCNet 81.29 90.86 95.30 67.06 81.64 83.23
MagNet 82.37 91.31 95.70 70.47 82.78 84.52
DANNet 81.33 90.51 94.58 75.28 83.96 85.13
DCED-MMA-CGU 84.15 91.53 95.51 75.31 86.44 86.59
表 7  在Jiage数据集上与 8 种最先进的方法进行定量比较
图 7  Jiage数据集上PSPNet、CCNet、MagNet和DCED-MMA-CGU的局部视觉对比结果
1 ZHANG X, XIAO Z, LI D, et al Semantic segmentation of remote sensing images using multiscale decoding network[J]. IEEE Geoscience and Remote Sensing Letters, 2019, 16 (9): 1492- 1496
doi: 10.1109/LGRS.2019.2901592
2 LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 3431-3440.
3 RONNEBERGER O, FISCHER P, BROX T. U-net: convolutional networks for biomedical image segmentation [C]// International Conference on Medical Image Computing and Computer-Assisted Intervention. Cham: Springer, 2015: 234-241.
4 CHEN L C, PAPANDREOU G, SCHROFF F, et al. Rethinking atrous convolution for semantic image segmentation [EB/OL]. [2017-06-17]. https://arxiv.org/ abs/1706.05587.
5 ZHAO H, SHI J, QI X, et al. Pyramid scene parsing network [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 2881-2890.
6 WANG X, LI Z, HUANG Y, et al Multimodal medical image segmentation using multi-scale context-aware network[J]. Neurocomputing, 2022, 486: 135- 146
doi: 10.1016/j.neucom.2021.11.017
7 DOU F, ZHANG C, HU D, et al EASNet: a multiscale attention semantic segmentation network combined with asymmetric convolution[J]. Journal of Electronic Imaging, 2022, 31 (4): 043034
8 LUO J, ZHAO L, ZHU L, et al Multi-scale receptive field fusion network for lightweight image super-resolution[J]. Neurocomputing, 2022, 493: 314- 326
doi: 10.1016/j.neucom.2022.04.038
9 LIN D, SHEN D, SHEN S, et al. Zigzagnet: fusing top-down and bottom-up context for object segmentation [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 7490-7499.
10 吴泽康, 赵姗, 李宏伟, 等 遥感图像语义分割空间全局上下文信息网络[J]. 浙江大学学报: 工学版, 2022, 56 (4): 795- 802
WU Ze-kang, ZHAO Shan, LI Hong-wei, et al Spatial global context information network for semantic segmentation of remote sensing image[J]. Journal of Zhejiang University: Engineering Science, 2022, 56 (4): 795- 802
11 FU J, LIU J, TIAN H, et al. Dual attention network for scene segmentation [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 3146-3154.
12 HUANG Z, WANG X, HUANG L, et al. CCNet: criss-cross attention for semantic segmentation [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 603-612.
13 HU J, SHEN L, SUN G. Squeeze-and-excitation networks [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 7132–7141.
14 WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module [C]// Proceedings of the European Conference on Computer Vision. Munich: [s. n. ], 2018: 3–19.
15 ZHOU Z, ZHOU Y, WANG D, et al Self-attention feature fusion network for semantic segmentation[J]. Neurocomputing, 2021, 453: 50- 59
doi: 10.1016/j.neucom.2021.04.106
16 谭大宁, 刘瑜, 姚力波, 等 基于视觉注意力机制的多源遥感图像语义分割[J]. 信号处理, 2022, 38 (6): 1180- 1191
TAN Da-ning, LIU Yu, YAO Li-bo, et al Semantic segmentation of multi-source remote sensing images based on visual attention mechanism[J]. Journal of Signal Processing, 2022, 38 (6): 1180- 1191
17 ZOU L, ZHANG Z, DU H, et al DA-IMRN: dual-attention-guided interactive multi-scale residual network for hyperspectral image classification[J]. Remote Sensing, 2022, 14 (3): 530
doi: 10.3390/rs14030530
18 CUI W, WANG F, HE X, et al Multi-scale semantic segmentation and spatial relationship recognition of remote sensing images based on an attention model[J]. Remote Sensing, 2019, 11 (9): 1044
doi: 10.3390/rs11091044
19 QI X, LI K, LIU P, et al Deep attention and multi-scale networks for accurate remote sensing image segmentation[J]. IEEE Access, 2020, 8: 146627- 146639
doi: 10.1109/ACCESS.2020.3015587
20 SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition [EB/OL]. [2014-09-04]. https://arxiv.org/abs/1409.1556.
21 BADRINARAYANAN V, KENDALL A, CIPOLLA R Segnet: a deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39 (12): 2481- 2495
doi: 10.1109/TPAMI.2016.2644615
22 SRIVASTAVA A, JHA D, CHANDA S, et al Msrf-net: a multi-scale residual fusion network for biomedical image segmentation[J]. IEEE Journal of Biomedical and Health Informatics, 2021, 26 (5): 2252- 2263
23 LI X, ZHONG Z, WU J, et al. Expectation-maximization attention networks for semantic segmentation [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 9167-9176.
24 WU X, WU Z, GUO H, et al. DANNet: a one-stage domain adaptation network for unsupervised nighttime semantic segmentation [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. [S. l. ]: IEEE, 2021: 15769-15778.
[1] 郭浩然,郭继昌,汪昱东. 面向水下场景的轻量级图像语义分割网络[J]. 浙江大学学报(工学版), 2023, 57(7): 1278-1286.
[2] 张海波,蔡磊,任俊平,王汝言,刘富. 基于Transformer的高效自适应语义分割网络[J]. 浙江大学学报(工学版), 2023, 57(6): 1205-1214.
[3] 杨长春,叶赞挺,刘半藤,王柯,崔海东. 基于多源信息融合的医学图像分割方法[J]. 浙江大学学报(工学版), 2023, 57(2): 226-234.
[4] 周国华,卢剑伟,倪彤光,胡学龙. 层次型非线性子空间字典学习[J]. 浙江大学学报(工学版), 2022, 56(6): 1159-1167.
[5] 吴泽康,赵姗,李宏伟,姜懿芮. 遥感图像语义分割空间全局上下文信息网络[J]. 浙江大学学报(工学版), 2022, 56(4): 795-802.
[6] 张云佐,郭威,蔡昭权,李文博. 联合多尺度与注意力机制的遥感图像目标检测[J]. 浙江大学学报(工学版), 2022, 56(11): 2215-2223.
[7] 周登文,田金月,马路遥,孙秀秀. 基于多级特征并联的轻量级图像语义分割[J]. 浙江大学学报(工学版), 2020, 54(8): 1516-1524.
[8] 高雪艳,潘安宁,杨扬. 基于图像混合特征的城市绿地遥感图像配准[J]. 浙江大学学报(工学版), 2019, 53(6): 1205-1217.
[9] 张登荣 俞乐 邓超 狄黎平. 基于OGC WPS的Web环境遥感图像处理技术研究[J]. J4, 2008, 42(7): 1184-1188.