Please wait a minute...
浙江大学学报(工学版)  2024, Vol. 58 Issue (4): 696-707    DOI: 10.3785/j.issn.1008-973X.2024.04.005
计算机与控制工程     
基于遥感图像道路提取的全局指导多特征融合网络
宦海1(),盛宇2,顾晨曦1
1. 南京信息工程大学 人工智能学院,江苏 南京 210044
2. 南京邮电大学 集成电路科学与工程学院,江苏 南京 210003
Global guidance multi-feature fusion network based on remote sensing image road extraction
Hai HUAN1(),Yu SHENG2,Chenxi GU1
1. School of Artificial Intelligence, Nanjing University of Information Science and Technology, Nanjing 210044, China
2. School of Integrated Circuit Science and Engineering, Nanjing University of Posts and Telecommunications, Nanjing 210003, China
 全文: PDF(2780 KB)   HTML
摘要:

在遥感图像中,建筑与道路的类间相似度高,且存在阴影和遮挡,使得现有深度学习语义分割网络在分割道路时误分割率高,为此提出全局指导多特征融合网络(GGMNet)用于提取遥感图像中的道路. 将特征图分为若干个局部特征,再将全局上下文信息与局部特征相乘,强化各类别特征的提取,以降低网络对道路周边相似地物的误判率. 采用融合多阶段特征的方法准确定位道路空间,降低将其余地物识别为道路的概率. 设计自适应全局通道注意力模块,利用全局信息指导局部信息,丰富每个像素的上下文信息. 在解码阶段,设计多特征融合模块,充分利用并融合骨干网络4个阶段的特征图中的位置信息与语义信息,发掘层与层之间的关联性以提升分割精度. 使用CITY-OSM数据集、DeepGlobe道路提取数据集和CHN6-CUG数据集对网络进行训练和测试. 测试结果表明,GGMNet具有优秀的道路分割性能,降低道路误分割率的能力比对比网络强.

关键词: 遥感图像深度学习道路提取注意力机制上下文信息    
Abstract:

Due to the high similarity between buildings and roads in remote sensing images, as well as the existence of shadows and occlusion, the existing deep learning semantic segmentation network generally has a high false segmentation rate when it comes to road segmentation. A global guide multi-feature fusion network (GGMNet) was proposed for road extraction in remote sensing images. To reduce the network’s misjudgment rate of similar features around the road, the feature map was divided into several local features, and then the features were multiplied by the global context information to strengthen the extraction of various features. The method of integrating multi-stage features was used to accurate spatial positioning of roads and reduce the probability of identifying other ground objects as roads. An adaptive global channel attention module was designed, and the global information was used to guide the local information, so as to enrich the context information of each pixel. In the decoding stage, a multi-feature fusion module was designed to make full use of the location information and the semantic information in the feature map of the four stages in the backbone network, and the correlations between layers were uncovered to improve the segmentation accuracy. The network was trained and tested using CITY-OSM dataset, DeepGlobe Road extraction dataset and CHN6-CUG dataset. Test results show that GGMNet has excellent road segmentation performance, and the ability to reduce the false segmentation rate of road segmentation is better than comparing networks.

Key words: remote sensing image    deep learning    road extraction    attention mechanism    context information
收稿日期: 2023-03-20 出版日期: 2024-03-27
CLC:  TP 751.1  
作者简介: 宦海(1978—),男,副教授,硕导,从事人工智能研究. orcid.org/0000-0002-2158-3386. E-mail:002274@nuist.edu.cn
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  
宦海
盛宇
顾晨曦

引用本文:

宦海,盛宇,顾晨曦. 基于遥感图像道路提取的全局指导多特征融合网络[J]. 浙江大学学报(工学版), 2024, 58(4): 696-707.

Hai HUAN,Yu SHENG,Chenxi GU. Global guidance multi-feature fusion network based on remote sensing image road extraction. Journal of ZheJiang University (Engineering Science), 2024, 58(4): 696-707.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2024.04.005        https://www.zjujournals.com/eng/CN/Y2024/V58/I4/696

图 1  全局指导多特征融合网络的整体结构
图 2  自适应全局通道注意力模块的整体结构
图 3  多特征融合模块的整体结构
%
方法IoUMIoU
背景建筑物道路
基准网络83.4448.6176.3069.45
$ s=1 $83.8651.3677.0670.76
$ s=2 $83.7851.8276.8870.83
$ s=3 $83.7150.4377.0670.40
$ s=4 $84.0353.0277.1671.40
$ s=5 $83.6552.6476.5170.94
表 1  基于CITY-OSM数据集的自适应全局通道注意力模块超参数取值对比
图 4  基于CITY-OSM数据集的超参数取值可视化结果对比
%
方法IoUMIoU
背景道路
基准网络98.0662.2480.15
$ s=1 $98.0662.3780.22
$ s=2 $98.0862.5880.33
$ s=3 $98.1062.4580.28
$ s=4 $98.1062.8080.45
$ s=5 $98.0662.4480.25
表 2  基于DeepGlobe数据集的自适应全局通道注意力模块超参数取值对比
图 5  基于DeepGlobe数据集的超参数值可视化结果对比
%
方法IoUMIoU
背景道路
基准网络97.1260.1378.63
$ s=1 $97.1660.2278.69
$ s=2 $97.2460.6278.93
$ s=3 $97.2560.5278.86
$ s=4 $97.2961.9479.62
$ s=5 $97.1660.3778.77
表 3  基于CHN6-CUG数据集的自适应全局通道注意力模块超参数取值对比
图 6  基于CHN6-CUG数据集的超参数值可视化结果对比
%
方法IoUMIoU
背景建筑物道路
基准网络83.4448.6176.3069.45
+AGCA84.0353.0277.1671.40
+AGCA+MFM(Res-1,Res-4)84.1253.7677.1871.69
+AGCA+MFM84.3154.9977.6872.33
表 4  基于CITY-OSM数据集的模块有效性分析
图 7  基于CITY-OSM数据集的模块有效性分析可视化结果对比
%
方法IoUMIoU
背景道路
基准网络98.0662.2480.15
+AGCA98.1062.8080.45
+AGCA+MFM(Res-1, Res-4)98.1062.9280.51
+AGCA+MFM98.1463.1180.63
表 5  基于DeepGlobe数据集的模块有效性分析
%
方法IoUMIoU
背景道路
基准网络97.1260.1378.63
+AGCA97.2961.9479.62
+AGCA+MFM(Res-1, Res-4)97.2862.1679.72
+AGCA+MFM97.3763.2180.29
表 6  基于CHN6-CUG数据集的模块有效性分析
图 8  基于DeepGlobe数据集模块的有效性分析可视化结果对比
图 9  基于CHN6-CUG数据集模块的有效性分析可视化结果对比
%
网络IoUMIoU
背景建筑物道路
DeepLabV383.4448.6176.3069.45
APCNet83.7549.4276.7769.98
CCNet83.3252.7676.5070.86
DANet81.7647.6173.0467.47
EMANet83.7653.3477.0371.38
DNLNet83.9553.0077.1271.36
CRANet83.2951.3576.8470.49
SANet84.2654.7877.5572.20
GGMNet84.3154.9977.6872.33
表 7  基于CITY-OSM数据集不同网络的分割性能对比
图 10  不同网络基于CITY-OSM数据集的分割结果对比
%
网络IoUMIoU
背景道路
DeepLabV398.0662.2480.15
APCNet98.0359.7878.91
CCNet98.0961.7779.93
DANet97.9861.7779.88
EMANet98.0661.4579.76
DNLNet98.1062.1980.15
CRANet98.0562.0480.04
SANet98.1563.0580.60
GGMNet98.1463.1180.63
表 8  基于DeepGlobe数据集不同网络的分割性能对比
图 11  不同网络基于DeepGlobe数据集的分割结果对比
%
网络IoUMIoU
背景道路
DeepLabV397.1260.1378.63
APCNet97.2461.9079.57
CCNet97.2661.5879.42
DANet97.2360.4478.83
EMANet97.1862.0479.61
DNLNet97.3262.5079.91
CRANet97.3262.8880.10
SANet97.3563.0880.22
GGMNet97.3763.2180.29
表 9  基于CHN6-CUG数据集不同网络的分割性能对比
图 12  不同网络基于CHN6-CUG数据集的分割结果对比
1 QUAN B, LIU B, FU D, et al. Improved DeepLabV3 for better road segmentation in remote sensing images [C]// 2021 International Conference on Computer Engineering and Artificial Intelligence . Shanghai: IEEE, 2021: 331–334.
2 ZHANG J, LI Y, SI Y, et al A low-grade road extraction method using SDG-DenseNet based on the fusion of optical and SAR images at decision level[J]. Remote Sensing, 2022, 14 (12): 2870
doi: 10.3390/rs14122870
3 胡春安, 陈玉玲 基于Gabor和改进 LDA的人耳识别[J]. 计算机工程与科学, 2015, 37 (7): 1355- 1359
HU Chun’an, CHEN Yuling An ear recognition algorithm based on gabor features and improved LDA[J]. Computer Engineering and Science, 2015, 37 (7): 1355- 1359
4 邢军 基于Sobel算子数字图像的边缘检测[J]. 微机发展, 2005, 15 (9): 48- 49
XING Jun Edge detection of Sobel-based digital image[J]. Microcomputer Development, 2005, 15 (9): 48- 49
5 SUN Q, LIU Q. The target fish’s population detection based on the improved watershed algorithm [C]// 2022 7th International Conference on Intelligent Computing and Signal Processing (ICSP) . Xi’an: IEEE, 2022: 507-510.
6 QIN J, HE Z S. A SVM face recognition method based on Gabor-featured key points [C]// 2005 International Conference on Machine Learning and Cybernetics . Guangzhou: IEEE, 2005: 5144–5149.
7 董师师, 黄哲学 随机森林理论浅析[J]. 集成技术, 2013, 2 (1): 1- 7
DONG Shishi, HUANG Zhexue A brief theoretical overview of random forests[J]. Journal of Integration Technology, 2013, 2 (1): 1- 7
8 GU J, WANG Z, KUEN J, et al Recent advances in convolutional neural networks[J]. Pattern Recognition, 2018, 77: 354- 377
doi: 10.1016/j.patcog.2017.10.013
9 ZHANG Z, LIU Q, WANG Y Road extraction by deep residual U-Net[J]. IEEE Geoscience and Remote Sensing Letters, 2018, 15 (5): 749- 753
doi: 10.1109/LGRS.2018.2802944
10 LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation [C]// 2015 IEEE Conference on Computer Vision and Pattern Recognition . Boston: IEEE, 2015: 3431–3440.
11 LIN G, MILAN A, SHEN C, et al. RefineNet: multi-path refinement networks for high-resolution semantic segmentation [C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition . Honolulu: IEEE, 2017: 1925–1934.
12 HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition [C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition . Las Vegas: IEEE, 2016: 770–778.
13 ZHAO H, SHI J, QI X, et al. Pyramid scene parsing network [C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition . Honolulu: IEEE, 2017: 2881–2890.
14 CHEN L C, ZHU Y, PAPANDREOU G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation [C]// European Conference on Computer Vision . [S. l.]: Springer, 2018: 833–851.
15 HU J, SHEN L, SUN G. Squeeze-and-excitation networks [C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Salt Lake City: IEEE, 2018: 7132–7141.
16 WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module [C]// European Conference on Computer Vision . [S. l.]: Springer, 2018: 3–19.
17 FU J, LIU J, TIAN H, et al. Dual attention network for scene segmentation [C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Long Beach: IEEE, 2019: 3146–3154.
18 ZHANG W, HUANG Z, LUO G, et al. TopFormer: token pyramid transformer for mobile semantic segmentation [C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition . New Orleans: IEEE, 2022: 12083–12093.
19 KAISER P, WEGNER J D, LUCCHI A, et al Learning aerial image segmentation from online maps[J]. IEEE Transactions on Geoscience and Remote Sensing, 2017, 55 (11): 6054- 6068
doi: 10.1109/TGRS.2017.2719738
20 DEMIR I, KOPERSKI K, LINDENBAUM D, et al. Deepglobe 2018: a challenge to parse the earth through satellite images [C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops . Salt Lake City: IEEE, 2018: 172-181.
21 ZHU Q, ZHANG Y, WANG L, et al A global context-aware and batch-independent network for road extraction from VHR satellite imagery[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2021, 175: 353- 365
doi: 10.1016/j.isprsjprs.2021.03.016
22 HE T, ZHANG Z, ZHANG H, et al. Bag of tricks for image classification with convolutional neural networks [C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Long Beach: IEEE, 2019: 558–567.
23 CHEN L C, PAPANDREOU G, SCHROFF F, et al. Rethinking atrous convolution for semantic image segmentation [EB/OL]. (2017-12-05)[2023-05-10]. https://arxiv.org/pdf/1706.05587.pdf.
24 HE J, DENG Z, ZHOU L, et al. Adaptive pyramid context network for semantic segmentation [C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Long Beach: IEEE, 2019: 7519–7528.
25 HUANG Z, WANG X, HUANG L, et al. CCNet: criss-cross attention for semantic segmentation [C]// 2019 IEEE/CVF International Conference on Computer Vision . Seoul: IEEE, 2019: 603–612.
26 LI X, ZHONG Z, WU J, et al. Expectation-maximization attention networks for semantic segmentation [C]// 2019 IEEE/CVF International Conference on Computer Vision . Seoul: IEEE, 2019: 9167–9176.
27 YIN M, YAO Z, CAO Y, et al. Disentangled non-local neural networks [C]// European Conference on Computer Vision . [S. l.]: Springer, 2020: 191–207.
28 LI S, LIAO C, DING Y, et al Cascaded residual attention enhanced road extraction from remote sensing images[J]. ISPRS International Journal of Geo-Information, 2022, 11 (1): 9
[1] 宋明俊,严文,邓益昭,张俊然,涂海燕. 轻量化机器人抓取位姿实时检测算法[J]. 浙江大学学报(工学版), 2024, 58(3): 599-610.
[2] 钱庆杰,余军合,战洪飞,王瑞,胡健. 基于DL-BiGRU多特征融合的注塑件尺寸预测方法[J]. 浙江大学学报(工学版), 2024, 58(3): 646-654.
[3] 姚鑫骅,于涛,封森文,马梓健,栾丛丛,沈洪垚. 基于图神经网络的零件机加工特征识别方法[J]. 浙江大学学报(工学版), 2024, 58(2): 349-359.
[4] 秦思怡,盖绍彦,达飞鹏. 混合采样下多级特征聚合的视频目标检测算法[J]. 浙江大学学报(工学版), 2024, 58(1): 10-19.
[5] 冯志成,杨杰,陈智超. 基于轻量级Transformer的城市路网提取方法[J]. 浙江大学学报(工学版), 2024, 58(1): 40-49.
[6] 孙雪菲,张瑞峰,关欣,李锵. 强化先验骨架结构的轻量型高效人体姿态估计[J]. 浙江大学学报(工学版), 2024, 58(1): 50-60.
[7] 郑超昊,尹志伟,曾钢锋,许月萍,周鹏,刘莉. 基于时空深度学习模型的数值降水预报后处理[J]. 浙江大学学报(工学版), 2023, 57(9): 1756-1765.
[8] 李海烽,张雪英,段淑斐,贾海蓉,Huizhi Liang . 融合生成对抗网络与时间卷积网络的普通话情感识别[J]. 浙江大学学报(工学版), 2023, 57(9): 1865-1875.
[9] 赵小强,王泽,宋昭漾,蒋红梅. 基于动态注意力网络的图像超分辨率重建[J]. 浙江大学学报(工学版), 2023, 57(8): 1487-1494.
[10] 王慧欣,童向荣. 融合知识图谱的推荐系统研究进展[J]. 浙江大学学报(工学版), 2023, 57(8): 1527-1540.
[11] 宋秀兰,董兆航,单杭冠,陆炜杰. 基于时空融合的多头注意力车辆轨迹预测[J]. 浙江大学学报(工学版), 2023, 57(8): 1636-1643.
[12] 李晓艳,王鹏,郭嘉,李雪,孙梦宇. 基于双注意力机制的多分支孪生网络目标跟踪[J]. 浙江大学学报(工学版), 2023, 57(7): 1307-1316.
[13] 杨哲,葛洪伟,李婷. 特征融合与分发的多专家并行推荐算法框架[J]. 浙江大学学报(工学版), 2023, 57(7): 1317-1325.
[14] 李云红,段姣姣,苏雪平,张蕾涛,于惠康,刘杏瑞. 基于改进生成对抗网络的书法字生成算法[J]. 浙江大学学报(工学版), 2023, 57(7): 1326-1334.
[15] 刘春娟,乔泽,闫浩文,吴小所,王嘉伟,辛钰强. 基于多尺度互注意力的遥感图像语义分割网络[J]. 浙江大学学报(工学版), 2023, 57(7): 1335-1344.