Global guidance multi-feature fusion network based on remote sensing image road extraction

doi:10.3785/j.issn.1008-973X.2024.04.005

Journal of ZheJiang University (Engineering Science)

2024, Vol. 58

Issue (4): 696-707 DOI: 10.3785/j.issn.1008-973X.2024.04.005

Global guidance multi-feature fusion network based on remote sensing image road extraction

Hai HUAN1(

),Yu SHENG2,Chenxi GU1

1. School of Artificial Intelligence, Nanjing University of Information Science and Technology, Nanjing 210044, China
2. School of Integrated Circuit Science and Engineering, Nanjing University of Posts and Telecommunications, Nanjing 210003, China

Download:

HTML

PDF(2780KB) HTML
Export: BibTeX | EndNote (RIS)

Abstract

Due to the high similarity between buildings and roads in remote sensing images, as well as the existence of shadows and occlusion, the existing deep learning semantic segmentation network generally has a high false segmentation rate when it comes to road segmentation. A global guide multi-feature fusion network (GGMNet) was proposed for road extraction in remote sensing images. To reduce the network’s misjudgment rate of similar features around the road, the feature map was divided into several local features, and then the features were multiplied by the global context information to strengthen the extraction of various features. The method of integrating multi-stage features was used to accurate spatial positioning of roads and reduce the probability of identifying other ground objects as roads. An adaptive global channel attention module was designed, and the global information was used to guide the local information, so as to enrich the context information of each pixel. In the decoding stage, a multi-feature fusion module was designed to make full use of the location information and the semantic information in the feature map of the four stages in the backbone network, and the correlations between layers were uncovered to improve the segmentation accuracy. The network was trained and tested using CITY-OSM dataset, DeepGlobe Road extraction dataset and CHN6-CUG dataset. Test results show that GGMNet has excellent road segmentation performance, and the ability to reduce the false segmentation rate of road segmentation is better than comparing networks.

Key words： remote sensing image deep learning road extraction attention mechanism context information

Received: 20 March 2023 Published: 27 March 2024

CLC:

TP 751.1

	Service
	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors
	Hai HUAN
	Yu SHENG
	Chenxi GU

Cite this article:

Hai HUAN,Yu SHENG,Chenxi GU. Global guidance multi-feature fusion network based on remote sensing image road extraction. Journal of ZheJiang University (Engineering Science), 2024, 58(4): 696-707.

URL:

https://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2024.04.005 OR https://www.zjujournals.com/eng/Y2024/V58/I4/696

基于遥感图像道路提取的全局指导多特征融合网络

在遥感图像中，建筑与道路的类间相似度高，且存在阴影和遮挡，使得现有深度学习语义分割网络在分割道路时误分割率高，为此提出全局指导多特征融合网络(GGMNet)用于提取遥感图像中的道路. 将特征图分为若干个局部特征，再将全局上下文信息与局部特征相乘，强化各类别特征的提取，以降低网络对道路周边相似地物的误判率. 采用融合多阶段特征的方法准确定位道路空间，降低将其余地物识别为道路的概率. 设计自适应全局通道注意力模块，利用全局信息指导局部信息，丰富每个像素的上下文信息. 在解码阶段，设计多特征融合模块，充分利用并融合骨干网络4个阶段的特征图中的位置信息与语义信息，发掘层与层之间的关联性以提升分割精度. 使用CITY-OSM数据集、DeepGlobe道路提取数据集和CHN6-CUG数据集对网络进行训练和测试. 测试结果表明，GGMNet具有优秀的道路分割性能，降低道路误分割率的能力比对比网络强.

关键词： 遥感图像, 深度学习, 道路提取, 注意力机制, 上下文信息

Fig.1 Overall structure of global guide multi-feature fusion network

Fig.2 Overall structure of adaptive globe channel attention module

Fig.3 Overall structure of multi-feature fusion module

Tab.1 Comparison of hyperparameter values for adaptive globe channel attention module based on CITY-OSM dataset

Fig.4 Comparison of hyperparameter-value visualization results based on CITY-OSM dataset

Tab.2 Comparison of hyperparameter values for adaptive globe channel attention module based on DeepGlobe dataset

Fig.5 Comparison of hyperparameter-value visualization results based on DeepGlobe dataset

Tab.3 Comparison of hyperparameter values for adaptive globe channel attention module based on CHN6-CUG dataset

Fig.6 Comparison of hyper parameter-value visualization results based on CHN6-CUG dataset

Tab.4 Module validity analysis based on CITY-OSM dataset

Fig.7 Comparison of visualization results for module validity analysis based on CITY-OSM dataset

Tab.5 Module validity analysis based on DeepGlobe dataset

Tab.6 Module validity analysis based on CHN6-CUG dataset

Fig.8 Comparison of visualization results for module validity analysis based on DeepGlobe dataset

Fig.9 Comparison of visualization results for module validity analysis based on CHN6-CUG dataset

Tab.7 Segmentation performance comparison of different networks based on CITY-OSM dataset

Fig.10 Segmentation results comparison of different networks based on CITY-OSM dataset

Tab.8 Segmentation performance comparison of different networks based on DeepGlobe dataset

Fig.11 Segmentation results comparison of different networks based on DeepGlobe dataset

Tab.9 Segmentation performance comparison of different networks based on CHN6-CUG dataset

Fig.12 Segmentation results comparison of different networks based on CHN6-CUG dataset


[1]	QUAN B, LIU B, FU D, et al. Improved DeepLabV3 for better road segmentation in remote sensing images [C]// 2021 International Conference on Computer Engineering and Artificial Intelligence . Shanghai: IEEE, 2021: 331–334.

[2]	ZHANG J, LI Y, SI Y, et al A low-grade road extraction method using SDG-DenseNet based on the fusion of optical and SAR images at decision level[J]. Remote Sensing, 2022, 14 (12): 2870 doi: 10.3390/rs14122870

[3]	胡春安, 陈玉玲基于Gabor和改进 LDA的人耳识别[J]. 计算机工程与科学, 2015, 37 (7): 1355- 1359 HU Chun’an, CHEN Yuling An ear recognition algorithm based on gabor features and improved LDA[J]. Computer Engineering and Science, 2015, 37 (7): 1355- 1359

[4]	邢军基于Sobel算子数字图像的边缘检测[J]. 微机发展, 2005, 15 (9): 48- 49 XING Jun Edge detection of Sobel-based digital image[J]. Microcomputer Development, 2005, 15 (9): 48- 49

[5]	SUN Q, LIU Q. The target fish’s population detection based on the improved watershed algorithm [C]// 2022 7th International Conference on Intelligent Computing and Signal Processing (ICSP) . Xi’an: IEEE, 2022: 507-510.

[6]	QIN J, HE Z S. A SVM face recognition method based on Gabor-featured key points [C]// 2005 International Conference on Machine Learning and Cybernetics . Guangzhou: IEEE, 2005: 5144–5149.

[7]	董师师, 黄哲学随机森林理论浅析[J]. 集成技术, 2013, 2 (1): 1- 7 DONG Shishi, HUANG Zhexue A brief theoretical overview of random forests[J]. Journal of Integration Technology, 2013, 2 (1): 1- 7

[8]	GU J, WANG Z, KUEN J, et al Recent advances in convolutional neural networks[J]. Pattern Recognition, 2018, 77: 354- 377 doi: 10.1016/j.patcog.2017.10.013

[9]	ZHANG Z, LIU Q, WANG Y Road extraction by deep residual U-Net[J]. IEEE Geoscience and Remote Sensing Letters, 2018, 15 (5): 749- 753 doi: 10.1109/LGRS.2018.2802944

[10]	LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation [C]// 2015 IEEE Conference on Computer Vision and Pattern Recognition . Boston: IEEE, 2015: 3431–3440.

[11]	LIN G, MILAN A, SHEN C, et al. RefineNet: multi-path refinement networks for high-resolution semantic segmentation [C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition . Honolulu: IEEE, 2017: 1925–1934.

[12]	HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition [C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition . Las Vegas: IEEE, 2016: 770–778.

[13]	ZHAO H, SHI J, QI X, et al. Pyramid scene parsing network [C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition . Honolulu: IEEE, 2017: 2881–2890.

[14]	CHEN L C, ZHU Y, PAPANDREOU G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation [C]// European Conference on Computer Vision . [S. l.]: Springer, 2018: 833–851.

[15]	HU J, SHEN L, SUN G. Squeeze-and-excitation networks [C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Salt Lake City: IEEE, 2018: 7132–7141.

[16]	WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module [C]// European Conference on Computer Vision . [S. l.]: Springer, 2018: 3–19.

[17]	FU J, LIU J, TIAN H, et al. Dual attention network for scene segmentation [C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Long Beach: IEEE, 2019: 3146–3154.

[18]	ZHANG W, HUANG Z, LUO G, et al. TopFormer: token pyramid transformer for mobile semantic segmentation [C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition . New Orleans: IEEE, 2022: 12083–12093.

[19]	KAISER P, WEGNER J D, LUCCHI A, et al Learning aerial image segmentation from online maps[J]. IEEE Transactions on Geoscience and Remote Sensing, 2017, 55 (11): 6054- 6068 doi: 10.1109/TGRS.2017.2719738

[20]	DEMIR I, KOPERSKI K, LINDENBAUM D, et al. Deepglobe 2018: a challenge to parse the earth through satellite images [C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops . Salt Lake City: IEEE, 2018: 172-181.

[21]	ZHU Q, ZHANG Y, WANG L, et al A global context-aware and batch-independent network for road extraction from VHR satellite imagery[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2021, 175: 353- 365 doi: 10.1016/j.isprsjprs.2021.03.016

[22]	HE T, ZHANG Z, ZHANG H, et al. Bag of tricks for image classification with convolutional neural networks [C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Long Beach: IEEE, 2019: 558–567.

[23]	CHEN L C, PAPANDREOU G, SCHROFF F, et al. Rethinking atrous convolution for semantic image segmentation [EB/OL]. (2017-12-05)[2023-05-10]. https://arxiv.org/pdf/1706.05587.pdf.

[24]	HE J, DENG Z, ZHOU L, et al. Adaptive pyramid context network for semantic segmentation [C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Long Beach: IEEE, 2019: 7519–7528.

[25]	HUANG Z, WANG X, HUANG L, et al. CCNet: criss-cross attention for semantic segmentation [C]// 2019 IEEE/CVF International Conference on Computer Vision . Seoul: IEEE, 2019: 603–612.

[26]	LI X, ZHONG Z, WU J, et al. Expectation-maximization attention networks for semantic segmentation [C]// 2019 IEEE/CVF International Conference on Computer Vision . Seoul: IEEE, 2019: 9167–9176.

[27]	YIN M, YAO Z, CAO Y, et al. Disentangled non-local neural networks [C]// European Conference on Computer Vision . [S. l.]: Springer, 2020: 191–207.

[28]	LI S, LIAO C, DING Y, et al Cascaded residual attention enhanced road extraction from remote sensing images[J]. ISPRS International Journal of Geo-Information, 2022, 11 (1): 9

[1]	Mingjun SONG,Wen YAN,Yizhao DENG,Junran ZHANG,Haiyan TU. Light-weight algorithm for real-time robotic grasp detection[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(3): 599-610.

[2]	Qingjie QIAN,Junhe YU,Hongfei ZHAN,Rui WANG,Jian HU. Dimension prediction method of injection molded parts based on multi-feature fusion of DL-BiGRU[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(3): 646-654.

[3]	Xinhua YAO,Tao YU,Senwen FENG,Zijian MA,Congcong LUAN,Hongyao SHEN. Recognition method of parts machining features based on graph neural network[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(2): 349-359.

[4]	Siyi QIN,Shaoyan GAI,Feipeng DA. Video object detection algorithm based on multi-level feature aggregation under mixed sampler[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(1): 10-19.

[5]	Zhicheng FENG,Jie YANG,Zhichao CHEN. Urban road network extraction method based on lightweight Transformer[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(1): 40-49.

[6]	Xuefei SUN,Ruifeng ZHANG,Xin GUAN,Qiang LI. Lightweight and efficient human pose estimation with enhanced priori skeleton structure[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(1): 50-60.

[7]	Chao-hao ZHENG,Zhi-wei YIN,Gang-feng ZENG,Yue-ping XU,Peng ZHOU,Li LIU. Post-processing of numerical precipitation forecast based on spatial-temporal deep learning model[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(9): 1756-1765.

[8]	Hai-feng LI,Xue-ying ZHANG,Shu-fei DUAN,Hai-rong JIA,Hui-zhi LIANG. Fusing generative adversarial network and temporal convolutional network for Mandarin emotion recognition[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(9): 1865-1875.

[9]	Xiao-qiang ZHAO,Ze WANG,Zhao-yang SONG,Hong-mei JIANG. Image super-resolution reconstruction based on dynamic attention network[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(8): 1487-1494.

[10]	Hui-xin WANG,Xiang-rong TONG. Research progress of recommendation system based on knowledge graph[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(8): 1527-1540.

[11]	Xiu-lan SONG,Zhao-hang DONG,Hang-guan SHAN,Wei-jie LU. Vehicle trajectory prediction based on temporal-spatial multi-head attention mechanism[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(8): 1636-1643.

[12]	Xiao-yan LI,Peng WANG,Jia GUO,Xue LI,Meng-yu SUN. Multi branch Siamese network target tracking based on double attention mechanism[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(7): 1307-1316.

[13]	Zhe YANG,Hong-wei GE,Ting LI. Framework of feature fusion and distribution with mixture of experts for parallel recommendation algorithm[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(7): 1317-1325.

[14]	Yun-hong LI,Jiao-jiao DUAN,Xue-ping SU,Lei-tao ZHANG,Hui-kang YU,Xing-rui LIU. Calligraphy generation algorithm based on improved generative adversarial network[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(7): 1326-1334.

[15]	Chun-juan LIU,Ze QIAO,Hao-wen YAN,Xiao-suo WU,Jia-wei WANG,Yu-qiang XIN. Semantic segmentation network for remote sensing image based on multi-scale mutual attention[J]. Journal of ZheJiang University (Engineering Science), 2023, 57(7): 1335-1344.

Viewed

Full text

Abstract

Cited

Shared

Discussed