Remote sensing road extraction by fusing multi-scale resolution and strip feature

doi:10.3785/j.issn.1008-973X.2026.03.014

Journal of ZheJiang University (Engineering Science)

2026, Vol. 60

Issue (3): 585-593 DOI: 10.3785/j.issn.1008-973X.2026.03.014

Remote sensing road extraction by fusing multi-scale resolution and strip feature

Guoyan LI(

),Penghui LI,Rong LIU*(

),Yupeng MEI,Minghui ZHANG

College of Computer and Information Engineering, Tianjin Chengjian University, Tianjin 300384, China

Download:

HTML

PDF(2854KB) HTML
Export: BibTeX | EndNote (RIS)

Abstract

Multi-scale resolution and strip feature fusion network (MSRSF-Net) was proposed in order to address the issue of fragmentation and loss of fine details in extracting long-range topological road feature from remote sensing imagery. The network was designed with a strip-shaped attention mechanism in order to enhance the feature representation of elongated road. The encoder integrated dual channel-spatial attention mechanism with multi-resolution residual branch in order to achieve collaborative cross-scale feature extraction. The decoder adopted a feature fusion architecture combining strip and square convolution, improving the topological continuity of road extraction. The experimental results on the Massachusetts, DeepGlobe and SpaceNet datasets demonstrated that MSRSF-Net achieved IoU scores of 73.76%, 68.57% and 59.98%, with APLS metrics of 69.78%, 60.27% and 62.17%, respectively, demonstrating superior performance in preserving road connectivity compared with mainstream segmentation models.

Key words： road extraction strip-shaped convolution multi-scale feature fusion attention mechanism ResNet residual structure

Received: 06 April 2025 Published: 04 February 2026

CLC:

TP 751

Fund: 天津市科技特派员资助项目（24YDTPJC00410）.

Corresponding Authors: Rong LIU E-mail: ligy@tcu.edu.cn;lr@tcu.edu.cn

	Service
	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors
	Guoyan LI
	Penghui LI
	Rong LIU
	Yupeng MEI
	Minghui ZHANG

Cite this article:

Guoyan LI,Penghui LI,Rong LIU,Yupeng MEI,Minghui ZHANG. Remote sensing road extraction by fusing multi-scale resolution and strip feature. Journal of ZheJiang University (Engineering Science), 2026, 60(3): 585-593.

URL:

https://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2026.03.014 OR https://www.zjujournals.com/eng/Y2026/V60/I3/585

融合多尺度分辨率和带状特征的遥感道路提取

针对现有深度学习方法在提取遥感影像道路长距离拓扑特征时存在连通性断裂和细节缺失的问题，提出融合多尺度分辨率和带状特征网络（MSRSF-Net）. 该网络设计带状形态学注意力机制，强化细长道路的特征聚焦能力. 编码器集成通道-空间双注意力机制与多分辨率残差分支，实现跨尺度特征的协同提取. 解码器采用带状卷积与方形卷积的特征融合架构，提升道路提取的拓扑连贯性. 在Massachusetts、DeepGlobe、SpaceNet数据集上的实验表明，MSRSF-Net的IoU分别达到73.76%、68.57%、59.98%，APLS达到69.78%、60.27%、62.17%，与主流分割模型相比，道路连续性的保持能力有所提升.

关键词： 道路提取, 带状卷积, 多尺度特征融合, 注意力机制, ResNet残差结构

Fig.1 U-network with multi-scale resolution and fusion of strip feature

Fig.2 Multi-directional strip attention mechanism

Fig.3 Multi-resolution feature fusion encoder

Fig.4 Multi-directional strip feature reduction decoder

Tab.1 Different module ablation experiment on DeepGlobe dataset

Tab.2 Different module ablation experiment on Massachusetts dataset

Tab.3 Different module ablation experiment on SpaceNet dataset

Fig.5 Comparison of visualization result of MSRSF-Net with several other advanced models

Tab.4 Comparison between proposed model and several other state-of-the-art road extraction methods on DeepGlobe dataset

Tab.5 Comparison between proposed model and several other state-of-the-art road extraction methods on Massachusetts dataset

Tab.6 Comparison between proposed model and several other state-of-the-art road extraction methods on SpaceNet dataset

Tab.7 Analysis of model complexity


[1]	顾剑华, 孙鑫, 李红基于地理国情普查高分辨率遥感影像的道路提取方法研究[J]. 测绘与空间地理信息, 2014, 37 (6): 145- 146 GU Jianhua, SUN Xin, LI Hong Study on the extraction method of geographical conditions survey of high resolution remote sensing image based road[J]. Geomatics and Spatial Information Technology, 2014, 37 (6): 145- 146

[2]	SHIN H C, ROTH H R, GAO M, et al Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning[J]. IEEE Transactions on Medical Imaging, 2016, 35 (5): 1285- 1298 doi: 10.1109/TMI.2016.2528162

[3]	ZOU S, XIONG F, LUO H, et al. AF-Net: all-scale feature fusion network for road extraction from remote sensing images [C]// Digital Image Computing. Techniques and Applications. Saudi Arabia: IEEE, 2021: 66-73.

[4]	LU X, ZHONG Y, ZHENG Z, et al GAMSNet: globally aware road detection network with multi-scale residual learning[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2021, 175: 340- 352 doi: 10.1016/j.isprsjprs.2021.03.008

[5]	ZHANG Z, LIU Q, WANG Y Road extraction by deep residual U-Net[J]. IEEE Geoscience and Remote Sensing Letters, 2018, 15 (5): 749- 753 doi: 10.1109/LGRS.2018.2802944

[6]	XU Z, SUN Y, LIU M iCurb: imitation learning-based detection of road curbs using aerial images for autonomous driving[J]. IEEE Robotics and Automation Letters, 2021, 6 (2): 1097- 1104 doi: 10.1109/LRA.2021.3056344

[7]	LIU Y, XIAO Y. Remote sensing object detection method based on attention mechanism and multi-scale feature fusion [C]// 41st Chinese Control Conference. Hefei: IEEE, 2022: 7155-7160.

[8]	CHEN S B, JI Y X, TANG J, et al DBRANet: road extraction by dual-branch encoder and regional attention decoder[J]. IEEE Geoscience and Remote Sensing Letters, 2022, 19: 3002905

[9]	XU Y, CHEN H, DU C, et al MSACon: mining spatial attention-based contextual information for road extraction[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 5604317

[10]	WANG Y, SEO J, JEON T NL-LinkNet: toward lighter but more accurate road extraction with nonlocal operations[J]. IEEE Geoscience and Remote Sensing Letters, 2022, 19: 3000105

[11]	QI Y, HE Y, QI X, et al. Dynamic snake convolution based on topological geometric constraints for tubular structure segmentation [C]// IEEE/CVF International Conference on Computer Vision. Paris: IEEE, 2023: 6047-6056.

[12]	WANG Y, TONG L, LUO S, et al A multiscale and multi-direction feature fusion network for road detection from satellite imagery[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 5615718

[13]	YONG C, WEI W, REN Z, et al Multi-scale feature fusion and transformer network for urban green space segmentation from high-resolution remote sensing images[J]. International Journal of Applied Earth Observation and Geoinformation, 2023, 124: 103514 doi: 10.1016/j.jag.2023.103514

[14]	HU J, SHEN L, SUN G, et al. Squeeze-and-Excitation networks [C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 7132-7141.

[15]	JADERBERG M, SIMONYAN K, ZISSERMAN A, et al. Spatial transformer networks [C]// Advances in Neural Information Processing Systems. Montreal: [s. n. ], 2015: 2017-2025.

[16]	WANG J, WANG R, LIU Y, et al Transferable contextual network for rural road extraction from UAV-based remote sensing images[J]. Sensors, 2025, 25 (5): 1394 doi: 10.3390/s25051394

[17]	GUI L, GU X, HUANG F, et al Road extraction from remote sensing images using a skip-connected parallel CNN-transformer encoder-decoder model[J]. Applied Sciences, 2025, 15 (3): 1427 doi: 10.3390/app15031427

[18]	TONG Z, LI Y, ZHANG J, et al MSFANet: multiscale fusion attention network for road segmentation of multispectral remote sensing data[J]. Remote Sensing, 2023, 15 (8): 1978 doi: 10.3390/rs15081978

[19]	WANG X, QIN C, BAI M, et al CAFormer: a connectivity-aware vision transformer for road extraction from remote sensing images[J]. The Visual Computer, 2025, 41 (10): 7965- 7981 doi: 10.1007/s00371-025-03849-1

[20]	VIDIVELLI S, PADMAKUMARI P, PARTHIBAN C, et al Optimising deep learning models for ophthalmological disorder classification[J]. Scientific Reports, 2025, 15: 3115 doi: 10.1038/s41598-024-75867-3

[21]	ADEYEMI S Defect detection in manufacturing: an integrated deep learning approach[J]. Journal of Computer and Communications, 2024, 12 (10): 153- 176 doi: 10.4236/jcc.2024.1210011

[22]	PENG J, WANG Y, PAN Z Weakly supervised instance segmentation via class double-activation maps and boundary localization[J]. Signal Processing: Image Communication, 2024, 127: 117150 doi: 10.1016/j.image.2024.117150

[1]	Wenqiang CHEN,Linyue FENG,Dongdan WANG,Yulei GU,Xuan ZHAO. Vehicle trajectory prediction model integrating dynamic risk map and multivariate attention mechanism[J]. Journal of ZheJiang University (Engineering Science), 2026, 60(3): 455-467.

[2]	Congyu HU,Chenbo YIN,Wei MA,Chao YANG,Shikuan YAN. Object recognition of excavator operation based on improved CNN-LSTM[J]. Journal of ZheJiang University (Engineering Science), 2026, 60(3): 536-545.

[3]	Binbin LI,Chao ZHANG,Tao QIN,Changsheng CHEN,Xingyan LIU,Jing YANG. Mobile-based human fall detection method for photovoltaic power plant construction[J]. Journal of ZheJiang University (Engineering Science), 2026, 60(3): 546-555.

[4]	Shuang WANG,Xitai ZHANG,Yongcun GUO,Shousuo SUN. Demagnetization fault diagnosis of controllable hybrid magnetic couplers based on deep neural networks[J]. Journal of ZheJiang University (Engineering Science), 2026, 60(2): 279-286.

[5]	Xianhua LI,Pengfei DU,Tao SONG,Xun QIU,Yu CAI. EEG signal classification based on multi-scale sliding-window attention temporal convolutional networks[J]. Journal of ZheJiang University (Engineering Science), 2026, 60(2): 370-378.

[6]	Minghui YANG,Muyuan SONG,Daxi FU,Yanwei GUO,Xianzhui LU,Wencong ZHANG,Weilong ZHENG. Prediction of shield tunneling-induced soil settlement based on multi-head self-attention-Bi-LSTM model[J]. Journal of ZheJiang University (Engineering Science), 2026, 60(2): 415-424.

[7]	Siyao ZHOU,Nan XIA,Jiahong JIANG. Pose-guided dual-branch network for clothing-changing person re-identification[J]. Journal of ZheJiang University (Engineering Science), 2026, 60(1): 71-80.

[8]	Fujian WANG,Zetian ZHANG,Xiqun CHEN,Dianhai WANG. Usage prediction of shared bike based on multi-channel graph aggregation attention mechanism[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(9): 1986-1995.

[9]	Xuejun ZHANG,Shubin LIANG,Wanrong BAI,Fenghe ZHANG,Haiyan HUANG,Meifeng GUO,Zhuo CHEN. Source code vulnerability detection method based on heterogeneous graph representation[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(8): 1644-1652.

[10]	Yishan LIN,Jing ZUO,Shuhua LU. Multimodal sentiment analysis based on multi-head self-attention mechanism and MLP-Interactor[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(8): 1653-1661.

[11]	Yahong ZHAI,Yaling CHEN,Longyan XU,Yu GONG. Improved YOLOv8s lightweight small target detection algorithm of UAV aerial image[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(8): 1708-1717.

[12]	Jiarui FU,Zhaofei LI,Hao ZHOU,Wei HUANG. Camouflaged object detection based on Convnextv2 and texture-edge guidance[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(8): 1718-1726.

[13]	Rongtai YANG,Yubin SHAO,Qingzhi DU. Structure-aware model for few-shot knowledge completion[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(7): 1394-1402.

[14]	Shengju WANG,Zan ZHANG. Missing value imputation algorithm based on accelerated diffusion model[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(7): 1471-1480.

[15]	Yongqing CAI,Cheng HAN,Wei QUAN,Wudi CHEN. Visual induced motion sickness estimation model based on attention mechanism[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(6): 1110-1118.

Viewed

Full text

Abstract

Cited

Shared

Discussed