|
|
Building extraction based on multiple multiscale-feature fusion attention network |
Dong-jie YANG1( ),Xian-jun GAO1,*( ),Shu-hao RAN1,Guang-bin ZHANG1,Ping WANG2,3,Yuan-wei YANG1,4,5 |
1. School of Geosciences, Yangtze University, Wuhan 430100, China 2. Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China 3. Key Laboratory of Earth Observation of Hainan Province, Sanya 572029, China 4. Hunan Provincial Key Laboratory of Geo-Information Engineering in Surveying, Mapping and Remote Sensing, Hunan University of Science and Technology, Xiangtan 411201, China 5. Beijing Key Laboratory of Urban Spatial Information Engineering, Beijing Institute of Surveying and Mapping, Beijing 100045, China |
|
|
Abstract A novel neural network named multiple multiscale-feature fusion attention network (MMFA-Net) was proposed for building segmentation from high-resolution remote sensing images aiming at the disadvantages that the fully convolutional networks for building extraction have the problems of over-segmentation and internal cavity. U-Net was used as the backbone combined with multiple-extract efficient channel attention (MECA) and multiscale-feature fusion attention (MFA) structure. The MECA module was designed to strengthen the effectiveness of the feature information through the weight ratio, which was in the skip connection. The transition allocation of attention to invalid features was avoided. The multiple feature extraction was adopted to reduce the loss of effective features. The MFA module was positioned at the bottom of the model. Different spatial features and spectral dimension features were obtained through the combination of parallel continuous medium or small-scale atrous convolution and channel attention. Then the problem of pixel loss of large buildings caused by atrous convolution was alleviated. The MMFA-Net integrating the MECA and the MFA modules can promote the integrity and accuracy of building extraction results. The proposed MMFA-Net was verified on WHU, Massachusetts, and owner-drawing building datasets. MMFA-Net showed better performance compared with the other five comparison methods. The F1-Score and IoU of MMFA-Net reached 93.33%, 87.50% at WHU datasets, 85.38%, 74.49% at Massachusetts datasets, and 88.46%, 79.31% at owner-drawing datasets, respectively.
|
Received: 05 January 2022
Published: 25 October 2022
|
|
Fund: 海南省地球观测重点实验室开放基金资助项目(2020LDE001);自然资源部地理国情监测重点实验室开放基金资助项目(2020NGCM07);城市轨道交通数字化建设与测评技术国家工程实验室开放课题基金资助项目(2021ZH02);湖南科技大学测绘遥感信息工程湖南省重点实验室开放基金资助项目(E22133);城市空间信息工程北京市重点实验室经费资助项目(20210205) |
Corresponding Authors:
Xian-jun GAO
E-mail: 2021710420@yangtzeu.edu.cn;junxgao@yangtzeu.edu.cn
|
基于多重多尺度融合注意力网络的建筑物提取
针对全卷积神经网络模型在进行建筑物提取时易产生过度分割以及内部空洞的问题,提出基于多重多尺度融合注意力网络(MMFA-Net)的高分辨率遥感影像建筑物提取方法. 该方法以U-Net为主体架构,设计2个模块:多重高效通道注意力(MECA)和多尺度特征融合注意力(MFA). MECA设计在模型跳跃连接中,通过权重配比强化有效特征信息,避免注意力向无效特征的过渡分配;采用多重特征提取,减少有效特征的损失. MFA被嵌入模型底部,结合并行连续中小尺度空洞卷积与通道注意力,获得不同的空间特征与光谱维度特征,缓解空洞卷积造成的大型建筑物像素缺失问题. MMFA-Net通过融合MECA和MFA,提高了建筑物提取结果的完整度和精确率. 将模型在WHU、Massachusetts和自绘建筑物数据集上进行验证,在定量评价方面优于其他5种对比方法,F1分数和IoU分别达到93.33%、87.50%;85.38%、74.49%和88.46%、79.31%.
关键词:
深度学习,
高分辨遥感影像,
建筑物提取,
多尺度特征融合,
高效通道注意力模块,
U-Net
|
|
[1] |
范荣双, 陈洋, 徐启恒, 等 基于深度学习的高分辨率遥感影像建筑物提取方法[J]. 测绘学报, 2019, 48 (1): 34- 41 FAN Rong-shuang, CHEN Yang, XU Qi-heng, et al A high-resolution remote sensing image building extraction method based on deep learning[J]. Acta Geodaetica et Cartographica Sinica, 2019, 48 (1): 34- 41
doi: 10.11947/j.AGCS.2019.20170638
|
|
|
[2] |
BLASCHKE T Object based image analysis for remote sensing[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2010, 65 (1): 2- 16
doi: 10.1016/j.isprsjprs.2009.06.004
|
|
|
[3] |
冉树浩, 胡玉龙, 杨元维, 等 基于样本形态变换的高分遥感影像建筑物提取[J]. 浙江大学学报: 工学版, 2020, 54 (5): 996- 1006 RAN Shu-hao, HU Yu-long, YANG Yuan-wei, et al Building extraction from high resolution remote sensing image based on sample morphological transformation[J]. Journal of Zhejiang University: Engineering Science, 2020, 54 (5): 996- 1006
|
|
|
[4] |
JUNG C R, SCHRAMM R. Rectangle detection based on a windowed Hough transform [C]// Proceedings of 17th Brazilian Symposium on Computer Graphics and Image Processing. Curitiba: IEEE, 2004: 113-120.
|
|
|
[5] |
季顺平, 魏世清 遥感影像建筑物提取的卷积神经元网络与开源数据集方法[J]. 测绘学报, 2019, 48 (4): 448- 459 JI Shun-ping, WEI Shi-qing Building extraction via convolutional neural networks from an open remote sensing building dataset[J]. Acta Geodaetica et Cartographica Sinica, 2019, 48 (4): 448- 459
doi: 10.11947/j.AGCS.2019.20180206
|
|
|
[6] |
BOULILA W, SELLAMI M, DRISS M, et al RS-DCNN: a novel distributed convolutional-neural-networks based-approach for big remote-sensing image classification[J]. Computers and Electronics in Agriculture, 2021, 182: 106014
doi: 10.1016/j.compag.2021.106014
|
|
|
[7] |
HAN W, FENG R, WANG L, et al A semi-supervised generative framework with deep learning features for high-resolution remote sensing image scene classification[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2018, 145: 23- 43
doi: 10.1016/j.isprsjprs.2017.11.004
|
|
|
[8] |
AILONG M, YUTING W, YANFEI Z, et al SceneNet: remote sensing scene classification deep learning network using multi-objective neural evolution architecture search[J]. ISPRS Journal of Photogrammetry and Remote Sensing, 2021, 172: 171- 188
doi: 10.1016/j.isprsjprs.2020.11.025
|
|
|
[9] |
SAITO S, YAMASHITA T, AOKI Y Multiple object extraction from aerial imagery with convolutional neural networks[J]. Electronic Imaging, 2016, 2016 (10): 1- 9
|
|
|
[10] |
BALL J E, ANDERSON D T, CHAN C S Comprehensive survey of deep learning in remote sensing: theories, tools, and challenges for the community[J]. Journal of Applied Remote Sensing, 2017, 11 (4): 042609
|
|
|
[11] |
MNIH V. Machine learning for aerial image labeling [D]. Canada: University of Toronto, 2013.
|
|
|
[12] |
SHELHAMER E, LONG J, DARRELL T Fully convolutional networks for semantic segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 39 (4): 640- 651
|
|
|
[13] |
RONNEBERGER O, FISCHER P, BROX T. U-net: convolutional networks for biomedical image segmentation [C]// International Conference on Medical Image Computing and Computer-Assisted Intervention. Cham: Springer, 2015: 234-241.
|
|
|
[14] |
YI Y, ZHANG Z, ZHANG W, et al Semantic segmentation of urban buildings from VHR remote sensing imagery using a deep convolutional neural network[J]. Remote Sensing, 2019, 11 (15): 1774
doi: 10.3390/rs11151774
|
|
|
[15] |
SHAO Z, TANG P, WANG Z, et al BRRNet: a fully convolutional neural network for automatic building extraction from high-resolution remote sensing images[J]. Remote Sensing, 2020, 12 (6): 1050
doi: 10.3390/rs12061050
|
|
|
[16] |
CHEN L C, PAPANDREOU G, KOKKINOS I, et al DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 40 (4): 834- 848
|
|
|
[17] |
CHEN L C, PAPANDREOU G, SCHROFF F, et al. Rethinking atrous convolution for semantic image segmentation [EB/OL]. (2017-12-05)[2022-01-05]. https://arxiv.53yu.com/abs/1706.05587.
|
|
|
[18] |
RAN S H, GAO X J, YANG Y W, et al Building multi-feature fusion refined network for building extraction from high-resolution remote sensing images[J]. Remote Sensing, 2021, 13 (14): 2794
doi: 10.3390/rs13142794
|
|
|
[19] |
HU J, SHEN L, SUN G. Squeeze-and-excitation networks [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 7132-7141.
|
|
|
[20] |
GAO Z, XIE J, WANG Q, et al. Global second-order pooling convolutional networks [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 3024-3033.
|
|
|
[21] |
FU J, LIU J, TIAN H, et al. Dual attention network for scene segmentation [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 3146-3154.
|
|
|
[22] |
WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module [C]// Proceedings of the European Conference on Computer Vision. Munich: [s. n. ], 2018: 3-19.
|
|
|
[23] |
WANG Q, WU B, ZHU P, et al. ECA-Net: efficient channel attention for deep convolutional neural networks [C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. [S. l. ]: IEEE, 2020.
|
|
|
[24] |
LIN M, CHEN Q, YAN S. Network in network [EB/OL]. (2014-03-04)[2022-01-05]. https://arxiv.org/abs/1312.4400.
|
|
|
[25] |
IOFFE S, SZEGEDY C. Batch normalization: accelerating deep network training by reducing internal covariate shift [C]// International Conference on Machine Learning. Lille: PMLR, 2015: 448-456.
|
|
|
[26] |
WANG P, CHEN P, YUAN Y, et al. Understanding convolution for semantic segmentation [C]// 2018 IEEE Winter Conference on Applications of Computer Vision. Lake Tahoe: IEEE, 2018: 1451-1460.
|
|
|
[27] |
JI S, WEI S, MENG L Fully convolutional networks for multisource building extraction from an open aerial and satellite imagery data set[J]. IEEE Transactions on Geoscience and Remote Sensing, 2018, 57 (1): 574- 586
|
|
|
[28] |
KINGMA D P, BA J. Adam: a method for stochastic optimization [EB/OL]. (2017-01-30)[2022-01-05]. https://arxiv.org/abs/1412.6980.
|
|
|
[29] |
MILLETARI F, NAVAB N, AHMADI S A. V-net: fully convolutional neural networks for volumetric medical image segmentation [C]// 2016 4th International Conference on 3D Vision. Stanford: IEEE, 2016: 565-571.
|
|
|
[30] |
BADRINARAYANAN V, KENDALL A, CIPOLLA R Segnet: a deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39 (12): 2481- 2495
doi: 10.1109/TPAMI.2016.2644615
|
|
|
[31] |
CHEN L C, ZHU Y, PAPANDREOU G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation [C]// Proceedings of the European Conference on Computer Vision. Munich: [s. n. ], 2018: 801-818.
|
|
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|