Please wait a minute...
浙江大学学报(工学版)  2025, Vol. 59 Issue (5): 938-946    DOI: 10.3785/j.issn.1008-973X.2025.05.007
计算机技术、信息工程     
基于CNN和Transformer聚合的遥感图像超分辨率重建
胡明志(),孙俊*(),杨彪,常开荣,杨俊龙
昆明理工大学 信息工程与自动化学院,云南 昆明 650500
Super-resolution reconstruction of remote sensing image based on CNN and Transformer aggregation
Mingzhi HU(),Jun SUN*(),Biao YANG,Kairong CHANG,Junlong YANG
School of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China
 全文: PDF(12404 KB)   HTML
摘要:

针对现有的遥感图像超分辨模型很少考虑噪声、模糊、JPEG压缩等因素对图像重建所带来的影响,以及Transformer模块构建高频信息能力受限的问题,提出多层退化模块. 设计基于CNN和Transformer聚合的网络,使用CNN识别图像的高频信息,Transformer提取全局信息. 利用基于注意力机制的聚合模块将2个模块聚合,在保持全局结构连贯性的同时,显著增强局部高频细节的重建精度. 利用所提模型,在AID数据集上随机选取6个场景进行实验,与MM-realSR模型在PSNR和SSIM指标上进行比较.结果表明,所提模型在PSNR指标上相比于MM-realSR模型平均提高1.61 dB,SSIM指标平均提升0.023.

关键词: 遥感图像超分辨率重建多层退化模块高频信息全局信息聚合模块    
Abstract:

A multi-layer degradation module was proposed aiming at the problem that most remote sensing image super-resolution models rarely consider the impact of noise, blur, JPEG compression, and other factors on image reconstruction, as well as the limitations of Transformer modules in capturing high-frequency information. A CNN-Transformer hybrid network was designed, where CNN captures high-frequency details and Transformer extracts global information. These two components were combined by an attention-based aggregation module, enhancing local high-frequency detail reconstruction while maintaining global structural coherence. The model was tested on six random scenes from the AID dataset and compared with the MM-realSR model in PSNR and SSIM. Results show an average PSNR improvement of 1.61 dB and a SSIM increase of 0.023 over MM-realSR.

Key words: remote sensing image    super-resolution reconstruction    multi-layer degradation module    high-frequency information    global information    aggregation module
收稿日期: 2024-05-28 出版日期: 2025-04-25
CLC:  TP 751  
基金资助: 国家自然科学基金资助项目(62363019);云南省基础研究计划资助项目(202401AT070355).
通讯作者: 孙俊     E-mail: 1404481618@qq.com;31408891@qq.com
作者简介: 胡明志(1998—),男,硕士生,从事图像处理分析的研究. orcid.org/0009-0001-9556-3396. E-mail:1404481618@qq.com
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  
胡明志
孙俊
杨彪
常开荣
杨俊龙

引用本文:

胡明志,孙俊,杨彪,常开荣,杨俊龙. 基于CNN和Transformer聚合的遥感图像超分辨率重建[J]. 浙江大学学报(工学版), 2025, 59(5): 938-946.

Mingzhi HU,Jun SUN,Biao YANG,Kairong CHANG,Junlong YANG. Super-resolution reconstruction of remote sensing image based on CNN and Transformer aggregation. Journal of ZheJiang University (Engineering Science), 2025, 59(5): 938-946.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2025.05.007        https://www.zjujournals.com/eng/CN/Y2025/V59/I5/938

图 1  单层退化模块与多层退化模块的数据合成过程
图 2  网络整体结构与深层特征提取模块的结构示意图
模型飞机场城市农田停车场运动场港口
PSNR/dBSSIMPSNR/dBSSIMPSNR/dBSSIMPSNR/dBSSIMPSNR/dBSSIMPSNR/dBSSIM
Bicubic26.220.687123.730.618229.490.734719.730.545125.450.690522.230.6699
Swinir24.430.659822.190.586728.150.714217.790.512624.200.666620.090.6401
CDC24.820.627222.500.586726.870.660120.040.554724.110.655021.710.6691
DAN25.700.692223.600.627728.750.730719.720.571125.310.696021.450.6682
real-Esrgan27.810.729624.820.676830.200.766421.060.646826.330.733822.840.7316
BSRGAN27.740.731825.240.675430.800.770421.790.641327.100.735123.570.7328
MM-realSR27.830.764925.640.722530.440.785222.420.694927.640.782623.950.7699
realHAT-TG27.760.747025.340.693330.530.773921.960.662226.930.749523.540.7426
本文模型29.500.785727.270.746232.430.806823.450.729229.570.806025.390.7860
表 1  AID测试数据集6个随机场景下不同模型的PSNR和SSIM指标
图 3  不同模型的重建结果可视化对比:AID测试集6个样本的PSNR/SSIM定量评估
图 4  不同模型的重建结果可视化对比:WHU-RS19数据集4个样本的PSNR/SSIM定量评估
图 5  退化模块对重建效果的影响
方法飞机场城市农田停车场运动场港口
PSNR/dBSSIMPSNR/dBSSIMPSNR/dBSSIMPSNR/dBSSIMPSNR/dBSSIMPSNR/dBSSIM
B26.880.704624.650.651629.670.735520.630.603726.470.714822.820.7018
B+H28.550.750426.310.692931.470.779422.280.642228.300.760024.560.7343
B+H+G28.600.751126.320.702631.530.779022.670.677428.530.766824.670.7512
B+H+G+A129.070.771726.820.728332.120.796323.070.701829.080.788325.100.7724
B+H+G+A29.500.785727.270.746232.430.806823.450.729229.570.806025.390.7860
表 2  不同消融模块在AID测试集6个场景下的PSNR和SSIM指标
Nb测试集
PSNR/dBSSIM
127.760.7690
227.780.7724
327.940.7769
427.860.7740
表 3  不同深层特征提取模块数下AID数据集6个场景的平均PSNR和SSIM指标
图 6  高频模块和全局模块输入输出特征图的可视化展示
1 ZHANG H, YANG Z, ZHANG L, et al Super-resolution reconstruction for multi-angle remote sensing images considering resolution differences[J]. Remote Sensing, 2014, 6 (1): 637- 657
doi: 10.3390/rs6010637
2 PAPATHANASSIOU C, PETROU M. Super resolution: an overview [C]// IEEE International Geoscience and Remote Sensing Symposium . Seoul: IEEE, 2005: 5655-5658.
3 GLASNER D, BAGON S, IRANI M. Super-resolution from a single image [C]// IEEE 12th International Conference on Computer Vision . Kyoto: IEEE, 2009: 349-356.
4 DONG C, LOY C C, HE K, et al Image super-resolution using deep convolutional networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 38 (2): 295- 307
5 LIM B, SON S, KIM H, et al. Enhanced deep residual networks for single image super-resolution [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops . Honolulu: IEEE, 2017: 136-144.
6 BEGIN I, FERRIE F R. Blind super-resolution using a learning-based approach [C]// Proceedings of the 17th International Conference on Pattern Recognition . Cambridge: IEEE, 2004: 85-89.
7 JOSHI M V, CHAUDHURI S, PANUGANTI R A learning-based method for image super-resolution from zoomed observations[J]. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 2005, 35 (3): 527- 537
doi: 10.1109/TSMCB.2005.846647
8 CHAN T M, ZHANG J. An improved super-resolution with manifold learning and histogram matching [C]// Advances in Biometrics: International Conference . Hong Kong: Springer, 2005: 756-762.
9 DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16x16 words: transformers for image recognition at scale [C]// International Conference on Learning Representations . Ethiopia: [s. n.], 2020.
10 DONG C, LOY C C, HE K, et al. Learning a deep convolutional network for image super-resolution [C]// 13th European Conference on Computer Vision . Switzerland: Springer, 2014: 184-199.
11 KIM J, LEE J K, LEE K M. Accurate image super-resolution using very deep convolutional networks [C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . Las Vegas: IEEE, 2016: 1646-1654.
12 LI W, ZHOU K, QI L, et al Lapar: linearly-assembled pixel-adaptive regression network for single image super-resolution and beyond[J]. Advances in Neural Information Processing Systems, 2020, 33: 20343- 20355
13 LIANG J, CAO J, SUN G, et al. Swinir: image restoration using swin transformer [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision . Montreal: IEEE, 2021: 1833-1844.
14 CHEN H, WANG Y, GUO T, et al. Pre-trained image processing transformer [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . Nashville: IEEE, 2021: 12299-12310.
15 LEI S, SHI Z, ZOU Z Super-resolution for remote sensing images via local–global combined network[J]. IEEE Geoscience and Remote Sensing Letters, 2017, 14 (8): 1243- 1247
doi: 10.1109/LGRS.2017.2704122
16 PAN Z, MA W, GUO J, et al Super-resolution of single remote sensing image based on residual dense backprojection networks[J]. IEEE Transactions on Geoscience and Remote Sensing, 2019, 57 (10): 7918- 7933
doi: 10.1109/TGRS.2019.2917427
17 ZHANG D, SHAO J, LI X, et al Remote sensing image super-resolution via mixed high-order attention network[J]. IEEE Transactions on Geoscience and Remote Sensing, 2020, 59 (6): 5183- 5196
18 BAI J, YUAN L, XIA S T, et al. Improving vision transformers by revisiting high-frequency components [C]// European Conference on Computer Vision . Cham: Springer, 2022: 1-18.
19 ELAD M, FEUER A Restoration of a single superresolution image from several blurred, noisy, and undersampled measured images[J]. IEEE Transactions on Image Processing, 1997, 6 (12): 1646- 1658
doi: 10.1109/83.650118
20 LIU C, SUN D On Bayesian adaptive video super resolution[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 36 (2): 346- 360
21 ZHANG K, LIANG J, VAN GOOL L, et al. Designing a practical degradation model for deep blind image super-resolution [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision . Montreal: IEEE, 2021: 4791-4800.
22 LIU Z, LIN Y, CAO Y, et al. Swin transformer: hierarchical vision transformer using shifted windows [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision . Montreal: IEEE, 2021: 10012-10022.
23 VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [EB/OL]. [2024-05-15]. https://arxiv.org/abs/1706.03762.
24 ZAMIR S W, ARORA A, KHAN S, et al. Restormer: efficient transformer for high-resolution image restoration [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition . New Orleans: IEEE, 2022: 5728-5739.
25 XIA G, HU J, HU F, et al AID: a benchmark data set for performance evaluation of aerial scene classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2017, 55 (7): 3965- 3981
doi: 10.1109/TGRS.2017.2685945
26 DAI D, YANG W Satellite image classification via two-layer sparse coding with biased image representation[J]. IEEE Geoscience and Remote Sensing Letters, 2010, 8 (1): 173- 176
27 TANCHENKO A Visual-PSNR measure of image quality[J]. Journal of Visual Communication and Image Representation, 2014, 25 (5): 874- 878
doi: 10.1016/j.jvcir.2014.01.008
28 WANG Z, BOVIK A C, SHEIKH H R, et al Image quality assessment: from error visibility to structural similarity[J]. IEEE Transactions on Image Processing, 2004, 13 (4): 600- 612
doi: 10.1109/TIP.2003.819861
29 ZHANG W, LI X, SHI G, et al. Real-world image super-resolution as multi-task learning [J]. Advances in Neural Information Processing Systems , 2023, 36: 21003-21022.
30 WANG X, XIE Liangbin, DONG C, et al. Real-esrgan: training real-world blind super-resolution with pure synthetic data [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision . Montreal: IEEE, 2021: 1905-1914.
31 MOU C, WU Y, WANG X, et al. Metric learning based interactive modulation for real-world super-resolution [C]// European Conference on Computer Vision . Cham: Springer, 2022: 723-740.
32 WEI P, XIE Z, LU H, et al. Component divide-and-conquer for real-world image super-resolution [C]// 16th European Conference on Computer Vision . Glasgow: Springer, 2020: 101-117.
[1] 张振利,胡新凯,李凡,冯志成,陈智超. 基于CNN和Efficient Transformer的多尺度遥感图像语义分割算法[J]. 浙江大学学报(工学版), 2025, 59(4): 778-786.
[2] 宦海,盛宇,顾晨曦. 基于遥感图像道路提取的全局指导多特征融合网络[J]. 浙江大学学报(工学版), 2024, 58(4): 696-707.
[3] 梁龙学,贺成龙,吴小所,闫浩文. 全局信息提取与重建的遥感图像语义分割网络[J]. 浙江大学学报(工学版), 2024, 58(11): 2270-2279.
[4] 冯志成,杨杰,陈智超. 基于轻量级Transformer的城市路网提取方法[J]. 浙江大学学报(工学版), 2024, 58(1): 40-49.
[5] 宋昭漾,赵小强,惠永永,蒋红梅. 基于多级连续编码与解码的图像超分辨率重建算法[J]. 浙江大学学报(工学版), 2023, 57(9): 1885-1893.
[6] 刘春娟,乔泽,闫浩文,吴小所,王嘉伟,辛钰强. 基于多尺度互注意力的遥感图像语义分割网络[J]. 浙江大学学报(工学版), 2023, 57(7): 1335-1344.
[7] 吕鑫栋,李娇,邓真楠,冯浩,崔欣桐,邓红霞. 基于改进Transformer的结构化图像超分辨网络[J]. 浙江大学学报(工学版), 2023, 57(5): 865-874.
[8] 詹燕,胡蝶,汤洪涛,鲁建厦,谭健,刘长睿. 基于改进生成对抗网络的图像数据增强方法[J]. 浙江大学学报(工学版), 2023, 57(10): 1998-2010.
[9] 周国华,卢剑伟,倪彤光,胡学龙. 层次型非线性子空间字典学习[J]. 浙江大学学报(工学版), 2022, 56(6): 1159-1167.
[10] 张云佐,郭威,蔡昭权,李文博. 联合多尺度与注意力机制的遥感图像目标检测[J]. 浙江大学学报(工学版), 2022, 56(11): 2215-2223.
[11] 段然,周登文,赵丽娟,柴晓亮. 基于多尺度特征映射网络的图像超分辨率重建[J]. 浙江大学学报(工学版), 2019, 53(7): 1331-1339.
[12] 高雪艳,潘安宁,杨扬. 基于图像混合特征的城市绿地遥感图像配准[J]. 浙江大学学报(工学版), 2019, 53(6): 1205-1217.
[13] 张廷蓉, 滕奇志, 李征骥, 卿粼波, 何小海. 岩心三维CT图像超分辨率重建[J]. 浙江大学学报(工学版), 2018, 52(7): 1294-1301.
[14] 张登荣 俞乐 邓超 狄黎平. 基于OGC WPS的Web环境遥感图像处理技术研究[J]. J4, 2008, 42(7): 1184-1188.