Please wait a minute...
Journal of ZheJiang University (Engineering Science)  2025, Vol. 59 Issue (7): 1434-1442    DOI: 10.3785/j.issn.1008-973X.2025.07.011
    
Based on improved Transformer for super-resolution reconstruction of lung CT images
Jie LIU1(),You WU1,Jiahe TIAN2,Ke HAN3
1. School of Measurement-Control Technology and Communications Engineering, Harbin University of Science and Technology, Harbin 150080, China
2. Rongcheng Campus, Harbin University of Science and Technology, Weihai 264300, China
3. School of Computer and Information Engineering, Harbin University of Commerce, Harbin 150028, China
Download: HTML     PDF(2460KB) HTML
Export: BibTeX | EndNote (RIS)      

Abstract  

A super-resolution reconstruction network for lung CT images based on a locally enhanced Transformer and U-Net was proposed for the rich grey scale of the lung CT images leading to insufficient feature extraction and poor reconstruction details. The dilated convolution was used for deep feature extraction in multiple receptive fields, the global image information was obtained under the dilated convolution layers with different dilation rates, and the feature information under these different receptive fields was fused. The original features were obtained through the 3×3 convolutional layers, which were sent to the coding and decoding structure combining the proposed network, and the local enhancement window module reduced the computation and captured the local information. In the decoding stage, a skip connection was utilized, along with a segmentation attention block that fused spatial and channel attention to discard irrelevant information and utilize useful information, in order to obtain high-quality reconstructed images. Experimental results showed that, on the SARS-CoV-2 dataset, compared with the Transformer network, the proposed network improved the structural similarity index measure and the peak signal-to-noise ratio for 4-fold super-resolution by 0.029 and 0.186 dB, respectively.



Key wordslung CT image      super-resolution reconstruction      Transformer      dilated convolution      segmentation attention     
Received: 02 September 2024      Published: 25 July 2025
CLC:  TP 391  
Fund:  黑龙江省自然科学基金资助项目(LH2023E086);黑龙江省交通运输厅科技项目(HJK2024B002).
Cite this article:

Jie LIU,You WU,Jiahe TIAN,Ke HAN. Based on improved Transformer for super-resolution reconstruction of lung CT images. Journal of ZheJiang University (Engineering Science), 2025, 59(7): 1434-1442.

URL:

https://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2025.07.011     OR     https://www.zjujournals.com/eng/Y2025/V59/I7/1434


改进Transformer的肺部CT图像超分辨率重建

肺部CT图像灰度级别丰富,导致特征提取不充分、重建细节较差,为此提出基于局部增强Transformer和U-Net的肺部CT图像超分辨率重建网络. 采用空洞卷积进行多感受野的深层特征提取,在不同膨胀率的空洞卷积层下获得全局图像信息,进行不同感受野下的特征信息融合. 将通过3×3卷积层获得的原始特征送入结合所提网络的编解码结构中,在局部增强窗口模块的作用下减小计算量并捕获局部信息. 在解码阶段,为了提高重建图像的质量,使用跳跃连接并加入融合空间注意力和通道注意力的分割注意模块,进行无用信息丢弃和有用信息利用. 实验结果表明,在SARS-CoV-2数据集中,所提网络与Transformer网络相比,4倍超分辨率的结构相似性和峰值信噪比分别提高了0.029和0.186 dB.


关键词: 肺部CT图像,  超分辨率重建,  Transformer,  空洞卷积,  分割注意力 
Fig.1 Structure comparison of proposed super-resolution reconstruction network with U-Net
Fig.2 Diagram of multi-sensory field feature extraction block
Fig.3 Diagram of encoder-decoder block integrated with U-Net structure
Fig.4 Diagram of segmentation attention block
数据集SSIMPSNR/dB
l=(1,1,1)l=(1,2,5)l=(1,2,3)l=(1,3,5)l=(2,4,8)l=(6,6,6)l=(1,1,1)l=(1,2,5)l=(1,2,3)l=(1,3,5)l=(2,4,8)l=(6,6,6)
COVID-CT0.6940.7920.7760.7830.7230.67026.93827.21027.11427.27527.01226.701
SARS-CoV-20.8120.8490.8370.8240.7370.70426.97628.36228.17528.19027.24426.917
Tab.1 Objective evaluation results of image quality for proposed network at six dilation rates in two datasets
主干网络COVID-CTSARS-CoV-2
SSIMPSNR/dBSSIMPSNR/dB
Transformer0.80227.0370.83327.687
U-Net0.66226.1010.71426.115
DUCformer0.83127.5480.86228.873
Tab.2 Comparison of objective evaluation results for reconstructed image quality in different backbone networks
跳跃连接方式COVID-CTSARS-CoV-2
SSIMPSNR/dBSSIMPSNR/dB
Concat0.80127.3250.82727.580
SPA0.79127.1300.83327.325
CA0.81227.3530.84927.585
DOUformer0.83127.5480.86228.873
Tab.3 Effect of different skip connections on quality of reconstructed images
编号MFFEB局部
增强
SABCOVID-CTSARS-CoV-2
SSIMPSNR/dBSSIMPSNR/dB
×××0.80227.0370.83327.687
××0.81227.2100.84927.362
××0.82427.3850.85527.507
××0.82627.4010.85827.798
0.83127.5480.86227.873
Tab.4 Modular ablation experiments of proposed network in two datasets
Fig.5 Comparison of convergence speed and objective index for different networks
网络COVID-CTSARS-CoV-2
PSNR/dBSSIMPSNR/dBSSIM
PBPN32.0610.79232.3680.815
Transformer32.5080.85632.9920.882
SwinIR33.1430.86333.6130.887
Restormer32.8960.86533.5020.899
HNCT33.1290.86833.4180.900
CuNeRF33.2470.86033.5060.896
SARGD33.1640.85733.4520.882
DCUformer33.8680.88334.3000.929
Tab.5 Quantitative comparison of different networks in two datasets (×2LR)
网络COVID-CTSARS-CoV-2
PSNR/dBSSIMPSNR/dBSSIM
PBPN28.8450.76229.4040.780
Transformer29.4930.82429.8450.850
SwinIR29.8320.83030.1810.855
Restormer29.8650.83430.2460.868
HNCT29.8830.83630.1490.868
CuNeRF29.8740.83930.3750.890
SARGD29.6790.82630.1670.872
DCUformer30.2960.85231.1820.892
Tab.6 Quantitative comparison of different networks in two datasets (×3LR)
网络COVID-CTSARS-CoV-2
PSNR/dBSSIMPSNR/dBSSIM
PBPN26.4980.73526.8130.756
Transformer27.0370.80227.6870.833
SwinIR27.2110.81327.7280.840
Restormer27.2200.80927.7310.842
HNCT27.2240.81127.7260.841
CuNeRF27.4350.82427.6710.856
SARGD27.2170.80727.4400.847
DCUformer27.5480.83127.8730.862
Tab.7 Quantitative comparison of different networks in two datasets (×4LR)
Fig.6 Comparison of visual effects for different image super-resolution reconstruction networks(×2LR)
Fig.7 Comparison of visual effects for different image super-resolution reconstruction networks(×3LR)
Fig.8 Comparison of visual effects for different image super-resolution reconstruction networks(×4LR)
[1]   范金河. 基于深度学习的超分辨率CT图像重建算法研究[D]. 绵阳: 西南科技大学, 2023: 1–66.
FAN Jinhe. Research on super-resolution CT image reconstruction algorithm based on deep learning [D]. Mianyang: Southwest University of Science and Technology, 2023: 1–66.
[2]   赵小强, 王泽, 宋昭漾, 等 基于动态注意力网络的图像超分辨率重建[J]. 浙江大学学报: 工学版, 2023, 57 (8): 1487- 1494
ZHAO Xiaoqiang, WANG Ze, SONG Zhaoyang, et al Image super-resolution reconstruction based on dynamic attention network[J]. Journal of Zhejiang University: Engineering Science, 2023, 57 (8): 1487- 1494
[3]   郑跃坤, 葛明锋, 常智敏, 等 基于残差网络的结直肠内窥镜图像超分辨率重建方法[J]. 中国光学(中英文), 2023, 16 (5): 1022- 1033
ZHENG Yuekun, GE Mingfeng, CHANG Zhimin, et al Super-resolution reconstruction for colorectal endoscopic images based on a residual network[J]. Chinese Optics, 2023, 16 (5): 1022- 1033
doi: 10.37188/CO.2022-0247
[4]   李嫣, 任文琦, 张长青, 等 基于真实退化估计与高频引导的内窥镜图像超分辨率重建[J]. 自动化学报, 2024, 50 (2): 334- 347
LI Yan, REN Wenqi, ZHANG Changqing, et al Super-resolution of endoscopic images based on real degradation estimation and high-frequency guidance[J]. Acta Automatica Sinica, 2024, 50 (2): 334- 347
[5]   宋全博, 李扬科, 范业莹, 等 先验GAN的CBCT牙齿图像超分辨率方法[J]. 计算机辅助设计与图形学学报, 2023, 35 (11): 1751- 1759
SONG Quanbo, LI Yangke, FAN Yeying, et al CBCT tooth images super-resolution method based on GAN prior[J]. Journal of Computer-Aided Design and Computer Graphics, 2023, 35 (11): 1751- 1759
[6]   VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [C]// Conference on Neural Information Processing Systems. Long Beach: MIT Press, 2017: 6000–6010.
[7]   LIANG J, CAO J, SUN G, et al. SwinIR: image restoration using swin transformer [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops. Montreal: IEEE, 2021: 1833–1844.
[8]   吕鑫栋, 李娇, 邓真楠, 等 基于改进Transformer的结构化图像超分辨网络[J]. 浙江大学学报: 工学版, 2023, 57 (5): 865- 874
LV Xindong, LI Jiao, DENG Zhennan, et al Structured image super-resolution network based on improved Transformer[J]. Journal of Zhejiang University: Engineering Science, 2023, 57 (5): 865- 874
[9]   YU P, ZHANG H, KANG H, et al. RPLHR-CT dataset and transformer baseline for volumetric super-resolution from CT scans [C]// Medical Image Computing and Computer Assisted Intervention. [S. l.]: Springer, 2022: 344–353.
[10]   赵凯光. 基于深度学习的肺部CT图像超分辨率重建[D]. 长春: 长春理工大学, 2022: 1–55.
ZHAO Kaiguang. Deep learning based on super-resolution reconstruction of lung CT images [D]. Changchun: Changchun University of Science and Technology, 2022: 1–55.
[11]   刘伟. 基于深度学习的三维头部MRI超分辨率重建[D]. 桂林: 桂林电子科技大学, 2022: 1–54.
LIU Wei. 3D Head MRI super-resolution reconstruction based on deep learning [D]. Guilin: Guilin University of Electronic Technology, 2023: 1–54.
[12]   李光远. 基于深度学习的磁共振成像超分辨率重建[D]. 烟台: 烟台大学, 2023: 1–77.
LI Guangyuan. Deep learning-based magnetic resonance imaging super-resolution reconstruction [D]. Yantai: Yantai University, 2023: 1–77.
[13]   李众, 王雅婧, 马巧梅 基于空洞卷积的医学图像超分辨率重建算法[J]. 计算机应用, 2023, 43 (9): 2940- 2947
LI Zhong, WANG Yajing, MA Qiaomei Super-resolution reconstruction algorithm of medical images based on dilated convolution[J]. Journal of Computer Applications, 2023, 43 (9): 2940- 2947
[14]   YANG X, HE X, ZHAO J, et al. COVID-CT-dataset: a CT scan dataset about COVID-19 [EB/OL]. (2020−06−17)[2024−07−18]. https://arxiv.org/pdf/2003.13865.
[15]   SOARES E, ANGELOV P, BIASO S, et al. SARS-CoV-2 CT-scan dataset: a large dataset of real patients CT scans for SARS-CoV-2 identification [EB/OL]. (2020−05−14)[2024−07−18]. https://www.medrxiv.org/content/10.1101/2020.04.24.20078584v3.full.pdf.
[16]   WANG C, LV X, SHAO M, et al A novel fuzzy hierarchical fusion attention convolution neural network for medical image super-resolution reconstruction[J]. Information Sciences, 2023, 622: 424- 436
doi: 10.1016/j.ins.2022.11.140
[17]   WANG Z, BOVIK A C, SHEIKH H R, et al Image quality assessment: from error visibility to structural similarity[J]. IEEE Transactions on Image Processing, 2004, 13 (4): 600- 612
doi: 10.1109/TIP.2003.819861
[18]   WANG P, CHEN P, YUAN Y, et al. Understanding convolution for semantic segmentation [C]// Proceedings of the IEEE Winter Conference on Applications of Computer Vision. Lake Tahoe: IEEE, 2018: 1451–1460.
[19]   SONG Z, ZHAO X, HUI Y, et al Progressive back-projection network for COVID-CT super-resolution[J]. Computer Methods and Programs in Biomedicine, 2021, 208: 106193
doi: 10.1016/j.cmpb.2021.106193
[20]   ZAMIR S W, ARORA A, KHAN S, et al. Restormer: efficient transformer for high-resolution image restoration [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022: 5718–5729.
[21]   FANG J, LIN H, CHEN X, et al. A hybrid network of CNN and transformer for lightweight image super-resolution [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. New Orleans: IEEE, 2022: 1102–1111.
[22]   CHEN Z, YANG L, LAI J H, et al. CuNeRF: cube-based neural radiance field for zero-shot medical image arbitrary-scale super resolution [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. Paris: IEEE, 2023: 21128–21138.
[1] Yongqing CAI,Cheng HAN,Wei QUAN,Wudi CHEN. Visual induced motion sickness estimation model based on attention mechanism[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(6): 1110-1118.
[2] Mengyao ZHANG,Jie ZHOU,Wenting LI,Yong ZHAO. Three-dimensional mesh segmentation framework using global and local information[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(5): 912-919.
[3] Dejun ZHANG,Yanzi BAI,Feng CAO,Yiqi WU,Zhanya XU. Point cloud Transformer adapter for dense prediction task[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(5): 920-928.
[4] Mingzhi HU,Jun SUN,Biao YANG,Kairong CHANG,Junlong YANG. Super-resolution reconstruction of remote sensing image based on CNN and Transformer aggregation[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(5): 938-946.
[5] Li MA,Yongshun WANG,Yao HU,Lei FAN. Pre-trained long-short spatiotemporal interleaved Transformer for traffic flow prediction applications[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(4): 669-678.
[6] Zhenli ZHANG,Xinkai HU,Fan LI,Zhicheng FENG,Zhichao CHEN. Semantic segmentation algorithm for multiscale remote sensing images based on CNN and Efficient Transformer[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(4): 778-786.
[7] Bing YANG,Chuyang XU,Jinliang YAO,Xueqin XIANG. 3D hand pose estimation method based on monocular RGB images[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(1): 18-26.
[8] Xianwei MA,Chaohui FAN,Weizhi NIE,Dong LI,Yiqun ZHU. Robust fault diagnosis method for failure sensors[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(7): 1488-1497.
[9] Kang HAN,Hongfei ZHAN,Junhe YU,Rui WANG. Rolling bearing fault diagnosis based on dilated convolution and enhanced multi-scale feature adaptive fusion[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(6): 1285-1295.
[10] Kang FAN,Ming’en ZHONG,Jiawei TAN,Zehui ZHAN,Yan FENG. Traffic scene perception algorithm with joint semantic segmentation and depth estimation[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(4): 684-695.
[11] Shaojie WEN,Ruigang WU,Chaowen FENG,Yingli LIU. Multimodal cascaded document layout analysis network based on Transformer[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(2): 317-324.
[12] Changzhen XIONG,Chuanxi GUO,Cong WANG. Target tracking algorithm based on dynamic position encoding and attention enhancement[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(12): 2427-2437.
[13] Wei LUO,Zuotao YAN,Jiahao GUAN,Jian HAN. Solar cell defect segmentation model based on improved SegFormer[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(12): 2459-2468.
[14] Xiaofeng FU,Weiqi CHEN,Yao SUN,Yuze PAN. Bimodal software classification model based on bidirectional encoder representation from transformer[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(11): 2239-2246.
[15] Longxue LIANG,Chenglong HE,Xiaosuo WU,Haowen YAN. Remote sensing image semantic segmentation network based on global information extraction and reconstruction[J]. Journal of ZheJiang University (Engineering Science), 2024, 58(11): 2270-2279.