多尺度残差学习结合Dilformer的双流医学图像配准网络

doi:10.3785/j.issn.1008-973X.2026.05.017

浙江大学学报(工学版)

2026, Vol. 60

Issue (5): 1082-1091 DOI: 10.3785/j.issn.1008-973X.2026.05.017

计算机技术、控制工程

多尺度残差学习结合Dilformer的双流医学图像配准网络

彭静(

),闫佳荣,刘佳英,魏子易,白珊,邓亚红

兰州交通大学电子与信息工程学院，甘肃兰州 730070

Multi-scale residual learning combined with Dilformer for dual-stream medical image registration network

Jing PENG(

),Jiarong YAN,Jiaying LIU,Ziyi WEI,Shan BAI,Yahong DENG

School of Electronic and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China

全文: PDF(6002 KB) HTML

摘要：

针对现有医学图像配准算法存在复杂形变配准精度低和模型泛化能力差的问题，提出多尺度残差学习结合多膨胀感知Transformer（Dilformer）的双流医学图像配准网. 提出多尺度残差学习模块（MSR），在双流金字塔特征提取阶段，增强网络特征的表达能力. 设计Dilformer，通过多膨胀率扩张卷积构建异质感受野特征交互机制，增强模型在低尺度空间的全局建模能力. 提出可分离残差融合模块（SRF），融合多尺度特征信息以提升模型预测形变场的准确性. 引入多分辨率损失函数，在不同尺度上约束网络训练，提升配准性能. 实验结果表明，所提网络在3D MRI脑部LPBA40和预处理的IXI数据集上配准精度均优于现有对比模型. 在IXI数据集上，所提网络的戴斯相似系数为0.769，95%分位豪斯多夫距离为8.937，负雅克比行列式比率为0.029，推理时间为0.29 s，证明了该网络在复杂形变医学图像配准中的有效性和实用性.

关键词： 图像配准; 扩张卷积; Transformer; 核磁共振图像; 多分辨率损失

Abstract:

To address the challenges of low registration accuracy under complex deformations and limited generalization ability in existing medical image registration algorithms, a dual-stream registration network that integrates multi-scale residual learning with a multi-dilated perception Transformer (Dilformer) was proposed. First, a multi-scale residual learning block (MSR) was introduced to enhance feature representation during the dual-stream pyramid feature extraction stage. Then, the Dilformer module was designed to construct a heterogeneous receptive field interaction mechanism using multi-rate dilated convolutions, thereby improving the model’s global modeling capacity at low-resolution scales. Subsequently, a separable residual fusion block (SRF) was developed to effectively fuse multi-scale features and enhance the accuracy of the predicted deformation field. Finally, a multi-resolution loss function was introduced to supervise network training across multiple scales, further improving registration performance. Experimental results on the 3D brain MRI datasets LPBA40 and preprocessed IXI demonstrate that the proposed network achieves superior accuracy compared to state-of-the-art models. Specifically, on the IXI dataset, the proposed network achieves a Dice similarity coefficient of 0.769, a 95th percentile Hausdorff distance of 8.937, a negative Jacobian determinant rate of 0.029, and an inference time of 0.29 s. These results confirm the effectiveness and practical applicability of the proposed network in complex deformation medical image registration tasks.

Key words: image registration dilated convolution Transformer MRI image multi-resolution loss

收稿日期: 2025-06-03 出版日期: 2026-05-06

CLC:

TP391

基金资助: 国家自然科学基金资助项目（62241106，61861025）；智能化隧道监理机器人研究项目（中铁科研院字2020-KJ016-Z016-A2）；甘肃省重点研发计划（甘科计[2024]10号-24YFGA037）；甘肃省科技专员专项（甘科计[2023]18号-23CXGA0008）.

作者简介: 彭静（1981—），女，副教授，从事图像处理研究. E-mail：pj@mail.lzjtu.cn

	服务
	把本文推荐给朋友
	加入引用管理器
	E-mail Alert
	作者相关文章
	彭静
	闫佳荣
	刘佳英
	魏子易
	白珊
	邓亚红

引用本文:

彭静,闫佳荣,刘佳英,魏子易,白珊,邓亚红. 多尺度残差学习结合Dilformer的双流医学图像配准网络[J]. 浙江大学学报(工学版), 2026, 60(5): 1082-1091.

Jing PENG,Jiarong YAN,Jiaying LIU,Ziyi WEI,Shan BAI,Yahong DENG. Multi-scale residual learning combined with Dilformer for dual-stream medical image registration network. Journal of ZheJiang University (Engineering Science), 2026, 60(5): 1082-1091.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2026.05.017 或 https://www.zjujournals.com/eng/CN/Y2026/V60/I5/1082

图 1 无监督双流医学图像配准框架

图 2 多尺度残差学习结合Dilformer的双流医学图像配准网络

图 3 多尺度残差学习模块

图 4 多膨胀感知Transformer模块

图 5 IXI和LPBA40数据集的二维切片图

图 6 IXI数据集预处理流程

表 1 各医学图像配准模型在IXI数据集上的定量分析结果

图 7 各医学图像配准模型在IXI数据集上的定性分析结果

表 2 在IXI数据集上的膨胀率分析结果

图 8 在IXI数据集上的模块消融实验结果

表 3 在IXI数据集上的模块消融实验性能评价指标对比

图 9 在LPBA40数据集上的模型泛化性实验结果

表 4 在LPBA40数据集上的模型泛化性验证数据

1	CHEN J, LIU Y, WEI S, et al A survey on deep learning in medical image registration: new technologies, uncertainty, evaluation metrics, and beyond[J]. Medical Image Analysis, 2025, 100: 103385 doi: 10.1016/j.media.2024.103385
2	沈瑜, 魏子易, 严源, 等基于多尺度约束的大形变3D医学图像配准[J]. 中国激光, 2024, 51 (21): 2107109 SHEN Yu, WEI Ziyi, YAN Yuan, et al Large-deformation 3D medical image registration based on multi-scale constraints[J]. Chinese Journal of Lasers, 2024, 51 (21): 2107109 doi: 10.3788/CJL241180
3	AVANTS B B, TUSTISON N J, SONG G, et al A reproducible evaluation of ANTs similarity metric performance in brain image registration[J]. NeuroImage, 2011, 54 (3): 2033- 2044 doi: 10.1016/j.neuroimage.2010.09.025
4	HERNANDEZ M, RAMON JULVEZ U Insights into traditional large deformation diffeomorphic metric mapping and unsupervised deep-learning for diffeomorphic registration and their evaluation[J]. Computers in Biology and Medicine, 2024, 178: 108761 doi: 10.1016/j.compbiomed.2024.108761
5	李文举, 孔德卿, 曹国刚, 等基于训练-推理解耦架构的2D-3D医学图像配准[J]. 激光与光电子学进展, 2022, 59 (16): 1610015 LI Wenju, KONG Deqing, CAO Guogang, et al 2D-3D medical image registration based on training-inference decoupling architecture[J]. Laser and Optoelectronics Progress, 2022, 59 (16): 1610015 doi: 10.3788/LOP202259.1610015
6	林立昊, 易见兵, 曹锋, 等多尺度并行全卷积神经网络的肺计算机断层扫描图像非刚性配准算法[J]. 激光与光电子学进展, 2022, 59 (16): 1617004 LIN Lihao, YI Jianbing, CAO Feng, et al Non-rigid registration algorithm of lung computed tomography image based on multi-scale parallel fully convolutional neural network[J]. Laser and Optoelectronics Progress, 2022, 59 (16): 1617004 doi: 10.3788/LOP202259.1617004
7	BALAKRISHNAN G, ZHAO A, SABUNCU M R, et al VoxelMorph: a learning framework for deformable medical image registration[J]. IEEE Transactions on Medical Imaging, 2019, 38 (8): 1788- 1800 doi: 10.1109/TMI.2019.2897538
8	尹艺晓, 马金刚, 张文凯, 等从U-Net到Transformer: 混合模型在医学图像分割中的应用进展[J]. 激光与光电子学进展, 2025, 62 (2): 1- 23 YIN Yixiao, MA Jingang, ZHANG Wenkai, et al From U-Net to transformer: progress in the application of hybrid models in medical image segmentation[J]. Laser and Optoelectronics Progress, 2025, 62 (2): 1- 23 doi: 10.3788/LOP240875
9	JADERBERG M, SIMONYAN K, ZISSERMAN A. Spatial transformer networks [C]// Proceedings of the 29th International Conference on Neural Information Processing Systems. [S.l.]: MIT Press, 2015: 2017–2025.
10	JIA X, BARTLETT J, ZHANG T, et al. U-Net vs Transformer: is U-Net outdated inMedical image registration? [C]// Machine Learning in Medical Imaging. [S.l.]: Springer, 2022: 151–160.
11	VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. [S.l.]: Curran Associates Inc. , 2017: 5998–6008.
12	石磊, 籍庆余, 陈清威, 等视觉Transformer在医学图像分析中的应用研究综述[J]. 计算机工程与应用, 2023, 59 (8): 41- 55 SHI Lei, JI Qingyu, CHEN Qingwei, et al Review of research on application of vision transformer in medical image analysis[J]. Computer Engineering and Applications, 2023, 59 (8): 41- 55 doi: 10.3778/j.issn.1002-8331.2206-0022
13	QIU W, XIONG L, LI N, et al UTR: a UNet-like transformer for efficient unsupervised medical image registration[J]. Image and Vision Computing, 2024, 150: 105209 doi: 10.1016/j.imavis.2024.105209
14	MA T, DAI X, ZHANG S, et al. PIViT: large deformation image registration with Pyramid-iterative vision transformer [C]// Medical Image Computing and Computer Assisted Intervention – MICCAI 2023. [S.l.]: Springer, 2023: 602–612.
15	LIU Z, LIN Y, CAO Y, et al. Swin Transformer: hierarchical vision transformer using shifted windows [C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2022: 9992–10002.
16	WANG H, NI D, WANG Y Recursive deformable pyramid network for unsupervised medical image registration[J]. IEEE Transactions on Medical Imaging, 2024, 43 (6): 2229- 2240 doi: 10.1109/TMI.2024.3362968
17	NAN J, FAN G, ZHANG K, et al. MsMorph: an unsupervised pyramid learning network for brain image registration [EB/OL]. (2024–10–23)[2025–05–29]. https://arxiv.org/abs/2410.18228.
18	刘卫朋, 李旭, 任子文, 等多尺度残差可变形肺部CT图像配准算法[J]. 华南理工大学学报: 自然科学版, 2024, 52 (10): 135- 145 LIU Weipeng, LI Xu, REN Ziwen, et al Algorithm for multiscale residual deformable lung CT image registration[J]. Journal of South China University of Technology: Natural Science Edition, 2024, 52 (10): 135- 145 doi: 10.12141/j.issn.1000-565X.230726
19	YANG H, YUAN C, LI B, et al Asymmetric 3D convolutional neural networks for action recognition[J]. Pattern Recognition, 2019, 85: 1- 12 doi: 10.1109/icip.2019.8802910
20	MA Y, NIU D, ZHANG J, et al Unsupervised deformable image registration network for 3D medical images[J]. Applied Intelligence, 2022, 52 (1): 766- 779 doi: 10.1007/s10489-021-02196-7
21	CHEN J, FREY E C, HE Y, et al TransMorph: transformer for unsupervised medical image registration[J]. Medical Image Analysis, 2022, 82: 102615 doi: 10.1016/j.media.2022.102615
22	FISCHL B FreeSurfer[J]. NeuroImage, 2012, 62 (2): 774- 781 doi: 10.1016/j.neuroimage.2012.01.021
23	KIM B, KIM D H, PARK S H, et al CycleMorph: cycle consistent unsupervised deformable image registration[J]. Medical Image Analysis, 2021, 71: 102036 doi: 10.1016/j.media.2021.102036
24	CHEN J, HE Y, FREY E C, et al. ViT-V-Net: vision transformer for unsupervised volumetric medical image registration [EB/OL]. (2021–04–13)[2025–05–29]. https://arxiv.org/abs/2104.06468.
25	CHEN Z, ZHENG Y, GEE J C TransMatch: a transformer-based multilevel dual-stream feature matching network for unsupervised deformable image registration[J]. IEEE Transactions on Medical Imaging, 2024, 43 (1): 15- 27 doi: 10.1109/TMI.2023.3288136

[1]	边文远,火久元,常琛. 基于改进的插补扩散模型与LSTM的风电数据清洗方法[J]. 浙江大学学报(工学版), 2026, 60(5): 1016-1026.
[2]	侯玉珍,沈晓红,李莉,杨明源,张彩明. 基于掩模和非局部注意力的双阶段去雨网络[J]. 浙江大学学报(工学版), 2026, 60(4): 791-799.
[3]	万刚,王小波,石纲,叶德震,朱思思,司帆. 基于特征细化与注意力增强重构的水下图像增强算法[J]. 浙江大学学报(工学版), 2026, 60(4): 800-811.
[4]	包晓安,彭书友,张娜,涂小妹,张庆琪,吴彪. 基于多方位感知深度融合检测头的目标检测算法[J]. 浙江大学学报(工学版), 2026, 60(1): 32-42.
[5]	孟璇,张雪英,孙颖,周雅茹. 基于电极排列和Transformer的脑电情感识别[J]. 浙江大学学报(工学版), 2025, 59(9): 1872-1880.
[6]	刘杰,吴优,田佳禾,韩轲. 改进Transformer的肺部CT图像超分辨率重建[J]. 浙江大学学报(工学版), 2025, 59(7): 1434-1442.
[7]	蔡永青,韩成,权巍,陈兀迪. 基于注意力机制的视觉诱导晕动症评估模型[J]. 浙江大学学报(工学版), 2025, 59(6): 1110-1118.
[8]	肖剑,武亮亮,何昕泽,胡欣. 基于异常检测的图像特征匹配算法[J]. 浙江大学学报(工学版), 2025, 59(6): 1140-1147.
[9]	张梦瑶,周杰,李文婷,赵勇. 结合全局信息和局部信息的三维网格分割框架[J]. 浙江大学学报(工学版), 2025, 59(5): 912-919.
[10]	张德军,白燕子,曹锋,吴亦奇,徐战亚. 面向密集预测任务的点云Transformer适配器[J]. 浙江大学学报(工学版), 2025, 59(5): 920-928.
[11]	马莉,王永顺,胡瑶,范磊. 预训练长短时空交错Transformer在交通流预测中的应用[J]. 浙江大学学报(工学版), 2025, 59(4): 669-678.
[12]	张振利,胡新凯,李凡,冯志成,陈智超. 基于CNN和Efficient Transformer的多尺度遥感图像语义分割算法[J]. 浙江大学学报(工学版), 2025, 59(4): 778-786.
[13]	贾晓芬,王子祥,赵佰亭,梁镇洹,胡锐. 双维度交叉融合驱动的图像超分辨率重建方法[J]. 浙江大学学报(工学版), 2025, 59(12): 2516-2526.
[14]	杨燕,贾存鹏. 代理注意力下域特征交互的高效图像去雾算法[J]. 浙江大学学报(工学版), 2025, 59(12): 2527-2538.
[15]	刘宇轩,刘毅志,廖祝华,邹正标,汤璟昕. 面向动态交通流量预测的自适应图注意Transformer[J]. 浙江大学学报(工学版), 2025, 59(12): 2585-2592.

Viewed

Full text

Abstract

Cited

Shared

Discussed