Please wait a minute...
Journal of Zhejiang University (Science Edition)  2023, Vol. 50 Issue (6): 745-753    DOI: 10.3785/j.issn.1008-9497.2023.06.009
CCF CAD/CG 2023     
LK-CAUNet: Large kernel multi-scale deformable medical image registration network based on cross-attention
Tianqi CHENG1,Lei WANG1(),Xinping GUO1,Yuwei WANG1,Chunxiang LIU2,Bin LI3
1.School of Computer Science and Technology,Shandong University of Technology,Zibo 255000,Shandong Province,China
2.School of Resources and Environmental Engineering,Shandong University of Technology,Zibo 255000,Shandong Province,China
3.School of Automation Science and Engineering,South China University of Technology,Guangzhou 510641,China
Download: HTML( 2 )   PDF(2794KB)
Export: BibTeX | EndNote (RIS)      

Abstract  

The UNet network can be used to predict the dense displacement field in the full-resolution spatial domain, and has achieved great success in the field of medical image registration. However, for three-dimensional images with large deformation, there are still shortcomings such as long running time, inability to effectively maintain the topological structure, and easily leading to the loss of spatial features. A large kernel multi-scale deformable medical image registration network based on cross-attention (LK-CAUNet) is proposed. Based on the classical UNet network, the cross-attention module is introduced to achieve efficient and multi-level semantic feature fusion. The large kernel asymmetric parallel convolution is equipped. It has the ability to learn multi-scale features and complex structures. Besides, an additional square and scaling module is added to let it have the advantages of topological conservation and transform reversibility. Using the brain MRI dataset, it is demonstrated that the proposed method has significantly improved the registration performance compared with the eighteen classical registration methods. Especially compared with the most advanced TransMorph registration method, the Dice score can be improved by 8%, and the parameter quantity is only one fifth of it.



Key wordsmedical image      image registration      UNet network      cross-attention      large kernel convolution     
Received: 12 June 2023      Published: 30 November 2023
CLC:  TP 391  
Corresponding Authors: Lei WANG     E-mail: wanglei0511@sdut.edu.cn
Cite this article:

Tianqi CHENG,Lei WANG,Xinping GUO,Yuwei WANG,Chunxiang LIU,Bin LI. LK-CAUNet: Large kernel multi-scale deformable medical image registration network based on cross-attention. Journal of Zhejiang University (Science Edition), 2023, 50(6): 745-753.

URL:

https://www.zjujournals.com/sci/EN/Y2023/V50/I6/745


LK-CAUNet:基于交叉注意的大内核多尺度可变形医学图像配准网络

经典的UNet网络可用于预测全分辨率空间域的密集位移场,在医学图像配准中取得了巨大成功。但对大变形的三维图像配准,还存在运行时间长、无法有效保持拓扑结构、空间特征易丢失等缺点。为此,提出一种基于交叉注意的大内核多尺度可变形医学图像配准网络(large kernel multi-scale deformable medical image registration network based on cross-attention,LK-CAUNet)。在经典UNet模型基础上,通过引入交叉注意力模块,实现高效、多层次的语义特征融合;配备大内核非对称并行卷积,使其具有多尺度特征和对复杂结构的学习能力;通过加入平方和缩放模块,实现拓扑守恒和变换可逆。基于脑部MRI数据集,将LK-CAUNet与18种经典图像配准模型进行了比较,结果表明,LK-CAUNet的配准性能较其他模型有明显提升,其Dice得分较TransMorph配准方法提高了8%,而参数量仅为TransMorph的1/5。


关键词: 医学图像,  图像配准,  UNet网络,  交叉注意力,  大内核卷积 
Fig.1 Single image network registration process
Fig.2 Feature fusion and feature matching based on cross attention
Fig.3 Architecture of LK-CAUNet model
Fig.4 Large kernel multi-scale feature extraction convolution block
Fig.5 The structure of cross-attention module
模型Dice得分
UNet0.762 5
LK-UNet0.765 1
LK-CAUNet0.772 0
Table 1 Ablation experiment
Fig.6 Visualization results of ablation experiment
模型平均Dice得分| J |≤0的百分比/%参数量/M
Affine0.386±0.195--
SyN0.639±0.151<0.000 1-
NiftyReg0.640±0.166<0.000 1-
LDDMM0.675±0.135<0.000 1-
deedsBCV0.733±0.1260.147±0.050-
VoxelMorph0.723±0.1301.590±0.3391.10
VoxelMorph-diff0.577±0.165<0.000 11.23
CycleMorph0.730±01241.719±0.3820.36
MIDIR0.736±0.129<0.000 10.27
ViT-V-Net0.728±0.1241.609±0.3199.82
CoTr0.721±0.1281.858±0.31438.72
PVT0.729±0.1351.292±0.34258.80
nnFormer0.740±0.1341.595±0.35834.40
TransMorph0.746±0.1281.579±0.32846.80
TransMorph-Bayes0.746±0.1231.560±0.33321.20
TransMorph-bspl0.752±0.128<0.000 146.80
TransMorph-diff0.599±0.156<0.000 146.60
UNet0.727±0.1261.524±0.3530.28
LK-CAUNet0.828±0.138<0.000 19.06
Table 2 Comparison of the results of different registration methods
Fig.7 Visualization results of different registration methods
[1]   LIU X, LI Z, ISHII M, et al. SAGE: Slam with appearance and geometry prior for endoscopy[C]// 2022 International Conference on Robotics and Automation (ICRA). Philadelphia: IEEE, 2022: 5587-5593. DOI:10.1109/icra46639.2022.9812257
doi: 10.1109/icra46639.2022.9812257
[2]   GAZIV G, BELIY R, GRANOT N, et al. Self-supervised natural image reconstruction and large-scale semantic classification from brain activity[J]. NeuroImage, 2022, 254: 119121. DOI:10.1016/j.neuroimage.2022.119121
doi: 10.1016/j.neuroimage.2022.119121
[3]   XIE Y T, ZHANG J P, SHEN C H, et al. CoTr: Efficiently bridging CNN and transformer for 3D medical image segmentation[C]// The 24th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI). Strasbourg: Springer International Publishing, 2021: 171-180. DOI:10.1007/978-3-030-87199-4_16
doi: 10.1007/978-3-030-87199-4_16
[4]   SOTIRAS A, DAVATZIKOS C, PARAGIOS N. Deformable medical image registration: A survey[J]. IEEE Transactions on Medical Imaging, 2013, 32(7): 1153-1190. DOI:10.1109/TMI.2013.2265603
doi: 10.1109/TMI.2013.2265603
[5]   RUECKERT D, SONODA L, HAYES C, et al. Nonrigid registration using free-form deformations: Application to breast MR images[J]. IEEE Transactions on Medical Imaging, 1999, 18(8): 712-721. DOI:10.1109/42.796284
doi: 10.1109/42.796284
[6]   VERCAUTEREN T, PENNEC X, PERCHANT A, et al. Diffeomorphic demons: Efficient non-parametric image registration[J]. NeuroImage, 2009, 45(1): S61-S72. DOI:10.1016/j.neuroimage. 2008.10.040
doi: 10.1016/j.neuroimage. 2008.10.040
[7]   AVANTS B B, TUSTISON N J, SONG G, et al. A reproducible evaluation of ANTs similarity metric performance in brain image registration[J]. NeuroImage, 2011, 54(3): 2033-2044. DOI:10. 1016/j.neuroimage.2010.09.025
doi: 10. 1016/j.neuroimage.2010.09.025
[8]   ZHANG M, FLETCHER P T. Fast diffeomorphic image registration via fourier-approximated lie algebras[J]. International Journal of Computer Vision, 2019, 127: 61-73. DOI:10.1007/s11263-018-1099-x
doi: 10.1007/s11263-018-1099-x
[9]   THORLEY A, JIA X, CHANG H J, et al. Nesterov accelerated ADMM for fast diffeomorphic image registration[C]// Medical Image Computing and Computer Assisted Intervention-MICCAI 2021: 24th International Conference. Strasbourg: Springer International Publishing, 2021: 150-160. DOI:10.1007/978-3-030-87202-1_15
doi: 10.1007/978-3-030-87202-1_15
[10]   HERING A, HANSEN L, MOK T C W, et al. Learn2Reg: Comprehensive multi-task medical image registration challenge, dataset and evaluation in the era of deep learning[J]. IEEE Transactions on Medical Imaging, 2023, 42(3): 697-712. DOI:10. 1109/TMI.2022.3213983
doi: 10. 1109/TMI.2022.3213983
[11]   BALAKRISHNAN G, ZHAO A, SABUNCU M R, et al. VoxelMorph: A learning framework for deformable medical image registration[J]. IEEE Transactions on Medical Imaging, 2019, 38(8): 1788-1800. DOI:10.1109/TMI.2019.2897538
doi: 10.1109/TMI.2019.2897538
[12]   SUN S L, HAN K, KONG D Y, et al. Topology-preserving shape reconstruction and registration via neural diffeomorphic flow[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022: 20845-20855. DOI:10.1109/CVPR52688.2022.02018
doi: 10.1109/CVPR52688.2022.02018
[13]   ZHAO S Y, DONG Y, CHANG E, et al. Recursive cascaded networks for unsupervised medical image registration[C]// Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 10600-10610. DOI:10.1109/ICCV.2019.01070
doi: 10.1109/ICCV.2019.01070
[14]   MOK T C W, CHUNG A. Fast symmetric diffeomorphic image registration with convolutional neural networks[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 4644-4653. DOI:10.1109/CVPR42600.2020.00470
doi: 10.1109/CVPR42600.2020.00470
[15]   JIA X, THORLEY A, CHEN W, et al. Learning a model-driven variational network for deformable image registration[J]. IEEE Transactions on Medical Imaging, 2021, 41(1): 199-212. DOI:10. 1109/TMI.2021.3108881
doi: 10. 1109/TMI.2021.3108881
[16]   KIM B, KIM D H, PARK S H, et al. CycleMorph: Cycle consistent unsupervised deformable image registration[J]. Medical Image Analysis, 2021, 71: 102036. DOI:10.1016/j.media.2021.102036
doi: 10.1016/j.media.2021.102036
[17]   CHEN J, FREY E C, HE Y, et al. TransMorph: Transformer for unsupervised medical image registration[J]. Medical Image Analysis, 2022, 82: 102615. DOI:10.1016/j.media.2022.102615
doi: 10.1016/j.media.2022.102615
[18]   JIA X, BARTLETT J, ZHANG T Y, et al. UNet vs transformer: Is UNet outdated in medical image registration?[C]// LIAN C F, CAO X H, REKIK I, et al. Machine Learning in Medical Imaging. Cham: Springer, 2022: 151-160. DOI:10.1007/978-3-031-21014-3_16
doi: 10.1007/978-3-031-21014-3_16
[19]   CHEN J, HE Y, FREY E C, et al. ViT-V-Net: Vision Transformer for Unsupervised Volumetric Medical Image Registration[Z]. (2021-04-13). https://arXiv/org/abs/2104.06468. doi:10.1016/j.media.2022.102615
doi: 10.1016/j.media.2022.102615
[20]   张纠, 刘晓芳, 杨兵. 基于双通道级联注意力网络的医学图像配准[J]. 计算机工程与设计, 2021, 42(10): 2894-2901. DOI:10.16208/j.issn1000-7024.2021.10.026
ZHANG J, LIU X F, YANG B. Medical image registration based on dual-stream cascaded attention network[J]. Computer Engineering and Design, 2021, 42(10): 2894-2901. DOI:10.16208/j.issn1000-7024.2021.10.026
doi: 10.16208/j.issn1000-7024.2021.10.026
[21]   秦庭威,赵鹏程,秦品乐,等. 基于残差注意力机制的点云配准算法[J]. 计算机应用, 2022, 42(7):2184-2191.
QIN T W, ZHAO P C, QIN P L, et al. Point cloud registration algorithm based on residual attention mechanism[J]. Journal of Computer Application, 2022, 42(7):2184-2191.
[22]   BALAKRISHNAN G, ZHAO A, SABUNCU M R, et al. An unsupervised learning model for deformable medical image registration[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 9252-9260. DOI:10.1109/CVPR.2018.00964
doi: 10.1109/CVPR.2018.00964
[23]   VOS B D D, BERENDSEN F F, VIERGEVER M A, et al. End-to-end unsupervised deformable image registration with a convolutional neural network[C]// Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support. Cham: Springer, 2017: 204-212. DOI:10.1007/978-3-319-67558-9_24
doi: 10.1007/978-3-319-67558-9_24
[24]   WANG X, GIRSHICK R, GUPTA A, et al. Non-local neural net-works[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE,2018: 7794-7803. DOI:10.1109/CVPR.2018.00813 .
doi: 10.1109/CVPR.2018.00813
[25]   MARCUS D S, WANG T H, PARKER J, et al. Open access series of imaging studies (OASIS): Cross-sectional MRI data in young, middle aged, nondemented, and demented older adults[J]. Journal of Cognitive Neuroscience, 2007, 19(9): 1498-1507. DOI:10.1162/jocn.2007.19.9.1498
doi: 10.1162/jocn.2007.19.9.1498
[26]   AVANTS B B, TUSTISON N J, WU J, et al. An open source multivariate framework for n-tissue segmentation with evaluation on public data[J]. NeuroInformatics, 2011, 9: 381-400. DOI:10.1007/s12021-011-9109-y
doi: 10.1007/s12021-011-9109-y
[27]   AVANTS B B, EPSTEIN C L, GROSSMAN M, et al. Symmetric diffeomorphic image registration with cross-correlation: Evaluating automated labeling of elderly and neurodegenerative brain[J]. Medical Image Analysis, 2008, 12(1): 26-41. DOI:10.1016/j.media.2007.06.004
doi: 10.1016/j.media.2007.06.004
[28]   MODAT M, RIDGWAY G R, TAYLOR Z A, et al. Fast free-form deformation using graphics processing units[J]. Computer Methods and Programs in Biomedicine, 2010, 98(3): 278-284. DOI:10.1016/j.cmpb.2009.09.002
doi: 10.1016/j.cmpb.2009.09.002
[29]   BEG M F, MILLER M I, TROUVÉ A, et al. Computing large deformation metric mappings via geodesic flows of diffeomorphisms[J]. International Journal of Computer Vision, 2005, 61: 139-157. DOI:10.1023/B:VISI.0000043755.93987.aa
doi: 10.1023/B:VISI.0000043755.93987.aa
[30]   HEINRICH M P, MAIER O, HANDELS H. Multi-modal multi-atlas segmentation using discrete optimisation and self-similarities[J]. VISCERAL Challenge@ ISBI, 2015, 1390: 27.
[31]   DALCA A V, BALAKRISHNAN G, GUTTAG J, et al. Unsupervised learning of probabilistic diffeomorphic registration for images and surfaces[J]. Medical Image Analysis, 2019, 57: 226-236. DOI:10.1016/j.media.2019.07.006
doi: 10.1016/j.media.2019.07.006
[32]   QIU H Q, QIN C, SCHUH A, et al. Learning diffeomorphic and modality-invariant registration using B-splines[C]// International Conference on Medical Imaging with Deep Learning. Lubeck: MDPL, 2021: 1-20.
[33]   WANG W, XIE E, LI X, et al. Pyramid vision transformer: A versatile backbone for dense prediction without convolutions[C]// 2021 IEEE/CVF International Conference on Computer Vision (ICCV). Montreal: IEEE, 2021: 548-558. DOI:10.1109/ICCV48922.2021.00061
doi: 10.1109/ICCV48922.2021.00061
[1] CHEN Yuanqiong, ZOU Beiji, ZHANG Meihua, LIAO Wangmin, HUANG Jiaer, ZHU Chengzhang. A review on deep learning interpretability in medical image processing[J]. Journal of Zhejiang University (Science Edition), 2021, 48(1): 18-29.