LK-CAUNet: Large kernel multi-scale deformable medical image registration network based on cross-attention

doi:10.3785/j.issn.1008-9497.2023.06.009

Journal of Zhejiang University (Science Edition)

2023, Vol. 50

Issue (6): 745-753 DOI: 10.3785/j.issn.1008-9497.2023.06.009

CCF CAD/CG 2023

LK-CAUNet: Large kernel multi-scale deformable medical image registration network based on cross-attention

Tianqi CHENG¹,Lei WANG¹(

),Xinping GUO¹,Yuwei WANG¹,Chunxiang LIU²,Bin LI³

^1.School of Computer Science and Technology，Shandong University of Technology，Zibo 255000，Shandong Province，China
^2.School of Resources and Environmental Engineering，Shandong University of Technology，Zibo 255000，Shandong Province，China
^3.School of Automation Science and Engineering，South China University of Technology，Guangzhou 510641，China

Download:

HTML( 2 )

PDF(2794KB)
Export: BibTeX | EndNote (RIS)

Abstract

The UNet network can be used to predict the dense displacement field in the full-resolution spatial domain, and has achieved great success in the field of medical image registration. However, for three-dimensional images with large deformation, there are still shortcomings such as long running time, inability to effectively maintain the topological structure, and easily leading to the loss of spatial features. A large kernel multi-scale deformable medical image registration network based on cross-attention (LK-CAUNet) is proposed. Based on the classical UNet network, the cross-attention module is introduced to achieve efficient and multi-level semantic feature fusion. The large kernel asymmetric parallel convolution is equipped. It has the ability to learn multi-scale features and complex structures. Besides, an additional square and scaling module is added to let it have the advantages of topological conservation and transform reversibility. Using the brain MRI dataset, it is demonstrated that the proposed method has significantly improved the registration performance compared with the eighteen classical registration methods. Especially compared with the most advanced TransMorph registration method, the Dice score can be improved by 8%, and the parameter quantity is only one fifth of it.

Key words： medical image image registration UNet network cross-attention large kernel convolution

Received: 12 June 2023 Published: 30 November 2023

CLC:

TP 391

Corresponding Authors: Lei WANG E-mail: wanglei0511@sdut.edu.cn

	Service
	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors
	Tianqi CHENG
	Lei WANG
	Xinping GUO
	Yuwei WANG
	Chunxiang LIU
	Bin LI

Cite this article:

Tianqi CHENG,Lei WANG,Xinping GUO,Yuwei WANG,Chunxiang LIU,Bin LI. LK-CAUNet: Large kernel multi-scale deformable medical image registration network based on cross-attention. Journal of Zhejiang University (Science Edition), 2023, 50(6): 745-753.

URL:

https://www.zjujournals.com/sci/EN/Y2023/V50/I6/745

LK-CAUNet：基于交叉注意的大内核多尺度可变形医学图像配准网络

经典的UNet网络可用于预测全分辨率空间域的密集位移场，在医学图像配准中取得了巨大成功。但对大变形的三维图像配准，还存在运行时间长、无法有效保持拓扑结构、空间特征易丢失等缺点。为此，提出一种基于交叉注意的大内核多尺度可变形医学图像配准网络（large kernel multi-scale deformable medical image registration network based on cross-attention，LK-CAUNet）。在经典UNet模型基础上，通过引入交叉注意力模块，实现高效、多层次的语义特征融合；配备大内核非对称并行卷积，使其具有多尺度特征和对复杂结构的学习能力；通过加入平方和缩放模块，实现拓扑守恒和变换可逆。基于脑部MRI数据集，将LK-CAUNet与18种经典图像配准模型进行了比较，结果表明，LK-CAUNet的配准性能较其他模型有明显提升，其Dice得分较TransMorph配准方法提高了8%，而参数量仅为TransMorph的1/5。

关键词： 医学图像, 图像配准, UNet网络, 交叉注意力, 大内核卷积

Fig.1 Single image network registration process

Fig.2 Feature fusion and feature matching based on cross attention

Fig.3 Architecture of LK-CAUNet model

Fig.4 Large kernel multi-scale feature extraction convolution block

Fig.5 The structure of cross-attention module

Table 1 Ablation experiment

Fig.6 Visualization results of ablation experiment

Table 2 Comparison of the results of different registration methods

Fig.7 Visualization results of different registration methods


[1]	LIU X， LI Z， ISHII M， et al. SAGE： Slam with appearance and geometry prior for endoscopy［C］// 2022 International Conference on Robotics and Automation （ICRA）. Philadelphia： IEEE， 2022： 5587-5593. DOI：10.1109/icra46639.2022.9812257 doi: 10.1109/icra46639.2022.9812257

[2]	GAZIV G， BELIY R， GRANOT N， et al. Self-supervised natural image reconstruction and large-scale semantic classification from brain activity［J］. NeuroImage， 2022， 254： 119121. DOI：10.1016/j.neuroimage.2022.119121 doi: 10.1016/j.neuroimage.2022.119121

[3]	XIE Y T， ZHANG J P， SHEN C H， et al. CoTr： Efficiently bridging CNN and transformer for 3D medical image segmentation［C］// The 24th International Conference on Medical Image Computing and Computer Assisted Intervention （MICCAI）. Strasbourg： Springer International Publishing， 2021： 171-180. DOI：10.1007/978-3-030-87199-4_16 doi: 10.1007/978-3-030-87199-4_16

[4]	SOTIRAS A， DAVATZIKOS C， PARAGIOS N. Deformable medical image registration： A survey［J］. IEEE Transactions on Medical Imaging， 2013， 32（7）： 1153-1190. DOI：10.1109/TMI.2013.2265603 doi: 10.1109/TMI.2013.2265603

[5]	RUECKERT D， SONODA L， HAYES C， et al. Nonrigid registration using free-form deformations： Application to breast MR images［J］. IEEE Transactions on Medical Imaging， 1999， 18（8）： 712-721. DOI：10.1109/42.796284 doi: 10.1109/42.796284

[6]	VERCAUTEREN T， PENNEC X， PERCHANT A， et al. Diffeomorphic demons： Efficient non-parametric image registration［J］. NeuroImage， 2009， 45（1）： S61-S72. DOI：10.1016/j.neuroimage. 2008.10.040 doi: 10.1016/j.neuroimage. 2008.10.040

[7]	AVANTS B B， TUSTISON N J， SONG G， et al. A reproducible evaluation of ANTs similarity metric performance in brain image registration［J］. NeuroImage， 2011， 54（3）： 2033-2044. DOI：10. 1016/j.neuroimage.2010.09.025 doi: 10. 1016/j.neuroimage.2010.09.025

[8]	ZHANG M， FLETCHER P T. Fast diffeomorphic image registration via fourier-approximated lie algebras［J］. International Journal of Computer Vision， 2019， 127： 61-73. DOI：10.1007/s11263-018-1099-x doi: 10.1007/s11263-018-1099-x

[9]	THORLEY A， JIA X， CHANG H J， et al. Nesterov accelerated ADMM for fast diffeomorphic image registration［C］// Medical Image Computing and Computer Assisted Intervention-MICCAI 2021： 24th International Conference. Strasbourg： Springer International Publishing， 2021： 150-160. DOI：10.1007/978-3-030-87202-1_15 doi: 10.1007/978-3-030-87202-1_15

[10]	HERING A， HANSEN L， MOK T C W， et al. Learn2Reg： Comprehensive multi-task medical image registration challenge， dataset and evaluation in the era of deep learning［J］. IEEE Transactions on Medical Imaging， 2023， 42（3）： 697-712. DOI：10. 1109/TMI.2022.3213983 doi: 10. 1109/TMI.2022.3213983

[11]	BALAKRISHNAN G， ZHAO A， SABUNCU M R， et al. VoxelMorph： A learning framework for deformable medical image registration［J］. IEEE Transactions on Medical Imaging， 2019， 38（8）： 1788-1800. DOI：10.1109/TMI.2019.2897538 doi: 10.1109/TMI.2019.2897538

[12]	SUN S L， HAN K， KONG D Y， et al. Topology-preserving shape reconstruction and registration via neural diffeomorphic flow［C］// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans： IEEE， 2022： 20845-20855. DOI：10.1109/CVPR52688.2022.02018 doi: 10.1109/CVPR52688.2022.02018

[13]	ZHAO S Y， DONG Y， CHANG E， et al. Recursive cascaded networks for unsupervised medical image registration［C］// Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul： IEEE， 2019： 10600-10610. DOI：10.1109/ICCV.2019.01070 doi: 10.1109/ICCV.2019.01070

[14]	MOK T C W， CHUNG A. Fast symmetric diffeomorphic image registration with convolutional neural networks［C］// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle： IEEE， 2020： 4644-4653. DOI：10.1109/CVPR42600.2020.00470 doi: 10.1109/CVPR42600.2020.00470

[15]	JIA X， THORLEY A， CHEN W， et al. Learning a model-driven variational network for deformable image registration［J］. IEEE Transactions on Medical Imaging， 2021， 41（1）： 199-212. DOI：10. 1109/TMI.2021.3108881 doi: 10. 1109/TMI.2021.3108881

[16]	KIM B， KIM D H， PARK S H， et al. CycleMorph： Cycle consistent unsupervised deformable image registration［J］. Medical Image Analysis， 2021， 71： 102036. DOI：10.1016/j.media.2021.102036 doi: 10.1016/j.media.2021.102036

[17]	CHEN J， FREY E C， HE Y， et al. TransMorph： Transformer for unsupervised medical image registration［J］. Medical Image Analysis， 2022， 82： 102615. DOI：10.1016/j.media.2022.102615 doi: 10.1016/j.media.2022.102615

[18]	JIA X， BARTLETT J， ZHANG T Y， et al. UNet vs transformer： Is UNet outdated in medical image registration？［C］// LIAN C F， CAO X H， REKIK I， et al. Machine Learning in Medical Imaging. Cham： Springer， 2022： 151-160. DOI：10.1007/978-3-031-21014-3_16 doi: 10.1007/978-3-031-21014-3_16

[19]	CHEN J， HE Y， FREY E C， et al. ViT-V-Net： Vision Transformer for Unsupervised Volumetric Medical Image Registration［Z］. （2021-04-13）. https：//arXiv/org/abs/2104.06468. doi:10.1016/j.media.2022.102615 doi: 10.1016/j.media.2022.102615

[20]	张纠，刘晓芳，杨兵. 基于双通道级联注意力网络的医学图像配准［J］. 计算机工程与设计， 2021， 42（10）： 2894-2901. DOI：10.16208/j.issn1000-7024.2021.10.026 ZHANG J， LIU X F， YANG B. Medical image registration based on dual-stream cascaded attention network［J］. Computer Engineering and Design， 2021， 42（10）： 2894-2901. DOI：10.16208/j.issn1000-7024.2021.10.026 doi: 10.16208/j.issn1000-7024.2021.10.026

[21]	秦庭威，赵鹏程，秦品乐，等. 基于残差注意力机制的点云配准算法［J］. 计算机应用， 2022， 42（7）：2184-2191. QIN T W， ZHAO P C， QIN P L， et al. Point cloud registration algorithm based on residual attention mechanism［J］. Journal of Computer Application， 2022， 42（7）：2184-2191.

[22]	BALAKRISHNAN G， ZHAO A， SABUNCU M R， et al. An unsupervised learning model for deformable medical image registration［C］// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City： IEEE， 2018： 9252-9260. DOI：10.1109/CVPR.2018.00964 doi: 10.1109/CVPR.2018.00964

[23]	VOS B D D， BERENDSEN F F， VIERGEVER M A， et al. End-to-end unsupervised deformable image registration with a convolutional neural network［C］// Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support. Cham： Springer， 2017： 204-212. DOI：10.1007/978-3-319-67558-9_24 doi: 10.1007/978-3-319-67558-9_24

[24]	WANG X， GIRSHICK R， GUPTA A， et al. Non-local neural net-works［C］// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City： IEEE，2018： 7794-7803. DOI：10.1109/CVPR.2018.00813 . doi: 10.1109/CVPR.2018.00813

[25]	MARCUS D S， WANG T H， PARKER J， et al. Open access series of imaging studies （OASIS）： Cross-sectional MRI data in young， middle aged， nondemented， and demented older adults［J］. Journal of Cognitive Neuroscience， 2007， 19（9）： 1498-1507. DOI：10.1162/jocn.2007.19.9.1498 doi: 10.1162/jocn.2007.19.9.1498

[26]	AVANTS B B， TUSTISON N J， WU J， et al. An open source multivariate framework for n-tissue segmentation with evaluation on public data［J］. NeuroInformatics， 2011， 9： 381-400. DOI：10.1007/s12021-011-9109-y doi: 10.1007/s12021-011-9109-y

[27]	AVANTS B B， EPSTEIN C L， GROSSMAN M， et al. Symmetric diffeomorphic image registration with cross-correlation： Evaluating automated labeling of elderly and neurodegenerative brain［J］. Medical Image Analysis， 2008， 12（1）： 26-41. DOI：10.1016/j.media.2007.06.004 doi: 10.1016/j.media.2007.06.004

[28]	MODAT M， RIDGWAY G R， TAYLOR Z A， et al. Fast free-form deformation using graphics processing units［J］. Computer Methods and Programs in Biomedicine， 2010， 98（3）： 278-284. DOI：10.1016/j.cmpb.2009.09.002 doi: 10.1016/j.cmpb.2009.09.002

[29]	BEG M F， MILLER M I， TROUVÉ A， et al. Computing large deformation metric mappings via geodesic flows of diffeomorphisms［J］. International Journal of Computer Vision， 2005， 61： 139-157. DOI：10.1023/B：VISI.0000043755.93987.aa doi: 10.1023/B：VISI.0000043755.93987.aa

[30]	HEINRICH M P， MAIER O， HANDELS H. Multi-modal multi-atlas segmentation using discrete optimisation and self-similarities［J］. VISCERAL Challenge@ ISBI， 2015， 1390： 27.

[31]	DALCA A V， BALAKRISHNAN G， GUTTAG J， et al. Unsupervised learning of probabilistic diffeomorphic registration for images and surfaces［J］. Medical Image Analysis， 2019， 57： 226-236. DOI：10.1016/j.media.2019.07.006 doi: 10.1016/j.media.2019.07.006

[32]	QIU H Q， QIN C， SCHUH A， et al. Learning diffeomorphic and modality-invariant registration using B-splines［C］// International Conference on Medical Imaging with Deep Learning. Lubeck： MDPL， 2021： 1-20.

[33]	WANG W， XIE E， LI X， et al. Pyramid vision transformer： A versatile backbone for dense prediction without convolutions［C］// 2021 IEEE/CVF International Conference on Computer Vision （ICCV）. Montreal： IEEE， 2021： 548-558. DOI：10.1109/ICCV48922.2021.00061 doi: 10.1109/ICCV48922.2021.00061

[1]	CHEN Yuanqiong, ZOU Beiji, ZHANG Meihua, LIAO Wangmin, HUANG Jiaer, ZHU Chengzhang. A review on deep learning interpretability in medical image processing[J]. Journal of Zhejiang University (Science Edition), 2021, 48(1): 18-29.

Viewed

Full text

Abstract

Cited

Shared

Discussed