Please wait a minute...
Journal of ZheJiang University (Engineering Science)  2026, Vol. 60 Issue (7): 1427-1437    DOI: 10.3785/j.issn.1008-973X.2026.07.006
    
Multi-path collaboration-based and spatial-spectral prior-based hyperspectral and multispectral image fusion
Yanchun YANG(),Jialong LI
School of Electronic and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China
Download: HTML     PDF(3061KB) HTML
Export: BibTeX | EndNote (RIS)      

Abstract  

A multi-path collaboration-based and spatial-spectral prior-based fusion method was proposed for hyperspectral and multispectral images, to address the challenges of insufficient global modeling and local detail capture in hyperspectral-multispectral image fusion, as well as the difficulty in exploring correlations between adjacent spectral bands. Firstly, the backbone network integrated a Local Bottleneck Control Unit and a Transformer in a parallel architecture. The Local Bottleneck Control Unit learned local structures while suppressing redundant features, whereas the Transformer handled long-range dependencies. A bidirectional interactive fusion mechanism was adopted to enhance the comprehension of both local details and global contexts. Secondly, the spatial-spectral joint prior module employed a dual-path pooling strategy for spatial attention and introduced an intra-spectral grouped attention mechanism to quantify inter-band correlations. Finally, the multi-path aggregation network consolidated features through residual blocks and a progressive fusion strategy. Experimental results demonstrated that the proposed method achieved average improvements of 4.5% in PSNR and 0.7% in SSIM compared to eight other methods on the CAVE dataset, exhibiting superior performance in capturing local-global features and integrating spatial-spectral prior information.



Key wordshyperspectral and multispectral image fusion      local and global collaboration      Transformer      joint spatial and spectral priors      spectral grouping attention mechanism     
Received: 16 April 2025      Published: 23 May 2026
CLC:  TP 391  
Fund:  国家自然科学基金资助项目(62462043,62067006);甘肃省重点研发计划资助项目(25YFGA047);甘肃省自然科学基金资助项目(23JRRA847,21JR7RA300).
Cite this article:

Yanchun YANG,Jialong LI. Multi-path collaboration-based and spatial-spectral prior-based hyperspectral and multispectral image fusion. Journal of ZheJiang University (Engineering Science), 2026, 60(7): 1427-1437.

URL:

https://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2026.07.006     OR     https://www.zjujournals.com/eng/Y2026/V60/I7/1427


基于多路协同与空谱先验的高光谱与多光谱图像融合

针对高光谱与多光谱图像融合中全局建模与局部细节捕捉不足以及光谱维度相邻波段相关性难以探索的问题,提出多路协同与空谱先验的高光谱与多光谱图像融合方法. 主干网络由局部瓶颈控制单元与Transformer并联构成,局部瓶颈控制单元学习局部结构并抑制冗余特征,Transformer处理长距离依赖,双向交互融合机制增强对局部细节与全局上下文的理解. 在空间与光谱联合先验模块中,对于空间注意力采用双路径池化策略,并采用光谱内部分组注意力机制衡量波段关联程度. 多路聚合网络通过残差块与逐层递进融合策略整合特征. 实验表明,在CAVE数据集上,该方法的PSNR和SSIM较其他8种方法分别平均提升4.5%、0.7%,在局部与全局特征捕捉及空谱先验信息融合方面优势明显.


关键词: 高光谱与多光谱图像融合,  局部与全局协同,  Transformer,  空间与光谱联合先验,  光谱分组注意力机制 
Fig.1 Overall architecture of fusion network
Fig.2 Local bottleneck control unit and Transformer
Fig.3 Spatial and spectral prior module
Fig.4 Spectral grouping attention mechanism
Fig.5 Experimental results on CAVE dataset
方法PSNRSAMERGASSSIM
MHF-Net40.087.412.720.970 9
DBIN42.125.342.880.982 8
CNN-FUS43.933.901.240.985 7
UAL44.873.281.530.984 5
Fusformer44.524.121.060.983 3
DCT44.412.660.930.987 1
SDAGE45.322.780.820.989 8
EDIP44.462.590.850.986 2
本研究算法45.632.570.760.990 3
Tab.1 Mean values of evaluation indicators for results of CAVE experiment
Fig.6 Experimental results on Harvard dataset
方法PSNRSAMERGASSSIM
MHF-Net38.256.752.800.968 8
DBIN41.413.541.930.974 2
CNN-FUS40.893.131.710.973 2
UAL42.842.931.220.980 3
Fusformer41.193.611.630.981 1
DCT42.062.971.110.982 6
SDAGE42.412.891.270.983 4
EDIP41.933.031.760.979 9
本研究方法43.572.851.020.984 0
Tab.2 Mean values of evaluation indicators for results of Harvard experiment
Fig.7 Experimental results on Pavia University dataset
方法PSNRSAMERGASSSIM
MHF-Net31.636.277.810.954 6
DBIN32.203.445.670.977 1
CNN-FUS30.653.675.130.975 5
UAL35.473.863.360.980 8
Fusformer31.155.156.850.968 1
DCT30.893.584.920.973 9
SDAGE33.393.463.880.978 6
EDIP32.613.743.520.976 3
本研究方法34.423.323.430.982 1
Tab.3 Mean values of evaluation indicators for results of Pavia University experiment
Fig.8 PSNR visualization of experimental results by band
Fig.9 Results of ablation experiments
方法PSNRSSIMSAMERGAS
1)40.210.983 43.743.11
2)41.880.988 33.122.01
3)42.710.989 75.563.85
4)43.670.990 42.811.33
5)45.320.991 32.420.82
Tab.4 Quantitative analysis of ablation experiments
模块Params/106Flops/109
A5.0319.95
B5.1220.67
C5.2621.47
D3.8115.57
E5.3921.63
Tab.5 Analysis of parameter quantities and computational complexity for each module
[1]   WANG Z, CHEN J, HOI S C H Deep learning for image super-resolution: a survey[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43 (10): 3365- 3387
doi: 10.1109/TPAMI.2020.2982166
[2]   VIVONE G Multispectral and hyperspectral image fusion in remote sensing: a survey[J]. Information Fusion, 2023, 89: 405- 417
doi: 10.1016/j.inffus.2022.08.032
[3]   胡明志, 孙俊, 杨彪, 等 基于CNN和Transformer聚合的遥感图像超分辨率重建[J]. 浙江大学学报: 工学版, 2025, 59 (5): 938- 946
HU Mingzhi, SUN Jun, YANG Biao, et al Super-resolution reconstruction of remote sensing image based on CNN and Transformer aggregation[J]. Journal of Zhejiang University: Engineering Science, 2025, 59 (5): 938- 946
doi: 10.3785/j.issn.1008-973X.2025.05.007
[4]   HONG D, GAO L, YOKOYA N, et al More diverse means better: multimodal deep learning meets remote-sensing imagery classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 59 (5): 4340- 4354
doi: 10.1109/TGRS.2020.3016820
[5]   ZHUANG L, NG M K, FU X, et al Hy-demosaicing: hyperspectral blind reconstruction from spectral subsampling[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 60: 5515815
[6]   DENG S Q, DENG L J, WU X, et al PSRT: pyramid shuffle-and-reshuffle transformer for multispectral and hyperspectral image fusion[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61: 5503715
doi: 10.1109/tgrs.2023.3244750
[7]   吕鑫栋, 李娇, 邓真楠, 等 基于改进Transformer的结构化图像超分辨网络[J]. 浙江大学学报: 工学版, 2023, 57 (5): 865- 874,910
LV Xindong, LI Jiao, DENG Zhennan, et al Structured image super-resolution network based on improved Transformer[J]. Journal of Zhejiang University: Engineering Science, 2023, 57 (5): 865- 874,910
doi: 10.3785/j.issn.1008-973X.2023.05.002
[8]   LI S, DIAN R, FANG L, et al Fusing hyperspectral and multispectral images via coupled sparse tensor factorization[J]. IEEE Transactions on Image Processing, 2018, 27 (8): 4118- 4130
[9]   DIAN R, LI S, FANG L, et al Multispectral and hyperspectral image fusion with spatial-spectral sparse representation[J]. Information Fusion, 2019, 49: 262- 270
doi: 10.1016/j.inffus.2018.11.012
[10]   PALSSON F, SVEINSSON J R, ULFARSSON M O Multispectral and hyperspectral image fusion using a 3-D-convolutional neural network[J]. IEEE Geoscience and Remote Sensing Letters, 2017, 14 (5): 639- 643
doi: 10.1109/LGRS.2017.2668299
[11]   ZHANG X, HUANG W, WANG Q, et al SSR-NET: spatial-spectral reconstruction network for hyperspectral and multispectral image fusion[J]. IEEE Transactions on Geoscience and Remote Sensing, 2020, 59 (7): 5953- 5965
doi: 10.1109/tgrs.2020.3018732
[12]   DIAN R, LI S, KANG X Regularizing hyperspectral and multispectral image fusion by CNN denoiser[J]. IEEE Transactions on Neural Networks and Learning Systems, 2021, 32 (3): 1124- 1135
doi: 10.1109/TNNLS.2020.2980398
[13]   YU H, LING Z, ZHENG K, et al Unsupervised hyperspectral and multispectral image fusion with deep spectral-spatial collaborative constraint[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 5534114
doi: 10.1109/tgrs.2024.3472226
[14]   YAN J, ZHANG K, SUN Q, et al Spatial-spectral unfolding network with mutual guidance for multispectral and hyperspectral image fusion[J]. Pattern Recognition, 2025, 161: 111277
doi: 10.1016/j.patcog.2024.111277
[15]   LI J, ZHENG K, GAO L, et al Enhanced deep image prior for unsupervised hyperspectral image super-resolution[J]. IEEE Transactions on Geoscience and Remote Sensing, 2025, 63: 5504218
[16]   HU J F, HUANG T Z, DENG L J, et al Fusformer: a transformer-based fusion network for hyperspectral image super-resolution[J]. IEEE Geoscience and Remote Sensing Letters, 2022, 19: 6012305
[17]   MA Q, JIANG J, LIU X, et al Learning a 3D-CNN and Transformer prior for hyperspectral image super-resolution[J]. Information Fusion, 2023, 100: 101907
doi: 10.1016/j.inffus.2023.101907
[18]   JIA S, MIN Z, FU X Multiscale spatial-spectral transformer network for hyperspectral and multispectral image fusion[J]. Information Fusion, 2023, 96: 117- 129
doi: 10.1016/j.inffus.2023.03.011
[19]   SUN L, ZHOU J, YE Q, et al MDC-FusFormer: multiscale deep cross-fusion transformer network for hyperspectral and multispectral image fusion[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 5528914
doi: 10.1109/tgrs.2024.3451551
[20]   WANG X, ZHANG F, ZHANG K, et al Learning spatial-spectral dual adaptive graph embedding for multispectral and hyperspectral image fusion[J]. Pattern Recognition, 2024, 151: 110365
doi: 10.1016/j.patcog.2024.110365
[21]   LIU S, SHAO T, LIU S, et al An asymptotic multiscale symmetric fusion network for hyperspectral and multispectral image fusion[J]. IEEE Transactions on Geoscience and Remote Sensing, 2025, 63: 5503016
[22]   LIU Z, LIN Y, CAO Y, et al. Swin transformer: hierarchical vision transformer using shifted windows [C]// IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2021: 9992–10002.
[23]   MA Q, JIANG J, LIU X, et al Reciprocal transformer for hyperspectral and multispectral image fusion[J]. Information Fusion, 2024, 104: 102148
doi: 10.1016/j.inffus.2023.102148
[24]   LIU Z, WANG W, MA Q, et al Rethinking 3D-CNN in hyperspectral image super-resolution[J]. Remote Sensing, 2023, 15 (10): 2574
doi: 10.3390/rs15102574
[25]   ANUL HAQ M, BEN HADJ HASSINE S, MALEBARY S J, et al 3D-CNNHSR: a 3-dimensional convolutional neural network for hyperspectral super-resolution[J]. Computer Systems Science and Engineering, 2023, 47 (2): 2689- 2705
doi: 10.32604/csse.2023.039904
[26]   YASUMA F, MITSUNAGA T, ISO D, et al Generalized assorted pixel camera: postcapture control of resolution, dynamic range, and spectrum[J]. IEEE Transactions on Image Processing, 2010, 19 (9): 2241- 2253
doi: 10.1109/TIP.2010.2046811
[27]   CHAKRABARTI A, ZICKLER T. Statistics of real-world hyperspectral images [C]// CVPR 2011. Colorado Springs: IEEE, 2011: 193–200.
[28]   XIE Q, ZHOU M, ZHAO Q, et al. Multispectral and hyperspectral image fusion by MS/HS fusion net [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 1585–1594.
[29]   WANG W, ZENG W, HUANG Y, et al. Deep blind hyperspectral image fusion [C]// IEEE/CVF International Conference on Computer Vision. Seoul: IEEE, 2019: 4149-4158.
[1] Kaiwei XU,Hafiz KHIZER BIN TALIB,Yanlong CAO,Yuanping XU,Zhijie XU,Jingchun SONG. Lightweight micro-expression recognition based on optical flow and convolutional vision Transformer[J]. Journal of ZheJiang University (Engineering Science), 2026, 60(7): 1381-1391.
[2] Liming LIANG,Chengbin WANG,Yi ZHONG,Linjun CHEN,Jian WU. Retinal vessel segmentation based on lightweight high-frequency Transformer and feature complementary fusion[J]. Journal of ZheJiang University (Engineering Science), 2026, 60(7): 1392-1403.
[3] Shaojiang DONG,Tao XIAO,Zhenming LV,Haoran XIA,Jiayuan LUO,Shizheng SUN,Xia ZHANG,Chao LIU. Small organism detection in underwater color-cast environments based on improved RT-DETR[J]. Journal of ZheJiang University (Engineering Science), 2026, 60(7): 1404-1415.
[4] Wenjun ZHENG,Zhikun LI,Shoufei HAN. Aspect-based sentiment analysis via knowledge-enhanced graph Transformer[J]. Journal of ZheJiang University (Engineering Science), 2026, 60(6): 1269-1276.
[5] Wenyuan BIAN,Jiuyuan HUO,Chen CHANG. Wind power data cleaning method based on improved imputation diffusion model and LSTM[J]. Journal of ZheJiang University (Engineering Science), 2026, 60(5): 1016-1026.
[6] Jing PENG,Jiarong YAN,Jiaying LIU,Ziyi WEI,Shan BAI,Yahong DENG. Multi-scale residual learning combined with Dilformer for dual-stream medical image registration network[J]. Journal of ZheJiang University (Engineering Science), 2026, 60(5): 1082-1091.
[7] Yuzhen HOU,Xiaohong SHEN,Li LI,Mingyuan YANG,Caiming ZHANG. Dual-stage deraining network based on mask and non-local attention[J]. Journal of ZheJiang University (Engineering Science), 2026, 60(4): 791-799.
[8] Gang WAN,Xiaobo WANG,Gang SHI,Dezhen YE,Sisi ZHU,Fan SI. Underwater image enhancement algorithm based on feature refinement and attention-augmented reconstruction[J]. Journal of ZheJiang University (Engineering Science), 2026, 60(4): 800-811.
[9] Xiao’an BAO,Shuyou PENG,Na ZHANG,Xiaomei TU,Qingqi ZHANG,Biao WU. Object detection algorithm based on multi-azimuth perception deep fusion detection head[J]. Journal of ZheJiang University (Engineering Science), 2026, 60(1): 32-42.
[10] Xuan MENG,Xueying ZHANG,Ying SUN,Yaru ZHOU. EEG emotion recognition based on electrode arrangement and Transformer[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(9): 1872-1880.
[11] Jie LIU,You WU,Jiahe TIAN,Ke HAN. Based on improved Transformer for super-resolution reconstruction of lung CT images[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(7): 1434-1442.
[12] Yongqing CAI,Cheng HAN,Wei QUAN,Wudi CHEN. Visual induced motion sickness estimation model based on attention mechanism[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(6): 1110-1118.
[13] Mengyao ZHANG,Jie ZHOU,Wenting LI,Yong ZHAO. Three-dimensional mesh segmentation framework using global and local information[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(5): 912-919.
[14] Dejun ZHANG,Yanzi BAI,Feng CAO,Yiqi WU,Zhanya XU. Point cloud Transformer adapter for dense prediction task[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(5): 920-928.
[15] Li MA,Yongshun WANG,Yao HU,Lei FAN. Pre-trained long-short spatiotemporal interleaved Transformer for traffic flow prediction applications[J]. Journal of ZheJiang University (Engineering Science), 2025, 59(4): 669-678.