Please wait a minute...
Journal of ZheJiang University (Engineering Science)  2025, Vol. 59 Issue (12): 2539-2544    DOI: 10.3785/j.issn.1008-973X.2025.12.008
    
Asymmetric structure based hyperspectral and LiDAR image classification model
Mingwan LI(),Sheng FANG*(),Zhe LI
College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao 266590, China
Download: HTML     PDF(2853KB) HTML
Export: BibTeX | EndNote (RIS)      

Abstract  

An asymmetric dual-branch modeling method was proposed to address the modality discrepancy and heterogeneous information structures in the joint classification of hyperspectral and LiDAR images. Separate feature extractors were designed for the dominant and auxiliary modalities. In the hyperspectral branch, a serial structure combining a vision transformer and a convolutional neural network was constructed. A central-focus Mamba module was introduced to enhance perception of central regions through modeling context via spiral paths. A spatial-spectral refinement module was applied to improve feature expression quality via fine-grained optimization. In the LiDAR branch, a lightweight convolutional structure was used to extract structural and elevation information, reducing redundant modeling while maintaining scale alignment. Experiments were conducted on three benchmark remote sensing datasets. Superior performance was achieved in terms of overall accuracy, average accuracy, and Kappa coefficient, demonstrating strong robustness and generalization ability. Results show that classification performance is significantly improved by the coordinated design of modality-specific modeling and region-aware enhancement mechanisms.



Key wordsmultimodal remote sensing image classification      asymmetric strategy      hyperspectral image      LiDAR image      Mamba      ViT-CNN framework     
Received: 15 July 2025      Published: 25 November 2025
CLC:  TP 751.1  
Fund:  山东省自然科学基金资助项目(ZR2024MF113,ZR2022MF325).
Corresponding Authors: Sheng FANG     E-mail: limingwanwan@163.com;fangsheng@tsinghua.org.cn
Cite this article:

Mingwan LI,Sheng FANG,Zhe LI. Asymmetric structure based hyperspectral and LiDAR image classification model. Journal of ZheJiang University (Engineering Science), 2025, 59(12): 2539-2544.

URL:

https://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2025.12.008     OR     https://www.zjujournals.com/eng/Y2025/V59/I12/2539


非对称结构的高光谱与激光雷达图像分类模型

针对高光谱图像与激光雷达图像联合分类任务中模态差异显著、信息结构异质的问题,提出非对称双分支建模方法,分别适配主导模态与辅助模态的特征提取需求. 在高光谱分支中,构建融合视觉transformer与卷积神经网络的串联结构,引入中心聚焦的Mamba模块,通过螺旋路径建模上下文增强对中心区域的感知能力,同时结合空间-光谱维度的细粒度优化模块提升特征表达质量. 在激光雷达分支中,采用轻量卷积结构提取结构与高程信息,减少冗余建模并保持尺度对齐. 实验在3个典型遥感数据集上进行,所提方法在整体精度、平均精度与一致性系数等评价指标上均优于现有方法,表现出较强的鲁棒性与泛化能力. 结果表明,差异化建模与区域感知增强机制的协同设计,可显著提升多模态遥感图像分类性能.


关键词: 多模态遥感图像分类,  非对称策略,  高光谱图像,  激光雷达图像,  Mamba,  ViT-CNN 框架 
Fig.1 Overall framework based on asymmetric structure with hyperspectral modality modeled by ViT-CNN structure
Fig.2 Comparison between CNN-ViT and ViT-CNN architectures
Fig.3 Detailed implementation of ViT-CNN in HSI branch of layer i
Fig.4 Illustration of CFSS structure and SSS process
数据集图像尺寸HSI波段数空间分辨率
Houston2013$ 349\times 1\;905 $1442.5 m
Augsburg$ 332\times 485 $18030 m
MUUFL$ 325\times 220 $640.54 m$\times $1.0 m
Tab.1 Overview of experimental datasets
模型结构模型名称Houston2013MUUFLAugsburg
OA/%AA/%Kappa/%OA/%AA/%Kappa/%OA/%AA/%Kappa/%
CNNENDNet88.0587.8687.0780.7580.3375.2465.8354.1455.14
HybridSN86.2287.4085.1762.4658.3654.2558.7353.8546.63
S2ENet94.5995.4094.1679.2379.7273.5774.7566.2066.14
ViTSpectralFormer69.3370.6666.8976.3476.1269.9439.7653.0928.94
MFT92.3193.4291.7073.0473.4666.5371.5465.8162.61
CNN-ViTS2EFT86.9486.3085.8279.1975.1373.0762.5757.1949.63
HCTNet94.7295.6894.3074.5273.4267.9373.9466.6465.29
MHST94.2295.1893.7576.8577.2570.7166.5466.5556.95
ViT-CNN本研究方法97.6097.9797.4183.9584.8379.4775.4767.9767.18
Tab.2 Quantitative comparison of different methods on three datasets
Fig.5 Classification maps generated by different methods on Houston2013 dataset
HSI分支LiDAR分支OA/%AA/%Kappa/%
ViT-CNNViT-CNN96.5697.1996.29
CNNViT-CNN95.8596.6495.51
ViT-CNNCNN97.6097.9797.41
Tab.3 Ablation study of asymmetric strategy
HSI分支架构OA/%AA/%Kappa/%
先CNN后ViT93.3694.4392.82
CNN与ViT并行95.7596.4495.41
先ViT后CNN97.6097.9797.41
Tab.4 Ablation study of HSI branch architecture (ViT-CNN architecture)
CFMambaSSRMOA/%AA /%Kappa /%
空间分支光谱分支
×95.3596.1794.98
××94.5195.4194.08
×97.0297.5696.78
×96.0996.7895.78
97.6097.9797.41
Tab.5 Ablation study of CFMamba and SSRM
[1]   HONG D, GAO L, YOKOYA N, et al More diverse means better: multimodal deep learning meets remote-sensing imagery classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 59 (5): 4340- 4354
doi: 10.1109/TGRS.2020.3016820
[2]   HUANG J, ZHANG Y, YANG F, et al Attention-guided fusion and classification for hyperspectral and LiDAR data[J]. Remote Sensing, 2024, 16 (1): 94
[3]   LI H, GHAMISI P, SOERGEL U, et al Hyperspectral and LiDAR fusion using deep three-stream convolutional neural networks[J]. Remote Sensing, 2018, 10 (10): 1649
doi: 10.3390/rs10101649
[4]   HONG D, GAO L, HANG R, et al Deep encoder-decoder networks for classification of hyperspectral and LiDAR data[J]. IEEE Geoscience and Remote Sensing Letters, 2020, 19: 5500205
[5]   FENG Y, ZHU J, SONG R, et al S2EFT: spectral-spatial-elevation fusion transformer for hyperspectral image and LiDAR classification[J]. Knowledge-Based Systems, 2024, 283: 111190
doi: 10.1016/j.knosys.2023.111190
[6]   ZHANG Y, XU S, HONG D, et al Multimodal transformer network for hyperspectral and LiDAR classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61: 5514317
[7]   ZHAO G, YE Q, SUN L, et al Joint classification of hyperspectral and LiDAR data using a hierarchical CNN and transformer[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 61: 5500716
[8]   XUE Z, TAN X, YU X, et al Deep hierarchical vision transformer for hyperspectral and LiDAR data classification[J]. IEEE Transactions on Image Processing, 2022, 31: 3095- 3110
doi: 10.1109/TIP.2022.3162964
[9]   NI K, WANG D, ZHENG Z, et al MHST: multiscale head selection transformer for hyperspectral and LiDAR classification[J]. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2024, 17: 5470- 5483
doi: 10.1109/JSTARS.2024.3366614
[11]   SONG R, FENG Y, CHENG W, et al BS2T: bottleneck spatial–spectral transformer for hyperspectral image classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: 5532117
[12]   JIA S, WANG Y, JIANG S, et al A center-masked transformer for hyperspectral image classification[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 5510416
[13]   ZHAO S, CHEN H, ZHANG X, et al RS-mamba for large remote sensing image dense prediction[J]. IEEE Transactions on Geoscience and Remote Sensing, 2024, 62: 5633314
[14]   FANG S, LI K, LI Z S2ENet: spatial–spectral cross-modal enhancement network for classification of hyperspectral and LiDAR data[J]. IEEE Geoscience and Remote Sensing Letters, 2021, 19: 6504205
[15]   GU A, DAO T. Mamba: Linear-time sequence modeling with selective state spaces [EB/OL]. (2024-05-31)[2025-7-11]. https://arxiv.org/abs/2312.00752.
[16]   WANG X, ZHU J, FENG Y, et al MS2CANet: multiscale spatial–spectral cross-modal attention network for hyperspectral image and LiDAR classification[J]. IEEE Geoscience and Remote Sensing Letters, 2024, 21: 5501505
[17]   GADER P, ZARE A, CLOSE R, et al. MUUFL Gulfport hyperspectral and LiDAR airborne data set: REP-2013-570 [R]. Gainesville, FL: University of Florida, 2013.
[18]   DU X, ZARE A. Technical report: scene label ground truth map for MUUFL gulfport data set [EB/OL]. (2017-04-17)[2025-07-15]. http://ufdc.ufl.edu/IR00009711/00001.
[19]   ROY S K, KRISHNA G, DUBEY S R, et al HybridSN: exploring 3-D–2-D CNN feature hierarchy for hyperspectral image classification[J]. IEEE Geoscience and Remote Sensing Letters, 2020, 17 (2): 277- 281
doi: 10.1109/LGRS.2019.2918719
[20]   HONG D, HAN Z, YAO J, et al SpectralFormer: rethinking hyperspectral image classification with transformers[J]. IEEE Transactions on Geoscience and Remote Sensing, 2021, 60: 5518615
[1] Fei SUN,Xiao-run LI,Liao-ying ZHAO,Shao-qi YU. Anomaly detection algorithm based on FrFT transform and total variation regularization[J]. Journal of ZheJiang University (Engineering Science), 2022, 56(7): 1276-1284.
[2] SUN Wei-wei, MA Jun, YANG Gang, LI Wei-yue. Improved kernel symmetric sparse representation based band selection for hyperspectral imagery[J]. Journal of ZheJiang University (Engineering Science), 2018, 52(4): 687-693.
[3] ZHAO Liao-ying, CHEN Xiao-fen, LI Xiao-run. Hyperspectral change detection based on change vector analysis and spectral unmixing[J]. Journal of ZheJiang University (Engineering Science), 2017, 51(10): 1912-1919.
[4] LI Xiao-run, ZHU Jie-er, WANG Jing, ZHAO Liao-ying. Hyperspectral image classification based on compsite kernels support vector machine[J]. Journal of ZheJiang University (Engineering Science), 2013, 47(8): 1403-1410.
[5] YAO Fu-tian, QIAN Yun-tao, LI Ji-ming. Semi-supervised learning based Gaussian processes for
hyperspectral image classification
[J]. Journal of ZheJiang University (Engineering Science), 2012, 46(7): 1295-1300.