Please wait a minute...
浙江大学学报(工学版)  2026, Vol. 60 Issue (1): 52-60    DOI: 10.3785/j.issn.1008-973X.2026.01.005
计算机技术     
基于SMPL模态分解与嵌入融合的多模态步态识别
吴越1(),梁铮1,高巍1,杨茂达1,赵培森1,邓红霞1,常媛媛2,*()
1. 太原理工大学 计算机科学与技术学院(大数据学院),山西 太原 030024
2. 太原理工大学 体育与健康工程学院,山西 太原 030024
Multi-modal gait recognition based on SMPL model decomposition and embedding fusion
Yue WU1(),Zheng LIANG1,Wei GAO1,Maoda YANG1,Peisen ZHAO1,Hongxia DENG1,Yuanyuan CHANG2,*()
1. College of Computer Science and Technology (College of Data Science), Taiyuan University of Technology, Taiyuan 030024, China
2. School of Physical Education and Health Engineering, Taiyuan University of Technology, Taiyuan 030024, China
 全文: PDF(1409 KB)   HTML
摘要:

针对现有步态识别研究中步态信息挖掘不足和跨模态特征对齐不充分导致真实场景中识别性能受限的问题,提出基于蒙皮多人线性(SMPL)模态分解与嵌入融合的多模态步态识别方法. 通过将SMPL模型分解为形状分支和姿势分支,全面提取人体静态形状特征和动态运动特征;构建自适应帧关节注意力模块,自适应聚焦关键帧与重要关节,增强姿势特征表达能力;设计模态嵌入融合模块,将不同模态特征投影至统一语义空间,并构建模态一致性损失函数,优化跨模态特征对齐,提升融合效果. 在Gait3D数据集上的实验结果表明,与6种基于轮廓的方法、2种基于骨骼的方法以及5种基于轮廓和骨骼或SMPL模型的多模态方法比较,所提方法Rank-1准确率达到70.4%,在复杂真实场景中表现出更高鲁棒性,验证了所提方法在模态特征提取和跨模态特征对齐方面的有效性.

关键词: 步态识别SMPL模型自适应注意力特征对齐模态融合    
Abstract:

A multimodal gait recognition method based on skinned multi-person linear (SMPL) modal decomposition and embedding fusion was proposed, to address the limitations in current gait recognition research, including insufficient gait information mining and inadequate cross-modal feature alignment that restrict recognition performance in real-world scenarios. The SMPL model was decomposed into a shape branch and a pose branch to comprehensively extract static body shape features and dynamic motion characteristics. An adaptive frame-joint attention module was constructed to focus on key frames and significant joints adaptively, thereby enhancing pose feature representation. A modality embedding fusion module was designed to project different modal features into a unified semantic space, and a modality consistency loss function was built to optimize cross-modal feature alignment and improve fusion effectiveness. Experimental results on the Gait3D dataset demonstrated that the proposed method achieved a Rank-1 accuracy of 70.4%, outperforming six silhouette-based methods, two skeleton-based methods, and five multimodal approaches combining silhouettes with skeletons or SMPL models. The method exhibits superior robustness in complex real-world scenarios, validating its effectiveness in modal feature extraction and cross-modal feature alignment.

Key words: gait recognition    SMPL model    adaptive attention    feature alignment    modality fusion
收稿日期: 2025-03-13 出版日期: 2025-12-15
:  TP 393  
基金资助: 山西省中央引导地方科技发展资金项目(YDZJSX2022A016);山西省重点研发计划资助项目(2022ZDYF128);山西省科技战略项目(202404030401080).
通讯作者: 常媛媛     E-mail: 18634898755@163.com;changyuanyuan@tyut.edu.cn
作者简介: 吴越(2000—),女,硕士生,从事步态识别研究. orcid.org/0009-0009-8894-720X. E-mail:18634898755@163.com
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  
吴越
梁铮
高巍
杨茂达
赵培森
邓红霞
常媛媛

引用本文:

吴越,梁铮,高巍,杨茂达,赵培森,邓红霞,常媛媛. 基于SMPL模态分解与嵌入融合的多模态步态识别[J]. 浙江大学学报(工学版), 2026, 60(1): 52-60.

Yue WU,Zheng LIANG,Wei GAO,Maoda YANG,Peisen ZHAO,Hongxia DENG,Yuanyuan CHANG. Multi-modal gait recognition based on SMPL model decomposition and embedding fusion. Journal of ZheJiang University (Engineering Science), 2026, 60(1): 52-60.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2026.01.005        https://www.zjujournals.com/eng/CN/Y2026/V60/I1/52

图 1  DFGait网络结构
图 2  轮廓分支结构
图 3  SMPL双分支结构
图 4  自适应帧关节注意力模块实现细节
图 5  模态特征对齐示意图
图 6  模态嵌入融合过程
模块模块结构输出维度
Block0$ \left[3\times 3, 64\right],\; {\rm{stride}}=1 $30×64×64×44
Block1$ \left[\begin{array}{c}3\times 3, 64\\ 3\times 3, 64\end{array}\right],\;{\rm{stride}}=1 $
$ \left[3\times 1\times 1, 64\right], \;{\rm{stride}}=1 $
30×128×64×44
Block2$ \left[\begin{array}{c}3\times 3, 64\\ 3\times 3, 128\end{array}\right],\;{\rm{stride}}=2 $
$ \left[3\times 1\times 1, 128\right] , \;{\rm{stride}}=1 $
30×128×32×22
Block3$ \left[\begin{array}{c}3\times 3, 128\\ 3\times 3, 256\end{array}\right],\;{\rm{stride}}=2 $
$ \left[3\times 1\times 1, 256\right],\; {\rm{stride}}=1 $
30×256×16×11
Block4$ \left[\begin{array}{c}3\times 3, 256\\ 3\times \mathrm{3,256}\end{array}\right] ,\;{\rm{stride}}=1 $
$ \left[3\times 1\times 1, 256\right],\; {\rm{stride}}=1 $
30×256×16×11
表 1  轮廓分支ResNet类主干结构
模块模块结构输出维度
Block0批量归一化层30×24×3
Block1基本层30×24×64
瓶颈层30×24×64
瓶颈层30×24×32
Block2瓶颈层30×24×64
瓶颈层30×24×128
瓶颈层30×24×256
瓶颈层30×24×256
Block3最大池化层1×256
表 2  SMPL姿势分支ResGCN主干结构
模态方法来源Rank-1Rank-5mAP/%mINP/%
轮廓GaitSet[5]AAAI201936.7058.3030.0117.30
GaitPart[6]CVPR202028.2047.6021.5812.36
GaitGL[8]ICCV202129.7048.5022.2913.26
GaitGCI[9]CVPR202350.3068.5039.5024.30
GaitBase[10]CVPR202364.2079.5054.5136.36
DyGait[11]ICCV202366.3080.8056.4037.30
骨骼GaitGraph[13]ICIP20218.3016.607.144.80
GPGait[14]ICCV202322.50
轮廓+
骨骼/SMPL
MSAFF[17]IJCB202348.1066.6038.4523.49
GaitRef[18]IJCB202349.0069.3040.6925.26
GaitSTR[19]T-BIOM202465.1081.3055.5936.84
SMPLGait[20]CVPR202246.3064.5037.1622.23
HybirdGait[21]AAAI202453.3072.0043.2926.65
DFGait本研究70.4085.0061.0441.27
表 3  不同方法在Gait3D数据集上识别性能的对比结果
MethodsR-1R-5mAP/%mINP/%
GaitGraph8.3016.607.144.80
GaitGraph+AFJAtt11.3022.509.876.56
SMPL姿势分支6.2012.604.922.94
SMPL姿势分支+AFJAtt8.1015.705.773.69
表 4  自适应帧关节注意力模块的消融实验
图 7  自适应帧关节注意力权重热力图
轮廓分支SMPL分支AFJAttMEFusionR-1R-5mAP/%mINP/%
EFusionMCLoss
26.4041.9017.4410.23
64.9082.2054.9635.70
66.1083.2055.6436.44
68.9084.2058.9439.11
69.5085.1060.6141.22
70.4085.0061.0441.27
表 5  多模态结构、自适应帧关节注意力模块及模态嵌入融合模块的消融实验
1 MAHMOUD M, KASEM M S, KANG H S A comprehensive survey of masked faces: recognition, detection, and unmasking[J]. Applied Sciences, 2024, 14 (19): 8781
doi: 10.3390/app14198781
2 JIA Z, HUANG C, WANG Z, et al Finger recovery transformer: toward better incomplete fingerprint identification[J]. IEEE Transactions on Information Forensics and Security, 2024, 19: 8860- 8874
doi: 10.1109/TIFS.2024.3419690
3 KUEHLKAMP A, BOYD A, CZAJKA A, et al. Interpretable deep learning-based forensic iris segmentation and recognition [C]// IEEE/CVF Winter Conference on Applications of Computer Vision Workshops. Waikoloa: IEEE, 2022: 359–368.
4 赵晓东, 刘作军, 陈玲玲, 等 下肢假肢穿戴者跑动步态识别方法[J]. 浙江大学学报: 工学版, 2018, 52 (10): 1980- 1988
ZHAO Xiaodong, LIU Zuojun, CHEN Lingling, et al Approach of running gait recognition for lower limb amputees[J]. Journal of Zhejiang University: Engineering Science, 2018, 52 (10): 1980- 1988
5 CHAO H, WANG K, HE Y, et al GaitSet: cross-view gait recognition through utilizing gait as a deep set[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44 (7): 3467- 3478
6 FAN C, PENG Y, CAO C, et al. GaitPart: temporal part-based model for gait recognition [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 14213–14221.
7 HUANG Z, XUE D, SHEN X, et al. 3D local convolutional neural networks for gait recognition [C]// IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2021: 14900–14909.
8 LIN B, ZHANG S, YU X. Gait recognition via effective global-local feature representation and local temporal aggregation [C]// IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2021: 14628–14636.
9 DOU H, ZHANG P, SU W, et al. GaitGCI: generative counterfactual intervention for gait recognition [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver: IEEE, 2023: 5578–5588.
10 FAN C, LIANG J, SHEN C, et al. OpenGait: revisiting gait recognition toward better practicality [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver: IEEE, 2023: 9707–9716.
11 WANG M, GUO X, LIN B, et al. DyGait: exploiting dynamic representations for high-performance gait recognition [C]// IEEE/CVF International Conference on Computer Vision. Paris: IEEE, 2023: 13378–13387.
12 LIAO R, YU S, AN W, et al A model-based gait recognition method with body pose and human prior knowledge[J]. Pattern Recognition, 2020, 98: 107069
doi: 10.1016/j.patcog.2019.107069
13 TEEPE T, KHAN A, GILG J, et al. Gaitgraph: graph convolutional network for skeleton-based gait recognition [C]// IEEE International Conference on Image Processing. Anchorage: IEEE, 2021: 2314–2318.
14 FU Y, MENG S, HOU S, et al. GPGait: generalized pose-based gait recognition [C]// 2023 IEEE/CVF International Conference on Computer Vision. Los Alamitos: IEEE Computer Soc, 2023: 19538–19547.
15 ZHANG C, CHEN X P, HAN G Q, et al Spatial transformer network on skeleton-based gait recognition[J]. Expert Systems, 2023, 40 (6): e13244
doi: 10.1111/exsy.13244
16 SUN Y, FENG X, MA L, et al. TriGait: aligning and fusing skeleton and silhouette gait data via a tri-branch network [C]// IEEE International Joint Conference on Biometrics. Ljubljana: IEEE, 2023: 1–9.
17 ZOU S, XIONG J, FAN C, et al. A multi-stage adaptive feature fusion neural network for multimodal gait recognition [C]// IEEE International Joint Conference on Biometrics. Ljubljana: IEEE, 2023: 1–10.
18 ZHU H, ZHENG W, ZHENG Z, et al. GaitRef: gait recognition with refined sequential skeletons [C]// 2023 IEEE International Joint Conference on Biometrics. Ljubljana: IEEE, 2023: 1–10.
19 ZHENG W, ZHU H, ZHENG Z, et al GaitSTR: gait recognition with sequential two-stream refinement[J]. IEEE Transactions on Biometrics, Behavior, and Identity Science, 2024, 6 (4): 528- 538
doi: 10.1109/TBIOM.2024.3390626
20 ZHENG J, LIU X, LIU W, et al. Gait recognition in the wild with dense 3D representations and a benchmark [C]// IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022: 20196–20205.
21 DONG Y, YU C, HA R, et al. HybridGait: a benchmark for spatial-temporal cloth-changing gait recognition with hybrid explorations [C]// AAAI Conference on Artificial Intelligence. Palo Alto: Assoc Advancement Artificial Intelligence, 2024: 1600–1608.
22 LOPER M, MAHMOOD N, ROMERO J, et al SMPL: a skinned multi-person linear model[J]. ACM Transactions on Graphics, 2015, 34 (6): 248
23 YU S, TAN D, TAN T. A framework for evaluating the effect of view angle, clothing and carrying condition on gait recognition [C]// International Conference on Pattern Recognition. Hong Kong: IEEE, 2006: 441–444.
24 TAKEMURA N, MAKIHARA Y, MURAMATSU D, et al Multi-view large population gait dataset and its performance evaluation for cross-view gait recognition[J]. IPSJ Transactions on Computer Vision and Applications, 2018, 10 (1): 4
doi: 10.1186/s41074-018-0039-6
25 ZHU Z, GUO X, YANG T, et al. Gait recognition in the wild: a benchmark [C]// 2021 IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2021: 14789–14799.
26 KIPF T N, WELLING M. Semi-supervised classification with graph convolutional networks [C]// International Conference on Learning Representations. Toulon: [s. n. ], 2017.
27 LI J, ZHANG Y, SHAN H, et al. Gaitcotr: improved spatial-temporal representation for gait recognition with a hybrid convolution-transformer framework [C]// 2023 IEEE International Conference on Acoustics, Speech and Signal Processing. Rhodes Island: IEEE, 2023: 1–5.
28 SONG Y F, ZHANG Z, SHAN C, et al. Stronger, faster and more explainable: a graph convolutional baseline for skeleton-based action recognition [C]// ACM International Conference on Multimedia. Seattle: ACM, 2020: 1625–1633.
[1] 陈智超,杨杰,李凡,冯志成. 基于深度学习的列车运行环境感知关键算法研究综述[J]. 浙江大学学报(工学版), 2025, 59(1): 1-17.
[2] 毛福新,杨旭,程嘉强,彭涛. 基于多模态融合的开放域三维模型检索算法[J]. 浙江大学学报(工学版), 2024, 58(1): 61-70.
[3] 赵卿,张雪英,陈桂军,张静. 基于模态注意力图卷积特征融合的EEG和fNIRS情感识别[J]. 浙江大学学报(工学版), 2023, 57(10): 1987-1997.
[4] 陈志刚,万永菁,王于蓝,蒋翠玲,陈霞. 基于异构低秩多模态融合网络的后囊膜混浊预测[J]. 浙江大学学报(工学版), 2021, 55(11): 2045-2053.
[5] 赵小虎,尹良飞,赵成龙. 基于全局?局部特征和自适应注意力机制的图像语义描述算法[J]. 浙江大学学报(工学版), 2020, 54(1): 126-134.
[6] 赵晓东, 刘作军, 陈玲玲, 杨鹏. 下肢假肢穿戴者跑动步态识别方法[J]. 浙江大学学报(工学版), 2018, 52(10): 1980-1988.
[7] 刘磊, 杨鹏, 刘作军. 采用多核相关向量机的人体步态识别[J]. 浙江大学学报(工学版), 2017, 51(3): 562-571.