基于简化概率选择框架的双足机器人模仿学习
薛雯,赵硕,李永强

Imitation learning for bipedal robots based on simplified probabilistic framework for options
Wen XUE,Shuo ZHAO,Yongqiang LI
表 3 模仿学习训练过程中的超参数配置
Tab.3 Hyperparameter settings for imitation learning training
参数名称数值
优化器Adam
$ {\lambda }_{{\mathrm{phys}}} $0.01
批量大小64
梯度裁剪阈值1.0
训练轮数(最大)3000
初始学习率$ \alpha $3×10−4
力矩PD控制$ {K}_{{\mathrm{p}}},{K}_{{\mathrm{d}}} $160,18
早停机制28,60,60,60,28
$ {w}_{{\mathrm{p}}},{w}_{{\mathrm{e}}},{w}_{{\mathrm{m}}},{w}_{{\mathrm{f}}},{w}_{{\mathrm{trk}}} $1.0,0.1,0.5,0.3,0.8