动态窗口法引导的TD3无地图导航算法
柳佳乐,薛雅丽,崔闪,洪君
TD3 mapless navigation algorithm guided by dynamic window approach
Jiale LIU,Yali XUE,Shan CUI,Jun HONG
表 2
训练所得模型的平均奖励值
Tab.2
Average reward value of trained model
方法
成功率
步数
奖励
PPO
0.76
44.29
31.92
DDPG
0.87
43.88
67.33
TD3
0.90
52.27
55.71
DWA-TD3
0.87
35.93
63.96
LSTM-TD3
0.91
44.75
60.07
DWA-LSTM TD3
0.91
36.89
70.19