基于强化学习的机器人曲面恒力跟踪研究 |
| 张铁,肖蒙,邹焱飚,肖佳栋 |
|
Research on robot constant force control of surface tracking based on reinforcement learning |
| Tie ZHANG,Meng XIAO,Yan-biao ZOU,Jia-dong XIAO |
| 图 12 第1次和第8次迭代时PILCO预测的奖励函数值和实际的奖励函数值 |
| Fig.12 Predicted reward and actual reward of 1st and 8th PILCO iterations |
|