基于强化学习的机器人曲面恒力跟踪研究 |
张铁,肖蒙,邹焱飚,肖佳栋 |
Research on robot constant force control of surface tracking based on reinforcement learning |
Tie ZHANG,Meng XIAO,Yan-biao ZOU,Jia-dong XIAO |
图 12 第1次和第8次迭代时PILCO预测的奖励函数值和实际的奖励函数值 |
Fig.12 Predicted reward and actual reward of 1st and 8th PILCO iterations |
![]() |