Please wait a minute...
Journal of ZheJiang University (Engineering Science)  2019, Vol. 53 Issue (10): 1865-1873    DOI: 10.3785/j.issn.1008-973X.2019.10.003
Mechanical and Energy Engineering     
Research on robot constant force control of surface tracking based on reinforcement learning
Tie ZHANG(),Meng XIAO,Yan-biao ZOU,Jia-dong XIAO
School of Mechanical and Automotive Engineering, South China University of Technology, Guangzhou 510640, China
Download: HTML     PDF(1638KB) HTML
Export: BibTeX | EndNote (RIS)      

Abstract  

The contact model between robot end-effector and surface was established in order to solve the problem that it is difficult to obtain contact force when a robot end effector contacts with the curved workpiece. The relationship between the contact force coordinate system of the curved surface and the measuring coordinate system of the robot sensor was constructed. The relationship between the output parameters of the model and the contact state was analyzed based on probabilistic inference and learning for control (PILCO) which was a reinforcement learning algorithm based on a probabilistic dynamics model. The partial contact state was forecasted according to the output state, and the displacement input parameters of the robot were optimized to achieve a constant force by the reinforcement learning algorithm. The input state of the reinforcement learning was modified to an average state value over a period of time, which reduced the interference to the input state value during experiments. The experimental results showed that the algorithm obtained stable force after 8 iterations. The convergence speed was faster compared with the fuzzy iterative algorithm, and the average absolute value of the force error was reduced by 29%.



Key wordsrobot      contour tracking      force control      probabilistic inference and learning for control (PILCO)      reinforcement learning     
Received: 02 August 2018      Published: 30 September 2019
CLC:  TP 242  
Cite this article:

Tie ZHANG,Meng XIAO,Yan-biao ZOU,Jia-dong XIAO. Research on robot constant force control of surface tracking based on reinforcement learning. Journal of ZheJiang University (Engineering Science), 2019, 53(10): 1865-1873.

URL:

http://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2019.10.003     OR     http://www.zjujournals.com/eng/Y2019/V53/I10/1865


基于强化学习的机器人曲面恒力跟踪研究

针对机器人末端执行器和曲面工件接触时难以得到恒定接触力的问题,建立机器人末端执行器与曲面工件的接触模型.构建曲面接触力坐标系与机器人传感器测量坐标系之间的关系,利用基于概率动力学模型的强化学习(PILCO)算法对模型输出参数与接触状态的关系进行学习,对部分接触状态进行预测,强化学习根据预测的状态优化机器人位移输入参数,得到期望跟踪力信号. 实验中,将强化学习的输入状态改为一段时间内的状态平均值以减少接触状态下信号的干扰. 实验结果表明,利用PILCO算法在迭代8次后能够得到较稳定的力,相比于模糊迭代算法收敛速度较快,力误差绝对值的平均值减少了29%.


关键词: 机器人,  曲面跟踪,  力控制,  基于概率动力学模型的强化学习(PILCO),  强化学习 
Fig.1 Robotic constant force tracking experiment platform
Fig.2 Partial detail of robot end effector
Fig.3 Force analysis of robot end effector
Fig.4 Explicit force control based on position control
Fig.5 Structure chart of policy-based reinforcement learning
Fig.6 Structure diagram of PILCO algorithm
Fig.7 Robotic constant force tracking experimental setup
Fig.8 Workpiece for tracking
Fig.9 Experimental flow chart of constant force tracking based on PILCO
Fig.10 First to fourth iterative processes of PICLO algorithm
Fig.11 Fifth to eighth iterative processes of PICLO algorithm
迭代次数 力误差
f |max/N $\left|\Delta \bar f \right| $/N Δfs /N
1 16.442 7 1.707 9 3.108 1
2 10.184 9 1.134 9 1.758 9
3 11.647 8 0.944 8 1.510 0
4 12.292 2 0.958 1 1.604 6
5 12.397 4 1.120 5 1.793 6
6 10.817 4 1.085 5 1.619 9
7 11.984 2 0.890 6 1.501 7
8 11.395 5 0.855 4 1.440 8
Tab.1 Comparison of force error parameters during PILCO iterations
Fig.12 Predicted reward and actual reward of 1st and 8th PILCO iterations
Fig.13 Comparison of results between PILCO algorithm and fuzzy iterative algorithm
算法 力误差
f |max/N $\left|\Delta \bar f \right| $/N Δfs /N
PILCO 11.395 5 0.855 4 1.440 8
模糊迭代 3.313 2 1.209 2 1.342 3
Tab.2 Experimental results comparison between PILCO algorithm and fuzzy iterative algorithm
[1]   ALICI G, SHIRINZADEH B Enhanced stiffness modeling, identification and characterization for robot manipulators[J]. IEEE Transactions on Robotics, 2005, 21 (4): 554- 564
doi: 10.1109/TRO.2004.842347
[2]   黄奇伟, 章明, 曲巍崴, 等 机器人制孔姿态优化与光顺[J]. 浙江大学学报: 工学版, 2015, 49 (12): 2261- 2268
HUANG Qi-wei, ZHANG Ming, QU Wei-wei, et al Posture optimization and smoothness for robot drilling[J]. Journal of Zhejiang University: Engineering Science, 2015, 49 (12): 2261- 2268
[3]   WINKLER A, SUCHY J. Force controlled contour following on unknown objects with an industrial robot [C]//IEEE International Symposium on Robotic and Sensors Environments (ROSE). Washington, DC: IEEE, 2013: 208-213.
[4]   TUNG P, FAN S Application of fuzzy on-line self-adaptive controller for a contour tracking robot on unknown contours[J]. Fuzzy Sets and Systems, 1996, 82 (1): 17- 25
doi: 10.1016/0165-0114(95)00272-3
[5]   ABU-MALLOUH M, SURGENOR B. Force/velocity control of a pneumatic gantry robot for contour tracking with neural network compensation [C]// ASME 2008 International Manufacturing Science and Engineering Conference. Evanston, Illinois, USA: ASME, 2008: 11-18.
[6]   LI E C, LI Z M Surface tracking with robot force control in unknown environment[J]. Advanced Materials Research, 2011, 328-330: 2140- 2143
doi: 10.4028/www.scientific.net/AMR.328-330
[7]   YE B S, SONG B, LI Z Y, et al A study of force and position tracking control for robot contact with an arbitrarily inclined plane[J]. International Journal of Advanced Robotic Systems, 2013, 10 (1): 1- 1
doi: 10.5772/52938
[8]   DUAN J J, GAN Y H, CHEN M, et al Adaptive variable impedance control for dynamic contact force tracking in uncertain environment[J]. Robotics and Autonomous Systems, 2018, 102: 54- 65
doi: 10.1016/j.robot.2018.01.009
[9]   NUCHKRUA T, CHEN S L Precision contouring control of five degree of freedom robot manipulators with uncertainty[J]. International Journal of Advanced Robotic Systems, 2017, 14 (1): 208- 213
[10]   WANG W C, LEE C H. Fuzzy neural network-based adaptive impedance force control design of robot manipulator under unknown environment [C]//IEEE International Conference on Fuzzy Systems. Beijing: IEEE, 2014: 1442-1448.
[11]   KOBER J, BAGNELL J A, PETERS J Reinforcement learning in robotics: a survey[J]. The International Journal of Robotics Research, 2013, 32 (11): 1238- 1274
doi: 10.1177/0278364913495721
[12]   NG A Y. Shaping and policy search in reinforcement learning [D]. California: University of California, Berkeley, 2003.
[13]   MüLLING K, KOBER J, KROEMER O, et al Learning to select and generalize striking movements in robot table tennis[J]. The International Journal of Robotics Research, 2013, 32 (3): 263- 279
doi: 10.1177/0278364912472380
[14]   HESTER T, QUINLAN M, STONE P. Generalized model learning for Reinforcement Learning on a humanoid robot [C]// IEEE International Conference on Robotics and Automation. Anchorage: IEEE, 2010: 2369-2374.
[15]   YEN G G, HICKEY T W Reinforcement learning algorithms for robotic navigation in dynamic environments[J]. ISA Transactions, 2004, 43 (2): 217- 230
doi: 10.1016/S0019-0578(07)60032-9
[16]   ARULKUMARAN K, DEISENROTH M P, BRUNDAGE M, et al Deep reinforcement learning: a brief survey[J]. IEEE Signal Processing Magazine, 2017, 34 (6): 26- 38
doi: 10.1109/MSP.2017.2743240
[17]   POLYDOROS A S, NALPANTIDIS L Survey of model-based reinforcement learning: applications on robotics[J]. Journal of Intelligent and Robotic Systems, 2017, 86 (2): 1- 21
[18]   DOERR A, TUONG N D, MARCO A, et al. Model-based policy search for automatic tuning of multivariate PID controllers [C]//2017 IEEE International Conference on Robotics and Automation (ICRA). Singapore: IEEE, 2017: 5295-5301.
[19]   HAN H, GAUL P, MATSUBARA T. Model-based reinforcement learning approach for deformable linear object manipulation [C]//2017 13th IEEE Conference on Automation Science and Engineering (CASE). Xi'an: IEEE, 2017: 750-755.
[20]   ZENG G, HEMAMI A An overview of robot force control[J]. Robotica, 1997, 15 (5): 473- 482
doi: 10.1017/S026357479700057X
[21]   SHENG X, XU L, WANG Z A position-based explicit force control strategy based on online trajectory prediction[J]. International Journal of Robotics and Automation, 2017, 32 (1): 93- 100
[22]   VOLPE R, KHOSLA P A theoretical and experimental investigation of explicit force control strategies for manipulators[J]. IEEE Transactions on Automatic Control, 1993, 38 (11): 1634- 1650
doi: 10.1109/9.262033
[23]   KOMATI B, PAC M R, RANATUNGA I, et al. Explicit force control vs impedance control for micromanipulation [C]//ASME 2013 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference. Portland: ASME, 2013: V001T09A018.
[24]   DEISENROTH M P, RASMUSSEN C E. PILCO: a model-based and data-efficient approach to policy search [C]// International Conference on International Conference on Machine Learning. Bellevue: Omnipress, 2011: 465-472.
[25]   DEISENROTH M P. Efficient reinforcement learning using Gaussian processes [M]. Karlsruhe, Germany: KIT, 2010.
[1] Tie ZHANG,Liang-liang HU,Yan-biao ZOU. Identification of improved friction model for robot based on hybrid genetic algorithm[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(5): 801-809.
[2] Wei-da LI,Zhu WANG,Hong-miao ZHANG,Juan LI,Hong GU. Mechanical design of bed-type gait rehabilitation training system[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(5): 823-830.
[3] Jun-xia JIANG,Xin-yuan ZHANG,Bang-ming TAO,Qun DONG. Design and experiment of remote handling motor replacement device based on passive compliant mechanism[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(5): 855-865.
[4] Wei-da LI,Juan LI,Xiang LI,Hong-miao ZHANG,Hong GU,Yi-peng SHI,Hao-jie ZHANG,Li-ning SUN. Dynamic analysis and parameter optimization of under-actuated heterogeneous lower limb rehabilitation robot[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(2): 222-228.
[5] Tian-ze HAO,Hua-ping XIAO,Shu-hai LIU,Chao ZHANG,Hao MA. Research status of integrated intelligent soft robots[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(2): 229-243.
[6] Ming-min LIU,Dao-kui QU,Fang XU,Feng-shan ZOU,Kai JIA,Ji-lai SONG. Gait planning of quadruped robot based on divergence component of motion[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(2): 244-250.
[7] De-bin LIU,Dan WANG,Bai CHEN,Yao-yao WANG,Li-yao SONG. A survey of supernumerary robotic limbs[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(2): 251-258.
[8] Yi-fan MA,Fan-yu ZHAO,Xin WANG,Zhong-he JIN. Satellite earth observation task planning method based on improved pointer networks[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(2): 395-401.
[9] You-kang DUAN,Xiao-gang CHEN,Jian GUI,Bin MA,Shun-fen LI,Zhi-tang SONG. Continuous kinematics prediction of lower limbs based on phase division[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(1): 89-95.
[10] Jin WANG,Xiang-kun WANG,Jian-hui FU,Guo-dong LU,Chao-chao JIN,Yan-zhi CHEN. Static and dynamic characteristic analysis and structure optimization for crossbeam structure of heavy-duty truss robot[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(1): 124-134.
[11] Da-zhao DONG,Guan-hua XU,Ji-liang GAO,Yue-tong XU,Jian-zhong FU. Online correction algorithm for posture by robot assembly based on machine vision[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(1): 145-152.
[12] Ying-jie GUO,Fan GU,Hui-yue DONG,Hai-jin WANG. Prediction and compensation of robot deformation under pressure force of pressure foot[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(8): 1457-1465.
[13] Yi-xiong FENG,Kang-jie LI,Yi-cong GAO,Hao ZHENG. Corner recognition of industrial robot contour curve for visual servoing[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(8): 1449-1456.
[14] Ze-sheng WANG,Yan-biao LI,Yi-qin LUO,Peng SUN,Bo CHEN,Hang ZHENG. Dynamic analysis of a 7-DOF redundant and hybrid mechanical arm[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(8): 1505-1515.
[15] Jian-ming XU,Zhi-peng ZHAO,Jian-wei DONG. Free-force control of flexible robot joint system without sensors on link side[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(7): 1256-1263.