计算机技术 |
|
|
|
|
基于参考模型的输出反馈强化学习控制 |
郝钏钏1, 方舟2, 李平2 |
1. 浙江大学 控制科学与工程学系,浙江 杭州 310027;2. 浙江大学 航空航天学院,浙江 杭州 310027 |
|
Output feedback reinforcement learning control method
based on reference model |
HAO Chuan-chuan1, FANG Zhou2, LI Ping2 |
1. Department of Control Science and Engineering, Zhejiang University, Hangzhou 310027, China;
2. School of Aeronautics and Astronautics, Zhejiang University, Hangzhou 310027, China |
[1] TAMEI T, SHIBATA T. Fast reinforcement learning for three-dimensional kinetic human-robot cooperation with EMG-to-activation model [J]. Advanced Robotics, 2011, 25(5): 563-580.
[2] HAN Y K, KIMURA H. Motions obtaining of multi-degree-freedom underwater robot by using reinforcement learning algorithms [C]∥ IEEE Region 10 Annual International Conference, Proceedings/TENCON. New Jersey: IEEE,2010: 1498-1502.
[3] PETERS J, SCHAAL S. Natural actor-critic [J]. Neurocomputing, 2008, 71(7/8/9): 1180-1190.
[4] ABBEEL P. Apprenticeship learning and reinforcement learning with application to robotic control [D]. Stanford: Department of Computer Science, Stanford University, 2008.
[5] 余涛,胡细兵,刘靖.基于多步回溯Q(λ)学习算法的多目标最优潮流计算[J].华南理工大学学报:自然科学版,2010,38(10): 139-145.
YU Tao, HU Xi-bing, LIU Jing. Multi-objective optimal power flow calculation based on multi-step Q(λ)learning algorithm [J]. Journal of South China University of Technology: Natural Science Edition, 2010, 38(10): 139-145.
[6] CHU B, PARK J, HONG D. Tunnel ventilation controller design using an RLS-based natural actor-critic algorithm [J]. International Journal of Precision Engineering and Manufacturing, 2010, 11(6): 829-838.
[7] LEWIS F L, VRABIE D. Reinforcement learning and adaptive dynamic programming for feedback control [J]. IEEE Circuits and Systems Magazine, 009, 9(3): 32-50.
[8] LEWIS F L, VAMVOUDAKIS K G. Optimal adaptive control for unknown systems using output feedback by reinforcement learning methods [C]∥ Proceedings of 2010 8th IEEE International Conference on Control and Automation. New Jersey:IEEE Computer Society, 2010: 2138-2145.
[9] 王学宁,陈伟,张锰,等.增强学习中的直接策略搜索方法综述[J].智能系统学报,2007,2(1): 16-24.
WANG Xue-ning, CHEN Wei, ZHANG Meng, et al. A survey of direct policy search methods in reinforcement learning [J]. CAAI Transaction on Intelligent Systems, 2007, 2(1): 16-24. |
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|