Aiming at the problem of poor adaptability and robustness of the conventional self-balancing control method for linear motions of the wheeled inverted pendulum (WIP), the wheeled inverted pendulum reactive cognitive system (WIP-RCS) was established, which can produce independently self-balancing control rules of the WIP through the interaction with environment. The WIP-RCS consisted of a perception module (PM), an execution module (EM) and a cognitive module (CM). PM and EM were responsible for the input and output of the systen. CM mainly involved knowledge model and learning strategy. The knowledge model was composed of a team of continuous-action learning automatic unit and served to describe the control rules. The learning strategy was a learning algorithm motivated by uncertainty motivation and served to optimize the knowledge model. The structure, working principle and learning algorithm of WIP-RCS were described in detail. The convergence of the learning algorithm was proved in theory, and the self-learning ability of WIP-RCS was verified by simulation experiments. The adaptability and robustness of the system were discussed with the combination convenfional PID and LQR. The simulation results show that the system can produce self balancing control rules, together with good learning cognitive skills, and has better adaptability and robustness.
ZHAO Chuan-song, REN Hong-ge, SHI Tao, LI Fu-jin. Wheeled inverted pendulum reactive cognitive system with internal motivation. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2018, 52(6): 1073-1080.
[1] GUO Z Q, XU J X, LEE T H. Design and implementation of a new sliding mode controller on an underactuated wheeled inverted pendulum[J]. Journal of the Franklin Institute, 2014, 351(4):2261-2282.
[2] PATHAK K, FRANCH J, AGRAWAL S K. Velocity and position control of a wheeled inverted pendulum by partial feedback linearization[J]. IEEE Transactions on robotics, 2005, 21(3):505-513.
[3] REN T J, CHEN T C, CHEN C J. Motion control for a two-wheeled vehicle using a self-tuning PID controller[J]. Control Engineering Practice, 2008, 16(3):365-375.
[4] BRISILLA R M, ANKARANARAYANAN V. Nonlinear control of mobile inverted pendulum[J]. Robotics and Autonomous Systems, 2015, 70(C):145-155.
[5] XU J X, GUO Z Q, LEE T H. Design and implementation of a Takagi-Sugeno-type fuzzy logic controller on a two-wheeled mobile robot[J]. IEEE Transactions on Industrial Electronics, 2013, 60(12):5717-5728.
[6] RAFFO G V, ORTEGA M G, MADERO V, et al. Two-wheeled self-balanced pendulum workspace improvement via underactuated robust nonlinear control[J]. Control Engineering Practice, 2015, 44:231-242.
[7] KIM S, KWON S. Nonlinear optimal control design for underactuated two-wheeled inverted pendulum mobile platform[J]. IEEE/ASME Transactions on Mechatronics, 2017, PP(99):1-1.
[8] BREAZEAL C, GRAY J, BERLIN M. An embodied cognition approach to mindreading skills for socially intelligent robots[J]. The International Journal of Robotics Research, 2009, 28(5):656-680.
[9] PFEIFER R, LUNGARELLA M, ⅡDA F. Self-organization, embodiment, and biologically inspired robotics[J]. Science, 2007, 318(5853):1088-1093.
[10] TOPS M, BOKSEM M A S, QUIRIN M, et al. Internally directed cognition and mindfulness:An integrative perspective derived from predictive and reactive control systems theory[J]. Frontiers in Psychology, 2014, 5:s429.
[11] 罗素,诺维格.人工智能:一种现代方法:第3版[M].殷建平,祝恩,刘越,等译.1版.北京:清华大学出版社,2013:831-832. KROEMER O B, DETRY R, PIATER J, et al. Combining active learning and reactive control for robot grasping. Robotics and Autonomous Systems, 2010, 58(9):1105-1116.
[13] BAKLOUTI E, AMOR N B, JALLOULI M. Reactive control architecture for mobile robot autonomous navigation[J]. Robotics and Autonomous Systems, 2017, 89:9-14.
[14] RUAN X G, CAI J X. Stochastic Fuzzy controller based on OCPFA and applied on two-wheeled self-balanced robot[M]//Fuzzy Information and Engineering Volume 2. Berlin Heidelberg:Springer, 2009:141-151.
[15] 任红格,史涛,张瑞成.基于操作条件反射机制的感觉运动系统认知模型的建立[J].机器人,2012,34(3):292-298. REN Hong-ge, SHI Tao, ZHANG Rui-cheng. Foundation of the sensorimotor system cognitive model with operant conditioning mechanism[J]. Robot, 2012,34(3):292-298.
[16] THATHACHAR M A L, SASTRY P S. Varieties of learning automata:an overview[J]. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 2002, 32(6):711-722.
[17] GUO Y, GE H, LI S. A set of novel continuous action-set reinforcement learning automata models to optimize continuous functions[J]. Applied Intelligence, 2017, 46(4):845-864.
[18] 刘晓.基于连续动作学习自动机的联想强化学习[J].山西大学学报:自然科学版,2015,38(3):426-431. LIU Xiao. Associative reinforcement learning based on contiuous-action learning automata[J]. Journal of Shanxi University:Natural Science Edition, 2015,38(3):426-431.
[19] HASHEMI F, MOHAMMADI M. Combination continuous action reinforcement learning automata & pso for design of pid controller for avr system[J]. International Journal of Engineering-Transactions A:Basics, 2014, 28(1):52.
[20] 甘晓琴.基于强化学习的仿人智能控制器参数在线学习与优化[D].重庆:重庆大学,2010:26-61. GAN Xiao-qin. On-line human simulated intelligent control tuning using reinforcement learning[D]. Chongqing:Chongqing University, 2010:26-61.
[21] OUDEYER P Y, KAPLAN F. What is intrinsic motivation? A typology of computational approaches[J]. Frontiers in Neurorobotics, 2007, 1(6):6.
[22] SIDDIQUE N, DHAKAN P, RANO I, et al. A review of the relationship between novelty, intrinsic motivation and reinforcement learning[J]. Paladyn Journal of Behavioral Robotics, 2017, 8(1).
[23] BALDASSARRE G, MIROLLI M. et al. Intrinsic motivations and open-ended development in animals, humans, and robots:an overview[J]. Frontiers in Psychology, 2014, 5(3):985.
[24] HESTER T, STONE P. Intrinsically motivated model learning for developing curious robots[J]. Artificial Intelligence, 2017, 247:170-186. SALAGADO R, PRIETO A, BELLAS F, et al. Motivational engine with autonomous sub-goal identification for the multilevel darwinist brain. Biologically Inspired Cognitive Architectures, 2016, 17:1-11.
WEI Xiao-feng, CHENG Cheng-qi, CHEN Bo, WANG Hai-yan. Chain code based on independent edge number[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2018, 52(9): 1686-1693.