Please wait a minute...
浙江大学学报(工学版)  2018, Vol. 52 Issue (6): 1073-1080    DOI: 10.3785/j.issn.1008-973X.2018.06.005
计算机与通信技术     
内在动机轮式倒立摆反应式认知系统
赵传松, 任红格, 史涛, 李福进
华北理工大学 电气工程学院, 河北 唐山 063000
Wheeled inverted pendulum reactive cognitive system with internal motivation
ZHAO Chuan-song, REN Hong-ge, SHI Tao, LI Fu-jin
North China University of Science and Technology, College of Electrical Engineering, Tangshan 063000, China
 全文: PDF(1296 KB)   HTML
摘要:

针对直线运动的轮式倒立摆常规自平衡控制方法存在自适应性差和鲁棒性差的问题,建立轮式倒立摆反应式认知系统,在与环境交互过程中涌现出自平衡控制规则.该系统由感知模块、执行模块和认知模块组成,前两者分别负责系统的输入与输出.认知模块涉及到知识模型与学习策略:前者由连续动作学习自动机组构成,用于描述控制规则;后者负责优化知识模型,采用不确定性动机驱动的学习算法.详细描述该系统的结构、原理和算法,理论证明学习算法的收敛性,并通过仿真实验验证系统的自学习能力.结合常规PID和LQR算法通过仿真实验验证自适应性和鲁棒性.实验结果表明,该系统能够自主涌现出自平衡控制规则,表现出良好的自主学习认知技能,具有较好的自适应性与鲁棒性.

Abstract:

Aiming at the problem of poor adaptability and robustness of the conventional self-balancing control method for linear motions of the wheeled inverted pendulum (WIP), the wheeled inverted pendulum reactive cognitive system (WIP-RCS) was established, which can produce independently self-balancing control rules of the WIP through the interaction with environment. The WIP-RCS consisted of a perception module (PM), an execution module (EM) and a cognitive module (CM). PM and EM were responsible for the input and output of the systen. CM mainly involved knowledge model and learning strategy. The knowledge model was composed of a team of continuous-action learning automatic unit and served to describe the control rules. The learning strategy was a learning algorithm motivated by uncertainty motivation and served to optimize the knowledge model. The structure, working principle and learning algorithm of WIP-RCS were described in detail. The convergence of the learning algorithm was proved in theory, and the self-learning ability of WIP-RCS was verified by simulation experiments. The adaptability and robustness of the system were discussed with the combination convenfional PID and LQR. The simulation results show that the system can produce self balancing control rules, together with good learning cognitive skills, and has better adaptability and robustness.

收稿日期: 2017-03-10 出版日期: 2018-06-20
CLC:  TP391  
基金资助:

国家自然科学基金资助项目(61203343);河北省自然科学基金资助项目(E2014209106);河北省高等学校科学技术研究青年基金资助项目(QN2016102,QN2016105).

通讯作者: 任红格,女,副教授,博士.orcid.org/0000-0002-3841-5952.     E-mail: renhg@ncst.edu.cn
作者简介: 赵传松(1991-),男,硕士生,从事发育机器人研究.orcid.org/00000-0002-7896-0330.E-mail:2465049224@qq.com
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  

引用本文:

赵传松, 任红格, 史涛, 李福进. 内在动机轮式倒立摆反应式认知系统[J]. 浙江大学学报(工学版), 2018, 52(6): 1073-1080.

ZHAO Chuan-song, REN Hong-ge, SHI Tao, LI Fu-jin. Wheeled inverted pendulum reactive cognitive system with internal motivation. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2018, 52(6): 1073-1080.

链接本文:

http://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2018.06.005        http://www.zjujournals.com/eng/CN/Y2018/V52/I6/1073

[1] GUO Z Q, XU J X, LEE T H. Design and implementation of a new sliding mode controller on an underactuated wheeled inverted pendulum[J]. Journal of the Franklin Institute, 2014, 351(4):2261-2282.
[2] PATHAK K, FRANCH J, AGRAWAL S K. Velocity and position control of a wheeled inverted pendulum by partial feedback linearization[J]. IEEE Transactions on robotics, 2005, 21(3):505-513.
[3] REN T J, CHEN T C, CHEN C J. Motion control for a two-wheeled vehicle using a self-tuning PID controller[J]. Control Engineering Practice, 2008, 16(3):365-375.
[4] BRISILLA R M, ANKARANARAYANAN V. Nonlinear control of mobile inverted pendulum[J]. Robotics and Autonomous Systems, 2015, 70(C):145-155.
[5] XU J X, GUO Z Q, LEE T H. Design and implementation of a Takagi-Sugeno-type fuzzy logic controller on a two-wheeled mobile robot[J]. IEEE Transactions on Industrial Electronics, 2013, 60(12):5717-5728.
[6] RAFFO G V, ORTEGA M G, MADERO V, et al. Two-wheeled self-balanced pendulum workspace improvement via underactuated robust nonlinear control[J]. Control Engineering Practice, 2015, 44:231-242.
[7] KIM S, KWON S. Nonlinear optimal control design for underactuated two-wheeled inverted pendulum mobile platform[J]. IEEE/ASME Transactions on Mechatronics, 2017, PP(99):1-1.
[8] BREAZEAL C, GRAY J, BERLIN M. An embodied cognition approach to mindreading skills for socially intelligent robots[J]. The International Journal of Robotics Research, 2009, 28(5):656-680.
[9] PFEIFER R, LUNGARELLA M, ⅡDA F. Self-organization, embodiment, and biologically inspired robotics[J]. Science, 2007, 318(5853):1088-1093.
[10] TOPS M, BOKSEM M A S, QUIRIN M, et al. Internally directed cognition and mindfulness:An integrative perspective derived from predictive and reactive control systems theory[J]. Frontiers in Psychology, 2014, 5:s429.
[11] 罗素,诺维格.人工智能:一种现代方法:第3版[M].殷建平,祝恩,刘越,等译.1版.北京:清华大学出版社,2013:831-832. KROEMER O B, DETRY R, PIATER J, et al. Combining active learning and reactive control for robot grasping. Robotics and Autonomous Systems, 2010, 58(9):1105-1116.
[13] BAKLOUTI E, AMOR N B, JALLOULI M. Reactive control architecture for mobile robot autonomous navigation[J]. Robotics and Autonomous Systems, 2017, 89:9-14.
[14] RUAN X G, CAI J X. Stochastic Fuzzy controller based on OCPFA and applied on two-wheeled self-balanced robot[M]//Fuzzy Information and Engineering Volume 2. Berlin Heidelberg:Springer, 2009:141-151.
[15] 任红格,史涛,张瑞成.基于操作条件反射机制的感觉运动系统认知模型的建立[J].机器人,2012,34(3):292-298. REN Hong-ge, SHI Tao, ZHANG Rui-cheng. Foundation of the sensorimotor system cognitive model with operant conditioning mechanism[J]. Robot, 2012,34(3):292-298.
[16] THATHACHAR M A L, SASTRY P S. Varieties of learning automata:an overview[J]. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 2002, 32(6):711-722.
[17] GUO Y, GE H, LI S. A set of novel continuous action-set reinforcement learning automata models to optimize continuous functions[J]. Applied Intelligence, 2017, 46(4):845-864.
[18] 刘晓.基于连续动作学习自动机的联想强化学习[J].山西大学学报:自然科学版,2015,38(3):426-431. LIU Xiao. Associative reinforcement learning based on contiuous-action learning automata[J]. Journal of Shanxi University:Natural Science Edition, 2015,38(3):426-431.
[19] HASHEMI F, MOHAMMADI M. Combination continuous action reinforcement learning automata & pso for design of pid controller for avr system[J]. International Journal of Engineering-Transactions A:Basics, 2014, 28(1):52.
[20] 甘晓琴.基于强化学习的仿人智能控制器参数在线学习与优化[D].重庆:重庆大学,2010:26-61. GAN Xiao-qin. On-line human simulated intelligent control tuning using reinforcement learning[D]. Chongqing:Chongqing University, 2010:26-61.
[21] OUDEYER P Y, KAPLAN F. What is intrinsic motivation? A typology of computational approaches[J]. Frontiers in Neurorobotics, 2007, 1(6):6.
[22] SIDDIQUE N, DHAKAN P, RANO I, et al. A review of the relationship between novelty, intrinsic motivation and reinforcement learning[J]. Paladyn Journal of Behavioral Robotics, 2017, 8(1).
[23] BALDASSARRE G, MIROLLI M. et al. Intrinsic motivations and open-ended development in animals, humans, and robots:an overview[J]. Frontiers in Psychology, 2014, 5(3):985.
[24] HESTER T, STONE P. Intrinsically motivated model learning for developing curious robots[J]. Artificial Intelligence, 2017, 247:170-186. SALAGADO R, PRIETO A, BELLAS F, et al. Motivational engine with autonomous sub-goal identification for the multilevel darwinist brain. Biologically Inspired Cognitive Architectures, 2016, 17:1-11.

[1] 韩勇, 宁连举, 郑小林, 林炜华, 孙中原. 基于社交信息和物品曝光度的矩阵分解推荐[J]. 浙江大学学报(工学版), 2019, 53(1): 89-98.
[2] 郑洲, 张学昌, 郑四鸣, 施岳定. 基于区域增长与统一化水平集的CT肝脏图像分割[J]. 浙江大学学报(工学版), 2018, 52(12): 2382-2396.
[3] 赵丽科, 郑顺义, 王晓南, 黄霞. 单目序列的刚体目标位姿测量[J]. 浙江大学学报(工学版), 2018, 52(12): 2372-2381.
[4] 何杰光, 彭志平, 崔得龙, 李启锐. 局部维度改进的教与学优化算法[J]. 浙江大学学报(工学版), 2018, 52(11): 2159-2170.
[5] 李志, 单洪, 马涛, 黄郡. 基于反向标签传播的移动终端用户群体发现[J]. 浙江大学学报(工学版), 2018, 52(11): 2171-2179.
[6] 王硕朋, 杨鹏, 孙昊. 听觉定位数据库构建过程优化[J]. 浙江大学学报(工学版), 2018, 52(10): 1973-1979.
[7] 魏小峰, 程承旗, 陈波, 王海岩. 基于独立边数的链码方法[J]. 浙江大学学报(工学版), 2018, 52(9): 1686-1693.
[8] 陈荣华, 王鹰汉, 卜佳俊, 于智, 高斐. 基于KNN算法与局部回归的网站无障碍采样评估[J]. 浙江大学学报(工学版), 2018, 52(9): 1702-1708.
[9] 张承志, 冯华君, 徐之海, 李奇, 陈跃庭. 图像噪声方差分段估计法[J]. 浙江大学学报(工学版), 2018, 52(9): 1804-1810.
[10] 刘洲洲, 李士宁, 李彬, 王皓, 张倩昀, 郑然. 基于弹性碰撞优化算法的传感云资源调度[J]. 浙江大学学报(工学版), 2018, 52(8): 1431-1443.
[11] 王勇超, 祝凯林, 吴奇轩, 鲁东明. 基于局部渲染的高精度模型自适应展示技术[J]. 浙江大学学报(工学版), 2018, 52(8): 1461-1466.
[12] 孙念, 李玉强, 刘爱华, 刘春, 黎威威. 基于松散条件下协同学习的中文微博情感分析[J]. 浙江大学学报(工学版), 2018, 52(8): 1452-1460.
[13] 郑守国, 崔雁民, 王青, 杨飞, 程亮. 飞机装配现场数据采集平台设计[J]. 浙江大学学报(工学版), 2018, 52(8): 1526-1534.
[14] 毕晓君, 王朝. 基于超平面投影的高维多目标进化算法[J]. 浙江大学学报(工学版), 2018, 52(7): 1284-1293.
[15] 张廷蓉, 滕奇志, 李征骥, 卿粼波, 何小海. 岩心三维CT图像超分辨率重建[J]. 浙江大学学报(工学版), 2018, 52(7): 1294-1301.