Please wait a minute...
Front. Inform. Technol. Electron. Eng.  2014, Vol. 15 Issue (1): 43-50    DOI: 10.1631/jzus.C1300145
    
线性脉冲系统的自适应动态规划方法
Xiao-hua Wang, Juan-juan Yu, Yao Huang, Hua Wang, Zhong-hua Miao
School of Mechatronics Engineering and Automation, Shanghai University, Shanghai 200072, China; Shanghai Key Laboratory of Power Station Automation Technology, Shanghai University, Shanghai 200072, China
Adaptive dynamic programming for linear impulse systems
Xiao-hua Wang, Juan-juan Yu, Yao Huang, Hua Wang, Zhong-hua Miao
School of Mechatronics Engineering and Automation, Shanghai University, Shanghai 200072, China; Shanghai Key Laboratory of Power Station Automation Technology, Shanghai University, Shanghai 200072, China
 全文: PDF 
摘要: 研究目的:针对线性脉冲系统最优控制,研究了基于自适应动态规划的递归方法。通过神经网络逼近最优目标函数,得出最优控制率。求解过程适用于一般脉冲系统,无需初始稳定控制器,为此类系统的最优控制提供理论依据。
创新要点:目前自适应动态规划方法研究局限于连续和离散系统,对脉冲系统研究较少。 本文研究了线性脉冲系统的最优控制问题,运用自适应动态规划思路,完成了脉冲系统相关理论证明,证实了方法的收敛性。通过神经网络逼近最优目标函数,当迭代稳定后,神经网络获得稳定参数,同时获得最优脉冲控制率。
方法提亮:线性脉冲系统的最优目标函数是一个状态二次型,但其中的P矩阵表现为跳跃的脉冲形式。基于此发现,以迭代学习为基础的自适应动态规划方法适用于最优脉冲求解。本文提出的方法避免了直接迭代的矩阵求逆,大大降低了运算量。
重要结论:线性脉冲系统的最优目标函数表现为状态二次型,可通过自适应动态规划方法迭代求解,求解过程稳定。通过神经网络逼近最优目标函数,可避免矩阵求逆,大大降低计算量。
关键词: 脉冲系统自适应动态规划最优控制神经网络    
Abstract: We investigate the optimization of linear impulse systems with the reinforcement learning based adaptive dynamic programming (ADP) method. For linear impulse systems, the optimal objective function is shown to be a quadric form of the pre-impulse states. The ADP method provides solutions that iteratively converge to the optimal objective function. If an initial guess of the pre-impulse objective function is selected as a quadratic form of the pre-impulse states, the objective function iteratively converges to the optimal one through ADP. Though direct use of the quadratic objective function of the states within the ADP method is theoretically possible, the numerical singularity problem may occur due to the matrix inversion therein when the system dimensionality increases. A neural network based ADP method can circumvent this problem. A neural network with polynomial activation functions is selected to approximate the pre-impulse objective function and trained iteratively using the ADP method to achieve optimal control. After a successful training, optimal impulse control can be derived. Simulations are presented for illustrative purposes.
Key words: Adaptive dynamic programming (ADP)    Impulse system    Optimal control    Neural network
收稿日期: 2013-05-27 出版日期: 2014-01-07
CLC:  TP273.1  
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章  
Xiao-hua Wang
Juan-juan Yu
Yao Huang
Hua Wang
Zhong-hua Miao

引用本文:

Xiao-hua Wang, Juan-juan Yu, Yao Huang, Hua Wang, Zhong-hua Miao. Adaptive dynamic programming for linear impulse systems. Front. Inform. Technol. Electron. Eng., 2014, 15(1): 43-50.

链接本文:

http://www.zjujournals.com/xueshu/fitee/CN/10.1631/jzus.C1300145        http://www.zjujournals.com/xueshu/fitee/CN/Y2014/V15/I1/43

[1] Yu-jun Xiao, Wen-yuan Xu, Zhen-hua Jia, Zhuo-ran Ma, Dong-lian Qi. 一种非侵入式的基于功耗的可编程逻辑控制器异常检测方案[J]. Frontiers of Information Technology & Electronic Engineering, 2017, 18(4): 519-534.
[2] Muhammad Asif Zahoor Raja, Iftikhar Ahmad, Imtiaz Khan, Muhammed Ibrahem Syam, Abdul Majid Wazwaz. 用于解决非线性受电弓系统的启发式神经网络计算[J]. Frontiers of Information Technology & Electronic Engineering, 2017, 18(4): 464-484.
[3] Guang-hui Song, Xiao-gang Jin, Gen-lang Chen, Yan Nie. 基于两级层次特征学习的图像分类方法[J]. Front. Inform. Technol. Electron. Eng., 2016, 17(9): 897-906.
[4] Yong-chun Xie, Huang Huang, Yong Hu, Guo-qi Zhang. 先进控制方法在航天器上的应用:进展、挑战和未来发展[J]. Front. Inform. Technol. Electron. Eng., 2016, 17(9): 841-861.
[5] Chang-bin Yu, Yin-qiu Wang, Jin-liang Shao. 基于线性二次最优化的多智能体编队控制[J]. Front. Inform. Technol. Electron. Eng., 2016, 17(2): 96-109.
[6] Gurmanik Kaur, Ajat Shatru Arora, Vijender Kumar Jain. 基于体位特征使用混杂模型预测血压对于无支撑后背的反应[J]. Front. Inform. Technol. Electron. Eng., 2015, 16(6): 474-485.
[7] Zheng-wei Huang, Wen-tao Xue, Qi-rong Mao. 基于无监督特征学习的语音情感识别方法[J]. Front. Inform. Technol. Electron. Eng., 2015, 16(5): 358-366.
[8] Ying Cai, Meng-long Yang, Jun Li. 基于深度卷积网络的多分类法在头部姿态估计中的应用[J]. Front. Inform. Technol. Electron. Eng., 2015, 16(11): 930-939.
[9] Fei-wei Qin, Lu-ye Li, Shu-ming Gao, Xiao-ling Yang, Xiang Chen. 用于三维CAD模型分类的深度学习方法[J]. Front. Inform. Technol. Electron. Eng., 2014, 15(2): 91-106.