Adaptive dynamic programming for linear impulse systems

doi:10.1631/jzus.C1300145

Front. Inform. Technol. Electron. Eng.

2014, Vol. 15

Issue (1): 43-50 DOI: 10.1631/jzus.C1300145

Adaptive dynamic programming for linear impulse systems

Xiao-hua Wang, Juan-juan Yu, Yao Huang, Hua Wang, Zhong-hua Miao

School of Mechatronics Engineering and Automation, Shanghai University, Shanghai 200072, China; Shanghai Key Laboratory of Power Station Automation Technology, Shanghai University, Shanghai 200072, China

Download:

PDF(0KB)
Export: BibTeX | EndNote (RIS)

Abstract We investigate the optimization of linear impulse systems with the reinforcement learning based adaptive dynamic programming (ADP) method. For linear impulse systems, the optimal objective function is shown to be a quadric form of the pre-impulse states. The ADP method provides solutions that iteratively converge to the optimal objective function. If an initial guess of the pre-impulse objective function is selected as a quadratic form of the pre-impulse states, the objective function iteratively converges to the optimal one through ADP. Though direct use of the quadratic objective function of the states within the ADP method is theoretically possible, the numerical singularity problem may occur due to the matrix inversion therein when the system dimensionality increases. A neural network based ADP method can circumvent this problem. A neural network with polynomial activation functions is selected to approximate the pre-impulse objective function and trained iteratively using the ADP method to achieve optimal control. After a successful training, optimal impulse control can be derived. Simulations are presented for illustrative purposes.

Key words： Adaptive dynamic programming (ADP) Impulse system Optimal control Neural network

Received: 27 May 2013 Published: 07 January 2014

CLC:

TP273.1

	Service
	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors
	Xiao-hua Wang
	Juan-juan Yu
	Yao Huang
	Hua Wang
	Zhong-hua Miao

Cite this article:

Xiao-hua Wang, Juan-juan Yu, Yao Huang, Hua Wang, Zhong-hua Miao. Adaptive dynamic programming for linear impulse systems. Front. Inform. Technol. Electron. Eng., 2014, 15(1): 43-50.

URL:

http://www.zjujournals.com/xueshu/fitee/10.1631/jzus.C1300145 OR http://www.zjujournals.com/xueshu/fitee/Y2014/V15/I1/43

线性脉冲系统的自适应动态规划方法

研究目的：针对线性脉冲系统最优控制，研究了基于自适应动态规划的递归方法。通过神经网络逼近最优目标函数，得出最优控制率。求解过程适用于一般脉冲系统，无需初始稳定控制器，为此类系统的最优控制提供理论依据。
创新要点：目前自适应动态规划方法研究局限于连续和离散系统，对脉冲系统研究较少。本文研究了线性脉冲系统的最优控制问题，运用自适应动态规划思路，完成了脉冲系统相关理论证明，证实了方法的收敛性。通过神经网络逼近最优目标函数，当迭代稳定后，神经网络获得稳定参数，同时获得最优脉冲控制率。
方法提亮：线性脉冲系统的最优目标函数是一个状态二次型，但其中的P矩阵表现为跳跃的脉冲形式。基于此发现，以迭代学习为基础的自适应动态规划方法适用于最优脉冲求解。本文提出的方法避免了直接迭代的矩阵求逆，大大降低了运算量。
重要结论：线性脉冲系统的最优目标函数表现为状态二次型，可通过自适应动态规划方法迭代求解，求解过程稳定。通过神经网络逼近最优目标函数，可避免矩阵求逆，大大降低计算量。

关键词： 脉冲系统, 自适应动态规划, 最优控制, 神经网络

[1]	Yu-jun Xiao, Wen-yuan Xu, Zhen-hua Jia, Zhuo-ran Ma, Dong-lian Qi. NIPAD: a non-invasive power-based anomaly detection scheme for programmable logic controllers[J]. Front. Inform. Technol. Electron. Eng., 2017, 18(4): 519-534.

[2]	Muhammad Asif Zahoor Raja, Iftikhar Ahmad, Imtiaz Khan, Muhammed Ibrahem Syam, Abdul Majid Wazwaz. Neuro-heuristic computational intelligence for solving nonlinear pantograph systems[J]. Front. Inform. Technol. Electron. Eng., 2017, 18(4): 464-484.

[3]	Guang-hui Song, Xiao-gang Jin, Gen-lang Chen, Yan Nie. Two-level hierarchical feature learning for image classification[J]. Front. Inform. Technol. Electron. Eng., 2016, 17(9): 897-906.

[4]	Yong-chun Xie, Huang Huang, Yong Hu, Guo-qi Zhang. Applications of advanced control methods in spacecrafts: progress, challenges, and future prospects[J]. Front. Inform. Technol. Electron. Eng., 2016, 17(9): 841-861.

[5]	Chang-bin Yu, Yin-qiu Wang, Jin-liang Shao. Optimization of formation for multi-agent systems based on LQR[J]. Front. Inform. Technol. Electron. Eng., 2016, 17(2): 96-109.

[6]	Gurmanik Kaur, Ajat Shatru Arora, Vijender Kumar Jain. Using hybrid models to predict blood pressure reactivity to unsupported back based on anthropometric characteristics[J]. Front. Inform. Technol. Electron. Eng., 2015, 16(6): 474-485.

[7]	Zheng-wei Huang, Wen-tao Xue, Qi-rong Mao. Speech emotion recognition with unsupervised feature learning[J]. Front. Inform. Technol. Electron. Eng., 2015, 16(5): 358-366.

[8]	Ying Cai, Meng-long Yang, Jun Li. Multiclass classification based on a deep convolutional network for head pose estimation[J]. Front. Inform. Technol. Electron. Eng., 2015, 16(11): 930-939.

[9]	Fei-wei Qin, Lu-ye Li, Shu-ming Gao, Xiao-ling Yang, Xiang Chen. A deep learning approach to the classification of 3D CAD models[J]. Front. Inform. Technol. Electron. Eng., 2014, 15(2): 91-106.

[10]	Yong-gang Peng, Jun Wang, Wei Wei. Model predictive control of servo motor driven constant pump hydraulic system in injection molding process based on neurodynamic optimization[J]. Front. Inform. Technol. Electron. Eng., 2014, 15(2): 139-146.

[11]	Sara Haghighatnia, Reihaneh Kardehi Moghaddam. Enlarging the guaranteed region of attraction in nonlinear systems with bounded parametric uncertainty[J]. Front. Inform. Technol. Electron. Eng., 2013, 14(3): 214-221.

[12]	Ali Uysal, Raif Bayir. Real-time condition monitoring and fault diagnosis in switched reluctance motors with Kohonen neural network[J]. Front. Inform. Technol. Electron. Eng., 2013, 14(12): 941-952.

[13]	Yan Liu, Jie Yang, Long Li, Wei Wu. Negative effects of sufficiently small initial weights on back-propagation neural networks[J]. Front. Inform. Technol. Electron. Eng., 2012, 13(8): 585-592.

[14]	Hasan Abbasi Nozari, Hamed Dehghan Banadaki, Mohammad Mokhtare, Somayeh Hekmati Vahed. Intelligent non-linear modelling of an industrial winding process using recurrent local linear neuro-fuzzy networks[J]. Front. Inform. Technol. Electron. Eng., 2012, 13(6): 403-412.

[15]	Xin-zheng Xu, Shi-fei Ding, Zhong-zhi Shi, Hong Zhu. Optimizing radial basis function neural network based on rough sets and affinity propagation clustering algorithm[J]. Front. Inform. Technol. Electron. Eng., 2012, 13(2): 131-138.

Viewed

Full text

Abstract

Cited

Shared

Discussed