Please wait a minute...
Front. Inform. Technol. Electron. Eng.  2010, Vol. 11 Issue (9): 718-723    DOI: 10.1631/jzus.C0910486
    
Modified reward function on abstract features in inverse reinforcement learning
Shen-yi Chen*, Hui Qian, Jia Fan, Zhuo-jun Jin, Miao-liang Zhu
School of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China
Download:   PDF(0KB)
Export: BibTeX | EndNote (RIS)      

Abstract  We improve inverse reinforcement learning (IRL) by applying dimension reduction methods to automatically extract abstract features from human-demonstrated policies, to deal with the cases where features are either unknown or numerous. The importance rating of each abstract feature is incorporated into the reward function. Simulation is performed on a task of driving in a five-lane highway, where the controlled car has the largest fixed speed among all the cars. Performance is almost 10.6% better on average with than without importance ratings.

Key wordsImportance rating      Abstract feature      Feature extraction      Inverse reinforcement learning (IRL)      Markov decision process (MDP)     
Received: 07 August 2009      Published: 07 September 2010
CLC:  TP181  
Cite this article:

Shen-yi Chen, Hui Qian, Jia Fan, Zhuo-jun Jin, Miao-liang Zhu. Modified reward function on abstract features in inverse reinforcement learning. Front. Inform. Technol. Electron. Eng., 2010, 11(9): 718-723.

URL:

http://www.zjujournals.com/xueshu/fitee/10.1631/jzus.C0910486     OR     http://www.zjujournals.com/xueshu/fitee/Y2010/V11/I9/718


Modified reward function on abstract features in inverse reinforcement learning

We improve inverse reinforcement learning (IRL) by applying dimension reduction methods to automatically extract abstract features from human-demonstrated policies, to deal with the cases where features are either unknown or numerous. The importance rating of each abstract feature is incorporated into the reward function. Simulation is performed on a task of driving in a five-lane highway, where the controlled car has the largest fixed speed among all the cars. Performance is almost 10.6% better on average with than without importance ratings.

关键词: Importance rating,  Abstract feature,  Feature extraction,  Inverse reinforcement learning (IRL),  Markov decision process (MDP) 
[1] Ching Soon TAN, Phooi Yee LAU, Paulo L. CORREIA, Aida CAMPOS. Automatic analysis of deep-water remotely operated vehicle footage for estimation of Norway lobster abundance[J]. Front. Inform. Technol. Electron. Eng., 2018, 19(8): 1042-1055.
[2] Yi-xiang HUANG , Xiao LIU, Cheng-liang LIU , Yan-ming LI. Intrinsic feature extraction using discriminant diffusion mapping analysis for automated tool wear evaluation[J]. Front. Inform. Technol. Electron. Eng., 2018, 19(11): 1352-1361.
[3] Liu LIU , Bao-sheng WANG, Bo YU, Qiu-xi ZHONG. Automatic malware classification and new malware detection using machine learning[J]. Front. Inform. Technol. Electron. Eng., 2017, 18(9): 1336-1347.
[4] Xiao-hu Ma, Meng Yang, Zhao Zhang. Local uncorrelated local discriminant embedding for face recognition[J]. Front. Inform. Technol. Electron. Eng., 2016, 17(3): 212-223.
[5] Yong Ding, Nan Li, Yang Zhao, Kai Huang. Image quality assessment method based on nonlinear feature extraction in kernel space[J]. Front. Inform. Technol. Electron. Eng., 2016, 17(10): 1008-1017.
[6] Yu-ming Liu, Lu-bin Ye, Ping-you Zheng, Xiang-rong Shi, Bin Hu, Jun Liang. Multiscale classification and its application to process monitoring[J]. Front. Inform. Technol. Electron. Eng., 2010, 11(6): 425-434.
[7] Myoung-beom CHUNG, Il-ju KO. Identical-video retrieval using the low-peak feature of a video’s audio information[J]. Front. Inform. Technol. Electron. Eng., 2010, 11(3): 151-159.