Modified reward function on abstract features in inverse reinforcement learning

doi:10.1631/jzus.C0910486

Front. Inform. Technol. Electron. Eng.

2010, Vol. 11

Issue (9): 718-723 DOI: 10.1631/jzus.C0910486

Modified reward function on abstract features in inverse reinforcement learning

Shen-yi Chen^*, Hui Qian, Jia Fan, Zhuo-jun Jin, Miao-liang Zhu

School of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China

Modified reward function on abstract features in inverse reinforcement learning

Shen-yi Chen^*, Hui Qian, Jia Fan, Zhuo-jun Jin, Miao-liang Zhu

School of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China

全文: PDF

摘要： We improve inverse reinforcement learning (IRL) by applying dimension reduction methods to automatically extract abstract features from human-demonstrated policies, to deal with the cases where features are either unknown or numerous. The importance rating of each abstract feature is incorporated into the reward function. Simulation is performed on a task of driving in a five-lane highway, where the controlled car has the largest fixed speed among all the cars. Performance is almost 10.6% better on average with than without importance ratings.

关键词： Importance rating; Abstract feature; Feature extraction; Inverse reinforcement learning (IRL); Markov decision process (MDP)

Abstract: We improve inverse reinforcement learning (IRL) by applying dimension reduction methods to automatically extract abstract features from human-demonstrated policies, to deal with the cases where features are either unknown or numerous. The importance rating of each abstract feature is incorporated into the reward function. Simulation is performed on a task of driving in a five-lane highway, where the controlled car has the largest fixed speed among all the cars. Performance is almost 10.6% better on average with than without importance ratings.

Key words: Importance rating Abstract feature Feature extraction Inverse reinforcement learning (IRL) Markov decision process (MDP)

收稿日期: 2009-08-07 出版日期: 2010-09-07

CLC:

TP181

通讯作者: Shen-yi CHEN E-mail: charles_csy@zju.edu.cn

	服务
	把本文推荐给朋友
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	Shen-yi Chen
	Hui Qian
	Jia Fan
	Zhuo-jun Jin
	Miao-liang Zhu

引用本文:

Shen-yi Chen, Hui Qian, Jia Fan, Zhuo-jun Jin, Miao-liang Zhu. Modified reward function on abstract features in inverse reinforcement learning. Front. Inform. Technol. Electron. Eng., 2010, 11(9): 718-723.

链接本文:

http://www.zjujournals.com/xueshu/fitee/CN/10.1631/jzus.C0910486 或 http://www.zjujournals.com/xueshu/fitee/CN/Y2010/V11/I9/718

[1]	Ching Soon TAN, Phooi Yee LAU, Paulo L. CORREIA, Aida CAMPOS. Automatic analysis of deep-water remotely operated vehicle footage for estimation of Norway lobster abundance[J]. Front. Inform. Technol. Electron. Eng., 2018, 19(8): 1042-1055.
[2]	Yi-xiang HUANG , Xiao LIU, Cheng-liang LIU , Yan-ming LI. Intrinsic feature extraction using discriminant diffusion mapping analysis for automated tool wear evaluation[J]. Front. Inform. Technol. Electron. Eng., 2018, 19(11): 1352-1361.
[3]	Liu LIU , Bao-sheng WANG, Bo YU, Qiu-xi ZHONG. Automatic malware classification and new malware detection using machine learning[J]. Front. Inform. Technol. Electron. Eng., 2017, 18(9): 1336-1347.
[4]	Yu-ming Liu, Lu-bin Ye, Ping-you Zheng, Xiang-rong Shi, Bin Hu, Jun Liang. Multiscale classification and its application to process monitoring[J]. Front. Inform. Technol. Electron. Eng., 2010, 11(6): 425-434.
[5]	Myoung-beom CHUNG, Il-ju KO. Identical-video retrieval using the low-peak feature of a video’s audio information[J]. Front. Inform. Technol. Electron. Eng., 2010, 11(3): 151-159.

Viewed

Full text

Abstract

Cited

Shared

Discussed