TY - GEN
T1 - Transfer of task representation in Reinforcement Learning using policy-based proto-value functions
AU - Ferrante, Eliseo
AU - Lazaric, Alessandro
AU - Restelli, Marcello
PY - 2008
Y1 - 2008
N2 - Reinforcement Learning research is traditionally devoted to solve single-task problems. Therefore, anytime a new task is faced, learning must be restarted from scratch. Recently, several studies have addressed the issue of reusing the knowledge acquired in solving previous related tasks by transfer- ring information about policies and value functions. In this paper, we analyze the use of proto-value functions under the transfer learning perspective. Proto-value functions are effective basis functions for the approximation of value functions defined over the graph obtained by a random walk on the environment. The definition of this graph is a key aspect in transfer transfer problems in which both the reward function and the dynamics change. Therefore, we introduce policy-based proto-value functions, which can be obtained by considering the graph generated by a random walk guided by the optimal policy of one of the tasks at hand. We compare the effectiveness of policy-based and standard proto-value functions, on different transfer problems defined on a simple grid-world environment.
AB - Reinforcement Learning research is traditionally devoted to solve single-task problems. Therefore, anytime a new task is faced, learning must be restarted from scratch. Recently, several studies have addressed the issue of reusing the knowledge acquired in solving previous related tasks by transfer- ring information about policies and value functions. In this paper, we analyze the use of proto-value functions under the transfer learning perspective. Proto-value functions are effective basis functions for the approximation of value functions defined over the graph obtained by a random walk on the environment. The definition of this graph is a key aspect in transfer transfer problems in which both the reward function and the dynamics change. Therefore, we introduce policy-based proto-value functions, which can be obtained by considering the graph generated by a random walk guided by the optimal policy of one of the tasks at hand. We compare the effectiveness of policy-based and standard proto-value functions, on different transfer problems defined on a simple grid-world environment.
KW - Proto-value functions
KW - Reinforcement Learning
KW - Transfer Learning
UR - http://www.scopus.com/inward/record.url?scp=84899902330&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84899902330&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84899902330
SN - 9781605604701
T3 - Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS
SP - 1301
EP - 1304
BT - 7th International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS 2008
PB - International Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS)
T2 - 7th International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS 2008
Y2 - 12 May 2008 through 16 May 2008
ER -