TY - GEN
T1 - Learning to Explore in Motion and Interaction Tasks
AU - Bogdanovic, Miroslav
AU - Righetti, Ludovic
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/11
Y1 - 2019/11
N2 - Model free reinforcement learning suffers from the high sampling complexity inherent to robotic manipulation or locomotion tasks. Most successful approaches typically use random sampling strategies which leads to slow policy convergence. In this paper we present a novel approach for efficient exploration that leverages previously learned tasks. We exploit the fact that the same system is used across many tasks and build a generative model for exploration based on data from previously solved tasks to improve learning new tasks. The approach also enables continuous learning of improved exploration strategies as novel tasks are learned. Extensive simulations on a robot manipulator performing a variety of motion and contact interaction tasks demonstrate the capabilities of the approach. In particular, our experiments suggest that the exploration strategy can more than double learning speed, especially when rewards are sparse. Moreover, the algorithm is robust to task variations and parameter tuning, making it beneficial for complex robotic problems.
AB - Model free reinforcement learning suffers from the high sampling complexity inherent to robotic manipulation or locomotion tasks. Most successful approaches typically use random sampling strategies which leads to slow policy convergence. In this paper we present a novel approach for efficient exploration that leverages previously learned tasks. We exploit the fact that the same system is used across many tasks and build a generative model for exploration based on data from previously solved tasks to improve learning new tasks. The approach also enables continuous learning of improved exploration strategies as novel tasks are learned. Extensive simulations on a robot manipulator performing a variety of motion and contact interaction tasks demonstrate the capabilities of the approach. In particular, our experiments suggest that the exploration strategy can more than double learning speed, especially when rewards are sparse. Moreover, the algorithm is robust to task variations and parameter tuning, making it beneficial for complex robotic problems.
UR - http://www.scopus.com/inward/record.url?scp=85081163845&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85081163845&partnerID=8YFLogxK
U2 - 10.1109/IROS40897.2019.8968584
DO - 10.1109/IROS40897.2019.8968584
M3 - Conference contribution
AN - SCOPUS:85081163845
T3 - IEEE International Conference on Intelligent Robots and Systems
SP - 2686
EP - 2692
BT - 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2019
Y2 - 3 November 2019 through 8 November 2019
ER -