TY - JOUR
T1 - Model-free reinforcement learning for robust locomotion using demonstrations from trajectory optimization
AU - Bogdanovic, Miroslav
AU - Khadiv, Majid
AU - Righetti, Ludovic
N1 - Funding Information:
This work was supported by the New York University, the European Union’s Horizon 2020 research and innovation program (grant agreement 780684) and the National Science Foundation (grants 1825993, 1932187 and 1925079).
Publisher Copyright:
Copyright © 2022 Bogdanovic, Khadiv and Righetti.
PY - 2022/8/31
Y1 - 2022/8/31
N2 - We present a general, two-stage reinforcement learning approach to create robust policies that can be deployed on real robots without any additional training using a single demonstration generated by trajectory optimization. The demonstration is used in the first stage as a starting point to facilitate initial exploration. In the second stage, the relevant task reward is optimized directly and a policy robust to environment uncertainties is computed. We demonstrate and examine in detail the performance and robustness of our approach on highly dynamic hopping and bounding tasks on a quadruped robot.
AB - We present a general, two-stage reinforcement learning approach to create robust policies that can be deployed on real robots without any additional training using a single demonstration generated by trajectory optimization. The demonstration is used in the first stage as a starting point to facilitate initial exploration. In the second stage, the relevant task reward is optimized directly and a policy robust to environment uncertainties is computed. We demonstrate and examine in detail the performance and robustness of our approach on highly dynamic hopping and bounding tasks on a quadruped robot.
KW - contact uncertainty
KW - deep reinforcement learning
KW - legged locomotion
KW - robust control policies
KW - trajectory optimization
UR - http://www.scopus.com/inward/record.url?scp=85138256712&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85138256712&partnerID=8YFLogxK
U2 - 10.3389/frobt.2022.854212
DO - 10.3389/frobt.2022.854212
M3 - Article
AN - SCOPUS:85138256712
SN - 2296-9144
VL - 9
JO - Frontiers in Robotics and AI
JF - Frontiers in Robotics and AI
M1 - 854212
ER -