TY - GEN
T1 - Automatic Data Augmentation for Generalization in Reinforcement Learning
AU - Raileanu, Roberta
AU - Goldstein, Max
AU - Yarats, Denis
AU - Kostrikov, Ilya
AU - Fergus, Rob
N1 - Publisher Copyright:
© 2021 Neural information processing systems foundation. All rights reserved.
PY - 2021
Y1 - 2021
N2 - Deep reinforcement learning (RL) agents often fail to generalize beyond their training environments. To alleviate this problem, recent work has proposed the use of data augmentation. However, different tasks tend to benefit from different types of augmentations and selecting the right one typically requires expert knowledge. In this paper, we introduce three approaches for automatically finding an effective augmentation for any RL task. These are combined with two novel regularization terms for the policy and value function, required to make the use of data augmentation theoretically sound for actor-critic algorithms. Our method achieves a new state-of-the-art on the Procgen benchmark and outperforms popular RL algorithms on DeepMind Control tasks with distractors. In addition, our agent learns policies and representations which are more robust to changes in the environment that are irrelevant for solving the task, such as the background. Our code is available at https://github.com/rraileanu/auto-drac.
AB - Deep reinforcement learning (RL) agents often fail to generalize beyond their training environments. To alleviate this problem, recent work has proposed the use of data augmentation. However, different tasks tend to benefit from different types of augmentations and selecting the right one typically requires expert knowledge. In this paper, we introduce three approaches for automatically finding an effective augmentation for any RL task. These are combined with two novel regularization terms for the policy and value function, required to make the use of data augmentation theoretically sound for actor-critic algorithms. Our method achieves a new state-of-the-art on the Procgen benchmark and outperforms popular RL algorithms on DeepMind Control tasks with distractors. In addition, our agent learns policies and representations which are more robust to changes in the environment that are irrelevant for solving the task, such as the background. Our code is available at https://github.com/rraileanu/auto-drac.
UR - http://www.scopus.com/inward/record.url?scp=85125770472&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85125770472&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85125770472
T3 - Advances in Neural Information Processing Systems
SP - 5402
EP - 5415
BT - Advances in Neural Information Processing Systems 34 - 35th Conference on Neural Information Processing Systems, NeurIPS 2021
A2 - Ranzato, Marc'Aurelio
A2 - Beygelzimer, Alina
A2 - Dauphin, Yann
A2 - Liang, Percy S.
A2 - Wortman Vaughan, Jenn
PB - Neural information processing systems foundation
T2 - 35th Conference on Neural Information Processing Systems, NeurIPS 2021
Y2 - 6 December 2021 through 14 December 2021
ER -