Abstract
Domain adaptation in imitation learning represents an essential step towards improving generalizability. However, even in the restricted setting of third-person imitation where transfer is between isomorphic Markov Decision Processes, there are no strong guarantees on the performance of transferred policies. We present problem-dependent, statistical learning guarantees for third-person imitation from observation in an offline setting, and a lower bound on performance in an online setting.
Original language | English (US) |
---|---|
Pages (from-to) | 1228-1237 |
Number of pages | 10 |
Journal | Proceedings of Machine Learning Research |
Volume | 124 |
State | Published - 2020 |
Event | 36th Conference on Uncertainty in Artificial Intelligence, UAI 2020 - Virtual, Online Duration: Aug 3 2020 → Aug 6 2020 |
ASJC Scopus subject areas
- Artificial Intelligence
- Software
- Control and Systems Engineering
- Statistics and Probability