Abstract
We study the stochastic multi-armed bandits problem in the presence of adversarial corruption. We present a new algorithm for this problem whose regret is nearly optimal, substantially improving upon previous work. Our algorithm is agnostic to the level of adversarial contamination and can tolerate a significant amount of corruption with virtually no degradation in performance.
Original language | English (US) |
---|---|
Pages (from-to) | 1562-1578 |
Number of pages | 17 |
Journal | Proceedings of Machine Learning Research |
Volume | 99 |
State | Published - 2019 |
Event | 32nd Conference on Learning Theory, COLT 2019 - Phoenix, United States Duration: Jun 25 2019 → Jun 28 2019 |
ASJC Scopus subject areas
- Artificial Intelligence
- Software
- Control and Systems Engineering
- Statistics and Probability