TY - JOUR
T1 - Corralling Stochastic Bandit Algorithms
AU - Arora, Raman
AU - Marinov, Teodor V.
AU - Mohri, Mehryar
N1 - Funding Information:
This research was supported in part by NSF BIG-DATA awards IIS-1546482, IIS-1838139, NSF CAREER award IIS-1943251, and by NSF CCF-1535987, NSF IIS-1618662, and a Google Research Award. RA would like to acknowledge support provided by Institute for Advanced Study and the Johns Hopkins Institute for Assured Autonomy. We warmly thank Julian Zimmert for insightful discussions regarding the Tsallis-INF approach.
Publisher Copyright:
Copyright © 2021 by the author(s)
PY - 2021
Y1 - 2021
N2 - We study the problem of corralling stochastic bandit algorithms, that is combining multiple bandit algorithms designed for a stochastic environment, with the goal of devising a corralling algorithm that performs almost as well as the best base algorithm. We give two general algorithms for this setting, which we show benefit from favorable regret guarantees. We show that the regret of the corralling algorithms is no worse than that of the best algorithm containing the arm with the highest reward, and depends on the gap between the highest reward and other rewards.
AB - We study the problem of corralling stochastic bandit algorithms, that is combining multiple bandit algorithms designed for a stochastic environment, with the goal of devising a corralling algorithm that performs almost as well as the best base algorithm. We give two general algorithms for this setting, which we show benefit from favorable regret guarantees. We show that the regret of the corralling algorithms is no worse than that of the best algorithm containing the arm with the highest reward, and depends on the gap between the highest reward and other rewards.
UR - http://www.scopus.com/inward/record.url?scp=85161884279&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85161884279&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85161884279
SN - 2640-3498
VL - 130
SP - 2116
EP - 2124
JO - Proceedings of Machine Learning Research
JF - Proceedings of Machine Learning Research
T2 - 24th International Conference on Artificial Intelligence and Statistics, AISTATS 2021
Y2 - 13 April 2021 through 15 April 2021
ER -