TY - GEN
T1 - A time and space efficient algorithm for contextual linear bandits
AU - Bento, José
AU - Ioannidis, Stratis
AU - Muthukrishnan, S.
AU - Yan, Jinyun
PY - 2013
Y1 - 2013
N2 - We consider a multi-armed bandit problem where payoffs are a linear function of an observed stochastic contextual variable. In the scenario where there exists a gap between optimal and suboptimal rewards, several algorithms have been proposed that achieve O(logT) regret after T time steps. However, proposed methods either have a computation complexity per iteration that scales linearly with T or achieve regrets that grow linearly with the number of contexts |χ|. We propose an ε-greedy type of algorithm that solves both limitations. In particular, when contexts are variables in ℝd, we prove that our algorithm has a constant computation complexity per iteration of O(poly(d)) and can achieve a regret of O(poly(d) log T) even when |χ| = Ω(2d). In addition, unlike previous algorithms, its space complexity scales like O(Kd2) and does not grow with T.
AB - We consider a multi-armed bandit problem where payoffs are a linear function of an observed stochastic contextual variable. In the scenario where there exists a gap between optimal and suboptimal rewards, several algorithms have been proposed that achieve O(logT) regret after T time steps. However, proposed methods either have a computation complexity per iteration that scales linearly with T or achieve regrets that grow linearly with the number of contexts |χ|. We propose an ε-greedy type of algorithm that solves both limitations. In particular, when contexts are variables in ℝd, we prove that our algorithm has a constant computation complexity per iteration of O(poly(d)) and can achieve a regret of O(poly(d) log T) even when |χ| = Ω(2d). In addition, unlike previous algorithms, its space complexity scales like O(Kd2) and does not grow with T.
KW - Contextual Linear Bandits
KW - Space and Time Efficiency
UR - http://www.scopus.com/inward/record.url?scp=84886538344&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84886538344&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-40988-2_17
DO - 10.1007/978-3-642-40988-2_17
M3 - Conference contribution
AN - SCOPUS:84886538344
SN - 9783642409875
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 257
EP - 272
BT - Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2013, Proceedings
T2 - European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2013
Y2 - 23 September 2013 through 27 September 2013
ER -