TY - GEN

T1 - A time and space efficient algorithm for contextual linear bandits

AU - Bento, José

AU - Ioannidis, Stratis

AU - Muthukrishnan, S.

AU - Yan, Jinyun

PY - 2013

Y1 - 2013

N2 - We consider a multi-armed bandit problem where payoffs are a linear function of an observed stochastic contextual variable. In the scenario where there exists a gap between optimal and suboptimal rewards, several algorithms have been proposed that achieve O(logT) regret after T time steps. However, proposed methods either have a computation complexity per iteration that scales linearly with T or achieve regrets that grow linearly with the number of contexts |χ|. We propose an ε-greedy type of algorithm that solves both limitations. In particular, when contexts are variables in ℝd, we prove that our algorithm has a constant computation complexity per iteration of O(poly(d)) and can achieve a regret of O(poly(d) log T) even when |χ| = Ω(2d). In addition, unlike previous algorithms, its space complexity scales like O(Kd2) and does not grow with T.

AB - We consider a multi-armed bandit problem where payoffs are a linear function of an observed stochastic contextual variable. In the scenario where there exists a gap between optimal and suboptimal rewards, several algorithms have been proposed that achieve O(logT) regret after T time steps. However, proposed methods either have a computation complexity per iteration that scales linearly with T or achieve regrets that grow linearly with the number of contexts |χ|. We propose an ε-greedy type of algorithm that solves both limitations. In particular, when contexts are variables in ℝd, we prove that our algorithm has a constant computation complexity per iteration of O(poly(d)) and can achieve a regret of O(poly(d) log T) even when |χ| = Ω(2d). In addition, unlike previous algorithms, its space complexity scales like O(Kd2) and does not grow with T.

KW - Contextual Linear Bandits

KW - Space and Time Efficiency

UR - http://www.scopus.com/inward/record.url?scp=84886538344&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84886538344&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-40988-2_17

DO - 10.1007/978-3-642-40988-2_17

M3 - Conference contribution

AN - SCOPUS:84886538344

SN - 9783642409875

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 257

EP - 272

BT - Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2013, Proceedings

T2 - European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2013

Y2 - 23 September 2013 through 27 September 2013

ER -