Batched bandit problems

Vianney Perchet, Philippe Rigollet, Sylvain Chassang, Erik Snowberg

    Research output: Contribution to journalArticlepeer-review

    Abstract

    Motivated by practical applications, chiefly clinical trials, we study the regret achievable for stochastic bandits under the constraint that the employed policy must split trials into a small number of batches. We propose a simple policy, and show that a very small number of batches gives close to minimax optimal regret bounds. As a byproduct, we derive optimal policies with low switching cost for stochastic bandits.

    Original languageEnglish (US)
    Pages (from-to)660-681
    Number of pages22
    JournalAnnals of Statistics
    Volume44
    Issue number2
    DOIs
    StatePublished - Apr 2016

    Keywords

    • Batches
    • Grouped clinical trials, sample size determination, switching cost
    • Multi-armed bandit problems
    • Multi-phase allocation
    • Regret bounds

    ASJC Scopus subject areas

    • Statistics and Probability
    • Statistics, Probability and Uncertainty

    Fingerprint

    Dive into the research topics of 'Batched bandit problems'. Together they form a unique fingerprint.

    Cite this