Generating and Adapting to Diverse Ad Hoc Partners in Hanabi

Rodrigo Canaan, Xianbo Gao, Julian Togelius, Andy Nealen, Stefan Menzel

    Research output: Contribution to journalArticlepeer-review

    Abstract

    Hanabi is a cooperative game that brings the problem of modeling other players to the forefront. In this game, coordinated groups of players can leverage preestablished conventions to great effect. In this article, we focus on ad hoc settings with no previous coordination between partners. We introduce a 'Bayesian Meta-Agent' that maintains a belief distribution over hypotheses of partner policies. The policies that serve as initial hypotheses are generated using MAP-Elites, to ensure behavioral diversity. We evaluate an 'Adaptive' version of the agent, which selects a response policy based on the updated belief distribution and a 'Generalist' version, which selects a response based on the uniform prior. In short episodes of ten games with a consistent partner, the 'Adaptive' version outperforms the 'Generalist' when the training and evaluation populations are the same. This presents a first step toward an agent that can model its partner and adapt within a time frame that is compatible with human interaction.

    Original languageEnglish (US)
    Pages (from-to)228-241
    Number of pages14
    JournalIEEE Transactions on Games
    Volume15
    Issue number2
    DOIs
    StatePublished - Jun 1 2023

    Keywords

    • Computational and artificial intelligence -Evolutionary computation
    • Learning (artificial intelligence) -Naive Bayes methods

    ASJC Scopus subject areas

    • Software
    • Artificial Intelligence
    • Electrical and Electronic Engineering
    • Control and Systems Engineering

    Fingerprint

    Dive into the research topics of 'Generating and Adapting to Diverse Ad Hoc Partners in Hanabi'. Together they form a unique fingerprint.

    Cite this