Dopamine transients encode reward prediction errors independent of learning rates

Andrew Mah, Carla E.M. Golden, Christine M. Constantinople

Research output: Contribution to journalArticlepeer-review

Abstract

Biological accounts of reinforcement learning posit that dopamine encodes reward prediction errors (RPEs), which are multiplied by a learning rate to update state or action values. These values are thought to be represented by corticostriatal synaptic weights, which are updated by dopamine-dependent plasticity. This suggests that dopamine release reflects the product of the learning rate and RPE. Here, we characterize dopamine encoding of learning rates in the nucleus accumbens core (NAcc) in a volatile environment. Using a task with semi-observable states offering different rewards, we find that rats adjust how quickly they initiate trials across states using RPEs. Computational modeling and behavioral analyses show that learning rates are higher following state transitions and scale with trial-by-trial changes in beliefs about hidden states, approximating normative Bayesian strategies. Notably, dopamine release in the NAcc encodes RPEs independent of learning rates, suggesting that dopamine-independent mechanisms instantiate dynamic learning rates.

Original languageEnglish (US)
Article number114840
JournalCell Reports
Volume43
Issue number10
DOIs
StatePublished - Oct 22 2024

Keywords

  • CP: Neuroscience
  • changepoint detection
  • dopamine
  • dynamic learning rate
  • nucleus accumbens core
  • reinforcement learning

ASJC Scopus subject areas

  • General Biochemistry, Genetics and Molecular Biology

Fingerprint

Dive into the research topics of 'Dopamine transients encode reward prediction errors independent of learning rates'. Together they form a unique fingerprint.

Cite this