TY - JOUR
T1 - Learning in noise
T2 - Dynamic decision-making in a variable environment
AU - Gureckis, Todd M.
AU - Love, Bradley C.
N1 - Funding Information:
This work was supported in part by NIH-NIMH Math Modeling Post-Doctoral Training Grant T32 MH019879-12 to T.M. Gureckis and AFOSR grant FA9550-04-1-0226, and NSF CAREER grant 0349101 to B.C. Love. Additional data collection and writing were supported by startup funds provided by New York University to T.M. Gureckis. We thank Nathaniel Daw, Yael Niv, A. Ross Otto, and Lisa Zaval for helpful conversations in the development of this work. Author contributions: TG and BL designed research, TG collected and analyzed data, implemented and tested models, and wrote the paper.
PY - 2009/6
Y1 - 2009/6
N2 - In engineering systems, noise is a curse, obscuring important signals and increasing the uncertainty associated with measurement. However, the negative effects of noise are not universal. In this paper, we examine how people learn sequential control strategies given different sources and amounts of feedback variability. In particular, we consider people's behavior in a task where short- and long-term rewards are placed in conflict (i.e., the best option in the short-term is worst in the long-term). Consistent with a model based on reinforcement learning principles [Gureckis, T., & Love, B.C. Short term gains, long term pains: How cues about state aid learning in dynamic environments. Cognition (in press)], we find that learners differentially weight information predictive of the current task state. In particular, when cues that signal state are noisy, we find that participants' ability to identify an optimal strategy is strongly impaired relative to equivalent amounts of noise that obscure the rewards/valuations of those states. In other situations, we find that noise and noise in reward signals may paradoxically improve performance by encouraging exploration. Our results demonstrate how experimentally-manipulated task variability can be used to test predictions about the mechanisms that learners engage in dynamic decision making tasks.
AB - In engineering systems, noise is a curse, obscuring important signals and increasing the uncertainty associated with measurement. However, the negative effects of noise are not universal. In this paper, we examine how people learn sequential control strategies given different sources and amounts of feedback variability. In particular, we consider people's behavior in a task where short- and long-term rewards are placed in conflict (i.e., the best option in the short-term is worst in the long-term). Consistent with a model based on reinforcement learning principles [Gureckis, T., & Love, B.C. Short term gains, long term pains: How cues about state aid learning in dynamic environments. Cognition (in press)], we find that learners differentially weight information predictive of the current task state. In particular, when cues that signal state are noisy, we find that participants' ability to identify an optimal strategy is strongly impaired relative to equivalent amounts of noise that obscure the rewards/valuations of those states. In other situations, we find that noise and noise in reward signals may paradoxically improve performance by encouraging exploration. Our results demonstrate how experimentally-manipulated task variability can be used to test predictions about the mechanisms that learners engage in dynamic decision making tasks.
UR - http://www.scopus.com/inward/record.url?scp=67349207841&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=67349207841&partnerID=8YFLogxK
U2 - 10.1016/j.jmp.2009.02.004
DO - 10.1016/j.jmp.2009.02.004
M3 - Article
AN - SCOPUS:67349207841
SN - 0022-2496
VL - 53
SP - 180
EP - 193
JO - Journal of Mathematical Psychology
JF - Journal of Mathematical Psychology
IS - 3
ER -