Dataset Information

Multiple associative structures created by reinforcement and incidental statistical learning mechanisms.

ABSTRACT: Learning the structure of the world can be driven by reinforcement but also occurs incidentally through experience. Reinforcement learning theory has provided insight into how prediction errors drive updates in beliefs but less attention has been paid to the knowledge resulting from such learning. Here we contrast associative structures formed through reinforcement and experience of task statistics. BOLD neuroimaging in human volunteers demonstrates rigid representations of rewarded sequences in temporal pole and posterior orbito-frontal cortex, which are constructed backwards from reward. By contrast, medial prefrontal cortex and a hippocampal-amygdala border region carry reward-related knowledge but also flexible statistical knowledge of the currently relevant task model. Intriguingly, ventral striatum encodes prediction error responses but not the full RL- or statistically derived task knowledge. In summary, representations of task knowledge are derived via multiple learning processes operating at different time scales that are associated with partially overlapping and partially specialized anatomical regions.

SUBMITTER: Klein-Flugge MC

PROVIDER: S-EPMC6811627 | biostudies-literature | 2019 Oct

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Multiple associative structures created by reinforcement and incidental statistical learning mechanisms.

Klein-Flügge Miriam C MC Wittmann Marco K MK Shpektor Anna A Jensen Daria E A DEA Rushworth Matthew F S MFS

Nature communications 20191023 1

Learning the structure of the world can be driven by reinforcement but also occurs incidentally through experience. Reinforcement learning theory has provided insight into how prediction errors drive updates in beliefs but less attention has been paid to the knowledge resulting from such learning. Here we contrast associative structures formed through reinforcement and experience of task statistics. BOLD neuroimaging in human volunteers demonstrates rigid representations of rewarded sequences in ...[more]

PMID: 31645545

Similar Datasets

Project description:Finding rewards and avoiding punishments are powerful goals of behavior. To maximize reward and minimize punishment, it is beneficial to learn about the stimuli that predict their occurrence, and decades of research have provided insight into the brain processes underlying such associative reinforcement learning. In addition, it is well known in experimental psychology, yet often unacknowledged in neighboring scientific disciplines, that subjects also learn about the stimuli that predict the absence of reinforcement. Here we evaluate evidence for both these learning processes. We focus on two study cases that both provide a baseline level of behavior against which the effects of associative learning can be assessed. Firstly, we report pertinent evidence from Drosophila larvae. A re-analysis of the literature reveals that through paired presentations of an odor A and a sugar reward (A+) the animals learn that the reward can be found where the odor is, and therefore show an above-baseline preference for the odor. In contrast, through unpaired training (A/+) the animals learn that the reward can be found precisely where the odor is not, and accordingly these larvae show a below-baseline preference for it (the same is the case, with inverted signs, for learning through taste punishment). In addition, we present previously unpublished data demonstrating that also during a two-odor, differential conditioning protocol (A+/B) both these learning processes take place in larvae, i.e., learning about both the rewarded stimulus A and the non-rewarded stimulus B (again, this is likewise the case for differential conditioning with taste punishment). Secondly, after briefly discussing published evidence from adult Drosophila, honeybees, and rats, we report an unpublished data set showing that relative to baseline behavior after truly random presentations of a visual stimulus A and punishment, rats exhibit memories of opposite valence upon paired and unpaired training. Collectively, the evidence conforms to classical findings in experimental psychology and suggests that across species animals associatively learn both through paired and through unpaired presentations of stimuli with reinforcement - with opposite valence. While the brain mechanisms of unpaired learning for the most part still need to be uncovered, the immediate implication is that using unpaired procedures as a mnemonically neutral control for associative reinforcement learning may be leading analyses astray.

Dataset Information

Multiple associative structures created by reinforcement and incidental statistical learning mechanisms.

Publications

Multiple associative structures created by reinforcement and incidental statistical learning mechanisms.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets