Dataset Information

Learning reward frequency over reward probability: A tale of two learning rules.

ABSTRACT: Learning about the expected value of choice alternatives associated with reward is critical for adaptive behavior. Although human choice preferences are affected by the presentation frequency of reward-related alternatives, this may not be captured by some dominant models of value learning, such as the delta rule. In this study, we examined whether reward learning is driven more by learning the probability of reward provided by each option, or how frequently each option has been rewarded, and assess how well models based on average reward (e.g. the delta model) and models based on cumulative reward (e.g. the decay model) can account for choice preferences. In a binary-outcome choice task, participants selected between pairs of options that had reward probabilities of 0.65 (A) versus 0.35 (B) or 0.75 (C) versus 0.25 (D). Crucially, during training there were twice the number of AB trials as CD trials, such that option A was associated with higher cumulative reward, while option C gave higher average reward. Participants then decided between novel combinations of options (e.g., AC). Most participants preferred option A over C, a result predicted by the Decay model, but not the Delta model. We also compared the Delta and Decay models to both more simplified as well as more complex models that assumed additional mechanisms, such as representation of uncertainty. Overall, models that assume learning about cumulative reward provided the best account of the data.

SUBMITTER: Don HJ

PROVIDER: S-EPMC6814570 | biostudies-literature | 2019 Dec

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Learning reward frequency over reward probability: A tale of two learning rules.

Don Hilary J HJ Otto A Ross AR Cornwall Astin C AC Davis Tyler T Worthy Darrell A DA

Cognition 20190817

Learning about the expected value of choice alternatives associated with reward is critical for adaptive behavior. Although human choice preferences are affected by the presentation frequency of reward-related alternatives, this may not be captured by some dominant models of value learning, such as the delta rule. In this study, we examined whether reward learning is driven more by learning the probability of reward provided by each option, or how frequently each option has been rewarded, and as ...[more]

PMID: 31430606

Dataset Information

Learning reward frequency over reward probability: A tale of two learning rules.

Publications

Learning reward frequency over reward probability: A tale of two learning rules.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Neuronal distortions of reward probability without choice.
| S-EPMC2628536 | biostudies-literature

Psychedelics Reopen the Social Reward Learning Critical Period
2023-06-14 | GSE230679 | GEO

Synaptic learning rules for sequence learning.
| S-EPMC8175084 | biostudies-literature

Mining Association rules for Low-Frequency itemsets.
| S-EPMC6056028 | biostudies-literature

Learning the value of information and reward over time when solving exploration-exploitation problems.
| S-EPMC5717252 | biostudies-literature

Probability machines: consistent probability estimation using nonparametric learning machines.
| S-EPMC3250568 | biostudies-literature

Two spatiotemporally distinct value systems shape reward-based learning in the human brain.
| S-EPMC4569710 | biostudies-literature

Delay discounting, probability discounting, and interdental cleaning frequency.
| S-EPMC9335449 | biostudies-literature

Tonality over a broad frequency range is linked to vocal learning in birds.
| S-EPMC9470270 | biostudies-literature