Dataset Information

Human reinforcement learning subdivides structured action spaces by learning effector-specific values.

ABSTRACT: Humans and animals are endowed with a large number of effectors. Although this enables great behavioral flexibility, it presents an equally formidable reinforcement learning problem of discovering which actions are most valuable because of the high dimensionality of the action space. An unresolved question is how neural systems for reinforcement learning-such as prediction error signals for action valuation associated with dopamine and the striatum-can cope with this "curse of dimensionality." We propose a reinforcement learning framework that allows for learned action valuations to be decomposed into effector-specific components when appropriate to a task, and test it by studying to what extent human behavior and blood oxygen level-dependent (BOLD) activity can exploit such a decomposition in a multieffector choice task. Subjects made simultaneous decisions with their left and right hands and received separate reward feedback for each hand movement. We found that choice behavior was better described by a learning model that decomposed the values of bimanual movements into separate values for each effector, rather than a traditional model that treated the bimanual actions as unitary with a single value. A decomposition of value into effector-specific components was also observed in value-related BOLD signaling, in the form of lateralized biases in striatal correlates of prediction error and anticipatory value correlates in the intraparietal sulcus. These results suggest that the human brain can use decomposed value representations to "divide and conquer" reinforcement learning over high-dimensional action spaces.

SUBMITTER: Gershman SJ

PROVIDER: S-EPMC2796632 | biostudies-literature | 2009 Oct

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Human reinforcement learning subdivides structured action spaces by learning effector-specific values.

Gershman Samuel J SJ Pesaran Bijan B Daw Nathaniel D ND

The Journal of neuroscience : the official journal of the Society for Neuroscience 20091001 43

Humans and animals are endowed with a large number of effectors. Although this enables great behavioral flexibility, it presents an equally formidable reinforcement learning problem of discovering which actions are most valuable because of the high dimensionality of the action space. An unresolved question is how neural systems for reinforcement learning-such as prediction error signals for action valuation associated with dopamine and the striatum-can cope with this "curse of dimensionality." W ...[more]

PMID: 19864565

Dataset Information

Human reinforcement learning subdivides structured action spaces by learning effector-specific values.

Publications

Human reinforcement learning subdivides structured action spaces by learning effector-specific values.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Computational evidence for hierarchically structured reinforcement learning in humans.
| S-EPMC7703642 | biostudies-literature

Dynamics of striatal action selection and reinforcement learning.
| S-EPMC10925202 | biostudies-literature

Action-specialized expert ensemble trading system with extended discrete action space using deep reinforcement learning.
| S-EPMC7384672 | biostudies-literature

Patient-Specific Sedation Management via Deep Reinforcement Learning.
| S-EPMC8521809 | biostudies-literature

A micro-genesis account of longer-form reinforcement learning in structured and unstructured environments.
| S-EPMC8222288 | biostudies-literature

Machine Teaching for Human Inverse Reinforcement Learning.
| S-EPMC8278287 | biostudies-literature

The successor representation in human reinforcement learning.
| S-EPMC6941356 | biostudies-literature

Deep learning-based automatic action extraction from structured chemical synthesis procedures.
| S-EPMC10495970 | biostudies-literature

Offline replay supports planning in human reinforcement learning.
| S-EPMC6303108 | biostudies-literature

Reinforcement learning derived chemotherapeutic schedules for robust patient-specific therapy.
| S-EPMC8429726 | biostudies-literature