Dataset Information

An imperfect dopaminergic error signal can drive temporal-difference learning.

ABSTRACT: An open problem in the field of computational neuroscience is how to link synaptic plasticity to system-level learning. A promising framework in this context is temporal-difference (TD) learning. Experimental evidence that supports the hypothesis that the mammalian brain performs temporal-difference learning includes the resemblance of the phasic activity of the midbrain dopaminergic neurons to the TD error and the discovery that cortico-striatal synaptic plasticity is modulated by dopamine. However, as the phasic dopaminergic signal does not reproduce all the properties of the theoretical TD error, it is unclear whether it is capable of driving behavior adaptation in complex tasks. Here, we present a spiking temporal-difference learning model based on the actor-critic architecture. The model dynamically generates a dopaminergic signal with realistic firing rates and exploits this signal to modulate the plasticity of synapses as a third factor. The predictions of our proposed plasticity dynamics are in good agreement with experimental results with respect to dopamine, pre- and post-synaptic activity. An analytical mapping from the parameters of our proposed plasticity dynamics to those of the classical discrete-time TD algorithm reveals that the biological constraints of the dopaminergic signal entail a modified TD algorithm with self-adapting learning parameters and an adapting offset. We show that the neuronal network is able to learn a task with sparse positive rewards as fast as the corresponding classical discrete-time TD algorithm. However, the performance of the neuronal network is impaired with respect to the traditional algorithm on a task with both positive and negative rewards and breaks down entirely on a task with purely negative rewards. Our model demonstrates that the asymmetry of a realistic dopaminergic signal enables TD learning when learning is driven by positive rewards but not when driven by negative rewards.

SUBMITTER: Potjans W

PROVIDER: S-EPMC3093351 | biostudies-literature | 2011 May

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

An imperfect dopaminergic error signal can drive temporal-difference learning.

Potjans Wiebke W Diesmann Markus M Morrison Abigail A

PLoS computational biology 20110512 5

An open problem in the field of computational neuroscience is how to link synaptic plasticity to system-level learning. A promising framework in this context is temporal-difference (TD) learning. Experimental evidence that supports the hypothesis that the mammalian brain performs temporal-difference learning includes the resemblance of the phasic activity of the midbrain dopaminergic neurons to the TD error and the discovery that cortico-striatal synaptic plasticity is modulated by dopamine. How ...[more]

PMID: 21589888

Dataset Information

An imperfect dopaminergic error signal can drive temporal-difference learning.

Publications

An imperfect dopaminergic error signal can drive temporal-difference learning.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Temporal difference error prediction signal dysregulation in cocaine dependence.
| S-EPMC4023147 | biostudies-other

Climbing fibers encode a temporal-difference prediction error during cerebellar learning in mice.
| S-EPMC4754078 | biostudies-literature

A gradual temporal shift of dopamine responses mirrors the progression of temporal difference error in machine learning.
| S-EPMC9624460 | biostudies-literature

A dopaminergic reward prediction error signal shapes maternal behavior in mice.
| S-EPMC9957971 | biostudies-literature

Does prediction error drive one-shot declarative learning?
| S-EPMC5381756 | biostudies-literature

Temporal-difference reinforcement learning with distributed representations.
| S-EPMC2760757 | biostudies-literature

Learning to Express Reward Prediction Error-like Dopaminergic Activity Requires Plastic Representations of Time.
| S-EPMC10543312 | biostudies-literature

Dopamine reward prediction error signal codes the temporal evaluation of a perceptual decision report.
| S-EPMC5715768 | biostudies-literature

Elimination of the error signal in the superior colliculus impairs saccade motor learning.
| S-EPMC6156644 | biostudies-literature

Distinct temporal difference error signals in dopamine axons in three regions of the striatum in a decision-making task.
| S-EPMC7771962 | biostudies-literature