Dataset Information

Temporal-difference reinforcement learning with distributed representations.

ABSTRACT: Temporal-difference (TD) algorithms have been proposed as models of reinforcement learning (RL). We examine two issues of distributed representation in these TD algorithms: distributed representations of belief and distributed discounting factors. Distributed representation of belief allows the believed state of the world to distribute across sets of equivalent states. Distributed exponential discounting factors produce hyperbolic discounting in the behavior of the agent itself. We examine these issues in the context of a TD RL model in which state-belief is distributed over a set of exponentially-discounting "micro-Agents", each of which has a separate discounting factor (gamma). Each microAgent maintains an independent hypothesis about the state of the world, and a separate value-estimate of taking actions within that hypothesized state. The overall agent thus instantiates a flexible representation of an evolving world-state. As with other TD models, the value-error (delta) signal within the model matches dopamine signals recorded from animals in standard conditioning reward-paradigms. The distributed representation of belief provides an explanation for the decrease in dopamine at the conditioned stimulus seen in overtrained animals, for the differences between trace and delay conditioning, and for transient bursts of dopamine seen at movement initiation. Because each microAgent also includes its own exponential discounting factor, the overall agent shows hyperbolic discounting, consistent with behavioral experiments.

SUBMITTER: Kurth-Nelson Z

PROVIDER: S-EPMC2760757 | biostudies-literature | 2009 Oct

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Temporal-difference reinforcement learning with distributed representations.

Kurth-Nelson Zeb Z Redish A David AD

PloS one 20091020 10

Temporal-difference (TD) algorithms have been proposed as models of reinforcement learning (RL). We examine two issues of distributed representation in these TD algorithms: distributed representations of belief and distributed discounting factors. Distributed representation of belief allows the believed state of the world to distribute across sets of equivalent states. Distributed exponential discounting factors produce hyperbolic discounting in the behavior of the agent itself. We examine these ...[more]

PMID: 19841749

Dataset Information

Temporal-difference reinforcement learning with distributed representations.

Publications

Temporal-difference reinforcement learning with distributed representations.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Developing PFC representations using reinforcement learning.
| S-EPMC2783795 | biostudies-literature

Contrasting temporal difference and opportunity cost reinforcement learning in an empirical money-emergence paradigm.
| S-EPMC6298096 | biostudies-literature

Emergence of belief-like representations through reinforcement learning.
| S-EPMC10513382 | biostudies-literature

Reward-predictive representations generalize across tasks in reinforcement learning.
| S-EPMC7591094 | biostudies-literature

Temporal encoding in deep reinforcement learning agents.
| S-EPMC10724179 | biostudies-literature

Computational mechanisms of distributed value representations and mixed learning strategies.
| S-EPMC8664930 | biostudies-literature

Transfer of Temporal Logic Formulas in Reinforcement Learning.
| S-EPMC6800702 | biostudies-literature

Dissociable neural representations of reinforcement and belief prediction errors underlie strategic learning.
| S-EPMC3277161 | biostudies-literature

Predictive representations can link model-based reinforcement learning to model-free mechanisms.
| S-EPMC5628940 | biostudies-literature

Autonomous robotic additive manufacturing through distributed model‐free deep reinforcement learning in computational design environments
| S-EPMC9125977 | biostudies-literature