Dataset Information

A learning theory for reward-modulated spike-timing-dependent plasticity with application to biofeedback.

ABSTRACT: Reward-modulated spike-timing-dependent plasticity (STDP) has recently emerged as a candidate for a learning rule that could explain how behaviorally relevant adaptive changes in complex networks of spiking neurons could be achieved in a self-organizing manner through local synaptic plasticity. However, the capabilities and limitations of this learning rule could so far only be tested through computer simulations. This article provides tools for an analytic treatment of reward-modulated STDP, which allows us to predict under which conditions reward-modulated STDP will achieve a desired learning effect. These analytical results imply that neurons can learn through reward-modulated STDP to classify not only spatial but also temporal firing patterns of presynaptic neurons. They also can learn to respond to specific presynaptic firing patterns with particular spike patterns. Finally, the resulting learning theory predicts that even difficult credit-assignment problems, where it is very hard to tell which synaptic weights should be modified in order to increase the global reward for the system, can be solved in a self-organizing manner through reward-modulated STDP. This yields an explanation for a fundamental experimental result on biofeedback in monkeys by Fetz and Baker. In this experiment monkeys were rewarded for increasing the firing rate of a particular neuron in the cortex and were able to solve this extremely difficult credit assignment problem. Our model for this experiment relies on a combination of reward-modulated STDP with variable spontaneous firing activity. Hence it also provides a possible functional explanation for trial-to-trial variability, which is characteristic for cortical networks of neurons but has no analogue in currently existing artificial computing systems. In addition our model demonstrates that reward-modulated STDP can be applied to all synapses in a large recurrent neural network without endangering the stability of the network dynamics.

SUBMITTER: Legenstein R

PROVIDER: S-EPMC2543108 | biostudies-literature | 2008 Oct

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

A learning theory for reward-modulated spike-timing-dependent plasticity with application to biofeedback.

Legenstein Robert R Pecevski Dejan D Maass Wolfgang W

PLoS computational biology 20081010 10

Reward-modulated spike-timing-dependent plasticity (STDP) has recently emerged as a candidate for a learning rule that could explain how behaviorally relevant adaptive changes in complex networks of spiking neurons could be achieved in a self-organizing manner through local synaptic plasticity. However, the capabilities and limitations of this learning rule could so far only be tested through computer simulations. This article provides tools for an analytic treatment of reward-modulated STDP, wh ...[more]

PMID: 18846203

Similar Datasets

Project description:A plethora of experimental studies have shown that long-term synaptic plasticity can be expressed pre- or postsynaptically depending on a range of factors such as developmental stage, synapse type, and activity patterns. The functional consequences of this diversity are not clear, although it is understood that whereas postsynaptic expression of plasticity predominantly affects synaptic response amplitude, presynaptic expression alters both synaptic response amplitude and short-term dynamics. In most models of neuronal learning, long-term synaptic plasticity is implemented as changes in connective weights. The consideration of long-term plasticity as a fixed change in amplitude corresponds more closely to post- than to presynaptic expression, which means theoretical outcomes based on this choice of implementation may have a postsynaptic bias. To explore the functional implications of the diversity of expression of long-term synaptic plasticity, we adapted a model of long-term plasticity, more specifically spike-timing-dependent plasticity (STDP), such that it was expressed either independently pre- or postsynaptically, or in a mixture of both ways. We compared pair-based standard STDP models and a biologically tuned triplet STDP model, and investigated the outcomes in a minimal setting, using two different learning schemes: in the first, inputs were triggered at different latencies, and in the second a subset of inputs were temporally correlated. We found that presynaptic changes adjusted the speed of learning, while postsynaptic expression was more efficient at regulating spike timing and frequency. When combining both expression loci, postsynaptic changes amplified the response range, while presynaptic plasticity allowed control over postsynaptic firing rates, potentially providing a form of activity homeostasis. Our findings highlight how the seemingly innocuous choice of implementing synaptic plasticity by single weight modification may unwittingly introduce a postsynaptic bias in modelling outcomes. We conclude that pre- and postsynaptically expressed plasticity are not interchangeable, but enable complimentary functions.

Project description:Rhythmic activity has been associated with a wide range of cognitive processes including the encoding of sensory information, navigation, the transfer of information and others. Rhythmic activity in the brain has also been suggested to be used for multiplexing information. Multiplexing is the ability to transmit more than one signal via the same channel. Here we focus on frequency division multiplexing, in which different signals are transmitted in different frequency bands. Recent work showed that spike-timing-dependent plasticity (STDP) can facilitate the transfer of rhythmic activity downstream the information processing pathway. However, STDP has also been known to generate strong winner-take-all like competition between subgroups of correlated synaptic inputs. This competition between different rhythmicity channels, induced by STDP, may prevent the multiplexing of information. Thus, raising doubts whether STDP is consistent with the idea of multiplexing. This study explores whether STDP can facilitate the multiplexing of information across multiple frequency channels, and if so, under what conditions. We address this question in a modelling study, investigating the STDP dynamics of two populations synapsing downstream onto the same neuron in a feed-forward manner. Each population was assumed to exhibit rhythmic activity, albeit in a different frequency band. Our theory reveals that the winner-take-all like competitions between the two populations is limited, in the sense that different rhythmic populations will not necessarily fully suppress each other. Furthermore, we found that for a wide range of parameters, the network converged to a solution in which the downstream neuron responded to both rhythms. Yet, the synaptic weights themselves did not converge to a fixed point, rather remained dynamic. These findings imply that STDP can support the multiplexing of rhythmic information, and demonstrate how functionality (multiplexing of information) can be retained in the face of continuous remodeling of all the synaptic weights. The constraints on the types of STDP rules that can support multiplexing provide a natural test for our theory.

Dataset Information

A learning theory for reward-modulated spike-timing-dependent plasticity with application to biofeedback.

Publications

A learning theory for reward-modulated spike-timing-dependent plasticity with application to biofeedback.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets