Dataset Information

Learning to maximize reward rate: a model based on semi-Markov decision processes.

ABSTRACT: WHEN ANIMALS HAVE TO MAKE A NUMBER OF DECISIONS DURING A LIMITED TIME INTERVAL, THEY FACE A FUNDAMENTAL PROBLEM: how much time they should spend on each decision in order to achieve the maximum possible total outcome. Deliberating more on one decision usually leads to more outcome but less time will remain for other decisions. In the framework of sequential sampling models, the question is how animals learn to set their decision threshold such that the total expected outcome achieved during a limited time is maximized. The aim of this paper is to provide a theoretical framework for answering this question. To this end, we consider an experimental design in which each trial can come from one of the several possible "conditions." A condition specifies the difficulty of the trial, the reward, the penalty and so on. We show that to maximize the expected reward during a limited time, the subject should set a separate value of decision threshold for each condition. We propose a model of learning the optimal value of decision thresholds based on the theory of semi-Markov decision processes (SMDP). In our model, the experimental environment is modeled as an SMDP with each "condition" being a "state" and the value of decision thresholds being the "actions" taken in those states. The problem of finding the optimal decision thresholds then is cast as the stochastic optimal control problem of taking actions in each state in the corresponding SMDP such that the average reward rate is maximized. Our model utilizes a biologically plausible learning algorithm to solve this problem. The simulation results show that at the beginning of learning the model choses high values of decision threshold which lead to sub-optimal performance. With experience, however, the model learns to lower the value of decision thresholds till finally it finds the optimal values.

SUBMITTER: Khodadadi A

PROVIDER: S-EPMC4033239 | biostudies-literature | 2014

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Learning to maximize reward rate: a model based on semi-Markov decision processes.

Khodadadi Arash A Fakhari Pegah P Busemeyer Jerome R JR

Frontiers in neuroscience 20140523

WHEN ANIMALS HAVE TO MAKE A NUMBER OF DECISIONS DURING A LIMITED TIME INTERVAL, THEY FACE A FUNDAMENTAL PROBLEM: how much time they should spend on each decision in order to achieve the maximum possible total outcome. Deliberating more on one decision usually leads to more outcome but less time will remain for other decisions. In the framework of sequential sampling models, the question is how animals learn to set their decision threshold such that the total expected outcome achieved during a li ...[more]

PMID: 24904252

Dataset Information

Learning to maximize reward rate: a model based on semi-Markov decision processes.

Publications

Learning to maximize reward rate: a model based on semi-Markov decision processes.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

BATCH POLICY LEARNING IN AVERAGE REWARD MARKOV DECISION PROCESSES.
| S-EPMC10072865 | biostudies-literature

Multi-Objective Markov Decision Processes for Data-Driven Decision Support.
| S-EPMC5179144 | biostudies-literature

The entropy rate of Linear Additive Markov Processes.
| S-EPMC10997120 | biostudies-literature

Effects of Depressive Symptoms, Feelings, and Interoception on Reward-Based Decision-Making: Investigation Using Reinforcement Learning Model.
| S-EPMC7464008 | biostudies-literature

biomvRhsmm: genomic segmentation with hidden semi-Markov model.
| S-EPMC4065698 | biostudies-literature

Composition of web services using Markov decision processes and dynamic programming.
| S-EPMC4385667 | biostudies-other

Using Bayesian Nonparametric Hidden Semi-Markov Models to Disentangle Affect Processes during Marital Interaction.
| S-EPMC4871360 | biostudies-literature

Dissecting nucleosome free regions by a segmental semi-Markov model.
| S-EPMC2648986 | biostudies-literature

Hierarchical Markov State Model Building to Describe Molecular Processes.
| S-EPMC7322723 | biostudies-literature

Acquisition of decision making criteria: reward rate ultimately beats accuracy.
| S-EPMC3383845 | biostudies-literature