Browse
Submit Data
Databases
API
Help

Dataset Information

0 Views

0 Connections

0 Citations

0 Reanalyses

0 Downloads

Omics score: 0

Optimizing agent behavior over long time scales by transporting value.

ABSTRACT: Humans prolifically engage in mental time travel. We dwell on past actions and experience satisfaction or regret. More than storytelling, these recollections change how we act in the future and endow us with a computationally important ability to link actions and consequences across spans of time, which helps address the problem of long-term credit assignment: the question of how to evaluate the utility of actions within a long-duration behavioral sequence. Existing approaches to credit assignment in AI cannot solve tasks with long delays between actions and consequences. Here, we introduce a paradigm where agents use recall of specific memories to credit past actions, allowing them to solve problems that are intractable for existing algorithms. This paradigm broadens the scope of problems that can be investigated in AI and offers a mechanistic account of behaviors that may inspire models in neuroscience, psychology, and behavioral economics.

SUBMITTER: Hung CC

PROVIDER: S-EPMC6864102 | biostudies-literature | 2019 Nov

REPOSITORIES: biostudies-literature

ACCESS DATA

Json Xml

Publications

Optimizing agent behavior over long time scales by transporting value.

Hung Chia-Chun CC Lillicrap Timothy T Abramson Josh J Wu Yan Y Mirza Mehdi M Carnevale Federico F Ahuja Arun A Wayne Greg G

Nature communications 20191119 1

Humans prolifically engage in mental time travel. We dwell on past actions and experience satisfaction or regret. More than storytelling, these recollections change how we act in the future and endow us with a computationally important ability to link actions and consequences across spans of time, which helps address the problem of long-term credit assignment: the question of how to evaluate the utility of actions within a long-duration behavioral sequence. Existing approaches to credit assignme ...[more]

PMID: 31745075

Similar Datasets

Collective cell migration over long time scales reveals distinct phenotypes.

Project description:Introduction:Migratory phenotypes of metastasizing tumor cells include single and collective cell migration. While migration of tumor cells is generally less cooperative than that of normal epithelial cells, our understanding of precisely how they differ in long time behavior is incomplete. Objectives:We measure in a model system how cancer progression affects collective migration on long time scales, and determine how perturbation of cell-cell adhesions, specifically reduced E-cadherin expression, affects the collective migration phenotype. Methods:Time lapse imaging of cellular sheets and particle image velocimetry (PIV) are used to quantitatively study the dynamics of cell motion over ten hours. Long time dynamics are measured via finite time Lyapunov exponents (FTLE) and changes in FTLE with time. Results:We find that non-malignant MCF10A cells are distinguished from malignant MCF10CA1a cells by both their short time (minutes) and long time (hours) dynamics. In addition, short time dynamics distinguish non-malignant E-cadherin knockdown cells from the control, but long time dynamics and increasing spatial correlations remain unchanged. Discussion:Epithelial sheet collective behavior includes long time dynamics that cannot be captured by metrics that assess cooperativity based on short time dynamics, such as instantaneous speed or directionality. The use of metrics incorporating migration data over hours instead of minutes allows us to more precisely describe how E-cadherin, a clinically relevant adhesion molecule, affects collective migration. We predict that the long time scale metrics described here will be more robust and predictive of malignant behavior than analysis of instantaneous velocity fields alone.

| S-EPMC5895103 | biostudies-literature

Evolutionary dynamics of Clostridium difficile over short and long time scales.

Project description:Clostridium difficile has rapidly emerged as the leading cause of antibiotic-associated diarrheal disease, with the transcontinental spread of various PCR ribotypes, including 001, 017, 027 and 078. However, the genetic basis for the emergence of C. difficile as a human pathogen is unclear. Whole genome sequencing was used to analyze genetic variation and virulence of a diverse collection of thirty C. difficile isolates, to determine both macro and microevolution of the species. Horizontal gene transfer and large-scale recombination of core genes has shaped the C. difficile genome over both short and long time scales. Phylogenetic analysis demonstrates C. difficile is a genetically diverse species, which has evolved within the last 1.1-85 million years. By contrast, the disease-causing isolates have arisen from multiple lineages, suggesting that virulence evolved independently in the highly epidemic lineages.

| S-EPMC2867753 | biostudies-literature

Dictyostelium myosin II mechanochemistry promotes active behavior of the cortex on long time scales.

Project description:Cell cortices rearrange dynamically to complete cytokinesis, crawlin response to chemoattractant, build tissues, and make neuronal connections. Highly enriched in the cell cortex, actin, myosin II, and actin crosslinkers facilitate cortical movements. Because cortical behavior is the consequence of nanoscale biochemical events, it is essential to probe the cortex at this level. Here, we use high-resolution laser-based particle tracking to examine how myosin II mechanochemistry and dynacortin-mediated actin crosslinking control cortex dynamics in Dictyostelium. Consistent with its low duty ratio, myosin II does not directly drive active bead motility. Instead, myosin II and dynacortin antagonistically regulate other active processes in the living cortex.

| S-EPMC1413706 | biostudies-literature

Adaptive Prediction Emerges Over Short Evolutionary Time Scales.

Project description:Adaptive prediction is a capability of diverse organisms, including microbes, to sense a cue and prepare in advance to deal with a future environmental challenge. Here, we investigated the timeframe over which adaptive prediction emerges when an organism encounters an environment with novel structure. We subjected yeast to laboratory evolution in a novel environment with repetitive, coupled exposures to a neutral chemical cue (caffeine), followed by a sublethal dose of a toxin (5-FOA), with an interspersed requirement for uracil prototrophy to counter-select mutants that gained constitutive 5-FOA resistance. We demonstrate the remarkable ability of yeast to internalize a novel environmental pattern within 50-150 generations by adaptively predicting 5-FOA stress upon sensing caffeine. We also demonstrate how novel environmental structure can be internalized by coupling two unrelated response networks, such as the response to caffeine and signaling-mediated conditional peroxisomal localization of proteins.

| S-EPMC5570091 | biostudies-literature

Optimizing treatment regimes to hinder antiviral resistance in influenza across time scales.

Project description:The large-scale use of antivirals during influenza pandemics poses a significant selection pressure for drug-resistant pathogens to emerge and spread in a population. This requires treatment strategies to minimize total infections as well as the emergence of resistance. Here we propose a mathematical model in which individuals infected with wild-type influenza, if treated, can develop de novo resistance and further spread the resistant pathogen. Our main purpose is to explore the impact of two important factors influencing treatment effectiveness: i) the relative transmissibility of the drug-resistant strain to wild-type, and ii) the frequency of de novo resistance. For the endemic scenario, we find a condition between these two parameters that indicates whether treatment regimes will be most beneficial at intermediate or more extreme values (e.g., the fraction of infected that are treated). Moreover, we present analytical expressions for effective treatment regimes and provide evidence of its applicability across a range of modeling scenarios: endemic behavior with deterministic homogeneous mixing, and single-epidemic behavior with deterministic homogeneous mixing and stochastic heterogeneous mixing. Therefore, our results provide insights for the control of drug-resistance in influenza across time scales.

| S-EPMC3612110 | biostudies-literature

Archaea dominate oxic subseafloor communities over multimillion-year time scales.

Project description:Ammonia-oxidizing archaea (AOA) dominate microbial communities throughout oxic subseafloor sediment deposited over millions of years in the North Atlantic Ocean. Rates of nitrification correlated with the abundance of these dominant AOA populations, whose metabolism is characterized by ammonia oxidation, mixotrophic utilization of organic nitrogen, deamination, and the energetically efficient chemolithoautotrophic hydroxypropionate/hydroxybutyrate carbon fixation cycle. These AOA thus have the potential to couple mixotrophic and chemolithoautotrophic metabolism via mixotrophic deamination of organic nitrogen, followed by oxidation of the regenerated ammonia for additional energy to fuel carbon fixation. This metabolic feature likely reduces energy loss and improves AOA fitness under energy-starved, oxic conditions, thereby allowing them to outcompete other taxa for millions of years.

| S-EPMC6584578 | biostudies-literature

Social complexity in bees is not sufficient to explain lack of reversions to solitary living over long time scales.

Project description:BackgroundThe major lineages of eusocial insects, the ants, termites, stingless bees, honeybees and vespid wasps, all have ancient origins (> or = 65 mya) with no reversions to solitary behaviour. This has prompted the notion of a 'point of no return' whereby the evolutionary elaboration and integration of behavioural, genetic and morphological traits over a very long period of time leads to a situation where reversion to solitary living is no longer an evolutionary option.ResultsWe show that in another group of social insects, the allodapine bees, there was a single origin of sociality > 40 mya. We also provide data on the biology of a key allodapine species, Halterapis nigrinervis, showing that it is truly social. H. nigrinervis was thought to be the only allodapine that was not social, and our findings therefore indicate that there have been no losses of sociality among extant allodapine clades. Allodapine colony sizes rarely exceed 10 females per nest and all females in virtually all species are capable of nesting and reproducing independently, so these bees clearly do not fit the 'point of no return' concept.ConclusionWe argue that allodapine sociality has been maintained by ecological constraints and the benefits of alloparental care, as opposed to behavioural, genetic or morphological constraints to independent living. Allodapine brood are highly vulnerable to predation because they are progressively reared in an open nest (not in sealed brood cells), which provides potentially large benefits for alloparental care and incentives for reproductives to tolerate potential alloparents. We argue that similar vulnerabilities may also help explain the lack of reversions to solitary living in other taxa with ancient social origins.

| S-EPMC2231370 | biostudies-literature

The Genomic Landscapes of Desert Birds Form over Multiple Time Scales.

Project description:Spatial models show that genetic differentiation between populations can be explained by factors ranging from geographic distance to environmental resistance across the landscape. However, genomes exhibit a landscape of differentiation, indicating that multiple processes may mediate divergence in different portions of the genome. We tested this idea by comparing alternative geographic predctors of differentiation in ten bird species that co-occur in Sonoran and Chihuahuan Deserts of North America. Using population-level genomic data, we described the genomic landscapes across species and modeled conditions that represented historical and contemporary mechanisms. The characteristics of genomic landscapes differed across species, influenced by varying levels of population structuring and admixture between deserts, and the best-fit models contrasted between the whole genome and partitions along the genome. Both historical and contemporary mechanisms were important in explaining genetic distance, but particularly past and current environments, suggesting that genomic evolution was modulated by climate and habitat There were also different best-ftit models across genomic partitions of the data, indicating that these regions capture different evolutionary histories. These results show that the genomic landscape of differentiation can be associated with alternative geographic factors operating on different portions of the genome, which reflect how heterogeneous patterns of genetic differentiation can evolve across species and genomes.

| S-EPMC9577548 | biostudies-literature

Long time-scales in primate amygdala neurons support aversive learning.

Project description:Associative learning forms when there is temporal relationship between a stimulus and a reinforcer, yet the inter-trial-interval (ITI), which is usually much longer than the stimulus-reinforcer-interval, contributes to learning-rate and memory strength. The neural mechanisms that enable maintenance of time between trials remain unknown, and it is unclear if the amygdala can support time scales at the order of dozens of seconds. We show that the ITI indeed modulates rate and strength of aversive-learning, and that single-units in the primate amygdala and dorsal-anterior-cingulate-cortex signal confined periods within the ITI, strengthen this coding during acquisition of aversive-associations, and diminish during extinction. Additionally, pairs of amygdala-cingulate neurons synchronize during specific periods suggesting a shared circuit that maintains the long temporal gap. The results extend the known roles of this circuit and suggest a mechanism that maintains trial-structure and temporal-contingencies for learning.

| S-EPMC6203797 | biostudies-other

Evidence for surprise minimization over value maximization in choice behavior.

Project description:Classical economic models are predicated on the idea that the ultimate aim of choice is to maximize utility or reward. In contrast, an alternative perspective highlights the fact that adaptive behavior requires agents' to model their environment and minimize surprise about the states they frequent. We propose that choice behavior can be more accurately accounted for by surprise minimization compared to reward or utility maximization alone. Minimizing surprise makes a prediction at variance with expected utility models; namely, that in addition to attaining valuable states, agents attempt to maximize the entropy over outcomes and thus 'keep their options open'. We tested this prediction using a simple binary choice paradigm and show that human decision-making is better explained by surprise minimization compared to utility maximization. Furthermore, we replicated this entropy-seeking behavior in a control task with no explicit utilities. These findings highlight a limitation of purely economic motivations in explaining choice behavior and instead emphasize the importance of belief-based motivations.

| S-EPMC4643240 | biostudies-other

OmicsDI is part of the ELIXIR infrastructure

OmicsDI is an Elixir interoperability service. Learn more ›

Tweets

OmicsDI Databases

PRIDE
PeptideAtlas
MassIVE
JPOST Repository
Physiome Model Repository

EGA
EVA
ENA
LINCS
PAXDB
Cell Collective

MetaboLights
Metabolomics Workbench
MetabolomeExpress
GNPS
BioModels
FAIRDOMHub

ArrayExpress
dbGaP
ExpressionAtlas
GEO
NODE

Information

Databases
Help
API
Contact us
Code on GitHub
Terms of use
Submit Data