Unknown

Dataset Information

0

Effective Online Bayesian Phylogenetics via Sequential Monte Carlo with Guided Proposals.


ABSTRACT: Modern infectious disease outbreak surveillance produces continuous streams of sequence data which require phylogenetic analysis as data arrives. Current software packages for Bayesian phylogenetic inference are unable to quickly incorporate new sequences as they become available, making them less useful for dynamically unfolding evolutionary stories. This limitation can be addressed by applying a class of Bayesian statistical inference algorithms called sequential Monte Carlo (SMC) to conduct online inference, wherein new data can be continuously incorporated to update the estimate of the posterior probability distribution. In this article, we describe and evaluate several different online phylogenetic sequential Monte Carlo (OPSMC) algorithms. We show that proposing new phylogenies with a density similar to the Bayesian prior suffers from poor performance, and we develop "guided" proposals that better match the proposal density to the posterior. Furthermore, we show that the simplest guided proposals can exhibit pathological behavior in some situations, leading to poor results, and that the situation can be resolved by heating the proposal density. The results demonstrate that relative to the widely used MCMC-based algorithm implemented in MrBayes, the total time required to compute a series of phylogenetic posteriors as sequences arrive can be significantly reduced by the use of OPSMC, without incurring a significant loss in accuracy.

SUBMITTER: Fourment M 

PROVIDER: S-EPMC5920299 | biostudies-literature | 2018 May

REPOSITORIES: biostudies-literature

altmetric image

Publications

Effective Online Bayesian Phylogenetics via Sequential Monte Carlo with Guided Proposals.

Fourment Mathieu M   Claywell Brian C BC   Dinh Vu V   McCoy Connor C   Matsen Iv Frederick A FA   Darling Aaron E AE  

Systematic biology 20180501 3


Modern infectious disease outbreak surveillance produces continuous streams of sequence data which require phylogenetic analysis as data arrives. Current software packages for Bayesian phylogenetic inference are unable to quickly incorporate new sequences as they become available, making them less useful for dynamically unfolding evolutionary stories. This limitation can be addressed by applying a class of Bayesian statistical inference algorithms called sequential Monte Carlo (SMC) to conduct o  ...[more]

Similar Datasets

| S-EPMC3223366 | biostudies-literature
| S-EPMC7026014 | biostudies-literature
| S-EPMC7069631 | biostudies-literature
| S-EPMC6894579 | biostudies-literature
| S-EPMC5815633 | biostudies-literature
| S-EPMC5636270 | biostudies-literature
| S-EPMC5721689 | biostudies-other
| S-EPMC8668238 | biostudies-literature
| S-EPMC7463299 | biostudies-literature
| S-EPMC3998890 | biostudies-literature