Unknown

Dataset Information

0

FUBAR: a fast, unconstrained bayesian approximation for inferring selection.


ABSTRACT: Model-based analyses of natural selection often categorize sites into a relatively small number of site classes. Forcing each site to belong to one of these classes places unrealistic constraints on the distribution of selection parameters, which can result in misleading inference due to model misspecification. We present an approximate hierarchical Bayesian method using a Markov chain Monte Carlo (MCMC) routine that ensures robustness against model misspecification by averaging over a large number of predefined site classes. This leaves the distribution of selection parameters essentially unconstrained, and also allows sites experiencing positive and purifying selection to be identified orders of magnitude faster than by existing methods. We demonstrate that popular random effects likelihood methods can produce misleading results when sites assigned to the same site class experience different levels of positive or purifying selection--an unavoidable scenario when using a small number of site classes. Our Fast Unconstrained Bayesian AppRoximation (FUBAR) is unaffected by this problem, while achieving higher power than existing unconstrained (fixed effects likelihood) methods. The speed advantage of FUBAR allows us to analyze larger data sets than other methods: We illustrate this on a large influenza hemagglutinin data set (3,142 sequences). FUBAR is available as a batch file within the latest HyPhy distribution (http://www.hyphy.org), as well as on the Datamonkey web server (http://www.datamonkey.org/).

SUBMITTER: Murrell B 

PROVIDER: S-EPMC3670733 | biostudies-literature | 2013 May

REPOSITORIES: biostudies-literature

altmetric image

Publications

FUBAR: a fast, unconstrained bayesian approximation for inferring selection.

Murrell Ben B   Moola Sasha S   Mabona Amandla A   Weighill Thomas T   Sheward Daniel D   Kosakovsky Pond Sergei L SL   Scheffler Konrad K  

Molecular biology and evolution 20130218 5


Model-based analyses of natural selection often categorize sites into a relatively small number of site classes. Forcing each site to belong to one of these classes places unrealistic constraints on the distribution of selection parameters, which can result in misleading inference due to model misspecification. We present an approximate hierarchical Bayesian method using a Markov chain Monte Carlo (MCMC) routine that ensures robustness against model misspecification by averaging over a large num  ...[more]

Similar Datasets

| S-EPMC4534465 | biostudies-literature
| S-EPMC6788783 | biostudies-literature
| S-EPMC7612314 | biostudies-literature
| S-EPMC9674117 | biostudies-literature
| S-EPMC5796505 | biostudies-literature
| S-EPMC2774326 | biostudies-literature
2024-09-02 | GSE255888 | GEO
| S-EPMC9850743 | biostudies-literature
| S-EPMC4843957 | biostudies-literature
| S-EPMC5552105 | biostudies-other