Unknown

Dataset Information

0

A counting renaissance: combining stochastic mapping and empirical Bayes to quickly detect amino acid sites under positive selection.


ABSTRACT:

Motivation

Statistical methods for comparing relative rates of synonymous and non-synonymous substitutions maintain a central role in detecting positive selection. To identify selection, researchers often estimate the ratio of these relative rates (dN/dS) at individual alignment sites. Fitting a codon substitution model that captures heterogeneity in dN/dS across sites provides a reliable way to perform such estimation, but it remains computationally prohibitive for massive datasets. By using crude estimates of the numbers of synonymous and non-synonymous substitutions at each site, counting approaches scale well to large datasets, but they fail to account for ancestral state reconstruction uncertainty and to provide site-specific dN/dS estimates.

Results

We propose a hybrid solution that borrows the computational strength of counting methods, but augments these methods with empirical Bayes modeling to produce a relatively fast and reliable method capable of estimating site-specific dN/dS values in large datasets. Importantly, our hybrid approach, set in a Bayesian framework, integrates over the posterior distribution of phylogenies and ancestral reconstructions to quantify uncertainty about site-specific dN/dS estimates. Simulations demonstrate that this method competes well with more-principled statistical procedures and, in some cases, even outperforms them. We illustrate the utility of our method using human immunodeficiency virus, feline panleukopenia and canine parvovirus evolution examples.

SUBMITTER: Lemey P 

PROVIDER: S-EPMC3579240 | biostudies-literature | 2012 Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications

A counting renaissance: combining stochastic mapping and empirical Bayes to quickly detect amino acid sites under positive selection.

Lemey Philippe P   Minin Vladimir N VN   Bielejec Filip F   Kosakovsky Pond Sergei L SL   Suchard Marc A MA  

Bioinformatics (Oxford, England) 20121012 24


<h4>Motivation</h4>Statistical methods for comparing relative rates of synonymous and non-synonymous substitutions maintain a central role in detecting positive selection. To identify selection, researchers often estimate the ratio of these relative rates (dN/dS) at individual alignment sites. Fitting a codon substitution model that captures heterogeneity in dN/dS across sites provides a reliable way to perform such estimation, but it remains computationally prohibitive for massive datasets. By  ...[more]

Similar Datasets

| S-EPMC9299287 | biostudies-literature
2024-06-05 | GSE32030 | GEO
| S-EPMC2335278 | biostudies-literature
| S-EPMC2759080 | biostudies-literature
| S-EPMC3432337 | biostudies-other
| S-EPMC3873963 | biostudies-literature
| S-EPMC4870158 | biostudies-literature
| S-EPMC6366007 | biostudies-literature
| S-EPMC6040203 | biostudies-literature
| S-EPMC3464654 | biostudies-literature