Dataset Information

Context-dependent codon partition models provide significant increases in model fit in atpB and rbcL protein-coding genes.

ABSTRACT:

Background

Accurate modelling of substitution processes in protein-coding sequences is often hampered by the computational burdens associated with full codon models. Lately, codon partition models have been proposed as a viable alternative, mimicking the substitution behaviour of codon models at a low computational cost. Such codon partition models however impose independent evolution of the different codon positions, which is overly restrictive from a biological point of view. Given that empirical research has provided indications of context-dependent substitution patterns at four-fold degenerate sites, we take those indications into account in this paper.

Results

We present so-called context-dependent codon partition models to assess previous empirical claims that the evolution of four-fold degenerate sites is strongly dependent on the composition of its two flanking bases. To this end, we have estimated and compared various existing independent models, codon models, codon partition models and context-dependent codon partition models for the atpB and rbcL genes of the chloroplast genome, which are frequently used in plant systematics. Such context-dependent codon partition models employ a full dependency scheme for four-fold degenerate sites, whilst maintaining the independence assumption for the first and second codon positions.

Conclusions

We show that, both in the atpB and rbcL alignments of a collection of land plants, these context-dependent codon partition models significantly improve model fit over existing codon partition models. Using Bayes factors based on thermodynamic integration, we show that in both datasets the same context-dependent codon partition model yields the largest increase in model fit compared to an independent evolutionary model. Context-dependent codon partition models hence perform closer to codon models, which remain the best performing models at a drastically increased computational cost, compared to codon partition models, but remain computationally interesting alternatives to codon models. Finally, we observe that the substitution patterns in both datasets are drastically different, leading to the conclusion that combined analysis of these two genes using a single model may not be advisable from a context-dependent point of view.

SUBMITTER: Baele G

PROVIDER: S-EPMC3126739 | biostudies-literature | 2011 May

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Context-dependent codon partition models provide significant increases in model fit in atpB and rbcL protein-coding genes.

Baele Guy G Van de Peer Yves Y Vansteelandt Stijn S

BMC evolutionary biology 20110527

<h4>Background</h4>Accurate modelling of substitution processes in protein-coding sequences is often hampered by the computational burdens associated with full codon models. Lately, codon partition models have been proposed as a viable alternative, mimicking the substitution behaviour of codon models at a low computational cost. Such codon partition models however impose independent evolution of the different codon positions, which is overly restrictive from a biological point of view. Given tha ...[more]

PMID: 21619569

Similar Datasets

Project description:BackgroundRutaceae subfamily Rutoideae (46 genera, c. 660 species) is diverse in both rainforests and sclerophyll vegetation of Australasia. Australia and New Caledonia are centres of endemism with a number of genera and species distributed disjunctly between the two regions. Our aim was to generate a high-level molecular phylogeny for the Australasian Rutoideae and identify major clades as a framework for assessing morphological and biogeographic patterns and taxonomy.Methodology/principal findingsPhylogenetic analyses were based on chloroplast genes, rbcL and atpB, for 108 samples (78 new here), including 38 of 46 Australasian genera. Results were integrated with those from other molecular studies to produce a supertree for Rutaceae worldwide, including 115 of 154 genera. Australasian clades are poorly matched with existing tribal classifications, and genera Philotheca and Boronia are not monophyletic. Major sclerophyll lineages in Australia belong to two separate clades, each with an early divergence between rainforest and sclerophyll taxa. Dehiscent fruits with seeds ejected at maturity (often associated with myrmecochory) are inferred as ancestral; derived states include woody capsules with winged seeds, samaras, fleshy drupes, and retention and display of seeds in dehisced fruits (the last two states adaptations to bird dispersal, with multiple origins among rainforest genera). Patterns of relationship and levels of sequence divergence in some taxa, mostly species, with bird-dispersed (Acronychia, Sarcomelicope, Halfordia and Melicope) or winged (Flindersia) seeds are consistent with recent long-distance dispersal between Australia and New Caledonia. Other deeper Australian/New Caledonian divergences, some involving ant-dispersed taxa (e.g., Neoschmidia), suggest older vicariance.Conclusions/significanceThis comprehensive molecular phylogeny of the Australasian Rutoideae gives a broad overview of the group's evolutionary and biogeographic history. Deficiencies of infrafamilial classifications of Rutoideae have long been recognised, and our results provide a basis for taxonomic revision and a necessary framework for more focused studies of genera and species.

Dataset Information

Context-dependent codon partition models provide significant increases in model fit in atpB and rbcL protein-coding genes.

Background

Results

Conclusions

Publications

Context-dependent codon partition models provide significant increases in model fit in atpB and rbcL protein-coding genes.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets