Unknown

Dataset Information

0

Differences in Performance among Test Statistics for Assessing Phylogenomic Model Adequacy.


ABSTRACT: Statistical phylogenetic analyses of genomic data depend on models of nucleotide or amino acid substitution. The adequacy of these substitution models can be assessed using a number of test statistics, allowing the model to be rejected when it is found to provide a poor description of the evolutionary process. A potentially valuable use of model-adequacy test statistics is to identify when data sets are likely to produce unreliable phylogenetic estimates, but their differences in performance are rarely explored. We performed a comprehensive simulation study to identify test statistics that are sensitive to some of the most commonly cited sources of phylogenetic estimation error. Our results show that, for many test statistics, traditional thresholds for assessing model adequacy can fail to reject the model when the phylogenetic inferences are inaccurate and imprecise. This is particularly problematic when analysing loci that have few informative sites. We propose new thresholds for assessing substitution model adequacy and demonstrate their effectiveness in analyses of three phylogenomic data sets. These thresholds lead to frequent rejection of the model for loci that yield topological inferences that are imprecise and are likely to be inaccurate. We also propose the use of a summary statistic that provides a practical assessment of overall model adequacy. Our approach offers a promising means of enhancing model choice in genome-scale data sets, potentially leading to improvements in the reliability of phylogenomic inference.

SUBMITTER: Duchene DA 

PROVIDER: S-EPMC6007652 | biostudies-literature | 2018 Jun

REPOSITORIES: biostudies-literature

altmetric image

Publications

Differences in Performance among Test Statistics for Assessing Phylogenomic Model Adequacy.

Duchêne David A DA   Duchêne Sebastian S   Ho Simon Y W SYW  

Genome biology and evolution 20180601 6


Statistical phylogenetic analyses of genomic data depend on models of nucleotide or amino acid substitution. The adequacy of these substitution models can be assessed using a number of test statistics, allowing the model to be rejected when it is found to provide a poor description of the evolutionary process. A potentially valuable use of model-adequacy test statistics is to identify when data sets are likely to produce unreliable phylogenetic estimates, but their differences in performance are  ...[more]

Similar Datasets

| S-EPMC6859642 | biostudies-literature
| S-EPMC9312427 | biostudies-literature
| S-EPMC7293575 | biostudies-literature
| S-EPMC1184045 | biostudies-literature
| S-EPMC4022267 | biostudies-literature
| S-EPMC5089055 | biostudies-literature
| S-EPMC4060759 | biostudies-literature
| S-EPMC8633771 | biostudies-literature
| S-EPMC6212419 | biostudies-literature
| S-EPMC5901117 | biostudies-literature