Phylogenetic Permulations: A Statistically Rigorous Approach to Measure Confidence in Associations in a Phylogenetic Context.
Ontology highlight
ABSTRACT: Many evolutionary comparative methods seek to identify associations between phenotypic traits or between traits and genotypes, often with the goal of inferring potential functional relationships between them. Comparative genomics methods aimed at this goal measure the association between evolutionary changes at the genetic level with traits evolving convergently across phylogenetic lineages. However, these methods have complex statistical behaviors that are influenced by nontrivial and oftentimes unknown confounding factors. Consequently, using standard statistical analyses in interpreting the outputs of these methods leads to potentially inaccurate conclusions. Here, we introduce phylogenetic permulations, a novel statistical strategy that combines phylogenetic simulations and permutations to calculate accurate, unbiased P values from phylogenetic methods. Permulations construct the null expectation for P values from a given phylogenetic method by empirically generating null phenotypes. Subsequently, empirical P values that capture the true statistical confidence given the correlation structure in the data are directly calculated based on the empirical null expectation. We examine the performance of permulation methods by analyzing both binary and continuous phenotypes, including marine, subterranean, and long-lived large-bodied mammal phenotypes. Our results reveal that permulations improve the statistical power of phylogenetic analyses and correctly calibrate statements of confidence in rejecting complex null distributions while maintaining or improving the enrichment of known functions related to the phenotype. We also find that permulations refine pathway enrichment analyses by correcting for nonindependence in gene ranks. Our results demonstrate that permulations are a powerful tool for improving statistical confidence in the conclusions of phylogenetic analysis when the parametric null is unknown.
SUBMITTER: Saputra E
PROVIDER: S-EPMC8233500 | biostudies-literature |
REPOSITORIES: biostudies-literature
ACCESS DATA