Unknown

Dataset Information

0

TreeMerge: a new method for improving the scalability of species tree estimation methods.


ABSTRACT: MOTIVATION:At RECOMB-CG 2018, we presented NJMerge and showed that it could be used within a divide-and-conquer framework to scale computationally intensive methods for species tree estimation to larger datasets. However, NJMerge has two significant limitations: it can fail to return a tree and, when used within the proposed divide-and-conquer framework, has O(n5) running time for datasets with n species. RESULTS:Here we present a new method called 'TreeMerge' that improves on NJMerge in two ways: it is guaranteed to return a tree and it has dramatically faster running time within the same divide-and-conquer framework-only O(n2) time. We use a simulation study to evaluate TreeMerge in the context of multi-locus species tree estimation with two leading methods, ASTRAL-III and RAxML. We find that the divide-and-conquer framework using TreeMerge has a minor impact on species tree accuracy, dramatically reduces running time, and enables both ASTRAL-III and RAxML to complete on datasets (that they would otherwise fail on), when given 64 GB of memory and 48 h maximum running time. Thus, TreeMerge is a step toward a larger vision of enabling researchers with limited computational resources to perform large-scale species tree estimation, which we call Phylogenomics for All. AVAILABILITY AND IMPLEMENTATION:TreeMerge is publicly available on Github (http://github.com/ekmolloy/treemerge). SUPPLEMENTARY INFORMATION:Supplementary data are available at Bioinformatics online.

SUBMITTER: Molloy EK 

PROVIDER: S-EPMC6612878 | biostudies-other | 2019 Jul

REPOSITORIES: biostudies-other

altmetric image

Publications

TreeMerge: a new method for improving the scalability of species tree estimation methods.

Molloy Erin K EK   Warnow Tandy T  

Bioinformatics (Oxford, England) 20190701 14


<h4>Motivation</h4>At RECOMB-CG 2018, we presented NJMerge and showed that it could be used within a divide-and-conquer framework to scale computationally intensive methods for species tree estimation to larger datasets. However, NJMerge has two significant limitations: it can fail to return a tree and, when used within the proposed divide-and-conquer framework, has O(n5) running time for datasets with n species.<h4>Results</h4>Here we present a new method called 'TreeMerge' that improves on NJM  ...[more]

Similar Datasets

| S-EPMC5841455 | biostudies-literature
| S-EPMC6700492 | biostudies-literature
| S-EPMC5998899 | biostudies-literature
| S-EPMC7161100 | biostudies-literature
| S-EPMC4147915 | biostudies-literature
| S-EPMC9787613 | biostudies-literature
| S-EPMC6264843 | biostudies-literature
| S-EPMC3796115 | biostudies-literature
| S-EPMC5993227 | biostudies-literature
| S-EPMC4604832 | biostudies-literature