Unknown

Dataset Information

0

PUmPER: phylogenies updated perpetually.


ABSTRACT:

Summary

New sequence data useful for phylogenetic and evolutionary analyses continues to be added to public databases. The construction of multiple sequence alignments and inference of huge phylogenies comprising large taxonomic groups are expensive tasks, both in terms of man hours and computational resources. Therefore, maintaining comprehensive phylogenies, based on representative and up-to-date molecular sequences, is challenging. PUmPER is a framework that can perpetually construct multi-gene alignments (with PHLAWD) and phylogenetic trees (with ExaML or RAxML-Light) for a given NCBI taxonomic group. When sufficient numbers of new gene sequences for the selected taxonomic group have accumulated in GenBank, PUmPER automatically extends the alignment and infers extended phylogenetic trees by using previously inferred smaller trees as starting topologies. Using our framework, large phylogenetic trees can be perpetually updated without human intervention. Importantly, resulting phylogenies are not statistically significantly worse than trees inferred from scratch.

Availability and implementation

PUmPER can run in stand-alone mode on a single server, or offload the computationally expensive phylogenetic searches to a parallel computing cluster. Source code, documentation, and tutorials are available at https://github.com/fizquierdo/perpetually-updated-trees.

Contact

Fernando.Izquierdo@h-its.org

Supplementary information

Supplementary Material is available at Bioinformatics online.

SUBMITTER: Izquierdo-Carrasco F 

PROVIDER: S-EPMC4016711 | biostudies-literature | 2014 May

REPOSITORIES: biostudies-literature

altmetric image

Publications

PUmPER: phylogenies updated perpetually.

Izquierdo-Carrasco Fernando F   Cazes John J   Smith Stephen A SA   Stamatakis Alexandros A  

Bioinformatics (Oxford, England) 20140128 10


<h4>Summary</h4>New sequence data useful for phylogenetic and evolutionary analyses continues to be added to public databases. The construction of multiple sequence alignments and inference of huge phylogenies comprising large taxonomic groups are expensive tasks, both in terms of man hours and computational resources. Therefore, maintaining comprehensive phylogenies, based on representative and up-to-date molecular sequences, is challenging. PUmPER is a framework that can perpetually construct  ...[more]

Similar Datasets

| S-EPMC3991718 | biostudies-literature
| PRJEB32949 | ENA
2022-12-31 | GSE148211 | GEO
2022-09-26 | GSE213704 | GEO
| PRJNA882085 | ENA
| PRJNA928771 | ENA
| PRJNA623506 | ENA
| PRJEB52397 | ENA
| S-EPMC6420041 | biostudies-literature
| S-EPMC4796935 | biostudies-literature