GToTree: a user-friendly workflow for phylogenomics.
Ontology highlight
ABSTRACT: SUMMARY:Genome-level evolutionary inference (i.e. phylogenomics) is becoming an increasingly essential step in many biologists' work. Accordingly, there are several tools available for the major steps in a phylogenomics workflow. But for the biologist whose main focus is not bioinformatics, much of the computational work required-such as accessing genomic data on large scales, integrating genomes from different file formats, performing required filtering, stitching different tools together etc.-can be prohibitive. Here I introduce GToTree, a command-line tool that can take any combination of fasta files, GenBank files and/or NCBI assembly accessions as input and outputs an alignment file, estimates of genome completeness and redundancy, and a phylogenomic tree based on a specified single-copy gene (SCG) set. Although GToTree can work with any custom hidden Markov Models (HMMs), also included are 13 newly generated SCG-set HMMs for different lineages and levels of resolution, built based on searches of ?12 000 bacterial and archaeal high-quality genomes. GToTree aims to give more researchers the capability to make phylogenomic trees. AVAILABILITY AND IMPLEMENTATION:GToTree is open-source and freely available for download from: github.com/AstrobioMike/GToTree. It is implemented primarily in bash with helper scripts written in python. SUPPLEMENTARY INFORMATION:Supplementary data are available at Bioinformatics online.
SUBMITTER: Lee MD
PROVIDER: S-EPMC6792077 | biostudies-literature | 2019 Oct
REPOSITORIES: biostudies-literature
ACCESS DATA