Dataset Information

TREE2FASTA: a flexible Perl script for batch extraction of FASTA sequences from exploratory phylogenetic trees.

ABSTRACT: The body of DNA sequence data lacking taxonomically informative sequence headers is rapidly growing in user and public databases (e.g. sequences lacking identification and contaminants). In the context of systematics studies, sorting such sequence data for taxonomic curation and/or molecular diversity characterization (e.g. crypticism) often requires the building of exploratory phylogenetic trees with reference taxa. The subsequent step of segregating DNA sequences of interest based on observed topological relationships can represent a challenging task, especially for large datasets.We have written TREE2FASTA, a Perl script that enables and expedites the sorting of FASTA-formatted sequence data from exploratory phylogenetic trees. TREE2FASTA takes advantage of the interactive, rapid point-and-click color selection and/or annotations of tree leaves in the popular Java tree-viewer FigTree to segregate groups of FASTA sequences of interest to separate files. TREE2FASTA allows for both simple and nested segregation designs to facilitate the simultaneous preparation of multiple data sets that may overlap in sequence content.

SUBMITTER: Sauvage T

PROVIDER: S-EPMC5838971 | biostudies-literature | 2018 Mar

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

TREE2FASTA: a flexible Perl script for batch extraction of FASTA sequences from exploratory phylogenetic trees.

Sauvage Thomas T Plouviez Sophie S Schmidt William E WE Fredericq Suzanne S

BMC research notes 20180305 1

<h4>Objective</h4>The body of DNA sequence data lacking taxonomically informative sequence headers is rapidly growing in user and public databases (e.g. sequences lacking identification and contaminants). In the context of systematics studies, sorting such sequence data for taxonomic curation and/or molecular diversity characterization (e.g. crypticism) often requires the building of exploratory phylogenetic trees with reference taxa. The subsequent step of segregating DNA sequences of interest ...[more]

PMID: 29506565

Similar Datasets

Project description:Phylogenetic trees illustrate evolutionary relationships between taxa or genes. Tree figures are crucial when presenting results and data, and by creating clear and effective plots, researchers can describe many kinds of evolutionary patterns. However, producing tree plots can be a time-consuming task, especially as multiple different programs are often needed to adjust and illustrate all data associated with a tree. We present TreeViewer, a new software to draw phylogenetic trees. TreeViewer is flexible, modular, and user-friendly. Plots are produced as the result of a user-defined pipeline, which can be finely customised and easily applied to different trees. Every feature of the program is documented and easily accessible, either in the online manual or within the program's interface. We show how TreeViewer can be used to produce publication-ready figures, saving time by not requiring additional graphical post-processing tools. TreeViewer is freely available for Windows, macOS, and Linux operating systems and distributed under an AGPLv3 licence from https://treeviewer.org. It has a graphical user interface (GUI), as well as a command-line interface, which is useful to work with very large trees and for automated pipelines. A detailed user manual with examples and tutorials is also available. TreeViewer is mainly aimed at users wishing to produce highly customised, publication-quality tree figures using a single GUI software tool. Compared to other GUI tools, TreeViewer offers a richer feature set and a finer degree of customisation. Compared to command-line-based tools and software libraries, TreeViewer's graphical interface is more accessible. The flexibility of TreeViewer's approach to phylogenetic tree plotting enables the program to produce a wide variety of publication-ready figures. Users are encouraged to create their own custom modules to expand the functionalities of the program. This sets the scene for an ever-expanding and ever-adapting software framework that can easily adjust to respond to new challenges.

Dataset Information

TREE2FASTA: a flexible Perl script for batch extraction of FASTA sequences from exploratory phylogenetic trees.

Publications

TREE2FASTA: a flexible Perl script for batch extraction of FASTA sequences from exploratory phylogenetic trees.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets