Unknown

Dataset Information

0

TreeGrafter: phylogenetic tree-based annotation of proteins with Gene Ontology terms and other annotations.


ABSTRACT:

Summary

TreeGrafter is a new software tool for annotating protein sequences using pre-annotated phylogenetic trees. Currently, the tool provides annotations to Gene Ontology (GO) terms, and PANTHER family and subfamily. The approach is generalizable to any annotations that have been made to internal nodes of a reference phylogenetic tree. TreeGrafter takes each input query protein sequence, finds the best matching homologous family in a library of pre-calculated, pre-annotated gene trees, and then grafts it to the best location in the tree. It then annotates the sequence by propagating annotations from ancestral nodes in the reference tree. We show that TreeGrafter outperforms subfamily HMM scoring for correctly assigning subfamily membership, and that it produces highly specific annotations of GO terms based on annotated reference phylogenetic trees. This method will be further integrated into InterProScan, enabling an even broader user community.

Availability and implementation

TreeGrafter is freely available on the web at https://github.com/pantherdb/TreeGrafter, including as a Docker image.

Supplementary information

Supplementary data are available at Bioinformatics online.

SUBMITTER: Tang H 

PROVIDER: S-EPMC6361231 | biostudies-literature | 2019 Feb

REPOSITORIES: biostudies-literature

altmetric image

Publications

TreeGrafter: phylogenetic tree-based annotation of proteins with Gene Ontology terms and other annotations.

Tang Haiming H   Finn Robert D RD   Thomas Paul D PD  

Bioinformatics (Oxford, England) 20190201 3


<h4>Summary</h4>TreeGrafter is a new software tool for annotating protein sequences using pre-annotated phylogenetic trees. Currently, the tool provides annotations to Gene Ontology (GO) terms, and PANTHER family and subfamily. The approach is generalizable to any annotations that have been made to internal nodes of a reference phylogenetic tree. TreeGrafter takes each input query protein sequence, finds the best matching homologous family in a library of pre-calculated, pre-annotated gene trees  ...[more]

Similar Datasets

| S-EPMC3178059 | biostudies-literature
| S-EPMC517493 | biostudies-literature
| S-EPMC3337258 | biostudies-literature
| S-EPMC2882390 | biostudies-literature
| S-EPMC5199145 | biostudies-literature
| S-EPMC168962 | biostudies-literature
| S-EPMC2686450 | biostudies-literature
| S-EPMC2652876 | biostudies-literature
| S-EPMC2655092 | biostudies-literature
| S-EPMC7536087 | biostudies-literature