Unknown

Dataset Information

0

Hayai-Annotation Plants: an ultra-fast and comprehensive functional gene annotation system in plants.


ABSTRACT:

Summary

Hayai-Annotation Plants is a browser-based interface for an ultra-fast and accurate functional gene annotation system for plant species using R. The pipeline combines the sequence-similarity searches, using USEARCH against UniProtKB (taxonomy Embryophyta), with a functional annotation step. Hayai-Annotation Plants provides five layers of annotation: i) protein name; ii) gene ontology terms consisting of its three main domains (Biological Process, Molecular Function and Cellular Component); iii) enzyme commission number; iv) protein existence level; and v) evidence type. It implements a new algorithm that gives priority to protein existence level to propagate GO and EC information and annotated Arabidopsis thaliana representative peptide sequences (Araport11) within 5 min at the PC level.

Availability and implementation

The software is implemented in R and runs on Macintosh and Linux systems. It is freely available at https://github.com/kdri-genomics/Hayai-Annotation-Plants under the GPLv3 license.

Supplementary information

Supplementary data are available at Bioinformatics online.

SUBMITTER: Ghelfi A 

PROVIDER: S-EPMC6821316 | biostudies-literature | 2019 Nov

REPOSITORIES: biostudies-literature

altmetric image

Publications

Hayai-Annotation Plants: an ultra-fast and comprehensive functional gene annotation system in plants.

Ghelfi Andrea A   Shirasawa Kenta K   Hirakawa Hideki H   Isobe Sachiko S  

Bioinformatics (Oxford, England) 20191101 21


<h4>Summary</h4>Hayai-Annotation Plants is a browser-based interface for an ultra-fast and accurate functional gene annotation system for plant species using R. The pipeline combines the sequence-similarity searches, using USEARCH against UniProtKB (taxonomy Embryophyta), with a functional annotation step. Hayai-Annotation Plants provides five layers of annotation: i) protein name; ii) gene ontology terms consisting of its three main domains (Biological Process, Molecular Function and Cellular C  ...[more]

Similar Datasets

| S-EPMC8016466 | biostudies-literature
2016-12-06 | GSE60865 | GEO
| S-EPMC6041978 | biostudies-literature
| S-EPMC2703922 | biostudies-literature
| S-EPMC4099352 | biostudies-literature
2014-01-28 | E-GEOD-51621 | biostudies-arrayexpress
| S-EPMC6446249 | biostudies-literature
| S-EPMC2529462 | biostudies-literature
2014-01-28 | GSE51621 | GEO
| S-EPMC3805979 | biostudies-literature