Unknown

Dataset Information

0

Mammalian Annotation Database for improved annotation and functional classification of Omics datasets from less well-annotated organisms.


ABSTRACT: Next-generation sequencing technologies and the availability of an increasing number of mammalian and other genomes allow gene expression studies, particularly RNA sequencing, in many non-model organisms. However, incomplete genome annotation and assignments of genes to functional annotation databases can lead to a substantial loss of information in downstream data analysis. To overcome this, we developed Mammalian Annotation Database tool (MAdb, https://madb.ethz.ch) to conveniently provide homologous gene information for selected mammalian species. The assignment between species is performed in three steps: (i) matching official gene symbols, (ii) using ortholog information contained in Ensembl Compara and (iii) pairwise BLAST comparisons of all transcripts. In addition, we developed a new tool (AnnOverlappeR) for the reliable assignment of the National Center for Biotechnology Information (NCBI) and Ensembl gene IDs. The gene lists translated to gene IDs of well-annotated species such as a human can be used for improved functional annotation with relevant tools based on Gene Ontology and molecular pathway information. We tested the MAdb on a published RNA-seq data set for the pig and showed clearly improved overrepresentation analysis results based on the assigned human homologous gene identifiers. Using the MAdb revealed a similar list of human homologous genes and functional annotation results regardless of whether starting with gene IDs from NCBI or Ensembl. The MAdb database is accessible via a web interface and a Galaxy application.

SUBMITTER: Bick JT 

PROVIDER: S-EPMC6661403 | biostudies-literature | 2019 Jan

REPOSITORIES: biostudies-literature

altmetric image

Publications

Mammalian Annotation Database for improved annotation and functional classification of Omics datasets from less well-annotated organisms.

Bick Jochen T JT   Zeng Shuqin S   Robinson Mark D MD   Ulbrich Susanne E SE   Bauersachs Stefan S  

Database : the journal of biological databases and curation 20190101


Next-generation sequencing technologies and the availability of an increasing number of mammalian and other genomes allow gene expression studies, particularly RNA sequencing, in many non-model organisms. However, incomplete genome annotation and assignments of genes to functional annotation databases can lead to a substantial loss of information in downstream data analysis. To overcome this, we developed Mammalian Annotation Database tool (MAdb, https://madb.ethz.ch) to conveniently provide hom  ...[more]

Similar Datasets

| S-EPMC29833 | biostudies-literature
| S-EPMC5820610 | biostudies-literature
| S-EPMC7359216 | biostudies-literature
| S-EPMC6341679 | biostudies-literature
| S-EPMC9034014 | biostudies-literature
2021-08-17 | GSE182079 | GEO
2021-08-18 | GSE182128 | GEO
2021-08-15 | GSE181791 | GEO
2021-08-15 | GSE181775 | GEO
2021-08-15 | GSE181772 | GEO