Unknown

Dataset Information

0

Using intron position conservation for homology-based gene prediction.


ABSTRACT: Annotation of protein-coding genes is very important in bioinformatics and biology and has a decisive influence on many downstream analyses. Homology-based gene prediction programs allow for transferring knowledge about protein-coding genes from an annotated organism to an organism of interest.Here, we present a homology-based gene prediction program called GeMoMa. GeMoMa utilizes the conservation of intron positions within genes to predict related genes in other organisms. We assess the performance of GeMoMa and compare it with state-of-the-art competitors on plant and animal genomes using an extended best reciprocal hit approach. We find that GeMoMa often makes more precise predictions than its competitors yielding a substantially increased number of correct transcripts. Subsequently, we exemplarily validate GeMoMa predictions using Sanger sequencing. Finally, we use RNA-seq data to compare the predictions of homology-based gene prediction programs, and find again that GeMoMa performs well.Hence, we conclude that exploiting intron position conservation improves homology-based gene prediction, and we make GeMoMa freely available as command-line tool and Galaxy integration.

SUBMITTER: Keilwagen J 

PROVIDER: S-EPMC4872089 | biostudies-literature | 2016 May

REPOSITORIES: biostudies-literature

altmetric image

Publications

Using intron position conservation for homology-based gene prediction.

Keilwagen Jens J   Wenk Michael M   Erickson Jessica L JL   Schattat Martin H MH   Grau Jan J   Hartung Frank F  

Nucleic acids research 20160217 9


Annotation of protein-coding genes is very important in bioinformatics and biology and has a decisive influence on many downstream analyses. Homology-based gene prediction programs allow for transferring knowledge about protein-coding genes from an annotated organism to an organism of interest.Here, we present a homology-based gene prediction program called GeMoMa. GeMoMa utilizes the conservation of intron positions within genes to predict related genes in other organisms. We assess the perform  ...[more]

Similar Datasets

| S-EPMC2996940 | biostudies-literature
| S-EPMC5378888 | biostudies-literature
2021-04-07 | GSE171636 | GEO
| S-EPMC1950532 | biostudies-literature
| S-EPMC1274302 | biostudies-literature
| S-EPMC4229973 | biostudies-literature
| S-EPMC4417613 | biostudies-literature
| S-EPMC8388039 | biostudies-literature
| S-EPMC1243800 | biostudies-literature
| S-EPMC5975413 | biostudies-literature