Unknown

Dataset Information

0

REPARATION: ribosome profiling assisted (re-)annotation of bacterial genomes.


ABSTRACT: Prokaryotic genome annotation is highly dependent on automated methods, as manual curation cannot keep up with the exponential growth of sequenced genomes. Current automated methods depend heavily on sequence composition and often underestimate the complexity of the proteome. We developed RibosomeE Profiling Assisted (re-)AnnotaTION (REPARATION), a de novo machine learning algorithm that takes advantage of experimental protein synthesis evidence from ribosome profiling (Ribo-seq) to delineate translated open reading frames (ORFs) in bacteria, independent of genome annotation (https://github.com/Biobix/REPARATION). REPARATION evaluates all possible ORFs in the genome and estimates minimum thresholds based on a growth curve model to screen for spurious ORFs. We applied REPARATION to three annotated bacterial species to obtain a more comprehensive mapping of their translation landscape in support of experimental data. In all cases, we identified hundreds of novel (small) ORFs including variants of previously annotated ORFs and >70% of all (variants of) annotated protein coding ORFs were predicted by REPARATION to be translated. Our predictions are supported by matching mass spectrometry proteomics data, sequence composition and conservation analysis. REPARATION is unique in that it makes use of experimental translation evidence to intrinsically perform a de novo ORF delineation in bacterial genomes irrespective of the sequence features linked to open reading frames.

SUBMITTER: Ndah E 

PROVIDER: S-EPMC5714196 | biostudies-literature | 2017 Nov

REPOSITORIES: biostudies-literature

altmetric image

Publications

REPARATION: ribosome profiling assisted (re-)annotation of bacterial genomes.

Ndah Elvis E   Jonckheere Veronique V   Giess Adam A   Valen Eivind E   Menschaert Gerben G   Van Damme Petra P  

Nucleic acids research 20171101 20


Prokaryotic genome annotation is highly dependent on automated methods, as manual curation cannot keep up with the exponential growth of sequenced genomes. Current automated methods depend heavily on sequence composition and often underestimate the complexity of the proteome. We developed RibosomeE Profiling Assisted (re-)AnnotaTION (REPARATION), a de novo machine learning algorithm that takes advantage of experimental protein synthesis evidence from ribosome profiling (Ribo-seq) to delineate tr  ...[more]

Similar Datasets

2017-01-30 | GSE91066 | GEO
2017-08-09 | PXD005844 | Pride
2017-08-03 | PXD005901 | Pride
| S-EPMC7115971 | biostudies-literature
| S-EPMC6305970 | biostudies-literature
| S-EPMC3548604 | biostudies-literature
| S-EPMC3775365 | biostudies-literature
| S-EPMC6007384 | biostudies-literature
| S-EPMC4644489 | biostudies-literature
| S-EPMC4666370 | biostudies-literature