Advancing Trypanosoma brucei genome annotation through ribosome profiling and spliced leader mapping
Ontology highlight
ABSTRACT: Since the initial publication of the trypanosomatid genomes, curation has been ongoing. Here we apply the technique of ribosome profiling to Trypanosoma brucei, identifying 223 new coding regions by virtue of ribosome occupancy in the corresponding transcripts. A small number of these putative genes correspond to extra copies of previously annotated genes but 85% are novel. The median size of these novels CDSs is small (74 aa) indicating that past annotation work has excelled at detecting large CDSs. Of the unique CDSs discovered here, over half have candidate orthologues in other trypanosomatid genomes, most of which were not yet annotated as genes. Still, approximately one-third of the new CDSs were found only in T. brucei subspecies. When combined with RNA-seq and spliced leader mapping, we were able to definitively revise the start sites for 430 CDSs as compared to the current gene models. Such data also allowed us to use a structured approach to eliminate 701 putative genes as protein-coding. Finally, the data pointed to several regions of the genome that had sequence errors that altered coding region boundaries.
ORGANISM(S): Trypanosoma brucei
PROVIDER: GSE72463 | GEO | 2015/12/22
SECONDARY ACCESSION(S): PRJNA294100
REPOSITORIES: GEO
ACCESS DATA