Unknown

Dataset Information

0

The discovery of novel protein-coding features in mouse genome based on mass spectrometry data.


ABSTRACT: Identifying protein-coding genes in eukaryotic genomes remains a challenge in post-genome era due to the complex gene models. We applied a proteogenomics strategy to detect un-annotated protein-coding regions in mouse genome. High-accuracy tandem mass spectrometry (MS/MS) data from diverse mouse samples were generated by LTQ-Orbitrap mass spectrometer in house. Two searchable diagnostic proteomic datasets were constructed, one with all possible encoding exon junctions, and the other with all putative encoding exons, for the discovery of novel exon splicing events and novel uninterrupted protein-coding regions. Altogether 29,586 unique peptides were identified. Aligning backwards to the mouse genome, the translation of 4471 annotated genes was validated by the known peptides; and 172 genic events were defined in mouse genome by the novel peptides. The approach in the current work can provide substantial evidences for eukaryote genome annotation in encoding genes.

SUBMITTER: Xing XB 

PROVIDER: S-EPMC5757624 | biostudies-literature | 2011 Nov

REPOSITORIES: biostudies-literature

altmetric image

Publications

The discovery of novel protein-coding features in mouse genome based on mass spectrometry data.

Xing Xiao-Bin XB   Li Qing-Run QR   Sun Han H   Fu Xing X   Zhan Fei F   Huang Xiu X   Li Jing J   Chen Chun-Lei CL   Shyr Yu Y   Zeng Rong R   Li Yi-Xue YX   Xie Lu L  

Genomics 20110804 5


Identifying protein-coding genes in eukaryotic genomes remains a challenge in post-genome era due to the complex gene models. We applied a proteogenomics strategy to detect un-annotated protein-coding regions in mouse genome. High-accuracy tandem mass spectrometry (MS/MS) data from diverse mouse samples were generated by LTQ-Orbitrap mass spectrometer in house. Two searchable diagnostic proteomic datasets were constructed, one with all possible encoding exon junctions, and the other with all put  ...[more]

Similar Datasets

| S-EPMC2812506 | biostudies-literature
| S-EPMC3083093 | biostudies-literature
2020-10-20 | GSE157610 | GEO
| S-EPMC6414543 | biostudies-literature
| S-EPMC2773710 | biostudies-literature
2007-10-27 | GSE9437 | GEO
| S-EPMC8341206 | biostudies-literature
| S-EPMC4143176 | biostudies-literature
2021-03-24 | GSE169406 | GEO
| S-EPMC7346861 | biostudies-literature