Proteomics

Dataset Information

0

Deep Human Proteome Sequencing Enables Global Detection of Mutations and Alternative Splicing


ABSTRACT: Mass spectrometry-based proteomics now routinely enables identification of over 10,000 human proteins from a single sample. However, proteins are typically identified by peptide sequences representing about 20% of all proteinogenic amino acids encoded in the transcriptome. Deeper protein sequencing - detection of all amino acids - is imperative for proteoform discovery and quantitative comparison. Here, we utilized six ENCODE cell lines, six proteases, and three tandem mass spectrometry (MS/MS) fragmentation methods to collect 2,491 raw MS data files. From these data we identified 17,717 protein groups with a median sequence coverage of 79.2%, confirming over eight million unique human amino acid residues. We compare our proteomics data with transcriptomics data and demonstrate how such deep proteome coverage can enable detection of over 7,000 proteoforms including 70.9 to 90.6% of all non-synonymous mutations and over 5,000 alternative splicing event junctions. Our dataset represents a valuable resource as the largest human proteome with the highest sequence coverage ever reported.

INSTRUMENT(S): Orbitrap Fusion Lumos, Orbitrap Fusion

ORGANISM(S): Homo Sapiens (ncbitaxon:9606)

SUBMITTER: Joshua Coon   Juergen Cox  

PROVIDER: MSV000086944 | MassIVE | Wed Feb 24 09:31:00 GMT 2021

SECONDARY ACCESSION(S): PXD024364

REPOSITORIES: MassIVE

Dataset's files

Source:
Action DRS
Other
Items per page:
1 - 1 of 1

Similar Datasets

2018-05-04 | GSE113990 | GEO
2013-05-06 | GSE46651 | GEO
2013-04-06 | GSE45556 | GEO
| PRJNA745484 | ENA
2021-04-01 | PXD023921 | Pride
2014-08-22 | GSE60588 | GEO
2021-02-26 | GSE116199 | GEO
2010-10-15 | E-GEOD-24626 | biostudies-arrayexpress
2010-10-15 | E-GEOD-24699 | biostudies-arrayexpress
2016-02-27 | GSE78700 | GEO