Proteomics

Dataset Information

0

Sequence coverage by multiple reads in shotgun proteomics to validate single amino acid variants


ABSTRACT: Mass spectrometry-based shotgun proteomics is currently based on assigning matches between mass-spectra of protein fragments resulting from protease digestion and amino acid sequences predicted from nucleic acid sequences. At the same time, the method lacks reliability in identification of every single amino acid of proteins proteome-wide. We proposed a way to interpret shotgun proteomics results, specifically in data-dependent acquisition mode, as a protein sequence coverage by multiple reads, just as it is done in the field of nucleic acid sequencing for the calling of single nucleotide variants. Multiple reads for each letter in the proteome could be provided by overlapping distinct peptides, which confirm the presence of certain amino acid residues in the overlapping stretch with much lower false discovery rate than conventional 1%. These overlapping distinct peptides were, first, miscleaved tryptic peptides in combination with their properly cleaved counterparts, and, second, the peptides generated by several proteases with different specificities after digestion of the same specimen and analyzed separately. We illustrated this approach using publicly available multiprotease proteomic datasets and in-home data for HEK-293 cell line subproteomes obtained using trypsin, LysC and GluC proteases. A general coverage of proteome in exemplary datasets, even with a single read, was 20-30% at 5-8 thousand protein groups identified. Inside this percentage, 5-7% of the whole proteome were covered at least two-fold and, thus, identified with increased reliability. Of 36 single amino acid variants identified in the HEK-293 cell line, seven variants were covered at least two-fold. The sequence coverage by multiple reads may be further increased with gain in proteome depth and the number of multiple proteases used.

INSTRUMENT(S): Orbitrap Q Exactive HF

ORGANISM(S): Homo Sapiens (ncbitaxon:9606)

SUBMITTER: Sergei Moshkovkii  

PROVIDER: MSV000088536 | MassIVE | Tue Dec 07 09:18:00 GMT 2021

SECONDARY ACCESSION(S): PXD030226

REPOSITORIES: MassIVE

Dataset's files

Source:
Action DRS
Other
Items per page:
1 - 1 of 1
altmetric image

Publications


Mass spectrometry-based proteome analysis implies matching the mass spectra of proteolytic peptides to amino acid sequences predicted from genomic sequences. Reliability of peptide variant identification in proteogenomic studies is often lacking. We propose a way to interpret shotgun proteomics results, specifically in the data-dependent acquisition mode, as protein sequence coverage by multiple reads as it is done in nucleic acid sequencing for calling of single nucleotide variants. Multiple re  ...[more]

Similar Datasets

2018-05-31 | PXD009447 | JPOST Repository
2013-04-02 | E-GEOD-33294 | biostudies-arrayexpress
2021-06-26 | GSE178932 | GEO
2013-11-01 | GSE41501 | GEO
| PRJNA616103 | ENA
2013-11-01 | E-GEOD-41501 | biostudies-arrayexpress
2008-02-23 | GSE4872 | GEO
2024-02-08 | GSE255117 | GEO
2020-04-14 | GSE133070 | GEO
2013-04-02 | GSE33294 | GEO