Proteomics

Dataset Information

0

YPIC challenge 2018: A case study in characterizing an unknown protein sample


ABSTRACT: For the YPIC challenge 2018 contestants were invited to try to decipher two unknown English questions encoded by a synthetic protein expressed in E. coli. We present how we analyzed this unknown sample using a tryptic digest with dynamic exclusion disabled to increase the signal-to-noise ratio of the measured molecules. Subsequently, spectral clustering was used to generate high-quality consensus spectra and condense the acquired MS/MS spectral data. De novo spectrum identification was used to determine the English questions encoded by the synthetic protein, and any post-translational modifications introduced by E. coli on the synthetic protein were detected using spectral networking. Although the synthetic protein sample for the YPIC challenge 2018 is not of biological interest, the experimental and computational strategy presented here can be directly used to analyze samples for which no protein sequence information is available. All software and code to perform the bioinformatics analysis is available as open source, and a self-contained Jupyter notebook is provided to fully recreate the analysis.

INSTRUMENT(S): Q Exactive

ORGANISM(S): Escherichia Coli

SUBMITTER: Wout Bittremieux  

LAB HEAD: William Stafford Noble

PROVIDER: PXD014003 | Pride | 2019-06-11

REPOSITORIES: Pride

Dataset's files

Source:
altmetric image

Publications

2018 YPIC Challenge: A Case Study in Characterizing an Unknown Protein Sample.

Pino Lindsay L   Lin Andy A   Bittremieux Wout W  

Journal of proteome research 20191007 11


For the 2018 YPIC Challenge, contestants were invited to try to decipher two unknown English questions encoded by a synthetic protein expressed in <i>Escherichia coli</i>. In addition to deciphering the sentence, contestants were asked to determine the three-dimensional structure and detect any post-translation modifications left by the host organism. We present our experimental and computational strategy to characterize this sample by identifying the unknown protein sequence and detecting the p  ...[more]

Similar Datasets

2019-12-06 | PXD013641 | Pride
2012-08-29 | E-GEOD-40444 | biostudies-arrayexpress
2021-05-25 | PXD009861 | Pride
2010-04-01 | E-MEXP-2593 | biostudies-arrayexpress
2018-03-19 | PXD006722 | Pride
2011-08-22 | E-GEOD-27604 | biostudies-arrayexpress
2024-10-17 | PXD052863 | Pride
2023-08-25 | PXD042416 | Pride
2002-12-22 | E-GEOD-92 | biostudies-arrayexpress
2015-03-17 | PXD000474 | Pride