Dataset Information

Statistical approach to protein quantification.

ABSTRACT: A major goal in proteomics is the comprehensive and accurate description of a proteome. This task includes not only the identification of proteins in a sample, but also the accurate quantification of their abundance. Although mass spectrometry typically provides information on peptide identity and abundance in a sample, it does not directly measure the concentration of the corresponding proteins. Specifically, most mass-spectrometry-based approaches (e.g. shotgun proteomics or selected reaction monitoring) allow one to quantify peptides using chromatographic peak intensities or spectral counting information. Ultimately, based on these measurements, one wants to infer the concentrations of the corresponding proteins. Inferring properties of the proteins based on experimental peptide evidence is often a complex problem because of the ambiguity of peptide assignments and different chemical properties of the peptides that affect the observed concentrations. We present SCAMPI, a novel generic and statistically sound framework for computing protein abundance scores based on quantified peptides. In contrast to most previous approaches, our model explicitly includes information from shared peptides to improve protein quantitation, especially in eukaryotes with many homologous sequences. The model accounts for uncertainty in the input data, leading to statistical prediction intervals for the protein scores. Furthermore, peptides with extreme abundances can be reassessed and classified as either regular data points or actual outliers. We used the proposed model with several datasets and compared its performance to that of other, previously used approaches for protein quantification in bottom-up mass spectrometry.

SUBMITTER: Gerster S

PROVIDER: S-EPMC3916661 | biostudies-literature | 2014 Feb

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Statistical approach to protein quantification.

Gerster Sarah S Kwon Taejoon T Ludwig Christina C Matondo Mariette M Vogel Christine C Marcotte Edward M EM Aebersold Ruedi R Bühlmann Peter P

Molecular & cellular proteomics : MCP 20131119 2

A major goal in proteomics is the comprehensive and accurate description of a proteome. This task includes not only the identification of proteins in a sample, but also the accurate quantification of their abundance. Although mass spectrometry typically provides information on peptide identity and abundance in a sample, it does not directly measure the concentration of the corresponding proteins. Specifically, most mass-spectrometry-based approaches (e.g. shotgun proteomics or selected reaction ...[more]

PMID: 24255132

Similar Datasets

Project description:BACKGROUND: According to Regulation (EU) No 619/2011, trace amounts of non-authorised genetically modified organisms (GMO) in feed are tolerated within the EU if certain prerequisites are met. Tolerable traces must not exceed the so-called 'minimum required performance limit' (MRPL), which was defined according to the mentioned regulation to correspond to 0.1% mass fraction per ingredient. Therefore, not yet authorised GMO (and some GMO whose approvals have expired) have to be quantified at very low level following the qualitative detection in genomic DNA extracted from feed samples. As the results of quantitative analysis can imply severe legal and financial consequences for producers or distributors of feed, the quantification results need to be utterly reliable. RESULTS: We developed a statistical approach to investigate the experimental measurement variability within one 96-well PCR plate. This approach visualises the frequency distribution as zygosity-corrected relative content of genetically modified material resulting from different combinations of transgene and reference gene Cq values. One application of it is the simulation of the consequences of varying parameters on measurement results. Parameters could be for example replicate numbers or baseline and threshold settings, measurement results could be for example median (class) and relative standard deviation (RSD). All calculations can be done using the built-in functions of Excel without any need for programming. The developed Excel spreadsheets are available (see section 'Availability of supporting data' for details). In most cases, the combination of four PCR replicates for each of the two DNA isolations already resulted in a relative standard deviation of 15% or less. CONCLUSIONS: The aims of the study are scientifically based suggestions for minimisation of uncertainty of measurement especially in -but not limited to- the field of GMO quantification at low concentration levels. Four PCR replicates for each of the two DNA isolations seem to be a reasonable minimum number to narrow down the possible spread of results.

Dataset Information

Statistical approach to protein quantification.

Publications

Statistical approach to protein quantification.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets