Dataset Information

Normalization and statistical analysis of quantitative proteomics data generated by metabolic labeling.

ABSTRACT: Comparative proteomics is a powerful analytical method for learning about the responses of biological systems to changes in growth parameters. To make confident inferences about biological responses, proteomics approaches must incorporate appropriate statistical measures of quantitative data. In the present work we applied microarray-based normalization and statistical analysis (significance testing) methods to analyze quantitative proteomics data generated from the metabolic labeling of a marine bacterium (Sphingopyxis alaskensis). Quantitative data were generated for 1,172 proteins, representing 1,736 high confidence protein identifications (54% genome coverage). To test approaches for normalization, cells were grown at a single temperature, metabolically labeled with (14)N or (15)N, and combined in different ratios to give an artificially skewed data set. Inspection of ratio versus average (MA) plots determined that a fixed value median normalization was most suitable for the data. To determine an appropriate statistical method for assessing differential abundance, a -fold change approach, Student's t test, unmoderated t test, and empirical Bayes moderated t test were applied to proteomics data from cells grown at two temperatures. Inverse metabolic labeling was used with multiple technical and biological replicates, and proteomics was performed on cells that were combined based on equal optical density of cultures (providing skewed data) or on cell extracts that were combined to give equal amounts of protein (no skew). To account for arbitrarily complex experiment-specific parameters, a linear modeling approach was used to analyze the data using the limma package in R/Bioconductor. A high quality list of statistically significant differentially abundant proteins was obtained by using lowess normalization (after inspection of MA plots) and applying the empirical Bayes moderated t test. The approach also effectively controlled for the number of false discoveries and corrected for the multiple testing problem using the Storey-Tibshirani false discovery rate (Storey, J. D., and Tibshirani, R. (2003) Statistical significance for genomewide studies. Proc. Natl. Acad. Sci. U.S.A. 100, 9440-9445). The approach we have developed is generally applicable to quantitative proteomics analyses of diverse biological systems.

SUBMITTER: Ting L

PROVIDER: S-EPMC2758752 | biostudies-literature | 2009 Oct

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Normalization and statistical analysis of quantitative proteomics data generated by metabolic labeling.

Ting Lily L Cowley Mark J MJ Hoon Seah Lay SL Guilhaus Michael M Raftery Mark J MJ Cavicchioli Ricardo R

Molecular & cellular proteomics : MCP 20090714 10

Comparative proteomics is a powerful analytical method for learning about the responses of biological systems to changes in growth parameters. To make confident inferences about biological responses, proteomics approaches must incorporate appropriate statistical measures of quantitative data. In the present work we applied microarray-based normalization and statistical analysis (significance testing) methods to analyze quantitative proteomics data generated from the metabolic labeling of a marin ...[more]

PMID: 19605365

Dataset Information

Normalization and statistical analysis of quantitative proteomics data generated by metabolic labeling.

Publications

Normalization and statistical analysis of quantitative proteomics data generated by metabolic labeling.

OmicsDI is part of the ELIXIR infrastructure

Tweets