Dataset Information

Metaprotein expression modeling for label-free quantitative proteomics.

ABSTRACT:

Background

Label-free quantitative proteomics holds a great deal of promise for the future study of both medicine and biology. However, the data generated is extremely intricate in its correlation structure, and its proper analysis is complex. There are issues with missing identifications. There are high levels of correlation between many, but not all, of the peptides derived from the same protein. Additionally, there may be systematic shifts in the sensitivity of the machine between experiments or even through time within the duration of a single experiment.

Results

We describe a hierarchical model for analyzing unbiased, label-free proteomics data which utilizes the covariance of peptide expression across samples as well as MS/MS-based identifications to group peptides-a strategy we call metaprotein expression modeling. Our metaprotein model acknowledges the possibility of misidentifications, post-translational modifications and systematic differences between samples due to changes in instrument sensitivity or differences in total protein concentration. In addition, our approach allows us to validate findings from unbiased, label-free proteomics experiments with further unbiased, label-free proteomics experiments. Finally, we demonstrate the clinical/translational utility of the model for building predictors capable of differentiating biological phenotypes as well as for validating those findings in the context of three novel cohorts of patients with Hepatitis C.

Conclusions

Mass-spectrometry proteomics is quickly becoming a powerful tool for studying biological and translational questions. Making use of all of the information contained in a particular set of data will be critical to the success of those endeavors. Our proposed model represents an advance in the ability of statistical models of proteomic data to identify and utilize correlation between features. This allows validation of predictors without translation to targeted assays in addition to informing the choice of targets when it is appropriate to generate those assays.

SUBMITTER: Lucas JE

PROVIDER: S-EPMC3436780 | biostudies-literature | 2012 May

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Metaprotein expression modeling for label-free quantitative proteomics.

Lucas Joseph E JE Thompson J Will JW Dubois Laura G LG McCarthy Jeanette J Tillmann Hans H Thompson Alexander A Shire Norah N Hendrickson Ron R Dieguez Francisco F Goldman Phyllis P Schwarz Kathleen K Patel Keyur K McHutchison John J Moseley M Arthur MA

BMC bioinformatics 20120504

<h4>Background</h4>Label-free quantitative proteomics holds a great deal of promise for the future study of both medicine and biology. However, the data generated is extremely intricate in its correlation structure, and its proper analysis is complex. There are issues with missing identifications. There are high levels of correlation between many, but not all, of the peptides derived from the same protein. Additionally, there may be systematic shifts in the sensitivity of the machine between exp ...[more]

PMID: 22559859

Similar Datasets

Project description:BackgroundAlthough a great deal of rice proteomic research has been conducted, there are relatively few studies specifically addressing the rice grain proteome. The existing rice grain proteomic researches have focused on the identification of differentially expressed proteins or monitoring protein expression patterns during grain filling stages.ResultsProteins were extracted from rice grains 10, 20, and 30 days after flowering, as well as from fully mature grains. By merging all of the identified proteins in this study, we identified 4,172 non-redundant proteins with a wide range of molecular weights (from 5.2 kDa to 611 kDa) and pI values (from pH 2.9 to pH 12.6). A Genome Ontology category enrichment analysis for the 4,172 proteins revealed that 52 categories were enriched, including the carbohydrate metabolic process, transport, localization, lipid metabolic process, and secondary metabolic process. The relative abundances of the 1,784 reproducibly identified proteins were compared to detect 484 differentially expressed proteins during rice grain development. Clustering analysis and Genome Ontology category enrichment analysis revealed that proteins involved in the metabolic process were enriched through all stages of development, suggesting that proteome changes occurred even in the desiccation phase. Interestingly, enrichments of proteins involved in protein folding were detected in the desiccation phase and in fully mature grain.ConclusionThis is the first report conducting comprehensive identification of rice grain proteins. With a label free shotgun proteomic approach, we identified large number of rice grain proteins and compared the expression patterns of reproducibly identified proteins during rice grain development. Clustering analysis, Genome Ontology category enrichment analysis, and the analysis of composite expression profiles revealed dynamic changes of metabolisms during rice grain development. Interestingly, we detected that proteins involved in glycolysis, TCA-cycle, lipid metabolism, and proteolysis accumulated at higher levels in fully mature grain compared to grain developing stages, suggesting that the accumulation of these proteins during the desiccation stage may be associated with the preparation of proteins required in germination.

Dataset Information

Metaprotein expression modeling for label-free quantitative proteomics.

Background

Results

Conclusions

Publications

Metaprotein expression modeling for label-free quantitative proteomics.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets