Unknown

Dataset Information

0

Integrated Identification and Quantification Error Probabilities for Shotgun Proteomics.


ABSTRACT: Protein quantification by label-free shotgun proteomics experiments is plagued by a multitude of error sources. Typical pipelines for identifying differential proteins use intermediate filters to control the error rate. However, they often ignore certain error sources and, moreover, regard filtered lists as completely correct in subsequent steps. These two indiscretions can easily lead to a loss of control of the false discovery rate (FDR). We propose a probabilistic graphical model, Triqler, that propagates error information through all steps, employing distributions in favor of point estimates, most notably for missing value imputation. The model outputs posterior probabilities for fold changes between treatment groups, highlighting uncertainty rather than hiding it. We analyzed 3 engineered data sets and achieved FDR control and high sensitivity, even for truly absent proteins. In a bladder cancer clinical data set we discovered 35 proteins at 5% FDR, whereas the original study discovered 1 and MaxQuant/Perseus 4 proteins at this threshold. Compellingly, these 35 proteins showed enrichment for functional annotation terms, whereas the top ranked proteins reported by MaxQuant/Perseus showed no enrichment. The model executes in minutes and is freely available at https://pypi.org/project/triqler/.

SUBMITTER: The M 

PROVIDER: S-EPMC6398204 | biostudies-literature | 2019 Mar

REPOSITORIES: biostudies-literature

altmetric image

Publications

Integrated Identification and Quantification Error Probabilities for Shotgun Proteomics.

The Matthew M   Käll Lukas L  

Molecular & cellular proteomics : MCP 20181127 3


Protein quantification by label-free shotgun proteomics experiments is plagued by a multitude of error sources. Typical pipelines for identifying differential proteins use intermediate filters to control the error rate. However, they often ignore certain error sources and, moreover, regard filtered lists as completely correct in subsequent steps. These two indiscretions can easily lead to a loss of control of the false discovery rate (FDR). We propose a probabilistic graphical model, <i>Triqler<  ...[more]

Similar Datasets

| S-EPMC4261935 | biostudies-other
| S-EPMC5096980 | biostudies-literature
| S-EPMC2352161 | biostudies-other
| S-EPMC5722229 | biostudies-literature
| S-EPMC2736651 | biostudies-literature
| S-EPMC2711935 | biostudies-literature
| S-EPMC3515694 | biostudies-literature
| S-EPMC7319958 | biostudies-literature
| S-EPMC2710313 | biostudies-other
| S-EPMC5767123 | biostudies-literature