Unknown

Dataset Information

0

A non-parametric cutout index for robust evaluation of identified proteins.


ABSTRACT: This paper proposes a novel, automated method for evaluating sets of proteins identified using mass spectrometry. The remaining peptide-spectrum match score distributions of protein sets are compared to an empirical absent peptide-spectrum match score distribution, and a Bayesian non-parametric method reminiscent of the Dirichlet process is presented to accurately perform this comparison. Thus, for a given protein set, the process computes the likelihood that the proteins identified are correctly identified. First, the method is used to evaluate protein sets chosen using different protein-level false discovery rate (FDR) thresholds, assigning each protein set a likelihood. The protein set assigned the highest likelihood is used to choose a non-arbitrary protein-level FDR threshold. Because the method can be used to evaluate any protein identification strategy (and is not limited to mere comparisons of different FDR thresholds), we subsequently use the method to compare and evaluate multiple simple methods for merging peptide evidence over replicate experiments. The general statistical approach can be applied to other types of data (e.g. RNA sequencing) and generalizes to multivariate problems.

SUBMITTER: Serang O 

PROVIDER: S-EPMC3591671 | biostudies-literature | 2013 Mar

REPOSITORIES: biostudies-literature

altmetric image

Publications

A non-parametric cutout index for robust evaluation of identified proteins.

Serang Oliver O   Paulo Joao J   Steen Hanno H   Steen Judith A JA  

Molecular & cellular proteomics : MCP 20130104 3


This paper proposes a novel, automated method for evaluating sets of proteins identified using mass spectrometry. The remaining peptide-spectrum match score distributions of protein sets are compared to an empirical absent peptide-spectrum match score distribution, and a Bayesian non-parametric method reminiscent of the Dirichlet process is presented to accurately perform this comparison. Thus, for a given protein set, the process computes the likelihood that the proteins identified are correctl  ...[more]

Similar Datasets

| S-EPMC9068750 | biostudies-literature
| S-EPMC8414689 | biostudies-literature
| S-EPMC5963472 | biostudies-literature
| S-EPMC8218388 | biostudies-literature
| S-EPMC5865707 | biostudies-literature
| S-EPMC6346534 | biostudies-literature
| S-EPMC8117519 | biostudies-literature
| S-EPMC8617258 | biostudies-literature
| S-EPMC7116477 | biostudies-literature
| S-EPMC8294536 | biostudies-literature