Considerations for peptide and protein error rate control in large-scale targeted DIA analyses
Ontology highlight
ABSTRACT: Liquid chromatography coupled to tandem mass spectrometry has become the main method for high-throughput identification and quantification of peptides and the inferred proteins. Discovery proteomics commonly employs data-dependent acquisition in combination with spectrum-centric analysis. The accumulation of data generated from thousands of samples by this method has approached saturation coverage of different proteomes. Recently, as a result of technological advances, methods based on data acquisition strategies compatible with peptide-centric scoring have also reached similar proteome coverage in individual runs, and scalability. This is exemplified by SWATH-MS, which combines data-independent acquisition (DIA) with targeted data extraction of groups of transitions uniquely detecting a peptide. As the data matrices generated by these experiments continue to grow with respect to both the number of peptides identified per sample and the number of samples analyzed per study, challenges for error rate control have emerged. Here, we discuss the adaptation of statistical concepts developed for discovery proteomics based on spectrum-centric scoring to large-scale DIA experiments analyzed with peptide-centric scoring strategies, and provide some guidance on their application. We propose that, in order to increase the quality and reproducibility of published proteomic results, well-established confidence criteria should be reported at each level as we progress from spectral evidence to identified or detected peptides and inferred proteins. These confidence criteria should equally be applied to proteomic analyses based on spectrum- and peptide-centric scoring strategies.
INSTRUMENT(S): TripleTOF 5600
ORGANISM(S): Homo Sapiens (human)
TISSUE(S): Permanent Cell Line Cell, Cell Culture
SUBMITTER: Isabell Bludau
LAB HEAD: Ruedi Aebersold
PROVIDER: PXD004884 | Pride | 2017-08-03
REPOSITORIES: pride
ACCESS DATA