Proteomics

Dataset Information

0

A Bothrops jararaca venom subproteome to benchmark a spectral clustering algorithm by controlling the selection bias


ABSTRACT: The clustering of mass spectra is a critical component of many proteomics applications. The clustering validation science is just as important, having evolved side by side with the clustering algorithms themselves. In this work, we build on Rieder et al. 's cluster validation framework, and we discuss the problem of selection bias in cluster validation measures; we introduce an assessment measure that is biased toward the number of peptide ion species; we introduce a cluster assessment framework for proteomics and demonstrate its importance by evaluating the performance of 8 clustering algorithms in 7 proteomics datasets, and we discuss the tradeoff between assessment measures. Finally, the validation methods presented here can be of broad applicability be-yond the clustering of mass spectra. This PRIDE entry describes in detail sample preparation, LC-MS/MS analysis, and protein identification of one of the proteomics datasets used in this work (> 10 kDa Bothrops jararaca snake venom proteome).

INSTRUMENT(S): Q Exactive

ORGANISM(S): Bothrops Jararaca (jararaca) (bothrops Jajaraca)

SUBMITTER: Richard Hemmi Valente  

LAB HEAD: Richard Hemmi Valente

PROVIDER: PXD022124 | Pride | 2022-03-16

REPOSITORIES: Pride

altmetric image

Publications

Leveraging the partition selection bias to achieve a high-quality clustering of mass spectra.

Silva André R F ARF   Lima Diogo B DB   Kurt Louise U LU   Dupré Mathieu M   Chamot-Rooke Julia J   Santos Marlon D M MDM   Nicolau Carolina Alves CA   Valente Richard Hemmi RH   Barbosa Valmir C VC   Carvalho Paulo C PC  

Journal of proteomics 20210602


In proteomics, the identification of peptides from mass spectral data can be mathematically described as the partitioning of mass spectra into clusters (i.e., groups of spectra derived from the same peptide). The way partitions are validated is just as important, having evolved side by side with the clustering algorithms themselves and given rise to many partition assessment measures. An assessment measure is said to have a selection bias if, and only if, the probability that a randomly chosen p  ...[more]

Similar Datasets

2017-05-16 | PXD005523 | Pride
2016-09-28 | PXD004186 | Pride
2010-04-12 | E-GEOD-17580 | biostudies-arrayexpress
2012-08-21 | E-GEOD-40231 | biostudies-arrayexpress
2014-04-07 | E-GEOD-56396 | biostudies-arrayexpress
2021-01-13 | PXD021561 | Pride
2012-08-21 | GSE40231 | GEO
2021-05-15 | PXD022696 | Pride
2018-11-13 | GSE118587 | GEO
2016-04-06 | E-GEOD-79978 | biostudies-arrayexpress