Unknown

Dataset Information

0

A Markov Chain Monte Carlo Method for Estimating the Statistical Significance of Proteoform Identifications by Top-Down Mass Spectrometry.


ABSTRACT: Top-down mass spectrometry is capable of identifying whole proteoform sequences with multiple post-translational modifications because it generates tandem mass spectra directly from intact proteoforms. Many software tools, such as ProSightPC, MSPathFinder, and TopMG, have been proposed for identifying proteoforms with modifications. In these tools, various methods are employed to estimate the statistical significance of identifications. However, most existing methods are designed for proteoform identifications without modifications, and the challenge remains for accurately estimating the statistical significance of proteoform identifications with modifications. Here we propose TopMCMC, a method that combines a Markov chain random walk algorithm and a greedy algorithm for assigning statistical significance to matches between spectra and protein sequences with variable modifications. Experimental results showed that TopMCMC achieved high accuracy in estimating E-values and false discovery rates of identifications in top-down mass spectrometry. Coupled with TopMG, TopMCMC identified more spectra than the generating function method from an MCF-7 top-down mass spectrometry data set.

SUBMITTER: Kou Q 

PROVIDER: S-EPMC6484843 | biostudies-literature | 2019 Mar

REPOSITORIES: biostudies-literature

altmetric image

Publications

A Markov Chain Monte Carlo Method for Estimating the Statistical Significance of Proteoform Identifications by Top-Down Mass Spectrometry.

Kou Qiang Q   Wang Zhe Z   Lubeckyj Rachele A RA   Wu Si S   Sun Liangliang L   Liu Xiaowen X  

Journal of proteome research 20190128 3


Top-down mass spectrometry is capable of identifying whole proteoform sequences with multiple post-translational modifications because it generates tandem mass spectra directly from intact proteoforms. Many software tools, such as ProSightPC, MSPathFinder, and TopMG, have been proposed for identifying proteoforms with modifications. In these tools, various methods are employed to estimate the statistical significance of identifications. However, most existing methods are designed for proteoform  ...[more]

Similar Datasets

| S-EPMC5807004 | biostudies-literature
| S-EPMC7490796 | biostudies-literature
| S-EPMC7224357 | biostudies-literature
| S-EPMC5354282 | biostudies-literature
| S-EPMC5075424 | biostudies-literature
| S-EPMC4578810 | biostudies-literature
| S-EPMC8272262 | biostudies-literature
| S-EPMC3464018 | biostudies-literature
| S-EPMC6894579 | biostudies-literature
| S-EPMC3845170 | biostudies-other