Unknown

Dataset Information

0

Protein and gene model inference based on statistical modeling in k-partite graphs.


ABSTRACT: One of the major goals of proteomics is the comprehensive and accurate description of a proteome. Shotgun proteomics, the method of choice for the analysis of complex protein mixtures, requires that experimentally observed peptides are mapped back to the proteins they were derived from. This process is also known as protein inference. We present Markovian Inference of Proteins and Gene Models (MIPGEM), a statistical model based on clearly stated assumptions to address the problem of protein and gene model inference for shotgun proteomics data. In particular, we are dealing with dependencies among peptides and proteins using a Markovian assumption on k-partite graphs. We are also addressing the problems of shared peptides and ambiguous proteins by scoring the encoding gene models. Empirical results on two control datasets with synthetic mixtures of proteins and on complex protein samples of Saccharomyces cerevisiae, Drosophila melanogaster, and Arabidopsis thaliana suggest that the results with MIPGEM are competitive with existing tools for protein inference.

SUBMITTER: Gerster S 

PROVIDER: S-EPMC2901486 | biostudies-literature | 2010 Jul

REPOSITORIES: biostudies-literature

altmetric image

Publications

Protein and gene model inference based on statistical modeling in k-partite graphs.

Gerster Sarah S   Qeli Ermir E   Ahrens Christian H CH   Bühlmann Peter P  

Proceedings of the National Academy of Sciences of the United States of America 20100618 27


One of the major goals of proteomics is the comprehensive and accurate description of a proteome. Shotgun proteomics, the method of choice for the analysis of complex protein mixtures, requires that experimentally observed peptides are mapped back to the proteins they were derived from. This process is also known as protein inference. We present Markovian Inference of Proteins and Gene Models (MIPGEM), a statistical model based on clearly stated assumptions to address the problem of protein and  ...[more]

Similar Datasets

| S-EPMC6619254 | biostudies-literature
| S-EPMC3564037 | biostudies-other
| S-EPMC6922442 | biostudies-literature
| S-EPMC3247861 | biostudies-other
| S-EPMC5642352 | biostudies-literature
| S-EPMC6659630 | biostudies-literature
| S-EPMC8936483 | biostudies-literature
2023-12-18 | GSE250148 | GEO
| S-EPMC3152790 | biostudies-literature
| S-EPMC3307026 | biostudies-literature