Unknown

Dataset Information

0

A two-step database search method improves sensitivity in peptide sequence matches for metaproteomics and proteogenomics studies.


ABSTRACT: Large databases (>10(6) sequences) used in metaproteomic and proteogenomic studies present challenges in matching peptide sequences to MS/MS data using database-search programs. Most notably, strict filtering to avoid false-positive matches leads to more false negatives, thus constraining the number of peptide matches. To address this challenge, we developed a two-step method wherein matches derived from a primary search against a large database were used to create a smaller subset database. The second search was performed against a target-decoy version of this subset database merged with a host database. High confidence peptide sequence matches were then used to infer protein identities. Applying our two-step method for both metaproteomic and proteogenomic analysis resulted in twice the number of high confidence peptide sequence matches in each case, as compared to the conventional one-step method. The two-step method captured almost all of the same peptides matched by the one-step method, with a majority of the additional matches being false negatives from the one-step method. Furthermore, the two-step method improved results regardless of the database search program used. Our results show that our two-step method maximizes the peptide matching sensitivity for applications requiring large databases, especially valuable for proteogenomics and metaproteomics studies.

SUBMITTER: Jagtap P 

PROVIDER: S-EPMC3633484 | biostudies-literature | 2013 Apr

REPOSITORIES: biostudies-literature

altmetric image

Publications

A two-step database search method improves sensitivity in peptide sequence matches for metaproteomics and proteogenomics studies.

Jagtap Pratik P   Goslinga Jill J   Kooren Joel A JA   McGowan Thomas T   Wroblewski Matthew S MS   Seymour Sean L SL   Griffin Timothy J TJ  

Proteomics 20130315 8


Large databases (>10(6) sequences) used in metaproteomic and proteogenomic studies present challenges in matching peptide sequences to MS/MS data using database-search programs. Most notably, strict filtering to avoid false-positive matches leads to more false negatives, thus constraining the number of peptide matches. To address this challenge, we developed a two-step method wherein matches derived from a primary search against a large database were used to create a smaller subset database. The  ...[more]

Similar Datasets

| S-EPMC6231400 | biostudies-literature
| S-EPMC4986259 | biostudies-literature
| S-EPMC6192206 | biostudies-literature
| S-EPMC3198583 | biostudies-literature
2019-07-12 | PXD014582 | iProX
| S-EPMC7767584 | biostudies-literature
2018-02-09 | PXD007587 | Pride
| S-EPMC3676282 | biostudies-literature
| S-EPMC3087343 | biostudies-literature
| S-EPMC7275042 | biostudies-literature