Unknown

Dataset Information

0

Comparison of false-discovery rates of various decoy databases.


ABSTRACT:

Background

The target-decoy strategy effectively estimates the false-discovery rate (FDR) by creating a decoy database with a size identical to that of the target database. Decoy databases are created by various methods, such as, the reverse, pseudo-reverse, shuffle, pseudo-shuffle, and the de Bruijn methods. FDR is sometimes over- or under-estimated depending on which decoy database is used because the ratios of redundant peptides in the target databases are different, that is, the numbers of unique (non-redundancy) peptides in the target and decoy databases differ.

Results

We used two protein databases (the UniProt Saccharomyces cerevisiae protein database and the UniProt human protein database) to compare the FDRs of various decoy databases. When the ratio of redundant peptides in the target database is low, the FDR is not overestimated by any decoy construction method. However, if the ratio of redundant peptides in the target database is high, the FDR is overestimated when the (pseudo) shuffle decoy database is used. Additionally, human and S. cerevisiae six frame translation databases, which are large databases, also showed outcomes similar to that from the UniProt human protein database.

Conclusion

The FDR must be estimated using the correction factor proposed by Elias and Gygi or that by Kim et al. when (pseudo) shuffle decoy databases are used.

SUBMITTER: Lee S 

PROVIDER: S-EPMC8449453 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| S-EPMC6708216 | biostudies-literature
| S-EPMC5379932 | biostudies-literature
| S-EPMC6252074 | biostudies-literature
| S-EPMC3220955 | biostudies-literature
| S-EPMC3820438 | biostudies-literature
| S-EPMC6919216 | biostudies-literature
| S-EPMC4711769 | biostudies-literature
| S-EPMC7773488 | biostudies-literature
| S-EPMC1334678 | biostudies-literature