Unknown

Dataset Information

0

Predicting co-complexed protein pairs using genomic and proteomic data integration.


ABSTRACT:

Background

Identifying all protein-protein interactions in an organism is a major objective of proteomics. A related goal is to know which protein pairs are present in the same protein complex. High-throughput methods such as yeast two-hybrid (Y2H) and affinity purification coupled with mass spectrometry (APMS) have been used to detect interacting proteins on a genomic scale. However, both Y2H and APMS methods have substantial false-positive rates. Aside from high-throughput interaction screens, other gene- or protein-pair characteristics may also be informative of physical interaction. Therefore it is desirable to integrate multiple datasets and utilize their different predictive value for more accurate prediction of co-complexed relationship.

Results

Using a supervised machine learning approach--probabilistic decision tree, we integrated high-throughput protein interaction datasets and other gene- and protein-pair characteristics to predict co-complexed pairs (CCP) of proteins. Our predictions proved more sensitive and specific than predictions based on Y2H or APMS methods alone or in combination. Among the top predictions not annotated as CCPs in our reference set (obtained from the MIPS complex catalogue), a significant fraction was found to physically interact according to a separate database (YPD, Yeast Proteome Database), and the remaining predictions may potentially represent unknown CCPs.

Conclusions

We demonstrated that the probabilistic decision tree approach can be successfully used to predict co-complexed protein (CCP) pairs from other characteristics. Our top-scoring CCP predictions provide testable hypotheses for experimental validation.

SUBMITTER: Zhang LV 

PROVIDER: S-EPMC419405 | biostudies-literature | 2004 Apr

REPOSITORIES: biostudies-literature

altmetric image

Publications

Predicting co-complexed protein pairs using genomic and proteomic data integration.

Zhang Lan V LV   Wong Sharyl L SL   King Oliver D OD   Roth Frederick P FP  

BMC bioinformatics 20040416


<h4>Background</h4>Identifying all protein-protein interactions in an organism is a major objective of proteomics. A related goal is to know which protein pairs are present in the same protein complex. High-throughput methods such as yeast two-hybrid (Y2H) and affinity purification coupled with mass spectrometry (APMS) have been used to detect interacting proteins on a genomic scale. However, both Y2H and APMS methods have substantial false-positive rates. Aside from high-throughput interaction  ...[more]

Similar Datasets

| S-EPMC3145649 | biostudies-literature
| S-EPMC6889397 | biostudies-literature
| S-EPMC4833864 | biostudies-other
| S-EPMC4360886 | biostudies-literature
2008-10-15 | GSE13029 | GEO
| S-EPMC3359238 | biostudies-literature
| S-EPMC6507230 | biostudies-literature
| S-EPMC6834616 | biostudies-literature
| S-EPMC2719670 | biostudies-literature
| S-EPMC3364912 | biostudies-literature