Dataset Information

Bias tradeoffs in the creation and analysis of protein-protein interaction networks.

ABSTRACT: Networks constructed from aggregated protein-protein interaction data are commonplace in biology. But the studies these data are derived from were conducted with their own hypotheses and foci. Focusing on data from budding yeast present in BioGRID, we determine that many of the downstream signals present in network data are significantly impacted by biases in the original data. We determine the degree to which selection bias in favor of biologically interesting bait proteins goes down with study size, while we also find that promiscuity in prey contributes more substantially in larger studies. We analyze interaction studies over time with respect to data in the Gene Ontology and find that reproducibly observed interactions are less likely to favor multifunctional proteins. We find that strong alignment between co-expression and protein-protein interaction data occurs only for extreme co-expression values, and use this data to suggest candidates for targets likely to reveal novel biology in follow-up studies.

Biological significance

Protein-protein interaction data finds particularly heavy use in the interpretation of disease-causal variants. In principle, network data allows researchers to find novel commonalities among candidate genes. In this study, we detail several of the most salient biases contributing to aggregated protein-protein interaction databases. We find strong evidence for the role of selection and laboratory biases. Many of these effects contribute to the commonalities researchers find for disease genes. In order for characterization of disease genes and their interactions to not simply be an artifact of researcher preference, it is imperative to identify data biases explicitly. Based on this, we also suggest ways to move forward in producing candidates less influenced by prior knowledge. This article is part of a Special Issue entitled: Can Proteomics Fill the Gap Between Genomics and Phenotypes?

SUBMITTER: Gillis J

PROVIDER: S-EPMC3972268 | biostudies-literature | 2014 Apr

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Bias tradeoffs in the creation and analysis of protein-protein interaction networks.

Gillis Jesse J Ballouz Sara S Pavlidis Paul P

Journal of proteomics 20140127

Networks constructed from aggregated protein-protein interaction data are commonplace in biology. But the studies these data are derived from were conducted with their own hypotheses and foci. Focusing on data from budding yeast present in BioGRID, we determine that many of the downstream signals present in network data are significantly impacted by biases in the original data. We determine the degree to which selection bias in favor of biologically interesting bait proteins goes down with study ...[more]

PMID: 24480284

Dataset Information

Bias tradeoffs in the creation and analysis of protein-protein interaction networks.

Biological significance

Publications

Bias tradeoffs in the creation and analysis of protein-protein interaction networks.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Removing bias against membrane proteins in interaction networks.
| S-EPMC3213014 | biostudies-literature

Preeclampsia: a bioinformatics approach through protein-protein interaction networks analysis.
| S-EPMC3483240 | biostudies-literature

Signing protein-protein interaction networks.
| S-EPMC12781092 | biostudies-literature

Comparative analysis of protein-protein interaction networks in metastatic breast cancer.
| S-EPMC8769308 | biostudies-literature

Computational analysis of protein interaction networks for infectious diseases.
| S-EPMC7110031 | biostudies-literature

SEC-TMT facilitates quantitative differential analysis of protein interaction networks.
| S-EPMC9882152 | biostudies-literature

Graph theory and stability analysis of protein complex interaction networks.
| S-EPMC8687277 | biostudies-literature

A Systems Chemoproteomic Analysis of Acyl-CoA/Protein Interaction Networks.
| S-EPMC8237707 | biostudies-literature

Controllability in protein interaction networks.
| S-EPMC4024882 | biostudies-literature

Modulating protein-protein interaction networks in protein homeostasis.
| S-EPMC6609442 | biostudies-literature