Unknown

Dataset Information

0

Topsy-Turvy: integrating a global view into sequence-based PPI prediction.


ABSTRACT:

Summary

Computational methods to predict protein-protein interaction (PPI) typically segregate into sequence-based 'bottom-up' methods that infer properties from the characteristics of the individual protein sequences, or global 'top-down' methods that infer properties from the pattern of already known PPIs in the species of interest. However, a way to incorporate top-down insights into sequence-based bottom-up PPI prediction methods has been elusive. We thus introduce Topsy-Turvy, a method that newly synthesizes both views in a sequence-based, multi-scale, deep-learning model for PPI prediction. While Topsy-Turvy makes predictions using only sequence data, during the training phase it takes a transfer-learning approach by incorporating patterns from both global and molecular-level views of protein interaction. In a cross-species context, we show it achieves state-of-the-art performance, offering the ability to perform genome-scale, interpretable PPI prediction for non-model organisms with no existing experimental PPI data. In species with available experimental PPI data, we further present a Topsy-Turvy hybrid (TT-Hybrid) model which integrates Topsy-Turvy with a purely network-based model for link prediction that provides information about species-specific network rewiring. TT-Hybrid makes accurate predictions for both well- and sparsely-characterized proteins, outperforming both its constituent components as well as other state-of-the-art PPI prediction methods. Furthermore, running Topsy-Turvy and TT-Hybrid screens is feasible for whole genomes, and thus these methods scale to settings where other methods (e.g. AlphaFold-Multimer) might be infeasible. The generalizability, accuracy and genome-level scalability of Topsy-Turvy and TT-Hybrid unlocks a more comprehensive map of protein interaction and organization in both model and non-model organisms.

Availability and implementation

https://topsyturvy.csail.mit.edu.

Supplementary information

Supplementary data are available at Bioinformatics online.

SUBMITTER: Singh R 

PROVIDER: S-EPMC9235477 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| S-EPMC4243118 | biostudies-other
| S-EPMC6486542 | biostudies-literature
| S-EPMC7490824 | biostudies-literature
| S-EPMC8712557 | biostudies-literature
| S-EPMC6752863 | biostudies-literature
| S-EPMC5892885 | biostudies-literature
| S-EPMC6339728 | biostudies-literature
| S-EPMC9309777 | biostudies-literature
| S-EPMC8490152 | biostudies-literature
| S-EPMC9897180 | biostudies-literature