Unknown

Dataset Information

0

AdaBoost based multi-instance transfer learning for predicting proteome-wide interactions between Salmonella and human proteins.


ABSTRACT: Pathogen-host protein-protein interaction (PPI) plays an important role in revealing the underlying pathogenesis of viruses and bacteria. The need of rapidly mapping proteome-wide pathogen-host interactome opens avenues for and imposes burdens on computational modeling. For Salmonella typhimurium, only 62 interactions with human proteins are reported to date, and the computational modeling based on such a small training data is prone to yield model overfitting. In this work, we propose a multi-instance transfer learning method to reconstruct the proteome-wide Salmonella-human PPI networks, wherein the training data is augmented by homolog knowledge transfer in the form of independent homolog instances. We use AdaBoost instance reweighting to counteract the noise from homolog instances, and deliberately design three experimental settings to validate the assumption that the homolog instances are effective to address the problems of data scarcity and data unavailability. The experimental results show that the proposed method outperforms the existing models and some predictions are validated by the findings from recent literature. Lastly, we conduct gene ontology based clustering analysis of the predicted networks to provide insights into the pathogenesis of Salmonella.

SUBMITTER: Mei S 

PROVIDER: S-EPMC4212833 | biostudies-literature | 2014

REPOSITORIES: biostudies-literature

altmetric image

Publications

AdaBoost based multi-instance transfer learning for predicting proteome-wide interactions between Salmonella and human proteins.

Mei Suyu S   Zhu Hao H  

PloS one 20141017 10


Pathogen-host protein-protein interaction (PPI) plays an important role in revealing the underlying pathogenesis of viruses and bacteria. The need of rapidly mapping proteome-wide pathogen-host interactome opens avenues for and imposes burdens on computational modeling. For Salmonella typhimurium, only 62 interactions with human proteins are reported to date, and the computational modeling based on such a small training data is prone to yield model overfitting. In this work, we propose a multi-i  ...[more]

Similar Datasets

| S-EPMC4436452 | biostudies-literature
2020-07-08 | GSE137436 | GEO
| S-EPMC5463197 | biostudies-other
| S-EPMC6203325 | biostudies-literature
2020-07-08 | GSE137435 | GEO
| S-EPMC8756178 | biostudies-literature
2020-07-08 | GSE137437 | GEO
| S-EPMC3951281 | biostudies-literature
| S-EPMC4648438 | biostudies-literature