Dataset Information

Integrating statistical predictions and experimental verifications for enhancing protein-chemical interaction predictions in virtual screening.

ABSTRACT: Predictions of interactions between target proteins and potential leads are of great benefit in the drug discovery process. We present a comprehensively applicable statistical prediction method for interactions between any proteins and chemical compounds, which requires only protein sequence data and chemical structure data and utilizes the statistical learning method of support vector machines. In order to realize reasonable comprehensive predictions which can involve many false positives, we propose two approaches for reduction of false positives: (i) efficient use of multiple statistical prediction models in the framework of two-layer SVM and (ii) reasonable design of the negative data to construct statistical prediction models. In two-layer SVM, outputs produced by the first-layer SVM models, which are constructed with different negative samples and reflect different aspects of classifications, are utilized as inputs to the second-layer SVM. In order to design negative data which produce fewer false positive predictions, we iteratively construct SVM models or classification boundaries from positive and tentative negative samples and select additional negative sample candidates according to pre-determined rules. Moreover, in order to fully utilize the advantages of statistical learning methods, we propose a strategy to effectively feedback experimental results to computational predictions with consideration of biological effects of interest. We show the usefulness of our approach in predicting potential ligands binding to human androgen receptors from more than 19 million chemical compounds and verifying these predictions by in vitro binding. Moreover, we utilize this experimental validation as feedback to enhance subsequent computational predictions, and experimentally validate these predictions again. This efficient procedure of the iteration of the in silico prediction and in vitro or in vivo experimental verifications with the sufficient feedback enabled us to identify novel ligand candidates which were distant from known ligands in the chemical space.

SUBMITTER: Nagamine N

PROVIDER: S-EPMC2685987 | biostudies-literature | 2009 Jun

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Integrating statistical predictions and experimental verifications for enhancing protein-chemical interaction predictions in virtual screening.

Nagamine Nobuyoshi N Shirakawa Takayuki T Minato Yusuke Y Torii Kentaro K Kobayashi Hiroki H Imoto Masaya M Sakakibara Yasubumi Y

PLoS computational biology 20090605 6

Predictions of interactions between target proteins and potential leads are of great benefit in the drug discovery process. We present a comprehensively applicable statistical prediction method for interactions between any proteins and chemical compounds, which requires only protein sequence data and chemical structure data and utilizes the statistical learning method of support vector machines. In order to realize reasonable comprehensive predictions which can involve many false positives, we p ...[more]

PMID: 19503826

Dataset Information

Integrating statistical predictions and experimental verifications for enhancing protein-chemical interaction predictions in virtual screening.

Publications

Integrating statistical predictions and experimental verifications for enhancing protein-chemical interaction predictions in virtual screening.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Comprehensive predictions of target proteins based on protein-chemical interaction using virtual screening and experimental verifications.
| S-EPMC3471015 | biostudies-literature

A Mechanistic Framework for Integrating Chemical Structure and High-Throughput Screening Results to Improve Toxicity Predictions.
| S-EPMC9910356 | biostudies-literature

Integrating DNA-encoded chemical libraries with virtual combinatorial library screening: Optimizing a PARP10 inhibitor.
| S-EPMC7530011 | biostudies-literature

Integrating EEG and EMG data: a novel statistical pipeline for investigating brain-muscle interaction in experimental neuroarchaeology.
| S-EPMC12170762 | biostudies-literature

Characterization of Soybean Protein Isolate-Food Polyphenol Interaction via Virtual Screening and Experimental Studies.
| S-EPMC8625844 | biostudies-literature

Discovery of Non-Covalent Inhibitors for SARS-CoV-2 PLpro: Integrating Virtual Screening, Synthesis, and Experimental Validation.
| S-EPMC11647681 | biostudies-literature

Descriptor-augmented machine learning for enzyme-chemical interaction predictions.
| S-EPMC10915406 | biostudies-literature

Statistical analysis of EGFR structures' performance in virtual screening.
| S-EPMC4749411 | biostudies-literature

Flavone Cocrystals: A Comprehensive Approach Integrating Experimental and Virtual Methods.
| S-EPMC11099919 | biostudies-literature

Expanding the fragrance chemical space for virtual screening.
| S-EPMC4037718 | biostudies-literature