Dataset Information

Effects of protein interaction data integration, representation and reliability on the use of network properties for drug target prediction.

ABSTRACT:

Background

Previous studies have noted that drug targets appear to be associated with higher-degree or higher-centrality proteins in interaction networks. These studies explicitly or tacitly make choices of different source databases, data integration strategies, representation of proteins and complexes, and data reliability assumptions. Here we examined how the use of different data integration and representation techniques, or different notions of reliability, may affect the efficacy of degree and centrality as features in drug target prediction.

Results

Fifty percent of drug targets have a degree of less than nine, and ninety-five percent have a degree of less than ninety. We found that drug targets are over-represented in higher degree bins - this relationship is only seen for the consolidated interactome and it is not dependent on n-ary interaction data or its representation. Degree acts as a weak predictive feature for drug-target status and using more reliable subsets of the data does not increase this performance. However, performance does increase if only cancer-related drug targets are considered. We also note that a protein's membership in pathway records can act as a predictive feature that is better than degree and that high-centrality may be an indicator of a drug that is more likely to be withdrawn.

Conclusions

These results show that protein interaction data integration and cleaning is an important consideration when incorporating network properties as predictive features for drug-target status. The provided scripts and data sets offer a starting point for further studies and cross-comparison of methods.

SUBMITTER: Mora A

PROVIDER: S-EPMC3534413 | biostudies-literature | 2012 Nov

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Effects of protein interaction data integration, representation and reliability on the use of network properties for drug target prediction.

Mora Antonio A Donaldson Ian M IM

BMC bioinformatics 20121112

<h4>Background</h4>Previous studies have noted that drug targets appear to be associated with higher-degree or higher-centrality proteins in interaction networks. These studies explicitly or tacitly make choices of different source databases, data integration strategies, representation of proteins and complexes, and data reliability assumptions. Here we examined how the use of different data integration and representation techniques, or different notions of reliability, may affect the efficacy o ...[more]

PMID: 23146171

Similar Datasets

Project description:BackgroundDrug repositioning, the strategy of unveiling novel targets of existing drugs could reduce costs and accelerate the pace of drug development. To elucidate the novel molecular mechanism of known drugs, considering the long time and high cost of experimental determination, the efficient and feasible computational methods to predict the potential associations between drugs and targets are of great aid.MethodsA novel calculation model for drug-target interaction (DTI) prediction based on network representation learning and convolutional neural networks, called DLDTI, was generated. The proposed approach simultaneously fused the topology of complex networks and diverse information from heterogeneous data sources, and coped with the noisy, incomplete, and high-dimensional nature of large-scale biological data by learning the low-dimensional and rich depth features of drugs and proteins. The low-dimensional feature vectors were used to train DLDTI to obtain the optimal mapping space and to infer new DTIs by ranking candidates according to their proximity to the optimal mapping space. More specifically, based on the results from the DLDTI, we experimentally validated the predicted targets of tetramethylpyrazine (TMPZ) on atherosclerosis progression in vivo.ResultsThe experimental results showed that the DLDTI model achieved promising performance under fivefold cross-validations with AUC values of 0.9172, which was higher than the methods using different classifiers or different feature combination methods mentioned in this paper. For the validation study of TMPZ on atherosclerosis, a total of 288 targets were identified and 190 of them were involved in platelet activation. The pathway analysis indicated signaling pathways, namely PI3K/Akt, cAMP and calcium pathways might be the potential targets. Effects and molecular mechanism of TMPZ on atherosclerosis were experimentally confirmed in animal models.ConclusionsDLDTI model can serve as a useful tool to provide promising DTI candidates for experimental validation. Based on the predicted results of DLDTI model, we found TMPZ could attenuate atherosclerosis by inhibiting signal transductions in platelets. The source code and datasets explored in this work are available at https://github.com/CUMTzackGit/DLDTI .

Dataset Information

Effects of protein interaction data integration, representation and reliability on the use of network properties for drug target prediction.

Background

Results

Conclusions

Publications

Effects of protein interaction data integration, representation and reliability on the use of network properties for drug target prediction.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets