Unknown

Dataset Information

0

Feature optimization in high dimensional chemical space: statistical and data mining solutions.


ABSTRACT: OBJECTIVES:The primary goal of this experiment is to prioritize molecular descriptors that control the activity of active molecules that could reduce the dimensionality produced during the virtual screening process. It also aims to: (1) develop a methodology for sampling large datasets and the statistical verification of the sampling process, (2) apply screening filter to detect molecules with polypharmacological or promiscuous activity. RESULTS:Sampling from large a dataset and its verification were done by applying Z-test. Molecular descriptors were prioritized using principal component analysis (PCA) by eliminating the least influencing ones. The original dimensions were reduced to one-twelfth by the application of PCA. There was a significant improvement in statistical parameter values of virtual screening model which in turn resulted in better screening results. Further improvement of screened results was done by applying Eli Lilly MedChem rules filter that removed molecules with polypharmacological or promiscuous activity. It was also shown that similarities in the activity of compounds were due to the molecular descriptors which were not apparent in prima facie structural studies.

SUBMITTER: K R J 

PROVIDER: S-EPMC6044099 | biostudies-literature | 2018 Jul

REPOSITORIES: biostudies-literature

altmetric image

Publications

Feature optimization in high dimensional chemical space: statistical and data mining solutions.

K R Jinuraj J   M Rakhila R   M Dhanalakshmi D   R Sajeev S   Gad Akshata A   K Jayan J   P Muhammed Iqbal MI   Manuel Andrew Titus AT   U C Abdul Jaleel AJ  

BMC research notes 20180713 1


<h4>Objectives</h4>The primary goal of this experiment is to prioritize molecular descriptors that control the activity of active molecules that could reduce the dimensionality produced during the virtual screening process. It also aims to: (1) develop a methodology for sampling large datasets and the statistical verification of the sampling process, (2) apply screening filter to detect molecules with polypharmacological or promiscuous activity.<h4>Results</h4>Sampling from large a dataset and i  ...[more]

Similar Datasets

| S-EPMC10119907 | biostudies-literature
| S-EPMC4507289 | biostudies-literature
| S-EPMC5544778 | biostudies-literature
| S-EPMC2099500 | biostudies-literature
| S-EPMC3445441 | biostudies-literature
| S-EPMC7397300 | biostudies-literature
| S-EPMC4248814 | biostudies-literature
| S-EPMC5738110 | biostudies-literature
| S-EPMC3796884 | biostudies-literature
| S-EPMC3577111 | biostudies-literature