Unknown

Dataset Information

0

Data set modelability by QSAR.


ABSTRACT: We introduce a simple MODelability Index (MODI) that estimates the feasibility of obtaining predictive QSAR models (correct classification rate above 0.7) for a binary data set of bioactive compounds. MODI is defined as an activity class-weighted ratio of the number of nearest-neighbor pairs of compounds with the same activity class versus the total number of pairs. The MODI values were calculated for more than 100 data sets, and the threshold of 0.65 was found to separate the nonmodelable and modelable data sets.

SUBMITTER: Golbraikh A 

PROVIDER: S-EPMC3984298 | biostudies-other | 2014 Jan

REPOSITORIES: biostudies-other

altmetric image

Publications

Data set modelability by QSAR.

Golbraikh Alexander A   Muratov Eugene E   Fourches Denis D   Tropsha Alexander A  

Journal of chemical information and modeling 20140108 1


We introduce a simple MODelability Index (MODI) that estimates the feasibility of obtaining predictive QSAR models (correct classification rate above 0.7) for a binary data set of bioactive compounds. MODI is defined as an activity class-weighted ratio of the number of nearest-neighbor pairs of compounds with the same activity class versus the total number of pairs. The MODI values were calculated for more than 100 data sets, and the threshold of 0.65 was found to separate the nonmodelable and m  ...[more]

Similar Datasets

| S-ECPF-GEOD-60184 | biostudies-other
| S-EPMC3985743 | biostudies-other
| S-EPMC8402290 | biostudies-literature
| PRJEB31782 | ENA
| S-EPMC3683469 | biostudies-literature
| S-EPMC6400260 | biostudies-literature
| S-EPMC2860497 | biostudies-literature
| S-EPMC7886009 | biostudies-literature
| S-EPMC4510372 | biostudies-literature
| S-EPMC4664738 | biostudies-literature