Unknown

Dataset Information

0

Classifiers and their Metrics Quantified.


ABSTRACT: Molecular modeling frequently constructs classification models for the prediction of two-class entities, such as compound bio(in)activity, chemical property (non)existence, protein (non)interaction, and so forth. The models are evaluated using well known metrics such as accuracy or true positive rates. However, these frequently used metrics applied to retrospective and/or artificially generated prediction datasets can potentially overestimate true performance in actual prospective experiments. Here, we systematically consider metric value surface generation as a consequence of data balance, and propose the computation of an inverse cumulative distribution function taken over a metric surface. The proposed distribution analysis can aid in the selection of metrics when formulating study design. In addition to theoretical analyses, a practical example in chemogenomic virtual screening highlights the care required in metric selection and interpretation.

SUBMITTER: Brown JB 

PROVIDER: S-EPMC5838539 | biostudies-literature | 2018 Jan

REPOSITORIES: biostudies-literature

altmetric image

Publications

Classifiers and their Metrics Quantified.

Brown J B JB  

Molecular informatics 20180123 1-2


Molecular modeling frequently constructs classification models for the prediction of two-class entities, such as compound bio(in)activity, chemical property (non)existence, protein (non)interaction, and so forth. The models are evaluated using well known metrics such as accuracy or true positive rates. However, these frequently used metrics applied to retrospective and/or artificially generated prediction datasets can potentially overestimate true performance in actual prospective experiments. H  ...[more]

Similar Datasets

| S-EPMC6855565 | biostudies-literature
| S-EPMC2745680 | biostudies-literature
| S-EPMC7042662 | biostudies-literature
| S-EPMC3867158 | biostudies-literature
| S-EPMC4172566 | biostudies-literature
2023-09-29 | GSE244325 | GEO
2005-05-09 | GSE2468 | GEO
2010-06-10 | E-GEOD-2468 | biostudies-arrayexpress
| S-EPMC8677763 | biostudies-literature
2023-05-10 | GSE215175 | GEO