Unknown

Dataset Information

0

Selective voting in convex-hull ensembles improves classification accuracy.


ABSTRACT: Classification algorithms can be used to predict risks and responses of patients based on genomic and other high-dimensional data. While there is optimism for using these algorithms to improve the treatment of diseases, they have yet to demonstrate sufficient predictive ability for routine clinical practice. They generally classify all patients according to the same criteria, under an implicit assumption of population homogeneity. The objective here is to allow for population heterogeneity, possibly unrecognized, in order to increase classification accuracy and further the goal of tailoring therapies on an individualized basis.A new selective-voting algorithm is developed in the context of a classifier ensemble of two-dimensional convex hulls of positive and negative training samples. Individual classifiers in the ensemble are allowed to vote on test samples only if those samples are located within or behind pruned convex hulls of training samples that define the classifiers.Validation of the new algorithm's increased accuracy is carried out using two publicly available datasets having cancer as the outcome variable and expression levels of thousands of genes as predictors. Selective voting leads to statistically significant increases in accuracy from 86.0% to 89.8% (p<0.001) and 63.2% to 67.8% (p<0.003) compared to the original algorithm.Selective voting by members of convex-hull classifier ensembles significantly increases classification accuracy compared to one-size-fits-all approaches.

SUBMITTER: Kodell RL 

PROVIDER: S-EPMC3666100 | biostudies-literature | 2012 Mar

REPOSITORIES: biostudies-literature

altmetric image

Publications

Selective voting in convex-hull ensembles improves classification accuracy.

Kodell Ralph L RL   Zhang Chuanlei C   Siegel Eric R ER   Nagarajan Radhakrishnan R  

Artificial intelligence in medicine 20111106 3


<h4>Objective</h4>Classification algorithms can be used to predict risks and responses of patients based on genomic and other high-dimensional data. While there is optimism for using these algorithms to improve the treatment of diseases, they have yet to demonstrate sufficient predictive ability for routine clinical practice. They generally classify all patients according to the same criteria, under an implicit assumption of population homogeneity. The objective here is to allow for population h  ...[more]

Similar Datasets

| S-EPMC8242930 | biostudies-literature
| S-EPMC6789115 | biostudies-literature
| S-EPMC5213839 | biostudies-literature
| S-EPMC3950251 | biostudies-literature
| S-EPMC8040629 | biostudies-literature
| S-EPMC5767225 | biostudies-literature
| S-EPMC2525726 | biostudies-literature