Unknown

Dataset Information

0

Multiclass cancer classification based on gene expression comparison.


ABSTRACT: As the complexity and heterogeneity of cancer is being increasingly appreciated through genomic analyses, microarray-based cancer classification comprising multiple discriminatory molecular markers is an emerging trend. Such multiclass classification problems pose new methodological and computational challenges for developing novel and effective statistical approaches. In this paper, we introduce a new approach for classifying multiple disease states associated with cancer based on gene expression profiles. Our method focuses on detecting small sets of genes in which the relative comparison of their expression values leads to class discrimination. For an m-class problem, the classification rule typically depends on a small number of m-gene sets, which provide transparent decision boundaries and allow for potential biological interpretations. We first test our approach on seven common gene expression datasets and compare it with popular classification methods including support vector machines and random forests. We then consider an extremely large cohort of leukemia cancer patients to further assess its effectiveness. In both experiments, our method yields comparable or even better results to benchmark classifiers. In addition, we demonstrate that our approach can integrate pathway analysis of gene expression to provide accurate and biological meaningful classification.

SUBMITTER: Yang S 

PROVIDER: S-EPMC4775275 | biostudies-literature | 2014 Aug

REPOSITORIES: biostudies-literature

altmetric image

Publications

Multiclass cancer classification based on gene expression comparison.

Yang Sitan S   Naiman Daniel Q DQ  

Statistical applications in genetics and molecular biology 20140801 4


As the complexity and heterogeneity of cancer is being increasingly appreciated through genomic analyses, microarray-based cancer classification comprising multiple discriminatory molecular markers is an emerging trend. Such multiclass classification problems pose new methodological and computational challenges for developing novel and effective statistical approaches. In this paper, we introduce a new approach for classifying multiple disease states associated with cancer based on gene expressi  ...[more]

Similar Datasets

| S-EPMC5947894 | biostudies-literature
| S-EPMC8796360 | biostudies-literature
| S-EPMC7710761 | biostudies-literature
| S-EPMC6372182 | biostudies-literature
| S-EPMC1184049 | biostudies-literature
| S-EPMC3042183 | biostudies-literature
| S-EPMC3347893 | biostudies-literature
| S-EPMC6999883 | biostudies-literature
| S-EPMC7206293 | biostudies-literature
| S-EPMC6798571 | biostudies-literature