Dataset Information

Stepwise classification of cancer samples using clinical and molecular data.

ABSTRACT:

Background

Combining clinical and molecular data types may potentially improve prediction accuracy of a classifier. However, currently there is a shortage of effective and efficient statistical and bioinformatic tools for true integrative data analysis. Existing integrative classifiers have two main disadvantages: First, coarse combination may lead to subtle contributions of one data type to be overshadowed by more obvious contributions of the other. Second, the need to measure both data types for all patients may be both unpractical and (cost) inefficient.

Results

We introduce a novel classification method, a stepwise classifier, which takes advantage of the distinct classification power of clinical data and high-dimensional molecular data. We apply classification algorithms to two data types independently, starting with the traditional clinical risk factors. We only turn to relatively expensive molecular data when the uncertainty of prediction result from clinical data exceeds a predefined limit. Experimental results show that our approach is adaptive: the proportion of samples that needs to be re-classified using molecular data depends on how much we expect the predictive accuracy to increase when re-classifying those samples.

Conclusions

Our method renders a more cost-efficient classifier that is at least as good, and sometimes better, than one based on clinical or molecular data alone. Hence our approach is not just a classifier that minimizes a particular loss function. Instead, it aims to be cost-efficient by avoiding molecular tests for a potentially large subgroup of individuals; moreover, for these individuals a test result would be quickly available, which may lead to reduced waiting times (for diagnosis) and hence lower the patients distress. Stepwise classification is implemented in R-package stepwiseCM and available at the Bioconductor website.

SUBMITTER: Obulkasim A

PROVIDER: S-EPMC3221726 | biostudies-literature | 2011 Oct

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Stepwise classification of cancer samples using clinical and molecular data.

Obulkasim Askar A Meijer Gerrit A GA van de Wiel Mark A MA

BMC bioinformatics 20111028

<h4>Background</h4>Combining clinical and molecular data types may potentially improve prediction accuracy of a classifier. However, currently there is a shortage of effective and efficient statistical and bioinformatic tools for true integrative data analysis. Existing integrative classifiers have two main disadvantages: First, coarse combination may lead to subtle contributions of one data type to be overshadowed by more obvious contributions of the other. Second, the need to measure both data ...[more]

PMID: 22034839

Similar Datasets

Project description:Endometriosis, an estrogen-dependent, progesterone-resistant, inflammatory disorder affects 10% of reproductive-age women. It is diagnosed and staged at surgery, resulting in an 11-year latency from symptom onset to diagnosis, underscoring the need for less invasive, less expensive approaches. Since the uterine lining (endometrium) in women with endometriosis has altered molecular profiles, we tested whether molecular classification of this tissue can distinguish and stage disease. We developed classifiers using genomic data from n=148 archived endometrial samples from women with endometriosis or without endometriosis (normal controls or with other common uterine/pelvic pathologies) across the menstrual cycle and evaluated their performance on independent sample sets. Classifiers were trained separately on samples in specific hormonal milieu, using margin tree classification, and accuracies were scored on independent validation samples. Classification of samples from women with endometriosis or no endometriosis involved two binary decisions each based on expression of specific genes. These first distinguished presence or absence of uterine/pelvic pathology and then no endometriosis from endometriosis, with the latter further classified according to severity (minimal/mild or moderate/severe). Best performing classifiers identified endometriosis with 90-100% accuracy, were cycle phase-specific or independent, and utilized relatively few genes to determine disease and severity. Differential gene expression and pathway analyses revealed immune activation, altered steroid and thyroid hormone signaling/metabolism and growth factor signaling in endometrium of women with endometriosis. Similar findings were observed with other disorders versus controls. Thus, classifier analysis of genomic data from endometrium can detect and stage pelvic endometriosis with high accuracy, dependent or independent of hormonal milieu. We propose that limited classifier candidate-genes are of high value in developing diagnostics and identifying therapeutic targets. Discovery of endometrial molecular differences in the presence of endometriosis and other uterine/pelvic pathologies raises the broader biological question of their impact on the steroid hormone response and normal functions of this tissue.

Dataset Information

Stepwise classification of cancer samples using clinical and molecular data.

Background

Results

Conclusions

Publications

Stepwise classification of cancer samples using clinical and molecular data.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets