Dataset Information

Module-based outcome prediction using breast cancer compendia.

ABSTRACT:

Background

The availability of large collections of microarray datasets (compendia), or knowledge about grouping of genes into pathways (gene sets), is typically not exploited when training predictors of disease outcome. These can be useful since a compendium increases the number of samples, while gene sets reduce the size of the feature space. This should be favorable from a machine learning perspective and result in more robust predictors.

Methodology

We extracted modules of regulated genes from gene sets, and compendia. Through supervised analysis, we constructed predictors which employ modules predictive of breast cancer outcome. To validate these predictors we applied them to independent data, from the same institution (intra-dataset), and other institutions (inter-dataset).

Conclusions

We show that modules derived from single breast cancer datasets achieve better performance on the validation data compared to gene-based predictors. We also show that there is a trend in compendium specificity and predictive performance: modules derived from a single breast cancer dataset, and a breast cancer specific compendium perform better compared to those derived from a human cancer compendium. Additionally, the module-based predictor provides a much richer insight into the underlying biology. Frequently selected gene sets are associated with processes such as cell cycle, E2F regulation, DNA damage response, proteasome and glycolysis. We analyzed two modules related to cell cycle, and the OCT1 transcription factor, respectively. On an individual basis, these modules provide a significant separation in survival subgroups on the training and independent validation data.

SUBMITTER: van Vliet MH

PROVIDER: S-EPMC2002511 | biostudies-literature | 2007 Oct

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Module-based outcome prediction using breast cancer compendia.

van Vliet Martin H MH Klijn Christiaan N CN Wessels Lodewyk F A LF Reinders Marcel J T MJ

PloS one 20071017 10

<h4>Background</h4>The availability of large collections of microarray datasets (compendia), or knowledge about grouping of genes into pathways (gene sets), is typically not exploited when training predictors of disease outcome. These can be useful since a compendium increases the number of samples, while gene sets reduce the size of the feature space. This should be favorable from a machine learning perspective and result in more robust predictors.<h4>Methodology</h4>We extracted modules of reg ...[more]

PMID: 17940611

Dataset Information

Module-based outcome prediction using breast cancer compendia.

Background

Methodology

Conclusions

Publications

Module-based outcome prediction using breast cancer compendia.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

FERAL: network-based classifier with application to breast cancer outcome prediction.
| S-EPMC4765883 | biostudies-other

Gene expression-based, individualized outcome prediction for autism
2015-04-17 | E-GEOD-67979 | biostudies-arrayexpress

Gene expression-based, individualized outcome prediction for surgically treated pancreatic cancer patients
2024-11-02 | GSE246880 | GEO

Gene expression-based, individualized outcome prediction for surgically treated lung cancer patients
2007-01-20 | GSE4716 | GEO

Gene silenced-based, individualized cancer progression outcome prediction for shATMIN stable transfectants
2019-05-25 | GSE81866 | GEO

Gene expression-based, individualized cancer progression outcome prediction for S61 stable transfectants
2016-04-26 | GSE80641 | GEO

Gene expression-based, individualized outcome prediction for autism
2015-04-17 | GSE67979 | GEO

Identifying gene function and module connections by the integration of multispecies expression compendia.
| S-EPMC6886503 | biostudies-literature

A critical evaluation of network and pathway-based classifiers for outcome prediction in breast cancer.
| S-EPMC3338754 | biostudies-literature

Gene set-based module discovery in the breast cancer transcriptome.
| S-EPMC2674431 | biostudies-literature