Unknown

Dataset Information

0

A three-gene model to robustly identify breast cancer molecular subtypes.


ABSTRACT: BACKGROUND: Single sample predictors (SSPs) and Subtype classification models (SCMs) are gene expression-based classifiers used to identify the four primary molecular subtypes of breast cancer (basal-like, HER2-enriched, luminal A, and luminal B). SSPs use hierarchical clustering, followed by nearest centroid classification, based on large sets of tumor-intrinsic genes. SCMs use a mixture of Gaussian distributions based on sets of genes with expression specifically correlated with three key breast cancer genes (estrogen receptor [ER], HER2, and aurora kinase A [AURKA]). The aim of this study was to compare the robustness, classification concordance, and prognostic value of these classifiers with those of a simplified three-gene SCM in a large compendium of microarray datasets. METHODS: Thirty-six publicly available breast cancer datasets (n = 5715) were subjected to molecular subtyping using five published classifiers (three SSPs and two SCMs) and SCMGENE, the new three-gene (ER, HER2, and AURKA) SCM. We used the prediction strength statistic to estimate robustness of the classification models, defined as the capacity of a classifier to assign the same tumors to the same subtypes independently of the dataset used to fit it. We used Cohen ? and Cramer V coefficients to assess concordance between the subtype classifiers and association with clinical variables, respectively. We used Kaplan-Meier survival curves and cross-validated partial likelihood to compare prognostic value of the resulting classifications. All statistical tests were two-sided. RESULTS: SCMs were statistically significantly more robust than SSPs, with SCMGENE being the most robust because of its simplicity. SCMGENE was statistically significantly concordant with published SCMs (? = 0.65-0.70) and SSPs (? = 0.34-0.59), statistically significantly associated with ER (V = 0.64), HER2 (V = 0.52) status, and histological grade (V = 0.55), and yielded similar strong prognostic value. CONCLUSION: Our results suggest that adequate classification of the major and clinically relevant molecular subtypes of breast cancer can be robustly achieved with quantitative measurements of three key genes.

SUBMITTER: Haibe-Kains B 

PROVIDER: S-EPMC3283537 | biostudies-literature | 2012 Feb

REPOSITORIES: biostudies-literature

altmetric image

Publications

A three-gene model to robustly identify breast cancer molecular subtypes.

Haibe-Kains Benjamin B   Desmedt Christine C   Loi Sherene S   Culhane Aedin C AC   Bontempi Gianluca G   Quackenbush John J   Sotiriou Christos C  

Journal of the National Cancer Institute 20120118 4


<h4>Background</h4>Single sample predictors (SSPs) and Subtype classification models (SCMs) are gene expression-based classifiers used to identify the four primary molecular subtypes of breast cancer (basal-like, HER2-enriched, luminal A, and luminal B). SSPs use hierarchical clustering, followed by nearest centroid classification, based on large sets of tumor-intrinsic genes. SCMs use a mixture of Gaussian distributions based on sets of genes with expression specifically correlated with three k  ...[more]

Similar Datasets

| S-EPMC3413822 | biostudies-other
| S-EPMC1489944 | biostudies-literature
| S-EPMC9277110 | biostudies-literature
| S-EPMC3615534 | biostudies-literature
| S-EPMC3286810 | biostudies-other
| S-ECPF-SMDB-3829 | biostudies-other
| S-ECPF-SMDB-3828 | biostudies-other
| S-EPMC8236363 | biostudies-literature
| S-EPMC5699328 | biostudies-literature
| S-EPMC7259379 | biostudies-literature