Unknown

Dataset Information

0

Iteratively refining breast cancer intrinsic subtypes in the METABRIC dataset.


ABSTRACT: BACKGROUND:Multi-gene lists and single sample predictor models have been currently used to reduce the multidimensional complexity of breast cancers, and to identify intrinsic subtypes. The perceived inability of some models to deal with the challenges of processing high-dimensional data, however, limits the accurate characterisation of these subtypes. Towards the development of robust strategies, we designed an iterative approach to consistently discriminate intrinsic subtypes and improve class prediction in the METABRIC dataset. FINDINGS:In this study, we employed the CM1 score to identify the most discriminative probes for each group, and an ensemble learning technique to assess the ability of these probes on assigning subtype labels using 24 different classifiers. Our analysis is comprised of an iterative computation of these methods and statistical measures performed on a set of over 2000 samples. The refined labels assigned using this iterative approach revealed to be more consistent and in better agreement with clinicopathological markers and patients' overall survival than those originally provided by the PAM50 method. CONCLUSIONS:The assignment of intrinsic subtypes has a significant impact in translational research for both understanding and managing breast cancer. The refined labelling, therefore, provides more accurate and reliable information by improving the source of fundamental science prior to clinical applications in medicine.

SUBMITTER: Milioli HH 

PROVIDER: S-EPMC4712506 | biostudies-literature | 2016

REPOSITORIES: biostudies-literature

altmetric image

Publications

Iteratively refining breast cancer intrinsic subtypes in the METABRIC dataset.

Milioli Heloisa H HH   Vimieiro Renato R   Tishchenko Inna I   Riveros Carlos C   Berretta Regina R   Moscato Pablo P  

BioData mining 20160113


<h4>Background</h4>Multi-gene lists and single sample predictor models have been currently used to reduce the multidimensional complexity of breast cancers, and to identify intrinsic subtypes. The perceived inability of some models to deal with the challenges of processing high-dimensional data, however, limits the accurate characterisation of these subtypes. Towards the development of robust strategies, we designed an iterative approach to consistently discriminate intrinsic subtypes and improv  ...[more]

Similar Datasets

| S-ECPF-GEOD-37145 | biostudies-other
2014-01-01 | E-GEOD-37145 | biostudies-arrayexpress
| S-EPMC4425383 | biostudies-literature
| S-EPMC4488510 | biostudies-literature
| S-EPMC2667820 | biostudies-literature
2014-01-01 | GSE37145 | GEO
| S-EPMC3073183 | biostudies-literature
| S-EPMC8216504 | biostudies-literature
| S-EPMC5822682 | biostudies-literature
| S-EPMC4350320 | biostudies-literature