Unknown

Dataset Information

0

Microarray-based RNA profiling of breast cancer: batch effect removal improves cross-platform consistency.


ABSTRACT: Microarray is a powerful technique used extensively for gene expression analysis. Different technologies are available, but lack of standardization makes it challenging to compare and integrate data. Furthermore, batch-related biases within datasets are common but often not tackled. We have analyzed the same 234 breast cancers on two different microarray platforms. One dataset contained known batch-effects associated with the fabrication procedure used. The aim was to assess the significance of correcting for systematic batch-effects when integrating data from different platforms. We here demonstrate the importance of detecting batch-effects and how tools, such as ComBat, can be used to successfully overcome such systematic variations in order to unmask essential biological signals. Batch adjustment was found to be particularly valuable in the detection of more delicate differences in gene expression. Furthermore, our results show that prober adjustment is essential for integration of gene expression data obtained from multiple sources. We show that high-variance genes are highly reproducibly expressed across platforms making them particularly well suited as biomarkers and for building gene signatures, exemplified by prediction of estrogen-receptor status and molecular subtypes. In conclusion, the study emphasizes the importance of utilizing proper batch adjustment methods when integrating data across different batches and platforms.

SUBMITTER: Larsen MJ 

PROVIDER: S-EPMC4101981 | biostudies-literature | 2014

REPOSITORIES: biostudies-literature

altmetric image

Publications

Microarray-based RNA profiling of breast cancer: batch effect removal improves cross-platform consistency.

Larsen Martin J MJ   Thomassen Mads M   Tan Qihua Q   Sørensen Kristina P KP   Kruse Torben A TA  

BioMed research international 20140703


Microarray is a powerful technique used extensively for gene expression analysis. Different technologies are available, but lack of standardization makes it challenging to compare and integrate data. Furthermore, batch-related biases within datasets are common but often not tackled. We have analyzed the same 234 breast cancers on two different microarray platforms. One dataset contained known batch-effects associated with the fabrication procedure used. The aim was to assess the significance of  ...[more]

Similar Datasets

2015-06-01 | GSE54275 | GEO
| S-EPMC1312314 | biostudies-literature
| S-EPMC2397413 | biostudies-literature
| S-EPMC419626 | biostudies-literature
| S-EPMC4322577 | biostudies-literature
| S-EPMC4736986 | biostudies-literature
| S-EPMC3226184 | biostudies-literature
| S-EPMC10981711 | biostudies-literature
| S-EPMC4001710 | biostudies-literature
2005-04-21 | GSE2458 | GEO