Unknown

Dataset Information

0

Direct integration of intensity-level data from Affymetrix and Illumina microarrays improves statistical power for robust reanalysis.


ABSTRACT:

Background

Affymetrix GeneChips and Illumina BeadArrays are the most widely used commercial single channel gene expression microarrays. Public data repositories are an extremely valuable resource, providing array-derived gene expression measurements from many thousands of experiments. Unfortunately many of these studies are underpowered and it is desirable to improve power by combining data from more than one study; we sought to determine whether platform-specific bias precludes direct integration of probe intensity signals for combined reanalysis.

Results

Using Affymetrix and Illumina data from the microarray quality control project, from our own clinical samples, and from additional publicly available datasets we evaluated several approaches to directly integrate intensity level expression data from the two platforms. After mapping probe sequences to Ensembl genes we demonstrate that, ComBat and cross platform normalisation (XPN), significantly outperform mean-centering and distance-weighted discrimination (DWD) in terms of minimising inter-platform variance. In particular we observed that DWD, a popular method used in a number of previous studies, removed systematic bias at the expense of genuine biological variability, potentially reducing legitimate biological differences from integrated datasets.

Conclusion

Normalised and batch-corrected intensity-level data from Affymetrix and Illumina microarrays can be directly combined to generate biologically meaningful results with improved statistical power for robust, integrated reanalysis.

SUBMITTER: Turnbull AK 

PROVIDER: S-EPMC3443058 | biostudies-literature | 2012 Aug

REPOSITORIES: biostudies-literature

altmetric image

Publications

Direct integration of intensity-level data from Affymetrix and Illumina microarrays improves statistical power for robust reanalysis.

Turnbull Arran K AK   Kitchen Robert R RR   Larionov Alexey A AA   Renshaw Lorna L   Dixon J Michael JM   Sims Andrew H AH  

BMC medical genomics 20120821


<h4>Background</h4>Affymetrix GeneChips and Illumina BeadArrays are the most widely used commercial single channel gene expression microarrays. Public data repositories are an extremely valuable resource, providing array-derived gene expression measurements from many thousands of experiments. Unfortunately many of these studies are underpowered and it is desirable to improve power by combining data from more than one study; we sought to determine whether platform-specific bias precludes direct i  ...[more]

Similar Datasets

| S-EPMC1891274 | biostudies-literature
| S-EPMC3220870 | biostudies-literature
| S-EPMC4890042 | biostudies-literature
| S-EPMC4518889 | biostudies-literature
| S-EPMC1557755 | biostudies-literature
| S-EPMC126253 | biostudies-literature
| S-EPMC2453111 | biostudies-literature
| S-EPMC165600 | biostudies-literature
| S-EPMC1929139 | biostudies-literature
| S-EPMC6834440 | biostudies-literature