Dataset Information

The effect of oligonucleotide microarray data pre-processing on the analysis of patient-cohort studies.

ABSTRACT:

Background

Intensity values measured by Affymetrix microarrays have to be both normalized, to be able to compare different microarrays by removing non-biological variation, and summarized, generating the final probe set expression values. Various pre-processing techniques, such as dChip, GCRMA, RMA and MAS have been developed for this purpose. This study assesses the effect of applying different pre-processing methods on the results of analyses of large Affymetrix datasets. By focusing on practical applications of microarray-based research, this study provides insight into the relevance of pre-processing procedures to biology-oriented researchers.

Results

Using two publicly available datasets, i.e., gene-expression data of 285 patients with Acute Myeloid Leukemia (AML, Affymetrix HG-U133A GeneChip) and 42 samples of tumor tissue of the embryonal central nervous system (CNS, Affymetrix HuGeneFL GeneChip), we tested the effect of the four pre-processing strategies mentioned above, on (1) expression level measurements, (2) detection of differential expression, (3) cluster analysis and (4) classification of samples. In most cases, the effect of pre-processing is relatively small compared to other choices made in an analysis for the AML dataset, but has a more profound effect on the outcome of the CNS dataset. Analyses on individual probe sets, such as testing for differential expression, are affected most; supervised, multivariate analyses such as classification are far less sensitive to pre-processing.

Conclusion

Using two experimental datasets, we show that the choice of pre-processing method is of relatively minor influence on the final analysis outcome of large microarray studies whereas it can have important effects on the results of a smaller study. The data source (platform, tissue homogeneity, RNA quality) is potentially of bigger importance than the choice of pre-processing method.

SUBMITTER: Verhaak RG

PROVIDER: S-EPMC1481623 | biostudies-literature | 2006 Mar

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

The effect of oligonucleotide microarray data pre-processing on the analysis of patient-cohort studies.

Verhaak Roel G W RG Staal Frank J T FJ Valk Peter J M PJ Lowenberg Bob B Reinders Marcel J T MJ de Ridder Dick D

BMC bioinformatics 20060302

<h4>Background</h4>Intensity values measured by Affymetrix microarrays have to be both normalized, to be able to compare different microarrays by removing non-biological variation, and summarized, generating the final probe set expression values. Various pre-processing techniques, such as dChip, GCRMA, RMA and MAS have been developed for this purpose. This study assesses the effect of applying different pre-processing methods on the results of analyses of large Affymetrix datasets. By focusing o ...[more]

PMID: 16512908

Dataset Information

The effect of oligonucleotide microarray data pre-processing on the analysis of patient-cohort studies.

Background

Results

Conclusion

Publications

The effect of oligonucleotide microarray data pre-processing on the analysis of patient-cohort studies.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

A fully scalable online pre-processing algorithm for short oligonucleotide microarray atlases.
| S-EPMC3664815 | biostudies-literature

protGear: A protein microarray data pre-processing suite.
| S-EPMC8114118 | biostudies-literature

Profound effect of normalization on detection of differentially expressed genes in oligonucleotide microarray data analysis.
| S-EPMC126238 | biostudies-literature

Filtering genes to improve sensitivity in oligonucleotide microarray data analysis.
| S-EPMC2018638 | biostudies-literature

Correction of scaling mismatches in oligonucleotide microarray data.
| S-EPMC1508160 | biostudies-literature

A pre-processing pipeline to quantify, visualize, and reduce technical variation in protein microarray studies.
| S-EPMC11849410 | biostudies-literature

Pro-MAP: a robust pipeline for the pre-processing of single channel protein microarray data.
| S-EPMC9733281 | biostudies-literature

Inverse Langmuir method for oligonucleotide microarray analysis.
| S-EPMC2661052 | biostudies-literature

DBNorm: normalizing high-density oligonucleotide microarray data based on distributions.
| S-EPMC5706403 | biostudies-literature

Combined analysis of oligonucleotide microarray data from transgenic and knockout mice identifies direct SREBP target genes.
| S-EPMC218707 | biostudies-literature