Dataset Information

Module-based prediction approach for robust inter-study predictions in microarray data.

ABSTRACT: Traditional genomic prediction models based on individual genes suffer from low reproducibility across microarray studies due to the lack of robustness to expression measurement noise and gene missingness when they are matched across platforms. It is common that some of the genes in the prediction model established in a training study cannot be matched to another test study because a different platform is applied. The failure of inter-study predictions has severely hindered the clinical applications of microarray. To overcome the drawbacks of traditional gene-based prediction (GBP) models, we propose a module-based prediction (MBP) strategy via unsupervised gene clustering.K-means clustering is used to group genes sharing similar expression profiles into gene modules, and small modules are merged into their nearest neighbors. Conventional univariate or multivariate feature selection procedure is applied and a representative gene from each selected module is identified to construct the final prediction model. As a result, the prediction model is portable to any test study as long as partial genes in each module exist in the test study. We demonstrate that K-means cluster sizes generally follow a multinomial distribution and the failure probability of inter-study prediction due to missing genes is diminished by merging small clusters into their nearest neighbors. By simulation and applications of real datasets in inter-study predictions, we show that the proposed MBP provides slightly improved accuracy while is considerably more robust than traditional GBP.http://www.biostat.pitt.edu/bioinfo/ctseng@pitt.eduSupplementary data are available at Bioinformatics online.

SUBMITTER: Mi Z

PROVIDER: S-EPMC2951088 | biostudies-other | 2010 Oct

REPOSITORIES: biostudies-other

ACCESS DATA

Publications

Module-based prediction approach for robust inter-study predictions in microarray data.

Mi Zhibao Z Shen Kui K Song Nan N Cheng Chunrong C Song Chi C Kaminski Naftali N Tseng George C GC

Bioinformatics (Oxford, England) 20100817 20

<h4>Motivation</h4>Traditional genomic prediction models based on individual genes suffer from low reproducibility across microarray studies due to the lack of robustness to expression measurement noise and gene missingness when they are matched across platforms. It is common that some of the genes in the prediction model established in a training study cannot be matched to another test study because a different platform is applied. The failure of inter-study predictions has severely hindered th ...[more]

PMID: 20719761

Dataset Information

Module-based prediction approach for robust inter-study predictions in microarray data.

Publications

Module-based prediction approach for robust inter-study predictions in microarray data.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

A study of inter-lab and inter-platform agreement of DNA microarray data.
| S-EPMC1142313 | biostudies-literature

Microarray-based cancer prediction using soft computing approach.
| S-EPMC2730177 | biostudies-literature

PETModule: a motif module based approach for enhancer target gene prediction.
| S-EPMC4951774 | biostudies-literature

The dChip survival analysis module for microarray data.
| S-EPMC3068974 | biostudies-literature

rapmad: Robust analysis of peptide microarray data.
| S-EPMC3174949 | biostudies-literature

Ratio adjustment and calibration scheme for gene-wise normalization to enhance microarray inter-study prediction.
| S-EPMC2732320 | biostudies-literature

Robust singular value decomposition analysis of microarray data.
| S-EPMC263735 | biostudies-literature

An effective method for network module extraction from microarray data.
| S-EPMC3426802 | biostudies-literature

Robust microarray meta-analysis identifies differentially expressed genes for clinical prediction.
| S-EPMC3539384 | biostudies-literature

Individualized markers optimize class prediction of microarray data.
| S-EPMC1569876 | biostudies-other