Unknown

Dataset Information

0

A statistical framework for Illumina DNA methylation arrays.


ABSTRACT:

Motivation

The Illumina BeadArray is a popular platform for profiling DNA methylation, an important epigenetic event associated with gene silencing and chromosomal instability. However, current approaches rely on an arbitrary detection P-value cutoff for excluding probes and samples from subsequent analysis as a quality control step, which results in missing observations and information loss. It is desirable to have an approach that incorporates the whole data, but accounts for the different quality of individual observations.

Results

We first investigate and propose a statistical framework for removing the source of biases in Illumina Methylation BeadArray based on several positive control samples. We then introduce a weighted model-based clustering called LumiWCluster for Illumina BeadArray that weights each observation according to the detection P-values systematically and avoids discarding subsets of the data. LumiWCluster allows for discovery of distinct methylation patterns and automatic selection of informative CpG loci. We demonstrate the advantages of LumiWCluster on two publicly available Illumina GoldenGate Methylation datasets (ovarian cancer and hepatocellular carcinoma).

Availability

R package LumiWCluster can be downloaded from http://www.unc.edu/?pfkuan/LumiWCluster.

SUBMITTER: Kuan PF 

PROVIDER: S-EPMC3025715 | biostudies-literature | 2010 Nov

REPOSITORIES: biostudies-literature

altmetric image

Publications

A statistical framework for Illumina DNA methylation arrays.

Kuan Pei Fen PF   Wang Sijian S   Zhou Xin X   Chu Haitao H  

Bioinformatics (Oxford, England) 20100929 22


<h4>Motivation</h4>The Illumina BeadArray is a popular platform for profiling DNA methylation, an important epigenetic event associated with gene silencing and chromosomal instability. However, current approaches rely on an arbitrary detection P-value cutoff for excluding probes and samples from subsequent analysis as a quality control step, which results in missing observations and information loss. It is desirable to have an approach that incorporates the whole data, but accounts for the diffe  ...[more]

Similar Datasets

| S-EPMC6791701 | biostudies-literature
| S-EPMC10208159 | biostudies-literature
| S-EPMC4176427 | biostudies-literature
| S-EPMC6518823 | biostudies-literature
| S-EPMC5907140 | biostudies-literature
| S-EPMC6498745 | biostudies-literature
| S-EPMC8078668 | biostudies-literature
| S-EPMC7057447 | biostudies-literature
2016-07-03 | E-GEOD-52635 | biostudies-arrayexpress
| S-EPMC6954393 | biostudies-literature