Unknown

Dataset Information

0

A statistical method for excluding non-variable CpG sites in high-throughput DNA methylation profiling.


ABSTRACT: High-throughput DNA methylation arrays are likely to accelerate the pace of methylation biomarker discovery for a wide variety of diseases. A potential problem with a standard set of probes measuring the methylation status of CpG sites across the whole genome is that many sites may not show inter-individual methylation variation among the biosamples for the disease outcome being studied. Inclusion of these so-called "non-variable sites" will increase the risk of false discoveries and reduce statistical power to detect biologically relevant methylation markers.We propose a method to estimate the proportion of non-variable CpG sites and eliminate those sites from further analyses. Our method is illustrated using data obtained by hybridizing DNA extracted from the peripheral blood mononuclear cells of 311 samples to an array assaying 1505 CpG sites. Results showed that a large proportion of the CpG sites did not show inter-individual variation in methylation.Our method resulted in a substantial improvement in association signals between methylation sites and outcome variables while controlling the false discovery rate at the same level.

SUBMITTER: Meng H 

PROVIDER: S-EPMC2876131 | biostudies-literature | 2010 May

REPOSITORIES: biostudies-literature

altmetric image

Publications

A statistical method for excluding non-variable CpG sites in high-throughput DNA methylation profiling.

Meng Hailong H   Joyce Andrew R AR   Adkins Daniel E DE   Basu Priyadarshi P   Jia Yankai Y   Li Guoya G   Sengupta Tapas K TK   Zedler Barbara K BK   Murrelle E Lenn EL   van den Oord Edwin J C G EJ  

BMC bioinformatics 20100505


<h4>Background</h4>High-throughput DNA methylation arrays are likely to accelerate the pace of methylation biomarker discovery for a wide variety of diseases. A potential problem with a standard set of probes measuring the methylation status of CpG sites across the whole genome is that many sites may not show inter-individual methylation variation among the biosamples for the disease outcome being studied. Inclusion of these so-called "non-variable sites" will increase the risk of false discover  ...[more]

Similar Datasets

| S-EPMC2613930 | biostudies-literature
| S-EPMC9715628 | biostudies-literature
| S-EPMC4914085 | biostudies-literature
| S-EPMC5836973 | biostudies-literature
| S-EPMC1415217 | biostudies-literature
| S-EPMC3293833 | biostudies-other
| S-EPMC9734657 | biostudies-literature
| S-EPMC5004144 | biostudies-literature
| S-EPMC3936770 | biostudies-literature
| S-EPMC4736656 | biostudies-literature