Unknown

Dataset Information

0

Incorporating genetic networks into case-control association studies with high-dimensional DNA methylation data.


ABSTRACT:

Background

In human genetic association studies with high-dimensional gene expression data, it has been well known that statistical selection methods utilizing prior biological network knowledge such as genetic pathways and signaling pathways can outperform other methods that ignore genetic network structures in terms of true positive selection. In recent epigenetic research on case-control association studies, relatively many statistical methods have been proposed to identify cancer-related CpG sites and their corresponding genes from high-dimensional DNA methylation array data. However, most of existing methods are not designed to utilize genetic network information although methylation levels between linked genes in the genetic networks tend to be highly correlated with each other.

Results

We propose new approach that combines data dimension reduction techniques with network-based regularization to identify outcome-related genes for analysis of high-dimensional DNA methylation data. In simulation studies, we demonstrated that the proposed approach overwhelms other statistical methods that do not utilize genetic network information in terms of true positive selection. We also applied it to the 450K DNA methylation array data of the four breast invasive carcinoma cancer subtypes from The Cancer Genome Atlas (TCGA) project.

Conclusions

The proposed variable selection approach can utilize prior biological network information for analysis of high-dimensional DNA methylation array data. It first captures gene level signals from multiple CpG sites using data a dimension reduction technique and then performs network-based regularization based on biological network graph information. It can select potentially cancer-related genes and genetic pathways that were missed by the existing methods.

SUBMITTER: Kim K 

PROVIDER: S-EPMC6805595 | biostudies-literature | 2019 Oct

REPOSITORIES: biostudies-literature

altmetric image

Publications

Incorporating genetic networks into case-control association studies with high-dimensional DNA methylation data.

Kim Kipoong K   Sun Hokeun H  

BMC bioinformatics 20191022 1


<h4>Background</h4>In human genetic association studies with high-dimensional gene expression data, it has been well known that statistical selection methods utilizing prior biological network knowledge such as genetic pathways and signaling pathways can outperform other methods that ignore genetic network structures in terms of true positive selection. In recent epigenetic research on case-control association studies, relatively many statistical methods have been proposed to identify cancer-rel  ...[more]

Similar Datasets

| S-EPMC3348559 | biostudies-literature
| S-EPMC9477535 | biostudies-literature
| S-EPMC8620394 | biostudies-literature
| S-EPMC1866832 | biostudies-literature
| S-EPMC3025519 | biostudies-literature
| S-EPMC3133924 | biostudies-literature
| S-EPMC3260479 | biostudies-literature
| S-EPMC4058572 | biostudies-literature
| S-EPMC2483716 | biostudies-literature
| S-EPMC5870885 | biostudies-literature