Unknown

Dataset Information

0

Critical evaluation of linear regression models for cell-subtype specific methylation signal from mixed blood cell DNA.


ABSTRACT: Epigenome-wide association studies seek to identify DNA methylation sites associated with clinical outcomes. Difference in observed methylation between specific cell-subtypes is often of interest; however, available samples often comprise a mixture of cells. To date, cell-subtype estimates have been obtained from mixed-cell DNA data using linear regression models, but the accuracy of such estimates has not been critically assessed. We evaluated linear regression performance for cell-subtype specific methylation estimation using a 450K methylation array dataset of both mixed-cell and cell-subtype sorted samples from six healthy males. CpGs associated with each cell-subtype were first identified using t-tests between groups of cell-subtype sorted samples. Subsequent reduced panels of reliably accurate CpGs were identified from mixed-cell samples using an accuracy heuristic (D). Performance was assessed by comparing cell-subtype specific estimates from mixed-cells with corresponding cell-sorted mean using the mean absolute error (MAE) and the Coefficient of Determination (R2). At the cell-subtype level, methylation levels at 3272 CpGs could be estimated to within a MAE of 5% of the expected value. The cell-subtypes with the highest accuracy were CD56+ NK (R2 = 0.56) and CD8+T (R2 = 0.48), where 23% of sites were accurately estimated. Hierarchical clustering and pathways enrichment analysis confirmed the biological relevance of the panels. Our results suggest that linear regression for cell-subtype specific methylation estimation is accurate only for some cell-subtypes at a small fraction of cell-associated sites but may be applicable to EWASs of disease traits with a blood-based pathology. Although sample size was a limitation in this study, we suggest that alternative statistical methods will provide the greatest performance improvements.

SUBMITTER: Kennedy DW 

PROVIDER: S-EPMC6301777 | biostudies-literature | 2018

REPOSITORIES: biostudies-literature

altmetric image

Publications

Critical evaluation of linear regression models for cell-subtype specific methylation signal from mixed blood cell DNA.

Kennedy Daniel W DW   White Nicole M NM   Benton Miles C MC   Fox Andrew A   Scott Rodney J RJ   Griffiths Lyn R LR   Mengersen Kerrie K   Lea Rodney A RA   Lea Rodney A RA  

PloS one 20181220 12


Epigenome-wide association studies seek to identify DNA methylation sites associated with clinical outcomes. Difference in observed methylation between specific cell-subtypes is often of interest; however, available samples often comprise a mixture of cells. To date, cell-subtype estimates have been obtained from mixed-cell DNA data using linear regression models, but the accuracy of such estimates has not been critically assessed. We evaluated linear regression performance for cell-subtype spec  ...[more]

Similar Datasets

| S-EPMC5667718 | biostudies-literature
| S-EPMC10022788 | biostudies-literature
| S-EPMC2665800 | biostudies-literature
| S-EPMC7419003 | biostudies-literature
| S-EPMC4143715 | biostudies-literature
| S-EPMC4026175 | biostudies-other
| S-EPMC3903419 | biostudies-literature
| S-EPMC4818520 | biostudies-literature
| S-EPMC2883299 | biostudies-literature
| S-EPMC5217786 | biostudies-literature