Accurate age prediction from blood using a small set of DNA methylation sites and a cohort-based machine learning algorithm
Ontology highlight
ABSTRACT: Chronological age prediction from DNA methylation sheds light on human aging, indicates poor health and predicts lifespan. Previous studies developed methylation clocks based on linear regression models on methylation array data. While accurate, these models are limited to fixed-rate changes in methylation levels across age. Moreover, the high cost of methylation arrays, compared to targeted-PCR sequencing, hinders widespread utility of such predictors. We present an AI-based alternative termed GP-age, which uses a non-parametric approach based on Gaussian Process Regression of a large cohort of ~12K blood methylomes. Given a new blood sample, methylation levels are compared to the cohort samples, which are then weighted to predict the query age. Using only 30 CpG sites, our approach outperforms state-of-the-art methylation clocks that use hundreds of sites, with a median error of 2.1 years (on held-out data). Our model was also applied to sequencing-based data yielding highly accurate predictions. Overall, we provide an accessible alternative to current array-based methylation clocks, with future applications in aging research, forensic profiling, and monitoring epigenetic processes in transplantation medicine and cancer.
ORGANISM(S): Homo sapiens
PROVIDER: GSE207605 | GEO | 2023/01/20
REPOSITORIES: GEO
ACCESS DATA