Unknown

Dataset Information

0

Non-linear machine learning models incorporating SNPs and PRS improve polygenic prediction in diverse human populations.


ABSTRACT: Polygenic risk scores (PRS) are commonly used to quantify the inherited susceptibility for a trait, yet they fail to account for non-linear and interaction effects between single nucleotide polymorphisms (SNPs). We address this via a machine learning approach, validated in nine complex phenotypes in a multi-ancestry population. We use an ensemble method of SNP selection followed by gradient boosted trees (XGBoost) to allow for non-linearities and interaction effects. We compare our results to the standard, linear PRS model developed using PRSice, LDpred2, and lassosum2. Combining a PRS as a feature in an XGBoost model results in a relative increase in the percentage variance explained compared to the standard linear PRS model by 22% for height, 27% for HDL cholesterol, 43% for body mass index, 50% for sleep duration, 58% for systolic blood pressure, 64% for total cholesterol, 66% for triglycerides, 77% for LDL cholesterol, and 100% for diastolic blood pressure. Multi-ancestry trained models perform similarly to specific racial/ethnic group trained models and are consistently superior to the standard linear PRS models. This work demonstrates an effective method to account for non-linearities and interaction effects in genetics-based prediction models.

SUBMITTER: Elgart M 

PROVIDER: S-EPMC9395509 | biostudies-literature | 2022 Aug

REPOSITORIES: biostudies-literature

altmetric image

Publications


Polygenic risk scores (PRS) are commonly used to quantify the inherited susceptibility for a trait, yet they fail to account for non-linear and interaction effects between single nucleotide polymorphisms (SNPs). We address this via a machine learning approach, validated in nine complex phenotypes in a multi-ancestry population. We use an ensemble method of SNP selection followed by gradient boosted trees (XGBoost) to allow for non-linearities and interaction effects. We compare our results to th  ...[more]

Similar Datasets

| S-EPMC5726434 | biostudies-literature
| S-EPMC9117455 | biostudies-literature
| S-EPMC9351615 | biostudies-literature
| S-EPMC9458667 | biostudies-literature
| S-EPMC8372543 | biostudies-literature
| S-EPMC5627249 | biostudies-literature
| S-EPMC11346299 | biostudies-literature
| S-EPMC11326295 | biostudies-literature
| S-EPMC11323637 | biostudies-literature
| S-EPMC11532986 | biostudies-literature