Dataset Information

Incorporating functional priors improves polygenic prediction accuracy in UK Biobank and 23andMe data sets.

ABSTRACT: Polygenic risk prediction is a widely investigated topic because of its promising clinical applications. Genetic variants in functional regions of the genome are enriched for complex trait heritability. Here, we introduce a method for polygenic prediction, LDpred-funct, that leverages trait-specific functional priors to increase prediction accuracy. We fit priors using the recently developed baseline-LD model, including coding, conserved, regulatory, and LD-related annotations. We analytically estimate posterior mean causal effect sizes and then use cross-validation to regularize these estimates, improving prediction accuracy for sparse architectures. We applied LDpred-funct to predict 21 highly heritable traits in the UK Biobank (avg N = 373 K as training data). LDpred-funct attained a +4.6% relative improvement in average prediction accuracy (avg prediction R² = 0.144; highest R² = 0.413 for height) compared to SBayesR (the best method that does not incorporate functional information). For height, meta-analyzing training data from UK Biobank and 23andMe cohorts (N = 1107 K) increased prediction R² to 0.431. Our results show that incorporating functional priors improves polygenic prediction accuracy, consistent with the functional architecture of complex traits.

SUBMITTER: Marquez-Luna C

PROVIDER: S-EPMC8523709 | biostudies-literature | 2021 Oct

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Incorporating functional priors improves polygenic prediction accuracy in UK Biobank and 23andMe data sets.

Márquez-Luna Carla C Gazal Steven S Loh Po-Ru PR Kim Samuel S SS Furlotte Nicholas N Auton Adam A Price Alkes L AL

Nature communications 20211018 1

Polygenic risk prediction is a widely investigated topic because of its promising clinical applications. Genetic variants in functional regions of the genome are enriched for complex trait heritability. Here, we introduce a method for polygenic prediction, LDpred-funct, that leverages trait-specific functional priors to increase prediction accuracy. We fit priors using the recently developed baseline-LD model, including coding, conserved, regulatory, and LD-related annotations. We analytically e ...[more]

PMID: 34663819

Dataset Information

Incorporating functional priors improves polygenic prediction accuracy in UK Biobank and 23andMe data sets.

Publications

Incorporating functional priors improves polygenic prediction accuracy in UK Biobank and 23andMe data sets.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Polygenic Risk Score Is Associated With Intraocular Pressure and Improves Glaucoma Prediction in the UK Biobank Cohort.
| S-EPMC6450641 | biostudies-literature

Incorporating kernelized multi-omics data improves the accuracy of genomic prediction.
| S-EPMC9490992 | biostudies-literature

Incorporating methylation genome information improves prediction accuracy for drug treatment responses.
| S-EPMC6157255 | biostudies-literature

Polygenic prediction via Bayesian regression and continuous shrinkage priors.
| S-EPMC6467998 | biostudies-literature

Incorporating European GWAS findings improve polygenic risk prediction accuracy of breast cancer among East Asians.
| S-EPMC8372543 | biostudies-literature

Polygenic prediction of major depressive disorder and related traits in African ancestries UK Biobank participants.
| S-EPMC11649553 | biostudies-literature

Integrative polygenic risk score improves the prediction accuracy of complex traits and diseases.
| S-EPMC9980241 | biostudies-literature

External Validation of Risk Prediction Models Incorporating Common Genetic Variants for Incident Colorectal Cancer Using UK Biobank.
| S-EPMC7610623 | biostudies-literature

Development of a Polygenic Risk Score for Metabolic Dysfunction-Associated Steatotic Liver Disease Prediction in UK Biobank.
| S-EPMC11765347 | biostudies-literature

Accurate and Scalable Construction of Polygenic Scores in Large Biobank Data Sets.
| S-EPMC7212266 | biostudies-literature