Dataset Information

Profiling and Leveraging Relatedness in a Precision Medicine Cohort of 92,455 Exomes.

ABSTRACT: Large-scale human genetics studies are ascertaining increasing proportions of populations as they continue growing in both number and scale. As a result, the amount of cryptic relatedness within these study cohorts is growing rapidly and has significant implications on downstream analyses. We demonstrate this growth empirically among the first 92,455 exomes from the DiscovEHR cohort and, via a custom simulation framework we developed called SimProgeny, show that these measures are in line with expectations given the underlying population and ascertainment approach. For example, within DiscovEHR we identified ?66,000 close (first- and second-degree) relationships, involving 55.6% of study participants. Our simulation results project that >70% of the cohort will be involved in these close relationships, given that DiscovEHR scales to 250,000 recruited individuals. We reconstructed 12,574 pedigrees by using these relationships (including 2,192 nuclear families) and leveraged them for multiple applications. The pedigrees substantially improved the phasing accuracy of 20,947 rare, deleterious compound heterozygous mutations. Reconstructed nuclear families were critical for identifying 3,415 de novo mutations in ?1,783 genes. Finally, we demonstrate the segregation of known and suspected disease-causing mutations, including a tandem duplication that occurs in LDLR and causes familial hypercholesterolemia, through reconstructed pedigrees. In summary, this work highlights the prevalence of cryptic relatedness expected among large healthcare population-genomic studies and demonstrates several analyses that are uniquely enabled by large amounts of cryptic relatedness.

SUBMITTER: Staples J

PROVIDER: S-EPMC5986700 | biostudies-literature | 2018 May

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Profiling and Leveraging Relatedness in a Precision Medicine Cohort of 92,455 Exomes.

Staples Jeffrey J Maxwell Evan K EK Gosalia Nehal N Gonzaga-Jauregui Claudia C Snyder Christopher C Hawes Alicia A Penn John J Ulloa Ricardo R Bai Xiaodong X Lopez Alexander E AE Van Hout Cristopher V CV O'Dushlaine Colm C Teslovich Tanya M TM McCarthy Shane E SE Balasubramanian Suganthi S Kirchner H Lester HL Leader Joseph B JB Murray Michael F MF Ledbetter David H DH Shuldiner Alan R AR Yancoupolos George D GD Dewey Frederick E FE Carey David J DJ Overton John D JD Baras Aris A Habegger Lukas L Reid Jeffrey G JG

American journal of human genetics 20180501 5

Large-scale human genetics studies are ascertaining increasing proportions of populations as they continue growing in both number and scale. As a result, the amount of cryptic relatedness within these study cohorts is growing rapidly and has significant implications on downstream analyses. We demonstrate this growth empirically among the first 92,455 exomes from the DiscovEHR cohort and, via a custom simulation framework we developed called SimProgeny, show that these measures are in line with e ...[more]

PMID: 29727688

Similar Datasets

Project description:BackgroundThe COVID-19 pandemic has highlighted the urgency of addressing an epidemic of obesity and associated inflammatory illnesses. Previous studies have demonstrated that interactions between single-nucleotide polymorphisms (SNPs) and lifestyle interventions such as food and exercise may vary metabolic outcomes, contributing to obesity. However, there is a paucity of research relating outcomes from digital therapeutics to the inclusion of genetic data in care interventions.ObjectiveThis study aims to describe and model the weight loss of participants enrolled in a precision digital weight loss program informed by the machine learning analysis of their data, including genomic data. It was hypothesized that weight loss models would exhibit a better fit when incorporating genomic data versus demographic and engagement variables alone.MethodsA cohort of 393 participants enrolled in Digbi Health's personalized digital care program for 120 days was analyzed retrospectively. The care protocol used participant data to inform precision coaching by mobile app and personal coach. Linear regression models were fit of weight loss (pounds lost and percentage lost) as a function of demographic and behavioral engagement variables. Genomic-enhanced models were built by adding 197 SNPs from participant genomic data as predictors and refitted using Lasso regression on SNPs for variable selection. Success or failure logistic regression models were also fit with and without genomic data.ResultsOverall, 72.0% (n=283) of the 393 participants in this cohort lost weight, whereas 17.3% (n=68) maintained stable weight. A total of 142 participants lost 5% bodyweight within 120 days. Models described the impact of demographic and clinical factors, behavioral engagement, and genomic risk on weight loss. Incorporating genomic predictors improved the mean squared error of weight loss models (pounds lost and percent) from 70 to 60 and 16 to 13, respectively. The logistic model improved the pseudo R2 value from 0.193 to 0.285. Gender, engagement, and specific SNPs were significantly associated with weight loss. SNPs within genes involved in metabolic pathways processing food and regulating fat storage were associated with weight loss in this cohort: rs17300539_G (insulin resistance and monounsaturated fat metabolism), rs2016520_C (BMI, waist circumference, and cholesterol metabolism), and rs4074995_A (calcium-potassium transport and serum calcium levels). The models described greater average weight loss for participants with more risk alleles. Notably, coaching for dietary modification was personalized to these genetic risks.ConclusionsIncluding genomic information when modeling outcomes of a digital precision weight loss program greatly enhanced the model accuracy. Interpretable weight loss models indicated the efficacy of coaching informed by participants' genomic risk, accompanied by active engagement of participants in their own success. Although large-scale validation is needed, our study preliminarily supports precision dietary interventions for weight loss using genetic risk, with digitally delivered recommendations alongside health coaching to improve intervention efficacy.

Dataset Information

Profiling and Leveraging Relatedness in a Precision Medicine Cohort of 92,455 Exomes.

Publications

Profiling and Leveraging Relatedness in a Precision Medicine Cohort of 92,455 Exomes.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets