Comprehensive Allele Genotyping in Critical Pharmacogenes Reduces Residual Clinical Risk in Diverse Populations.
Ontology highlight
ABSTRACT: Genomic-guided pharmaceutical prescribing is increasingly recognized as an important clinical application of genetics. Accurate genotyping of pharmacogenomic (PGx) genes can be difficult, owing to their complex genetic architecture involving combinations of single-nucleotide polymorphisms and structural variation. Here, we introduce the Helix PGx database, an open-source star allele, genotype, and resulting metabolic phenotype frequency database for CYP2C9, CYP2C19, CYP2D6, and CYP4F2, based on short-read sequencing of >86,000 unrelated individuals enrolled in the Helix DNA Discovery Project. The database is annotated using a pipeline that is clinically validated against a broad range of alleles and designed to call CYP2D6 structural variants with high (98%) accuracy. We find that CYP2D6 has greater allelic diversity than the other genes, manifest in both a long tail of low-frequency star alleles, as well as a disproportionate fraction (36%) of all novel predicted loss-of-function variants identified. Across genes, we observe that many rare alleles (<0.1% frequency) in the overall cohort have 10 times higher frequency in one or more subgroups with non-European genetic ancestry. Extending these PGx genotypes to predicted metabolic phenotypes, we demonstrate that >90% of the cohort harbors a high-risk variant in one of the four pharmacogenes. Based on the recorded prescriptions for >30,000 individuals in the Healthy Nevada Project, combined with predicted PGx metabolic phenotypes, we anticipate that standard-of-care screening of these 4 pharmacogenes could impact nearly half of the general population.
Project description:We compared whole exome sequencing (WES, n = 176 patients) and whole genome sequencing (WGS, n = 68) and clinical genotyping (DMET array-based approach) for interrogating 13 genes with Clinical Pharmacogenetics Implementation Consortium (CPIC) guidelines. We focused on 127 CPIC important variants: 103 single nucleotide variations (SNV), 21 insertion/deletions (Indel), HLA-B alleles, and two CYP2D6 structural variations. WES and WGS provided interrogation of nonoverlapping sets of 115 SNV/Indels with call rate >98%. Among 68 loci interrogated by both WES and DMET, 64 loci (94.1%, confidence interval [CI]: 85.6-98.4%) showed no discrepant genotyping calls. Among 66 loci interrogated by both WGS and DMET, 63 loci (95.5%, CI: 87.2-99.0%) showed no discrepant genotyping calls. In conclusion, even without optimization to interrogate pharmacogenetic variants, WES and WGS displayed potential to provide reliable interrogation of most pharmacogenes and further validation of genome sequencing in a clinical lab setting is warranted.
Project description:Genotyping-by-Sequencing (GBS) is an excellent tool for characterising genetic variation between plant genomes. To date, its use has been reported only for genotyping of single individuals. However, there are many applications where resolving allele frequencies within populations on a genome-wide scale would be very powerful, examples include the breeding of outbreeding species, varietal protection in outbreeding species, monitoring changes in population allele frequencies. This motivated us to test the potential to use GBS to evaluate allele frequencies within populations. Perennial ryegrass is an outbreeding species, and breeding programs are based upon selection on populations. We tested two restriction enzymes for their efficiency in complexity reduction of the perennial ryegrass genome. The resulting profiles have been termed Genome Wide Allele Frequency Fingerprints (GWAFFs), and we have shown how these fingerprints can be used to distinguish between plant populations. Even at current costs and throughput, using sequencing to directly evaluate populations on a genome-wide scale is viable. GWAFFs should find many applications, from varietal development in outbreeding species right through to playing a role in protecting plant breeders' rights.
Project description:We have developed a publicly accessible database (ALFRED, the ALlele FREquency Database) that catalogues allele frequency data for a wide range of population samples and DNA polymorphisms. This database is web-accessible through our laboratory (Kidd Lab) Web site: http://info.med.yale.edu/genetics/kkidd. ALFRED currently contains data on 60 populations and 156 genetic systems including single nucleotide polymorphisms (SNPs), short tandem repeat polymorphisms (STRPs), variable number of tandem repeats (VNTRs) and insertion-deletion polymorphisms. While data are not available for all population-DNA polymorphism combinations, over 2000 allele frequency tables have been entered. Our database is designed (i) to address our specific research requirements as well as broader scientific objectives; (ii) to allow researchers and interested educators to easily navigate and retrieve data of interest to them; and (iii) to integrate links to other related public databases such as dbSNP, GenBank and PubMed.
Project description:Pharmacogenetic (PGx) testing has not been well adopted in current clinical practice. The phenotypic distribution of clinically relevant pharmacogenes remains to be fully characterized in large population cohorts. In addition, no study has explored actionable PGx alleles in the East Asian population at a large scale. This study comprehensively analyzed 14 actionable pharmacogene diplotypes and phenotypes in 172,854 Taiwanese Han individuals by using their genotype data. Furthermore, we analyzed data from electronic medical records to investigate the effect of the actionable phenotypes on the individuals. The PGx phenotype frequencies were comparable between our cohort and the East Asian population. Overall, 99.9% of the individuals harbored at least one actionable PGx phenotype, and 29% of them have been prescribed a drug to which they may exhibit an atypical response. Our findings can facilitate the clinical application of PGx testing and the optimization of treatment and dosage individually.
Project description:DNA methylation is known to be the most stable epigenetic modification and has been extensively studied in relation to cell differentiation, development, X chromosome inactivation and disease. Allele-specific DNA methylation (ASM) is a well-established mechanism for genomic imprinting and regulates imprinted gene expression. Previous studies have confirmed that certain special regions with ASM are susceptible and closely related to human carcinogenesis and plant development. In addition, recent studies have proven ASM to be an effective tumour marker. However, research on the functions of ASM in diseases and development is still extremely scarce. Here, we collected 4400 BS-Seq datasets and 1598 corresponding RNA-Seq datasets from 47 species, including human and mouse, to establish a comprehensive ASM database. We obtained the data on DNA methylation level, ASM and allele-specific expressed genes (ASEGs) and further analysed the ASM/ASEG distribution patterns of these species. In-depth ASM distribution analysis and differential methylation analysis conducted in nine cancer types showed results consistent with the reported changes in ASM in key tumour genes and revealed several potential ASM tumour-related genes. Finally, integrating these results, we constructed the first well-resourced and comprehensive ASM database for 47 species (ASMdb, www.dna-asmdb.com).
Project description:This dataset contains DNase-seq data and CTCF ChIP-seq data for 6 lymphoblastoid cell lines. There are 3 cell lines from a YRI trio and 3 lines from a CEU trio (HapMap GM19238, GM19239, GM 19240, GM12891, GM12892, GM12878). For data usage terms and conditions, please refer to http://www.genome.gov/27528022 and http://www.genome.gov/Pages/Research/ENCODE/ENCODEDataReleasePolicyFinal2008.pdf
Project description:This dataset contains DNase-seq data and CTCF ChIP-seq data for 6 lymphoblastoid cell lines. There are 3 cell lines from a YRI trio and 3 lines from a CEU trio (HapMap GM19238, GM19239, GM 19240, GM12891, GM12892, GM12878). For data usage terms and conditions, please refer to http://www.genome.gov/27528022 and http://www.genome.gov/Pages/Research/ENCODE/ENCODEDataReleasePolicyFinal2008.pdf DNase-seq and ChIP-seq data from each of the 6 cell lines.
Project description:The implementation of pharmacogenetics (PGx) is a main milestones of precision medicine nowadays in order to achieve safer and more effective therapies. Nevertheless, the implementation of PGx diagnostics is extremely slow and unequal worldwide, in part due to a lack of ethnic PGx information. We analysed genetic data from 3006 Spanish individuals obtained by different high-throughput (HT) techniques. Allele frequencies were determined in our population for the main 21 actionable PGx genes associated with therapeutical changes. We found that 98% of the Spanish population harbours at least one allele associated with a therapeutical change and, thus, there would be a need for a therapeutical change in a mean of 3.31 of the 64 associated drugs. We also identified 326 putative deleterious variants that were not previously related with PGx in 18 out of the 21 main PGx genes evaluated and a total of 7122 putative deleterious variants for the 1045 PGx genes described. Additionally, we performed a comparison of the main HT diagnostic techniques, revealing that after whole genome sequencing, genotyping with the PGx HT array is the most suitable solution for PGx diagnostics. Finally, all this information was integrated in the Collaborative Spanish Variant Server to be available to and updated by the scientific community.
Project description:ALFRED (the ALelle FREquency Database) is designed to store and disseminate frequencies of alleles at human polymorphic sites for multiple populations, primarily for the population genetics and molecular anthropology communities. Currently ALFRED has information on over 180 polymorphic sites for more than 70 populations. Since our initial release of the database we have focussed on increasing the quantity and quality of data, making reciprocal links between ALFRED and other related databases, and providing useful tools to make the data more comprehensible to the end user. ALFRED is accessible from the Kidd Lab home page (http://info.med.yale. edu/genetics/kkidd/) or from ALFRED directly (http://alfred.med.yale. edu/alfred/index.asp).