Project description:Genome-wide association studies have identified over 150 risk loci that increase prostate cancer risk. However, few causal variants and their regulatory mechanisms have been characterized. In this study, we utilized our previously developed single-nucleotide polymorphisms sequencing (SNPs-seq) technology to test allele-dependent protein binding at 903 SNP sites covering 28 genomic regions. All selected SNPs have shown significant cis-association with at least one nearby gene. After preparing nuclear extract using LNCaP cell line, we first mixed the extract with dsDNA oligo pool for protein-DNA binding incubation. We then performed sequencing analysis on protein-bound oligos. SNPs-seq analysis showed protein-binding differences (>1.5-fold) between reference and variant alleles in 380 (42%) of 903 SNPs with androgen treatment and 403 (45%) of 903 SNPs without treatment. From these significant SNPs, we performed a database search and further narrowed down to 74 promising SNPs. To validate this initial finding, we performed electrophoretic mobility shift assay in two SNPs (rs12246440 and rs7077275) at CTBP2 locus and one SNP (rs113082846) at NCOA4 locus. This analysis showed that all three SNPs demonstrated allele-dependent protein-binding differences that were consistent with the SNPs-seq. Finally, clinical association analysis of the two candidate genes showed that CTBP2 was upregulated, while NCOA4 was downregulated in prostate cancer (p < 0.02). Lower expression of CTBP2 was associated with poor recurrence-free survival in prostate cancer. Utilizing our experimental data along with bioinformatic tools provides a strategy for identifying candidate functional elements at prostate cancer susceptibility loci to help guide subsequent laboratory studies.
Project description:BackgroundSLE is a systemic autoimmune disease with a large number of common risk gene variants, but several rare gene variants can cause monogenic SLE. The relationship between common and rare variants in SLE is unclear. We therefore investigated the occurrence of rare deleterious variants in patients with childhood-onset SLE (cSLE) and adult-onset SLE (aSLE) and compared the frequency of these variants with their individual SLE polygenic risk score (PRS).Materials and methodsTargeted sequencing of 1832 gene regions, including coding regions of 31 genes associated with monogenic SLE, was performed in 958 patients with SLE and 1026 healthy individuals. A total of 116 patients with SLE had disease onset before the age of 18 (cSLE). An SLE common variant PRS was created from 37 SLE genome-wide association study single nucleotide variants (SNVs).ResultsRare coding deleterious SNVs (RD SNVs) were observed in 23 of the monogenic SLE-associated genes. Six per cent of patients with cSLE, compared with 3.2% of controls and 4.6% of patients with aSLE, carried rare deleterious alleles. In cSLE, RD SNVs were observed in the C1S, DDX58, IFIH1, IKZF1, RNASEH2A and C8A genes. A PRS analysis showed that patients with cSLE with any of these gene variants had a similar average PRS as control individuals.ConclusionRD SNVs were observed in a small proportion of cSLE and carriers of these RD SNVs had a PRS similar to healthy individuals, suggesting the importance of rare coding heterozygous variants in driving disease risk in a subset of children with SLE.
Project description:BackgroundColorectal cancer has significant impact on individuals and healthcare systems. Many genes have been identified to influence its pathogenesis. However, the genetic basis of mucinous tumor histology, an aggressive subtype of colorectal cancer, is currently not well-known. This study aimed to identify common and rare genetic variations that are associated with the mucinous tumor phenotype.MethodsGenome-wide single nucleotide polymorphism (SNP) data was investigated in a colorectal cancer patient cohort (n = 505). Association analyses were performed for 729,373 common SNPs and 275,645 rare SNPs. Common SNP association analysis was performed using univariable and multivariable logistic regression under different genetic models. Rare-variant association analysis was performed using a multi-marker test.ResultsNo associations reached the traditional genome-wide significance. However, promising genetic associations were identified. The identified common SNPs significantly improved the discriminatory accuracy of the model for mucinous tumor phenotype. Specifically, the area under the receiver operating characteristic curve increased from 0.703 (95% CI: 0.634-0.773) to 0.916 (95% CI: 0.873-0.960) when considering the most significant SNPs. Additionally, the rare variant analysis identified a number of genetic regions that potentially contain causal rare variants associated with the mucinous tumor phenotype.ConclusionsThis is the first study applying both common and rare variant analyses to identify genetic associations with mucinous tumor phenotype using a genome-wide genotype data. Our results suggested novel associations with mucinous tumors. Once confirmed, these results will not only help us understand the biological basis of mucinous histology, but may also help develop targeted treatment options for mucinous tumors.
Project description:Genome-wide association (GWA) studies have identified hundreds of common (minor allele frequency ≥5%) single nucleotide polymorphisms (SNPs) associated with phenotype traits or diseases, yet causal variants accounting for the association signals have rarely been determined. A question then raised is whether a GWA signal represents an "indirect association" as a proxy of a strongly correlated causal variant with similar frequency, or a "synthetic association" of one or more rarer causal variants in linkage disequilibrium (D' ≈ 1, but r(2) not large); answering the question generally requires extensive resequencing and association analysis. Instead, we propose to test statistically whether a quantitative trait (QT) association of an SNP represents a synthetic association or not by inspecting the QT distribution at each genotype, not requiring the causal variant(s) to be known. We devised two test statistics and assessed the power by mathematical analysis and simulation. Testing the heterogeneity of variance was powerful when low-frequency causal alleles are linked mostly to one SNP allele, while testing the skewness outperformed when the causal alleles are linked evenly to either of the SNP alleles. By testing a statistic combining these two in 5000 individuals, we could detect synthetic association of a GWA signal when causal alleles sum up to 3% in frequency. Such signal only partially explains the heritability contributed by the whole locus. The proposed test is useful for designing fine mapping after studying association of common SNPs exhaustively; we can prioritize which GWA signal and which individuals to be resequenced, and identify the causal variants efficiently.
Project description:The specific causes of prostate cancer are not known. However, multiple etiologic factors, including genetic profile, metabolism of steroid hormones, nutrition, chronic inflammation, family history of prostate cancer, and environmental exposures are thought to play significant roles. Variations in exposure to these risk factors may explain interindividual differences in prostate cancer risk. However, regardless of the precise mechanism(s), a robust DNA repair capacity may mitigate any risks conferred by mutations from these risk factors. Numerous single nucleotide polymorphisms (SNPs) in DNA repair genes have been found, and studies of these SNPs and prostate cancer risk are critical to understanding the response of prostate cells to DNA damage. A few SNPs in DNA repair genes are associated with significantly increased risk of prostate cancer; however, in most cases, the effects are moderate and often depend upon interactions among the risk alleles of several genes in a pathway or with other environmental risk factors. This report reviews the published epidemiologic literature on the association of SNPs in genes involved in DNA repair pathways and prostate cancer risk.
Project description:We report targeted sequencing of 63 known prostate cancer risk regions in a multi-ancestry study of 9,237 men and use the data to explore the contribution of low-frequency variation to disease risk. We show that SNPs with minor allele frequencies (MAFs) of 0.1-1% explain a substantial fraction of prostate cancer risk in men of African ancestry. We estimate that these SNPs account for 0.12 (standard error (s.e.) = 0.05) of variance in risk (∼42% of the variance contributed by SNPs with MAF of 0.1-50%). This contribution is much larger than the fraction of neutral variation due to SNPs in this class, implying that natural selection has driven down the frequency of many prostate cancer risk alleles; we estimate the coupling between selection and allelic effects at 0.48 (95% confidence interval [0.19, 0.78]) under the Eyre-Walker model. Our results indicate that rare variants make a disproportionate contribution to genetic risk for prostate cancer and suggest the possibility that rare variants may also have an outsize effect on other common traits.
Project description:Next-generation sequencing studies are dependent on a high-quality reference genome for single nucleotide variant (SNV) calling. Although the two most recent builds of the human genome are widely used, position information is typically not directly comparable between them. Re-alignment gives the most accurate position information, but this procedure is often computationally expensive, and therefore, tools such as liftOver and CrossMap are used to convert data from one build to another. However, the positions of converted SNVs do not always match SNVs derived from aligned data, and in some instances, SNVs are known to change chromosome when converted. This is a significant problem when compiling sequencing resources or comparing results across studies. Here, we describe a novel algorithm to identify positions that are unstable when converting between human genome reference builds. These positions are detected independent of the conversion tools and are determined by the chain files, which provide a mapping of contiguous positions from one build to another. We also provide the list of unstable positions for converting between the two most commonly used builds GRCh37 and GRCh38. Pre-excluding SNVs at these positions, prior to conversion, results in SNVs that are stable to conversion. This simple procedure gives the same final list of stable SNVs as applying the algorithm and subsequently removing variants at unstable positions. This work highlights the care that must be taken when converting SNVs between genome builds and provides a simple method for ensuring higher confidence converted data. Unstable positions and algorithm code, available at https://github.com/cathaloruaidh/genomeBuildConversion.
Project description:Prostate cancer is one of the most frequently diagnosed malignancies in developed countries and approximately 248,530 new cases of prostate cancer are likely to be diagnosed in the United States in 2021. During the late 1990s and 2000s, the prostate cancer-related death rate has decreased by 4% per year on average because of advancements in prostate-specific antigen (PSA) testing. However, the non-specificity of PSA to distinguish between benign and malignant forms of cancer is a major concern in the management of prostate cancer. Despite other risk factors in the pathogenesis of prostate cancer, recent advancement in molecular genetics suggests that genetic heredity plays a crucial role in prostate carcinogenesis. Approximately, 60% of heritability and more than 100 well-recognized single-nucleotide-polymorphisms (SNPs) have been found to be associated with prostate cancer and constitute a major risk factor in the development of prostate cancer. Recent findings revealed that a low to moderate effect on the progression of prostate cancer of individual SNPs was observed compared to a strong progressive effect when SNPs were in combination. Here, in this review, we made an attempt to critically analyze the role of SNPs and associated genes in the development of prostate cancer and their implications in diagnostics and therapeutics. A better understanding of the role of SNPs in prostate cancer susceptibility may improve risk prediction, enhance fine-mapping, and furnish new insights into the underlying pathophysiology of prostate cancer.
Project description:Functional characterization of disease-causing variants at risk loci has been a significant challenge. Here we report a high-throughput single-nucleotide polymorphisms sequencing (SNPs-seq) technology to simultaneously screen hundreds to thousands of SNPs for their allele-dependent protein-binding differences. This technology takes advantage of higher retention rate of protein-bound DNA oligos in protein purification column to quantitatively sequence these SNP-containing oligos. We apply this technology to test prostate cancer-risk loci and observe differential allelic protein binding in a significant number of selected SNPs. We also test a unique application of self-transcribing active regulatory region sequencing (STARR-seq) in characterizing allele-dependent transcriptional regulation and provide detailed functional analysis at two risk loci (RGS17 and ASCL2). Together, we introduce a powerful high-throughput pipeline for large-scale screening of functional SNPs at disease risk loci.