Project description:BackgroundOf the 108 Schizophrenia (SZ) risk-loci discovered through genome-wide association studies (GWAS), 96 are not altering the sequence of any protein. Evidence linking non-coding risk-SNPs and genes may be established using expression quantitative trait loci (eQTL). However, other approaches such allelic expression quantitative trait loci (aeQTL) also may be of use.MethodsWe applied both the eQTL and aeQTL analysis to a biobank of deeply sequenced RNA from 680 dorso-lateral pre-frontal cortex (DLPFC) samples. For each of 340 genes proximal to the SZ risk-SNPs, we asked how much SNP-genotype affected total expression (eQTL), as well as how much the expression ratio between the two alleles differed from 1:1 as a consequence of the risk-SNP genotype (aeQTL).ResultsWe analyzed overlap with comparable eQTL-findings: 16 of the 30 risk-SNPs known to have gene-level eQTL also had gene-level aeQTL effects. 6 of 21 risk-SNPs with known splice-eQTL had exon-aeQTL effects. 12 novel potential risk genes were identified with the aeQTL approach, while 55 tested SNP-pairs were found as eQTL but not aeQTL. Of the tested 108 loci we could find at least one gene to be associated with 21 of the risk-SNPs using gene-level aeQTL, and with an additional 18 risk-SNPs using exon-level aeQTL.ConclusionOur results suggest that the aeQTL strategy complements the eQTL approach to susceptibility gene identification.
Project description:Type 2 diabetes (T2D)-associated SNPs are more likely to be expression quantitative trait loci (eQTLs). The allelic expression imbalance (AEI) analysis is the measure of relative expression between two allelic transcripts and is the most sensitive measurement to detect cis-regulatory effects. We performed AEI screening to detect cis-regulators for genes expressed in transformed lymphocytes of 190 Caucasian (CA) and African American (AA) subjects to identify functional variants for T2D susceptibility in the chromosome 1q21-24 region of linkage. Among transcribed SNPs studied in 115 genes, significant AEI (P < 0.001) occurred in 28 and 30 genes in CA and AA subjects, respectively. Analysis of the effect of selected AEI-SNPs (≥10% mean AEI) on total gene expression further established the cis-eQTLs in thioesterase superfamily member-4 (THEM4) (rs13320, P = 0.027), and IGSF8 (rs1131891, P = 0.02). Examination of published genome-wide association data identified significant associations (P < 0.01) of three AEI-SNPs with T2D in the DIAGRAM-v3 dataset. Six AEI single nucleotide polymorphisms, including rs13320 (P = 1.35E-04) in THEM4, were associated with glucose homeostasis traits in the MAGIC dataset. Evaluation of AEI-SNPs for association with glucose homeostasis traits in 611 nondiabetic subjects showed lower AIRG (P = 0.005) in those with TT/TC genotype for rs13320. THEM4 expression in adipose was higher (P = 0.005) in subjects carrying the T allele; in vitro analysis with luciferase construct confirmed the higher expression of the T allele. Resequencing of THEM4 exons in 192 CA subjects revealed four coding nonsynonymous variants, but did not explain transmission of T2D in 718 subjects from 67 Caucasian pedigrees. Our study indicates the role of a cis-regulatory SNP in THEM4 that may influence T2D predisposition by modulating glucose homeostasis.
Project description:Although over 60 single nucleotide polymorphisms (SNPs) have been identified by meta-analysis of genome-wide association studies for type-2 diabetes (T2D) among individuals of European descent, much of the genetic variation remains unexplained. There are likely many more SNPs that contribute to variation in T2D risk, some of which may lie in the regions surrounding established SNPs--a phenomenon often referred to as allelic heterogeneity. Here, we use the summary statistics from the DIAGRAM consortium meta-analysis of T2D genome-wide association studies along with linkage disequilibrium patterns inferred from a large reference sample to identify novel SNPs associated with T2D surrounding each of the previously established risk loci. We then examine the extent to which the use of these additional SNPs improves prediction of T2D risk in an independent validation dataset. Our results suggest that multiple SNPs at each of 3 loci contribute to T2D susceptibility (TCF7L2, CDKN2A/B, and KCNQ1; p<5×10(-8)). Using a less stringent threshold (p<5×10(-4)), we identify 34 additional loci with multiple associated SNPs. The addition of these SNPs slightly improves T2D prediction compared to the use of only the respective lead SNPs, when assessed using an independent validation cohort. Our findings suggest that some currently established T2D risk loci likely harbor multiple polymorphisms which contribute independently and collectively to T2D risk. This opens a promising avenue for improving prediction of T2D, and for a better understanding of the genetic architecture of T2D.
Project description:Most autosomal genes are thought to be expressed from both alleles, with some notable exceptions, including imprinted genes and genes showing random monoallelic expression (RME). The extent and nature of RME has been the subject of debate. Here we investigate the expression of several candidate RME genes in F1 hybrid mouse cells before and after differentiation, to define how they become persistently, monoallelically expressed. Clonal monoallelic expression is not present in embryonic stem cells, but we observe high frequencies of monoallelism in neuronal progenitor cells by assessing expression status in more than 200 clones. We uncover unforeseen modes of allelic expression that appear to be gene-specific and epigenetically regulated. This non-canonical allelic regulation has important implications for development and disease, including autosomal dominant disorders and opens up therapeutic perspectives.
Project description:A significant proportion of the variation between individuals in gene expression levels is genetic, and it is likely that these differences correlate with phenotypic differences or with risk of disease. Cis-acting polymorphisms are important in determining interindividual differences in gene expression that lead to allelic expression imbalance, which is the unequal expression of homologous alleles in individuals heterozygous for such a polymorphism. This expression imbalance can be detected using a transcribed polymorphism, and, once it is established, the next step is to identify the polymorphisms that are responsible for or predictive of allelic expression levels. We present an expectation-maximization algorithm for such analyses, providing a formal statistical framework to test whether a candidate polymorphism is associated with allelic expression differences.
Project description:Transcriptional bursts render substantial biological noise in cellular transcriptomes. Here, we investigated the theoretical extent of allelic expression resulting from transcriptional bursting and how it compared to the amount biallelic, monoallelic and allele-biased expression observed in single-cell RNA-sequencing (scRNA-seq) data. We found that transcriptional bursting can explain the allelic expression patterns observed in single cells, including the frequent observations of autosomal monoallelic gene expression. Importantly, we identified that the burst frequency largely determined the fraction of cells with monoallelic expression, whereas the burst size had little effect on monoallelic observations. The high consistency between the bursting model predictions and scRNA-seq observations made it possible to assess the heterogeneity of a group of cells as their deviation in allelic observations from the expected. Finally, both burst frequency and size contributed to allelic imbalance observations and reinforced that studies of allelic imbalance can be confounded from the inherent noise in transcriptional bursting. Altogether, we demonstrate that allele-level transcriptional bursting renders widespread, although predictable, amounts of monoallelic and biallelic expression in single cells and cell populations.
Project description:Genome-wide association studies (GWAS) have identified numerous loci associated with various complex traits for which the underlying susceptibility gene(s) remain unknown. In a GWAS for high-density lipoprotein-cholesterol (HDL-C) level, one strongly associated locus contains at least two biologically compelling candidates, methylmalonic aciduria cblB type (MMAB) and mevalonate kinase (MVK). To detect evidence of cis-acting regulation at this locus, we measured relative allelic expression of transcribed SNPs in five genes using human hepatocyte samples heterozygous for the transcribed SNP. If an HDL-C-associated SNP allele differentially regulates mRNA level in cis, samples heterozygous both for a transcribed SNP and an HDL-C-associated SNP should display allelic expression imbalance (AEI) of the transcribed SNP. We designed statistical tests to detect AEI in a comprehensive set of linkage disequilibrium (LD) scenarios between the transcribed SNP and an HDL-C-associated SNP (rs7298565) in phase unknown samples. We observed significant AEI of 22% in MMAB (P = 1.4 x 10(-13), transcribed SNP rs11067231), and the allele associated with lower HDL-C level was associated with greater MMAB transcript level. The same rs7298565 allele was also associated with higher MMAB mRNA level (P = 0.0081) and higher MMAB protein level (P = 0.0020). In contrast, MVK, UBE3B, KCTD10 and ACACB did not show significant AEI (P > or = 0.05). These data suggest MMAB is the most likely gene influencing HDL-C levels at this locus and demonstrate that measuring AEI at loci containing more than one candidate gene can prioritize genes for functional studies.
Project description:Simple Summary Osteosarcoma (OS) is a highly heterogenous cancer, making the identification of genetic driving factors difficult. Genetic factors, such as heritable mutations of Rb1 and TP53, are associated with an increased risk of OS. We previously generated pigs carrying a mutated TP53 gene, which develop OS at high frequency. RNA sequencing and allelic expression imbalance analysis identified an amplification of YAP1 involved in p53- dependent OS progression. The inactivation of YAP1 inhibits proliferation, migration, and invasion, and leads to the silencing of TP63 and reconstruction of p16 expression in p53-deficient porcine OS cells. This study confirms the importance of p53/YAP1 network in cancer. Abstract Osteosarcoma (OS) is a primary bone malignancy that mainly occurs during adolescent growth, suggesting that bone growth plays an important role in the aetiology of the disease. Genetic factors, such as heritable mutations of Rb1 and TP53, are associated with an increased risk of OS. Identifying driver mutations for OS has been challenging due to the complexity of bone growth-related pathways and the extensive intra-tumoral heterogeneity of this cancer. We previously generated pigs carrying a mutated TP53 gene, which develop OS at high frequency. RNA sequencing and allele expression imbalance (AEI) analysis of OS and matched healthy control samples revealed a highly significant AEI (p = 2.14 × 10−39) for SNPs in the BIRC3-YAP1 locus on pig chromosome 9. Analysis of copy number variation showed that YAP1 amplification is associated with the AEI and the progression of OS. Accordingly, the inactivation of YAP1 inhibits proliferation, migration, and invasion, and leads to the silencing of TP63 and reconstruction of p16 expression in p53-deficient porcine OS cells. Increased p16 mRNA expression correlated with lower methylation of its promoter. Altogether, our study provides molecular evidence for the role of YAP1 amplification in the progression of p53-dependent OS.
Project description:Clonal level random allelic expression imbalance and random monoallelic expression provides cellular heterogeneity within tissues by modulating allelic dosage. Although such expression patterns have been observed in multiple cell types, little is known about when in development these stochastic allelic choices are made. We examine allelic expression patterns in human neural progenitor cells before and after epigenetic reprogramming to induced pluripotency, observing that loci previously characterized by random allelic expression imbalance (0.63% of expressed genes) are generally reset to a biallelic state in induced pluripotent stem cells (iPSCs). We subsequently neuralized the iPSCs and profiled isolated clonal neural stem cells, observing that significant random allelic expression imbalance is reestablished at 0.65% of expressed genes, including novel loci not found to show allelic expression imbalance in the original parental neural progenitor cells. Allelic expression imbalance was associated with altered DNA methylation across promoter regulatory regions, with clones characterized by skewed allelic expression being hypermethylated compared to their biallelic sister clones. Our results suggest that random allelic expression imbalance is established during lineage commitment and is associated with increased DNA methylation at the gene promoter.
Project description:Allelic expression imbalance (AEI), quantified by the relative expression of two alleles of a gene in a diploid organism, can help explain phenotypic variations among individuals. Traditional methods detect AEI using bulk RNA sequencing (RNA-seq) data, a data type that averages out cell-to-cell heterogeneity in gene expression across cell types. Since the patterns of AEI may vary across different cell types, it is desirable to study AEI in a cell-type-specific manner. Although this can be achieved by single-cell RNA sequencing (scRNA-seq), it requires full-length transcript to be sequenced in single cells of a large number of individuals, which are still cost prohibitive to generate. To overcome this limitation and utilize the vast amount of existing disease relevant bulk tissue RNA-seq data, we developed BSCET, which enables the characterization of cell-type-specific AEI in bulk RNA-seq data by integrating cell type composition information inferred from a small set of scRNA-seq samples, possibly obtained from an external dataset. By modeling covariate effect, BSCET can also detect genes whose cell-type-specific AEI are associated with clinical factors. Through extensive benchmark evaluations, we show that BSCET correctly detected genes with cell-type-specific AEI and differential AEI between healthy and diseased samples using bulk RNA-seq data. BSCET also uncovered cell-type-specific AEIs that were missed in bulk data analysis when the directions of AEI are opposite in different cell types. We further applied BSCET to two pancreatic islet bulk RNA-seq datasets, and detected genes showing cell-type-specific AEI that are related to the progression of type 2 diabetes. Since bulk RNA-seq data are easily accessible, BSCET provides a convenient tool to integrate information from scRNA-seq data to gain insight on AEI with cell type resolution. Results from such analysis will advance our understanding of cell type contributions in human diseases.