Project description:Somatic mutations in cancer genomes often show allelic imbalance (AI) of mutation abundance between the genome and transcriptome, but there is not yet a systematic understanding of AI. In this study, we performed large-scale DNA and RNA AI analyses of >100,000 somatic mutations in >2,000 cancer specimens across five tumor types using the exome and transcriptome sequencing data of the Cancer Genome Atlas consortium. First, AI analysis of nonsense mutations and frameshift indels revealed that nonsense-mediated decay is typical in cancer genomes, and we identified the relationship between the extent of AI and the location of mutations in addition to the well-recognized 50-nt rules. Second, the AI with splice site mutations may reflect the extent of intron retention and is frequently observed in known tumor suppressor genes. For missense mutations, we observed that mutations frequently subject to AI are enriched to genes related to cancer, especially those of apoptosis and the extracellular matrix, and C:G > A:T transversions. Our results suggest that mutations in known cancer-related genes and their transcripts are subjected to different levels of transcriptional or posttranscriptional regulation compared to wildtype alleles and may add an additional regulatory layer to the functions of cancer-relevant genes.
Project description:The extent to which heritable genetic variants can affect tumor development has yet to be fully elucidated. Tumor selection of single nucleotide polymorphism (SNP) risk alleles, a phenomenon called preferential allelic imbalance (PAI), has been demonstrated in some cancer types. We developed a novel application of digital PCR termed Somatic Mutation Allelic Ratio Test using Droplet Digital PCR (SMART-ddPCR) for accurate assessment of tumor PAI, and have applied this method to test the hypothesis that heritable SNPs associated with childhood acute lymphoblastic leukemia (ALL) may demonstrate tumor PAI. These SNPs are located at CDKN2A (rs3731217) and IKZF1 (rs4132601), genes frequently lost in ALL, and at CEBPE (rs2239633), ARID5B (rs7089424), PIP4K2A (rs10764338), and GATA3 (rs3824662), genes located on chromosomes gained in high-hyperdiploid ALL. We established thresholds of AI using constitutional DNA from SNP heterozygotes, and subsequently measured allelic copy number in tumor DNA from 19-142 heterozygote samples per SNP locus. We did not find significant tumor PAI at these loci, though CDKN2A and IKZF1 SNPs showed a trend towards preferential selection of the risk allele (p = 0.17 and p = 0.23, respectively). Using a genomic copy number control ddPCR assay, we investigated somatic copy number alterations (SCNA) underlying AI at CDKN2A and IKZF1, revealing a complex range of alterations including homozygous and hemizygous deletions and copy-neutral loss of heterozygosity, with varying degrees of clonality. Copy number estimates from ddPCR showed high agreement with those from multiplex ligation-dependent probe amplification (MLPA) assays. We demonstrate that SMART-ddPCR is a highly accurate method for investigation of tumor PAI and for assessment of the somatic alterations underlying AI. Furthermore, analysis of publicly available data from The Cancer Genome Atlas identified 16 recurrent SCNA loci that contain heritable cancer risk SNPs associated with a matching tumor type, and which represent candidate PAI regions warranting further investigation.
Project description:Recent advances in single cell technology have enabled dissection of cellular heterogeneity in great detail. However, analysis of single cell DNA sequencing data remains challenging due to bias and artifacts that arise during DNA extraction and whole-genome amplification, including allelic imbalance and dropout. Here, we present a framework for statistical estimation of allele-specific amplification imbalance at any given position in single cell whole-genome sequencing data by utilizing the allele frequencies of heterozygous single nucleotide polymorphisms in the neighborhood. The resulting allelic imbalance profile is critical for determining whether the variant allele fraction of an observed mutation is consistent with the expected fraction for a true variant. This method, implemented in SCAN-SNV (Single Cell ANalysis of SNVs), substantially improves the identification of somatic variants in single cells. Our allele balance framework is broadly applicable to genotype analysis of any variant type in any data that might exhibit allelic imbalance.
Project description:BackgroundCommon single-nucleotide polymorphisms (SNPs) in ten chromosomal loci have been shown to predispose to colorectal cancer (CRC) in genome-wide association studies. A plausible biological mechanism of CRC susceptibility associated with genetic variation has so far only been proposed for three loci, each pointing to variants that affect gene expression through distant regulatory elements. In this study, we aimed to gain insight into the molecular basis of seven low-penetrance CRC loci tagged by rs4779584 at 15q13, rs10795668 at 10p14, rs3802842 at 11q23, rs4444235 at 14q22, rs9929218 at 16q22, rs10411210 at 19q13, and rs961253 at 20p12.MethodsPossible somatic gain of the risk allele or loss of the protective allele was studied by analyzing allelic imbalance in tumour and corresponding normal tissue samples of heterozygous patients. Functional variants were searched from in silico predicted enhancer elements locating inside the CRC-associating linkage-disequilibrium regions.ResultsNo allelic imbalance targeting the SNPs was observed at any of the seven loci. Altogether, 12 SNPs that were predicted to disrupt potential transcription factor binding sequences were genotyped in the same population-based case-control series as the seven tagging SNPs originally. None showed association with CRC.ConclusionsThe results of the allelic imbalance analysis suggest that the seven CRC risk variants are not somatically selected for in the neoplastic progression. The bioinformatic approach was unable to pinpoint cancer-causing variants at any of the seven loci. While it is possible that many of the predisposition loci for CRC are involved in control of gene expression by targeting transcription factor binding sites, also other possibilities, such as regulatory RNAs, should be considered.
Project description:BackgroundSomatic mutations in the Janus kinase 2 gene (JAK2) occur in many myeloproliferative neoplasms, but the molecular pathogenesis of myeloproliferative neoplasms with nonmutated JAK2 is obscure, and the diagnosis of these neoplasms remains a challenge.MethodsWe performed exome sequencing of samples obtained from 151 patients with myeloproliferative neoplasms. The mutation status of the gene encoding calreticulin (CALR) was assessed in an additional 1345 hematologic cancers, 1517 other cancers, and 550 controls. We established phylogenetic trees using hematopoietic colonies. We assessed calreticulin subcellular localization using immunofluorescence and flow cytometry.ResultsExome sequencing identified 1498 mutations in 151 patients, with medians of 6.5, 6.5, and 13.0 mutations per patient in samples of polycythemia vera, essential thrombocythemia, and myelofibrosis, respectively. Somatic CALR mutations were found in 70 to 84% of samples of myeloproliferative neoplasms with nonmutated JAK2, in 8% of myelodysplasia samples, in occasional samples of other myeloid cancers, and in none of the other cancers. A total of 148 CALR mutations were identified with 19 distinct variants. Mutations were located in exon 9 and generated a +1 base-pair frameshift, which would result in a mutant protein with a novel C-terminal. Mutant calreticulin was observed in the endoplasmic reticulum without increased cell-surface or Golgi accumulation. Patients with myeloproliferative neoplasms carrying CALR mutations presented with higher platelet counts and lower hemoglobin levels than patients with mutated JAK2. Mutation of CALR was detected in hematopoietic stem and progenitor cells. Clonal analyses showed CALR mutations in the earliest phylogenetic node, a finding consistent with its role as an initiating mutation in some patients.ConclusionsSomatic mutations in the endoplasmic reticulum chaperone CALR were found in a majority of patients with myeloproliferative neoplasms with nonmutated JAK2. (Funded by the Kay Kendall Leukaemia Fund and others.).
Project description:Point mutations in cancer have been extensively studied but chromosomal gains and losses have been more challenging to interpret due to their unspecific nature. Here we examine high-resolution allelic imbalance (AI) landscape in 1699 colorectal cancers, 256 of which have been whole-genome sequenced (WGSed). The imbalances pinpoint 38 genes as plausible AI targets based on previous knowledge. Unbiased CRISPR-Cas9 knockout and activation screens identified in total 79 genes within AI peaks regulating cell growth. Genetic and functional data implicate loss of TP53 as a sufficient driver of AI. The WGS highlights an influence of copy number aberrations on the rate of detected somatic point mutations. Importantly, the data reveal several associations between AI target genes, suggesting a role for a network of lineage-determining transcription factors in colorectal tumorigenesis. Overall, the results unravel the contribution of AI in colorectal cancer and provide a plausible explanation why so few genes are commonly affected by point mutations in cancers.
Project description:Buccal epithelial cells harbor an MPN-associated CALR mutation in a patient with CALR-mutant essential thrombocytosis, Ph+ CML, and no germ line CALR mutation.
Project description:Mutations in cardiac myosin binding protein C (MYBPC3) represent the most frequent cause of familial hypertrophic cardiomyopathy (HCM), making up approximately 50% of identified HCM mutations. MYBPC3 is distinct among other sarcomere genes associated with HCM in that truncating mutations make up the vast majority, whereas nontruncating mutations predominant in other sarcomere genes. Several studies using myocardial tissue from HCM patients have found reduced abundance of wild-type MYBPC3 compared to control hearts, suggesting haploinsufficiency of full-length MYBPC3. Further, decreased mutant versus wild-type mRNA and lack of truncated mutant MYBPC3 protein has been demonstrated, highlighting the presence of allelic imbalance. In this review, we will begin by introducing allelic imbalance and haploinsufficiency, highlighting the broad role each plays within the spectrum of human disease. We will subsequently focus on the roles allelic imbalance and haploinsufficiency play within MYBPC3-linked HCM. Finally, we will explore the implications of these findings on future directions of HCM research. An improved understanding of allelic imbalance and haploinsufficiency may help us better understand genotype-phenotype relationships in HCM and develop novel targeted therapies, providing exciting future research opportunities.
Project description:Most autosomal genes are thought to be expressed from both alleles, with some notable exceptions, including imprinted genes and genes showing random monoallelic expression (RME). The extent and nature of RME has been the subject of debate. Here we investigate the expression of several candidate RME genes in F1 hybrid mouse cells before and after differentiation, to define how they become persistently, monoallelically expressed. Clonal monoallelic expression is not present in embryonic stem cells, but we observe high frequencies of monoallelism in neuronal progenitor cells by assessing expression status in more than 200 clones. We uncover unforeseen modes of allelic expression that appear to be gene-specific and epigenetically regulated. This non-canonical allelic regulation has important implications for development and disease, including autosomal dominant disorders and opens up therapeutic perspectives.