Project description:The series were performed to study the changes in gene expression upon diploidization of KBM7 cancer (CML) cell line. The line can exist either as a clone with 24 chromosomes (nearly haploid) or with 48 chromosomes (nearly diploid). Gene expression patterns are largely ploidy-independent, as suggested by this experiment Single cell derived clones of KBM7 cell line were grown. Included 7 haploid and 10 diploid clones, each with 2 independent total RNA extraction/microarray run. Also included peripheral blood mononuclear cells (PBMC) samples for comparison purposes.
Project description:The series were performed to study the changes in gene expression upon diploidization of KBM7 cancer (CML) cell line. The line can exist either as a clone with 24 chromosomes (nearly haploid) or with 48 chromosomes (nearly diploid). Gene expression patterns are largely ploidy-independent, as suggested by this experiment
Project description:Background: DNA in the nucleus of a living cell carries out its functions in the context of a complex, three-dimensional chromatin architecture. Several recently developed methods, each an extension of the chromatin conformation capture (3C) assay, have enabled the genome-wide profiling of chromatin contacts between pairs of loci in yeast, fruit fly, human and mouse. Especially in complex eukaryotes, data generated by these methods, coupled with other genome-wide datasets, demonstrated that non-random chromatin folding correlates strongly with cellular processes such as gene expression and DNA replication. Here we describe a novel assay to map genome-wide chromatin contacts, tethered multiple 3C (TM3C), that involves a simple protocol of restriction enzyme digestion and religation of fragments upon agarose gel beads followed by deep DNA paired-end sequencing. In addition to identifying contacts between pairs of loci, TM3C enables identification of contacts among more than two loci simultaneously. Results: We use TM3C to assay the genome architectures of two human cell lines: KBM7, a near-haploid chronic leukemia cell line, and NHEK, a normal diploid human epidermal keratinocyte cell line. We confirm that the contact frequency maps produced by TM3C exhibit features characteristic of existing genome architecture datasets, including the expected scaling of contact probabilities with genomic distance, as well as a low noise-to-signal ratio between inter- and intrachromosomal contacts. We also confirm that TM3C captures several known cell type-specific contacts, ploidy shifts and translocations, such as Ph+ formation in KBM7. Furthermore, we develop a two-phase mapping strategy that separately maps chimeric subsequences within a single read, allowing us to identify contacts involving three or four loci simultaneously, potentially corresponding to combinatorial regulation events. This mapping strategy also greatly increases the number of distinct binary contacts identified and, therefore, the coverage obtained for a fixed number of mapped reads. We confirm a subset of the triplet contacts involving the IGF2-H19 imprinting control region (ICR) using PCR analysis for KBM7 cells. Assaying the genome architecture of a near-haploid cell line allows us to create 3D models of a human cell line without averaging signal from two homologous copies of a chromosome. Our 3D models of KBM7 show clustering of small chromosomes with each other and large chromosomes with each other, consistent with previous studies of the genome architectures of other human cell lines. Conclusion: TM3C is a simple protocol for ascertaining genome architecture and can be used to identify simultaneous contacts among three or four loci. Application of TM3C to a near-haploid human cell line revealed large-scale features of chromosomal organization and complex chromatin loops that may play a role in regulating reciprocal expression of the IGF2 and H19 genes. Analysis of the spatial organization of two human cell lines (KBM7, a near-haploid chronic leukemia cell line, and NHEK, a normal diploid human epidermal keratinocyte cell line) using tethered multiple 3C (TM3C), a novel and simple protocol for ascertaining genome architecture which can be used to identify simultaneous contacts among three or four loci in addition to binary contacts that can be identified using traditional chromosome conformation capture coupled with next generation sequencing (Hi-C).
Project description:Background: DNA in the nucleus of a living cell carries out its functions in the context of a complex, three-dimensional chromatin architecture. Several recently developed methods, each an extension of the chromatin conformation capture (3C) assay, have enabled the genome-wide profiling of chromatin contacts between pairs of loci in yeast, fruit fly, human and mouse. Especially in complex eukaryotes, data generated by these methods, coupled with other genome-wide datasets, demonstrated that non-random chromatin folding correlates strongly with cellular processes such as gene expression and DNA replication. Here we describe a novel assay to map genome-wide chromatin contacts, tethered multiple 3C (TM3C), that involves a simple protocol of restriction enzyme digestion and religation of fragments upon agarose gel beads followed by deep DNA paired-end sequencing. In addition to identifying contacts between pairs of loci, TM3C enables identification of contacts among more than two loci simultaneously. Results: We use TM3C to assay the genome architectures of two human cell lines: KBM7, a near-haploid chronic leukemia cell line, and NHEK, a normal diploid human epidermal keratinocyte cell line. We confirm that the contact frequency maps produced by TM3C exhibit features characteristic of existing genome architecture datasets, including the expected scaling of contact probabilities with genomic distance, as well as a low noise-to-signal ratio between inter- and intrachromosomal contacts. We also confirm that TM3C captures several known cell type-specific contacts, ploidy shifts and translocations, such as Ph+ formation in KBM7. Furthermore, we develop a two-phase mapping strategy that separately maps chimeric subsequences within a single read, allowing us to identify contacts involving three or four loci simultaneously, potentially corresponding to combinatorial regulation events. This mapping strategy also greatly increases the number of distinct binary contacts identified and, therefore, the coverage obtained for a fixed number of mapped reads. We confirm a subset of the triplet contacts involving the IGF2-H19 imprinting control region (ICR) using PCR analysis for KBM7 cells. Assaying the genome architecture of a near-haploid cell line allows us to create 3D models of a human cell line without averaging signal from two homologous copies of a chromosome. Our 3D models of KBM7 show clustering of small chromosomes with each other and large chromosomes with each other, consistent with previous studies of the genome architectures of other human cell lines. Conclusion: TM3C is a simple protocol for ascertaining genome architecture and can be used to identify simultaneous contacts among three or four loci. Application of TM3C to a near-haploid human cell line revealed large-scale features of chromosomal organization and complex chromatin loops that may play a role in regulating reciprocal expression of the IGF2 and H19 genes.
Project description:Analysis of copy number variation in evolved haploid, diploid, tetraploid strains. All experimental samples were compared to the same reference strain S288C. The samples include the progenitor strains for the haploid, diploid, and tetraploid evolution experiments, and single colony isolates (clones) from the evolving populations at given time points. Evolved clones were analyzed at generation 250 unless the name is followed by gen35, gen55 or gen500, in which case those generations were analyzed.
Project description:Kynureninase is a member of a large family of catalytically diverse but structurally homologous pyridoxal 5'-phosphate (PLP) dependent enzymes known as the aspartate aminotransferase superfamily or alpha-family. The Homo sapiens and other eukaryotic constitutive kynureninases preferentially catalyze the hydrolytic cleavage of 3-hydroxy-l-kynurenine to produce 3-hydroxyanthranilate and l-alanine, while l-kynurenine is the substrate of many prokaryotic inducible kynureninases. The human enzyme was cloned with an N-terminal hexahistidine tag, expressed, and purified from a bacterial expression system using Ni metal ion affinity chromatography. Kinetic characterization of the recombinant enzyme reveals classic Michaelis-Menten behavior, with a Km of 28.3 +/- 1.9 microM and a specific activity of 1.75 micromol min-1 mg-1 for 3-hydroxy-dl-kynurenine. Crystals of recombinant kynureninase that diffracted to 2.0 A were obtained, and the atomic structure of the PLP-bound holoenzyme was determined by molecular replacement using the Pseudomonas fluorescens kynureninase structure (PDB entry 1qz9) as the phasing model. A structural superposition with the P. fluorescens kynureninase revealed that these two structures resemble the "open" and "closed" conformations of aspartate aminotransferase. The comparison illustrates the dynamic nature of these proteins' small domains and reveals a role for Arg-434 similar to its role in other AAT alpha-family members. Docking of 3-hydroxy-l-kynurenine into the human kynureninase active site suggests that Asn-333 and His-102 are involved in substrate binding and molecular discrimination between inducible and constitutive kynureninase substrates.
Project description:Many thousand long non-coding (lnc) RNAs are mapped in the human genome. Time consuming studies using reverse genetic approaches by post-transcriptional knock-down or genetic modification of the locus demonstrated diverse biological functions for a few of these transcripts. The Human Gene Trap Mutant Collection in haploid KBM7 cells is a ready-to-use tool for studying protein-coding gene function. As lncRNAs show remarkable differences in RNA biology compared to protein-coding genes, it is unclear if this gene trap collection is useful for functional analysis of lncRNAs. Here we use the uncharacterized LOC100288798 lncRNA as a model to answer this question. Using public RNA-seq data we show that LOC100288798 is ubiquitously expressed, but inefficiently spliced. The minor spliced LOC100288798 isoforms are exported to the cytoplasm, whereas the major unspliced isoform is nuclear localized. This shows that LOC100288798 RNA biology differs markedly from typical mRNAs. De novo assembly from RNA-seq data suggests that LOC100288798 extends 289kb beyond its annotated 3' end and overlaps the downstream SLC38A4 gene. Three cell lines with independent gene trap insertions in LOC100288798 were available from the KBM7 gene trap collection. RT-qPCR and RNA-seq confirmed successful lncRNA truncation and its extended length. Expression analysis from RNA-seq data shows significant deregulation of 41 protein-coding genes upon LOC100288798 truncation. Our data shows that gene trap collections in human haploid cell lines are useful tools to study lncRNAs, and identifies the previously uncharacterized LOC100288798 as a potential gene regulator. We cultured and processed 8 KBM7 cell lines in one batch. These cell lines were: two wild type KBM7 cells (WT2 and WT3), two monoclonal KBM7 cell lines with gene trap cassette insertions outside of the body of LOC100288798 (C1 and C2), two independently obtained KBM7 clones with gene trap cassette insertion 3kb downstream LOC100288798 transcriptional start site (TSS) (3kb1 and 3kb2), one independently obtained KBM7 clone with gene trap cassette insertion 100kb downstream LOC100288798 TSS replicated twice at the thawing step (100kb1 and 100kb2). We isolated total RNA from all th 8 cell lines, applied DNAseI treatment and ribosomal RNA depletion, and thhen prepared strand-specific RNA-seq libraries, which were pooled in equal molarities and sequenced using Illumina HiSeq 2000 (8 pooled samples were sequence on 2 lanes). We performed 50bp single-end RNA-seq. We used these 8 samples (4 untreated: WT2, WT3, C1, C2 and 4 treated:3kb1, 3kb2, 100kbk1, 100kb2) to analyze genome-wide gene deregulation associated with LOC100288798 lncRNA truncation