Project description:We report the application of Chromosome Conformation Capture Carbon-copy (5C) to a 4.5 Mb stretch of the mouse X chromosome encompassing the X inactivation center locus. We uncover a series of discrete 200kb-1Mb topologically associating domains (TADs). These align with several domain-wide epigenomic features as well as co-regulated gene clusters. 5C analysis in EED and G9A mutants reveal that this segmental organisation in TADs does not relie on the underlying H3K27me3 or H3K9me2 blocks. Deletion of a boundary between two TADs leads to ectopic chromosomal contacts between them. Analysis of mESCs, mNPCs and MEFs suggest that the positioning of TADs on the chromosome is stable during cell differentiation though their internal organisation changes. Comparison of male (XY) and female (XX) differentiated cells highlights that the long-range chromosomal contacts within TADs are dampened on the inactive X compared to the active X. 5C oligonucleotides were designed around HindIII restriction site following an alternative scheme
Project description:The Center for Eukaryotic Structural Genomics (CESG) is a "specialized" or "technology development" center supported by the Protein Structure Initiative (PSI). CESG's mission is to develop improved methods for the high-throughput solution of structures from eukaryotic proteins, with a very strong weighting toward human proteins of biomedical relevance. During the first three years of PSI-2, CESG selected targets representing 601 proteins from Homo sapiens, 33 from mouse, 10 from rat, 139 from Galdieria sulphuraria, 35 from Arabidopsis thaliana, 96 from Cyanidioschyzon merolae, 80 from Plasmodium falciparum, 24 from yeast, and about 25 from other eukaryotes. Notably, 30% of all structures of human proteins solved by the PSI Centers were determined at CESG. Whereas eukaryotic proteins generally are considered to be much more challenging targets than prokaryotic proteins, the technology now in place at CESG yields success rates that are comparable to those of the large production centers that work primarily on prokaryotic proteins. We describe here the technological innovations that underlie CESG's platforms for bioinformatics and laboratory information management, target selection, protein production, and structure determination by X-ray crystallography or NMR spectroscopy.
Project description:We report the application of Chromosome Conformation Capture Carbon-copy (5C) to a 4.5 Mb stretch of the mouse X chromosome encompassing the X inactivation center locus. We uncover a series of discrete 200kb-1Mb topologically associating domains (TADs). These align with several domain-wide epigenomic features as well as co-regulated gene clusters. 5C analysis in EED and G9A mutants reveal that this segmental organisation in TADs does not relie on the underlying H3K27me3 or H3K9me2 blocks. Deletion of a boundary between two TADs leads to ectopic chromosomal contacts between them. Analysis of mESCs, mNPCs and MEFs suggest that the positioning of TADs on the chromosome is stable during cell differentiation though their internal organisation changes. Comparison of male (XY) and female (XX) differentiated cells highlights that the long-range chromosomal contacts within TADs are dampened on the inactive X compared to the active X.
Project description:Comparison of DNA sequences from different species is a fundamental method for identifying functional elements in genomes. Here, we describe the VISTA family of tools created to assist biologists in carrying out this task. Our first VISTA server at http://www-gsd.lbl.gov/vista/ was launched in the summer of 2000 and was designed to align long genomic sequences and visualize these alignments with associated functional annotations. Currently the VISTA site includes multiple comparative genomics tools and provides users with rich capabilities to browse pre-computed whole-genome alignments of large vertebrate genomes and other groups of organisms with VISTA Browser, to submit their own sequences of interest to several VISTA servers for various types of comparative analysis and to obtain detailed comparative analysis results for a set of cardiovascular genes. We illustrate capabilities of the VISTA site by the analysis of a 180 kb interval on human chromosome 5 that encodes for the kinesin family member 3A (KIF3A) protein.
Project description:Many disciplines, from human genetics and oncology to plant breeding, microbiology and virology, commonly face the challenge of analyzing rapidly increasing numbers of genomes. In case of Homo sapiens, the number of sequenced genomes will approach hundreds of thousands in the next few years. Simply scaling up established bioinformatics pipelines will not be sufficient for leveraging the full potential of such rich genomic data sets. Instead, novel, qualitatively different computational methods and paradigms are needed. We will witness the rapid extension of computational pan-genomics, a new sub-area of research in computational biology. In this article, we generalize existing definitions and understand a pan-genome as any collection of genomic sequences to be analyzed jointly or to be used as a reference. We examine already available approaches to construct and use pan-genomes, discuss the potential benefits of future technologies and methodologies and review open challenges from the vantage point of the above-mentioned biological disciplines. As a prominent example for a computational paradigm shift, we particularly highlight the transition from the representation of reference genomes as strings to representations as graphs. We outline how this and other challenges from different application domains translate into common computational problems, point out relevant bioinformatics techniques and identify open problems in computer science. With this review, we aim to increase awareness that a joint approach to computational pan-genomics can help address many of the problems currently faced in various domains.