Project description:Assay for transposase-accessible chromatin using sequencing data (ATAC-seq) is an efficient and precise method for revealing chromatin accessibility across the genome. Most of the current ATAC-seq tools follow chromatin immunoprecipitation sequencing (ChIP-seq) strategies that do not consider ATAC-seq-specific properties. To incorporate specific ATAC-seq quality control and the underlying biology of chromatin accessibility, we developed a bioinformatics software named ATACgraph for analyzing and visualizing ATAC-seq data. ATACgraph profiles accessible chromatin regions and provides ATAC-seq-specific information including definitions of nucleosome-free regions (NFRs) and nucleosome-occupied regions. ATACgraph also allows identification of differentially accessible regions between two ATAC-seq datasets. ATACgraph incorporates the docker image with the Galaxy platform to provide an intuitive user experience via the graphical interface. Without tedious installation processes on a local machine or cloud, users can analyze data through activated websites using pre-designed workflows or customized pipelines composed of ATACgraph modules. Overall, ATACgraph is an effective tool designed for ATAC-seq for biologists with minimal bioinformatics knowledge to analyze chromatin accessibility. ATACgraph can be run on any ATAC-seq data with no limit to specific genomes. As validation, we demonstrated ATACgraph on human genome to showcase its functions for ATAC-seq interpretation. This software is publicly accessible and can be downloaded at https://github.com/RitataLU/ATACgraph.
Project description:Nucleosomal DNA is thought to be generally inaccessible to DNA-binding factors, such as micrococcal nuclease (MNase). Here, we digest Drosophila chromatin with high and low concentrations of MNase to reveal two distinct nucleosome types: MNase-sensitive and MNase-resistant. MNase-resistant nucleosomes assemble on sequences depleted of A/T and enriched in G/C-containing dinucleotides, whereas MNase-sensitive nucleosomes form on A/T-rich sequences found at transcription start and termination sites, enhancers and DNase I hypersensitive sites. Estimates of nucleosome formation energies indicate that MNase-sensitive nucleosomes tend to be less stable than MNase-resistant ones. Strikingly, a decrease in cell growth temperature of about 10°C makes MNase-sensitive nucleosomes less accessible, suggesting that observed variations in MNase sensitivity are related to either thermal fluctuations of chromatin fibers or the activity of enzymatic machinery. In the vicinity of active genes and DNase I hypersensitive sites nucleosomes are organized into periodic arrays, likely due to 'phasing' off potential barriers formed by DNA-bound factors or by nucleosomes anchored to their positions through external interactions. The latter idea is substantiated by our biophysical model of nucleosome positioning and energetics, which predicts that nucleosomes immediately downstream of transcription start sites are anchored and recapitulates nucleosome phasing at active genes significantly better than sequence-dependent models.
Project description:Metamorphosis is a widely studied post-embryonic process in which many tissues undergo dramatic modifications to adapt to the new adult lifestyle. Flatfishes represent a good example of metamorphosis in teleost fishes. During metamorphosis of flatfish, organ regression and neoformation occur, with one of the most notable changes being the migration of one of the eyes to the other side of the body. In order to create a useful and reliable tool to advance the molecular study of metamorphosis in flatfish, we generated a chromatin accessible atlas as well as gene expression profile during four developmental stages ranging from a phylotypic to a post-metamorphic stage. We identified 29,019 differentially accessible chromatin regions and 3,253 differentially expressed genes. We found stage-specific regulatory regions and gene expression profiles, supporting the quality of the results. Our work provides strongly reproducible data for further studies to elucidate the regulatory elements that ensure successful metamorphosis in flatfish species.
Project description:Many genetic variants influence complex traits by modulating gene expression, thus altering the abundance of one or multiple proteins. Here we introduce a powerful strategy that integrates gene expression measurements with summary association statistics from large-scale genome-wide association studies (GWAS) to identify genes whose cis-regulated expression is associated with complex traits. We leverage expression imputation from genetic data to perform a transcriptome-wide association study (TWAS) to identify significant expression-trait associations. We applied our approaches to expression data from blood and adipose tissue measured in ∼ 3,000 individuals overall. We imputed gene expression into GWAS data from over 900,000 phenotype measurements to identify 69 new genes significantly associated with obesity-related traits (BMI, lipids and height). Many of these genes are associated with relevant phenotypes in the Hybrid Mouse Diversity Panel. Our results showcase the power of integrating genotype, gene expression and phenotype to gain insights into the genetic basis of complex traits.
Project description:Diffuse large B-cell lymphoma (DLBCL) is an aggressive cancer originating from mature B-cells. Prognosis is strongly associated with molecular subgroup, although the driver mutations that distinguish the two main subgroups remain poorly defined. Through an integrative analysis of whole genomes, exomes, and transcriptomes, we have uncovered genes and non-coding loci that are commonly mutated in DLBCL. Our analysis has identified novel cis-regulatory sites, and implicates recurrent mutations in the 3' UTR of NFKBIZ as a novel mechanism of oncogene deregulation and NF-κB pathway activation in the activated B-cell (ABC) subgroup. Small amplifications associated with over-expression of FCGR2B (the Fcγ receptor protein IIB), primarily in the germinal centre B-cell (GCB) subgroup, correlate with poor patient outcomes suggestive of a novel oncogene. These results expand the list of subgroup driver mutations that may facilitate implementation of improved diagnostic assays and could offer new avenues for the development of targeted therapeutics.
Project description:Genome-wide chromatin state underlies gene expression potential and cellular function. Epigenetic features and nucleosome positioning contribute to the accessibility of DNA, but widespread regulators of chromatin state are largely unknown. Our study investigates how coordination of ANP32E and H2A.Z contributes to genome-wide chromatin state in mouse fibroblasts. We define H2A.Z as a universal chromatin accessibility factor, and demonstrate that ANP32E antagonizes H2A.Z accumulation to restrict chromatin accessibility genome-wide. In the absence of ANP32E, H2A.Z accumulates at promoters in a hierarchical manner. H2A.Z initially localizes downstream of the transcription start site, and if H2A.Z is already present downstream, additional H2A.Z accumulates upstream. This hierarchical H2A.Z accumulation coincides with improved nucleosome positioning, heightened transcription factor binding, and increased expression of neighboring genes. Thus, ANP32E dramatically influences genome-wide chromatin accessibility through subtle refinement of H2A.Z patterns, providing a means to reprogram chromatin state and to hone gene expression levels.
Project description:Nucleosome organization is important for chromatin compaction and accessibility. Profiling nucleosome positioning genome-wide in single cells provides critical information to understand the cell-to-cell heterogeneity of chromatin states within a cell population. This protocol describes single-cell micrococcal nuclease sequencing (scMNase-seq), a method for detecting genome-wide nucleosome positioning and chromatin accessibility simultaneously from a small number of cells or single cells. To generate scMNase-seq libraries, single cells are isolated by FACS sorting, lysed and digested by MNase. DNA is purified, end-repaired and ligated to Y-shaped adaptors. Following PCR amplification with indexing primers, the subnucleosome-sized (fragments with a length of ≤80 bp) and mononucleosome-sized (fragments with a length between 140 and 180 bp) DNA fragments are recovered and sequenced on Illumina HiSeq platforms. On average, 0.5-1 million unique mapped reads are obtained for each single cell. The mononucleosome-sized DNA fragments precisely define genome-wide nucleosome positions in single cells, while the subnucleosome-sized DNA fragments provide information on chromatin accessibility. Library preparation of scMNase-seq takes only 2 d, requires only standard molecular biology techniques and does not require sophisticated laboratory equipment. Processing of high-throughput sequencing data requires basic bioinformatics skills and uses publicly available bioinformatics software.
Project description:To better understand genome regulation, it is important to uncover the role of transcription factors in the process of chromatin structure establishment and maintenance. Here we present a data-driven approach to systematically characterise transcription factors that are relevant for this process. Our method uses a linear mixed modelling approach to combine datasets of transcription factor binding motif enrichments in open chromatin and gene expression across the same set of cell lines. Applying this approach to the ENCODE dataset, we confirm already known and imply numerous novel transcription factors that play a role in the establishment or maintenance of open chromatin. In particular, our approach rediscovers many factors that have been annotated as pioneer factors.
Project description:Despite advances in treatment, 30% of diffuse large B-cell lymphoma (DLBCL) cases are refractory or relapse after chemoimmunotherapy. Currently, the relationship between angiogenesis and angiomiRs in DLBCL is unknown. We classified 84 DLBCL cases according to stromal signatures and evaluated the expression of pro- and antiangiomiRs in paraffin embedded tissues of DLBCL and correlated them with microvascular density (MVD). 40% of cases were classified as stromal-1, 50% as stromal-2 and 10% were not classified. We observed increased expression of proangiomiRs Let-7f, miR-17, miR-18a, miR-19b, miR-126, miR-130a, miR-210, miR-296 and miR-378 in 14%, 57%, 30%, 45%, 12%, 12%, 56%, 58% and 48% of the cases, respectively. Among antiangiomiRs we found decreased expression of miR-16, miR-20b, miR-92a, miR-221 and miR-328 in, respectively, 27%, 71%, 2%, 44% and 11%. We found association between increased expression of proangiomiRs miR-126 and miR-130a and antiangiomiR miR-328 and the subtype non-GCB. We found higher levels of the antiangiomiRs miR-16, miR-221 and miR-328 in patients with low MVD and stromal-1 signature. IPI and CD34 confirmed independent impact on survival of the study group. None of the above angiomiRs showed significance as biomarker in an independent serum samples cohort of patients and controls. In conclusion, we confirmed association between antiangiomiRs miR-16, miR-221 and miR-328 and stromal-1 signature. Four angiomiRs emerged as potential therapeutic targets: proangiomiRs miR-17, miR-210 and miR-296 and antiangiomiR miR-20b. Although the four microRNAs seem to be important in DLBCL pathogenesis, they were not predictive of DLBCL onset or relapse in the serum independent cohort.
Project description:Diffuse large B cell lymphoma (DLBCL) is the most common lymphoma subtype and is clinically aggressive. To identify genetic susceptibility loci for DLBCL, we conducted a meta-analysis of 3 new genome-wide association studies (GWAS) and 1 previous scan, totaling 3,857 cases and 7,666 controls of European ancestry, with additional genotyping of 9 promising SNPs in 1,359 cases and 4,557 controls. In our multi-stage analysis, five independent SNPs in four loci achieved genome-wide significance marked by rs116446171 at 6p25.3 (EXOC2; P = 2.33 × 10(-21)), rs2523607 at 6p21.33 (HLA-B; P = 2.40 × 10(-10)), rs79480871 at 2p23.3 (NCOA1; P = 4.23 × 10(-8)) and two independent SNPs, rs13255292 and rs4733601, at 8q24.21 (PVT1; P = 9.98 × 10(-13) and 3.63 × 10(-11), respectively). These data provide substantial new evidence for genetic susceptibility to this B cell malignancy and point to pathways involved in immune recognition and immune function in the pathogenesis of DLBCL.