Project description:Sequence overlap between two genes is common across all genomes, with viruses having particularly high proportions of these gene overlaps. The natural biological function and effects on fitness of gene overlaps are not fully understood and their effects on gene cluster and genome-level refactoring are unknown.The model bacteriophage φX174 genome displays complex sequence architecture in which ~26% of nucleotides are involved in encoding more than one gene. In this study we use an engineered φX174 phage containing a genome with all gene overlaps removed.
Here we have temporally measured the proteome of a synthetically engineered and wild-type φX174 during infection. We find that almost half of all phage proteins (5/11) have abnormal expression profiles after genome modularisation.
Project description:Inteins are self-splicing protein elements found in viruses and all three domains of life. How the DNA encoding these selfish elements spreads within and between genomes is poorly understood, particularly in eukaryotes where inteins are scarce. Here we show that the nuclear genomes of three strains of Anaeramoeba encode between 45 and 103 inteins, in stark contrast to four found in the most intein-rich eukaryote described previously. The Anaeramoeba inteins reside in a wide range of proteins, only some of which correspond to intein-containing proteins in other eukaryotes, prokaryotes and viruses. Our data also suggest that viruses have contributed to the spread of inteins in Anaeramoeba and the colonization of new alleles. The persistence of Anaeramoeba inteins might be partly explained by intragenomic movement of intein-encoding regions from gene to gene. Our intein dataset greatly expands the spectrum of intein-containing proteins and provides insights into the evolution of inteins in eukaryotes.
Project description:The experiment investigates the effect of BPV-1 on microRNA profile in control cell line and BPV-1 in vitro transformed cell line A subset of microRNAs have been found differentially expressed in cells containing the BPV-1 genomes compared to control cells
Project description:Computer algorithms are often used to identify tRNA genes in newly sequenced genomes, but these predictions can be challenging. Not only are there structural variations and extremely limited sequence conservation among genes, but vertebrate genomes tend to have highly reiterated short interspersed sequences (SINEs) that originally derived from tRNA genes or tRNA-like transcription units. We have employed two programs, tRNAScan SE and ARAGORN, to predict the tRNA genes in the mouse nuclear genome, resulting in quite diverse but overlapping predicted gene sets. From these, we removed known SINE repeats and sorted the genes into predicted families and single-copy genes. In particular, four families of intron-containing tRNA genes were predicted, with introns in positions and structures analogous to the well characterized intron-containing tRNA genes in yeast. In this work we focus on verifying the expression of the intron-containing tRNA gene familes, as well as the other 30 tRNA gene familes. Keywords: tRNA, direct label
Project description:5-Methylcytosine (5-mC) is an important DNA modification found in eukaryotes that impacts gene regulation and disease pathogenesis. Recently, 5-hydroxymethylcytosine (5-hmC), another form of DNA modification, has been identified in substantial amounts in certain mammalian cell types; however, its roles as well as its distribution in mammalian genomes are unknown. Here we present a selective chemical labeling method for 5-hmC by utilizing T4 bacteriophage BGT-glucosyltransferase to transfer an engineered glucose moiety containing an azide group onto the hydroxyl group of 5-hmC, which in turn can chemically incorporate a biotin group for detection, affinity enrichment, and sequencing of 5-hmC in mammalian genomes. Using this highly effective method, we demonstrate that 5-hmC is present in human cell lines beyond those previously recognized. We also find a gene expression level-dependent enrichment of intragenic 5-hmC in mouse cerebellum and an age-dependent acquisition of this modification in specific gene bodies linked to neurodegenerative disorders Identification of 5hmC enriched genmoic regions in mouse cerebellum
Project description:We have developed a method for mapping unmethylated sites in human genome based on the resistant of TspR1 digested ends to exoIII nuclease degradation. Digestion with TspR1 and methylation-sensitive restriction endonuclease, HpaII, followed by exoIII and single strand DNA nuclease allows the removal of DNA fragments containing unmethylated HpaII sites. We then use array CGH to map the sequences depleted by this procedures in human genomes derived from five human tissues, a primary breast tumor and two breast tumor cell lines. Analysis of methylation patterns of the normal tissue genomes indicates that the hypomethylated sites are enriched in the 5’ end of widely expressed genes including promoter, first exon and first intron. In contrast, genomes of the MCF-7 and MDA-MB-231 cell lines show extensive hypomethylation in the intragenic and intergenic regions whereas primary tumor exhibits intermediate pattern between normal tissue and cell lines. A striking characteristic of tumor genomes is the presence of megabase-sized hypomethylated zones. These hypomethylated zones are associated with large genes, fragile sites, evolutionary breakpoints, chromosomal rearrangement breakpoints, tumor supperessor genes, and with regions containing tissue-specific gene clusters or with gene poor region containing novel tissue-specific genes. Bisulfite sequencing analysis shows a novel mosaic methylation pattern with alternative methylated and unmethylated zones was found in human histone gene clusters in chromosome 6. Correlation with microarray analysis show that genes with hypomethylated sequence 2kb up- or down-stream of transcription start site are highly expressed whereas genes with extensive intragenic and 3’ UTR hypomethylation are silenced. The method described herein can be used for large scale screening of changes in methylation pattern in the genome of interest. Keywords: Genome-Wide Mapping of Hypomethylated Sites in Human Genomes
Project description:5-Hydroxymethyluracil (5hmU) is a thymine modification existing in the genomes of a number of living organisms. The post-replicative formation of 5hmU occurs via hydroxylation of thymine, which can be mediated by the ten-eleven translocation (TET) dioxygenases in mammalian and J-binding proteins (JBPs) in protozoan genomes, respectively. In addition, 5hmU also can be generated through oxidation of thymine by reactive oxygen species or from deamination of 5hmC by activation-induced cytidine deaminase (AID) or APOBEC family enzymes. While the biological roles of 5hmU have not been fully explored, identifying its genomic location will assist in elucidating its functions. Herein, we report a method of enzyme-mediated bioorthogonal labeling to selectively enrich genomic regions containing 5hmU. 5hmU DNA kinase (5hmUDK) was utilized to selectively install an azide group or alkynyl group into the hydroxyl group of 5hmU followed by incorporation of the biotin linker through click chemistry and capture of 5hmU-containing DNA fragments via streptavidin pull-down. The enriched fragments were applied to deep sequencing to map the location of 5hmU. With this established enzyme-mediated bioorthogonal labeling strategy, we achieved the genome-wide mapping of 5hmU in Trypanosoma brucei (T. brucei) genomes. The method described here will allow for a better understanding of the functional roles and dynamics of 5hmU in genomes
Project description:Prochlorococcus genomes harbor a new type of mobile genetic elements named tycheposons. To study the effects on environmental stress on the gene expression and induction of tycheposons, we subjugated cultures of Prochlorococcus strain MIT0604 containing 7 such elements to treatments with mitomycin C and UV stress.