Project description:5-hydroxymethylcytosine (5-hmC), a derivative of 5-methylcytosine (5-mC), is abundant in the brain for unknown reasons. We mapped the genomic distribution of 5-hmC and 5-mC in human and mouse tissues using glucosylation of 5-hmC coupled with restriction enzyme digestion, and interrogation on microarrays. We detected 5-hmC enrichment in genes with synapse-related functions in the brain. We also identified significant, tissue-specific differential distributions of these DNA modifications at the exon-intron boundary, in both human and mouse. This boundary change was mainly due to 5-hmC in the brain, but due to 5-mC in non-neural contexts. This pattern was replicated in multiple independent datasets, and the brain-specific change in 5-hmC was validated using single-molecule sequencing. Moreover, in the brain, constitutive exons contained higher levels of 5-hmC, relative to alternatively-spliced exons. Our study suggests a novel role for 5-hmC in RNA splicing and synaptic function in the brain
Project description:5-hydroxymethylcytosine (5-hmC), a derivative of 5-methylcytosine (5-mC), is abundant in the brain for unknown reasons. We mapped the genomic distribution of 5-hmC and 5-mC in human and mouse tissues using glucosylation of 5-hmC coupled with restriction enzyme digestion, and interrogation on microarrays. We detected 5-hmC enrichment in genes with synapse-related functions in the brain. We also identified significant, tissue-specific differential distributions of these DNA modifications at the exon-intron boundary, in both human and mouse. This boundary change was mainly due to 5-hmC in the brain, but due to 5-mC in non-neural contexts. This pattern was replicated in multiple independent datasets, and the brain-specific change in 5-hmC was validated using single-molecule sequencing. Moreover, in the brain, constitutive exons contained higher levels of 5-hmC, relative to alternatively-spliced exons. Our study suggests a novel role for 5-hmC in RNA splicing and synaptic function in the brain
Project description:The 5-methylcytosine (5-mC) derivative 5-hydroxymethylcytosine (5-hmC) is abundant in the brain for unknown reasons. Here we characterize the genomic distribution of 5-hmC and 5-mC in human and mouse tissues. We assayed 5-hmC by using glucosylation coupled with restriction-enzyme digestion and microarray analysis. We detected 5-hmC enrichment in genes with synapse-related functions in both human and mouse brain. We also identified substantial tissue-specific differential distributions of these DNA modifications at the exon-intron boundary in human and mouse. This boundary change was mainly due to 5-hmC in the brain but due to 5-mC in non-neural contexts. This pattern was replicated in multiple independent data sets and with single-molecule sequencing. Moreover, in human frontal cortex, constitutive exons contained higher levels of 5-hmC relative to alternatively spliced exons. Our study suggests a new role for 5-hmC in RNA splicing and synaptic function in the brain.
Project description:5-hydroxymethylcytosine (5-hmC), a derivative of 5-methylcytosine (5-mC), is abundant in the brain for unknown reasons. We mapped the genomic distribution of 5-hmC and 5-mC in human and mouse tissues using glucosylation of 5-hmC coupled with restriction enzyme digestion, and interrogation on microarrays. We detected 5-hmC enrichment in genes with synapse-related functions in the brain. We also identified significant, tissue-specific differential distributions of these DNA modifications at the exon-intron boundary, in both human and mouse. This boundary change was mainly due to 5-hmC in the brain, but due to 5-mC in non-neural contexts. This pattern was replicated in multiple independent datasets, and the brain-specific change in 5-hmC was validated using single-molecule sequencing. Moreover, in the brain, constitutive exons contained higher levels of 5-hmC, relative to alternatively-spliced exons. Our study suggests a novel role for 5-hmC in RNA splicing and synaptic function in the brain The M-NM-2-glucosyltransferase (BGT) enzyme transfers a glucose molecule specifically to the hydroxymethyl group of 5-hmC, thus rendering it resistant to digestion by the methylation insensitive MspI enzyme at the ChmCGG target site; 5-hmC is thus detected by differential resistance to MspI-digestion with and without glucosylation of genomic DNA (gDNA). HpaII (targets the same site, CCGG) cannot cut CmCGG or ChmCGG, and conceptually its difference with MspI digestion is a measure of both 5-mC and 5-hmC. Subtraction of 5-hmC from the HpaII-based estimate therefore measures 5-mC. These estimates were measured on respective Affymetrix whole-genome tiling arrays (2.0 R ) MspI - APRIL_fc1.ch02_coverage.bed Undigested DNA - APRIL_fc1.ch01_coverage.bed GluMspI - APRIL_fc1.ch04_coverage_NONTARGET.bed GluMspI - APRIL_fc1.ch04_coverage.bed MspI - APRIL_fc1.ch02_coverage_NONTARGET.bed Undigested DNA - APRIL_fc1.ch01_coverage_NONTARGET.bed MspI - MARCH_fc1.ch02_coverage_NONTARGET.bed Undigested DNA - MARCH_fc1.ch01_coverage_NONTARGET.bed GluMspI - MARCH_fc1.ch04_coverage_NONTARGET.bed GluMspI - MARCH_fc1.ch04_coverage.bed MspI - MARCH_fc1.ch02_coverage.bed Undigested DNA - MARCH_fc1.ch01_coverage.bed MspI - MAY_fc1.ch02_coverage.bed GluMspI - MAY_fc1.ch04_coverage_NONTARGET.bed GluMspI - MAY_fc1.ch04_coverage.bed Undigested DNA - MAY_fc1.ch01_coverage.bed MspI - MAY_fc1.ch02_coverage_NONTARGET.bed Undigested DNA - MAY_fc1.ch01_coverage_NONTARGET.bed
Project description:Gene duplication plays key roles in organismal evolution. Duplicate genes, if they survive, tend to diverge in regulatory and coding regions. Divergences in coding regions, especially those that can change the function of the gene, can be caused by amino acid-altering substitutions and/or alterations in exon-intron structure. Much has been learned about the mode, tempo, and consequences of nucleotide substitutions, yet relatively little is known about structural divergences. In this study, by analyzing 612 pairs of sibling paralogs from seven representative gene families and 300 pairs of one-to-one orthologs from different species, we investigated the occurrence and relative importance of structural divergences during the evolution of duplicate and nonduplicate genes. We found that structural divergences have been very prevalent in duplicate genes and, in many cases, have led to the generation of functionally distinct paralogs. Comparisons of the genomic sequences of these genes further indicated that the differences in exon-intron structure were actually accomplished by three main types of mechanisms (exon/intron gain/loss, exonization/pseudoexonization, and insertion/deletion), each of which contributed differently to structural divergence. Like nucleotide substitutions, insertion/deletion and exonization/pseudoexonization occurred more or less randomly, with the number of observable mutational events per gene pair being largely proportional to evolutionary time. Notably, however, compared with paralogs with similar evolutionary times, orthologs have accumulated significantly fewer structural changes, whereas the amounts of amino acid replacements accumulated did not show clear differences. This finding suggests that structural divergences have played a more important role during the evolution of duplicate than nonduplicate genes.
Project description:Precise identification of correct exon-intron boundaries is a prerequisite to analyze the location and structure of genes. The existing framework for genomic signals, delineating exon and introns in a genomic segment, seems insufficient, predominantly due to poor sequence consensus as well as limitations of training on available experimental data sets. We present here a novel concept for characterizing exon-intron boundaries in genomic segments on the basis of structural and energetic properties. We analyzed boundary junctions on both sides of all the exons (3 28 368) of protein coding genes from human genome (GENCODE database) using 28 structural and three energy parameters. Study of sequence conservation at these sites shows very poor consensus. It is observed that DNA adopts a unique structural and energy state at the boundary junctions. Also, signals are somewhat different for housekeeping and tissue specific genes. Clustering of 31 parameters into four derived vectors gives some additional insights into the physical mechanisms involved in this biological process. Sites of structural and energy signals correlate well to the positions playing important roles in pre-mRNA splicing.
Project description:BackgroundThe origin and importance of exon-intron architecture comprises one of the remaining mysteries of gene evolution. Several studies have investigated the variations of intron length, GC content, ordinal position in a gene and divergence. However, there is little study about the structural variation of exons and introns.ResultsWe investigated the length, GC content, ordinal position and divergence in both exons and introns of 13 eukaryotic genomes, representing plant and animal. Our analyses revealed that three basic patterns of exon-intron variation were present in nearly all analyzed genomes (P < 0.001 in most cases): an ordinal reduction of length and divergence in both exon and intron, a co-variation between exon and its flanking introns in their length, GC content and divergence, and a decrease of average exon (or intron) length, GC content and divergence as the total exon numbers of a gene increased. In addition, we observed that the shorter introns had either low or high GC content, and the GC content of long introns was intermediate.ConclusionAlthough the factors contributing to these patterns have not been identified, our results provide three important clues: common factor(s) exist and may shape both exons and introns; the ordinal reduction patterns may reflect a time-orderly evolution; and the larger first and last exons may be splicing-required. These clues provide a framework for elucidating mechanisms involved in the organization of eukaryotic genomes and particularly in building exon-intron structures.
Project description:The regulation of metazoan gene expression occurs in part by pre-mRNA splicing into mature RNAs. Signals affecting the efficiency and specificity with which introns are removed have not been completely elucidated. Splicing likely occurs cotranscriptionally, with chromatin structure playing a key regulatory role. We calculated DNA encoded nucleosome occupancy likelihood (NOL) scores at the boundaries between introns and exons across five metazoan species. We found that (i) NOL scores reveal a sequence-based feature at the introns on both sides of the intron-exon boundary; (ii) this feature is not part of any recognizable consensus sequence; (iii) this feature is conserved throughout metazoa; (iv) this feature is enriched in genes sharing similar functions: ATPase activity, ATP binding, helicase activity, and motor activity; (v) genes with these functions exhibit different genomic characteristics; (vi) in vivo nucleosome positioning data confirm ontological enrichment at this feature; and (vii) genes with this feature exhibit unique dinucleotide distributions at the intron-exon boundary. The NOL scores point toward a physical property of DNA that may play a role in the mechanism of pre-mRNA splicing. These results provide a foundation for identification of a new set of regulatory DNA elements involved in splicing regulation.