Project description:Crassulacean acid metabolism (CAM) is a water-use efficient adaptation of photosynthesis that has evolved independently many times in diverse lineages of flowering plants. We hypothesize that convergent evolution of protein sequence and temporal gene expression underpins the independent emergences of CAM from C3 photosynthesis. To test this hypothesis, we generated a de novo genome assembly and genome-wide transcript expression data for Kalanchoe fedtschenkoi, an obligate CAM species within the core eudicots with a relatively small genome (~260 Mb). Our comparative analyses identified signatures of convergence in protein sequence and re-scheduling of diel transcript expression of genes involved in nocturnal CO2 fixation, stomatal movement, heat tolerance, circadian clock and carbohydrate metabolism in K. fedtschenkoi and other CAM species in comparison with non-CAM species. These findings provide new insights into molecular convergence and building blocks of CAM and will facilitate CAM-into-C3 photosynthesis engineering to enhance water-use efficiency in crops.
Project description:<p><strong>BACKGROUND:</strong> Manchurian walnut (Juglans mandshurica Maxim.) is a tree with multiple industrial uses and medicinal properties in the Juglandaceae family (walnuts and hickories). J. mandshurica produces juglone, which is a toxic allelopathic agent and has potential utilization value. Furthermore, the seed of J. mandshurica is rich in various unsaturated fatty acids and has high nutritive value.</p><p><strong>FINDINGS:</strong> Here, we present a high-quality chromosome-scale reference genome assembly and annotation for J. mandshurica (n = 16) with a contig N50 of 21.4 Mb by combining PacBio high-fidelity reads with high-throughput chromosome conformation capture data. The assembled genome has an estimated sequence size of 548.7 Mb and consists of 657 contigs, 623 scaffolds and 40,453 protein-coding genes. In total, 60.99% of the assembled genome consists of repetitive sequences. Sixteen super-scaffolds corresponding to the 16 chromosomes were assembled, with a scaffold N50 length of 33.7 Mb and a BUSCO complete gene percentage of 98.3%. J. mandshurica displays a close sequence relationship with Juglans cathayensis, with a divergence time of 13.8 million years ago. Combining the high-quality genome, transcriptome and metabolomics data, we constructed a gene-to-metabolite network and identified 566 core and conserved differentially expressed genes, which may be involved in juglone biosynthesis. Five CYP450 genes were found that may contribute to juglone accumulation. NAC, bZip, NF-YA and NF-YC are positively correlated with the juglone content. Some candidate regulators (e.g., FUS3, ABI3, LEC2 and WRI1 transcription factors) involved in the regulation of lipid biosynthesis were also identified.</p><p><strong>CONCLUSIONS:</strong> Our genomic data provide new insights into the evolution of the walnut genome and create a new platform for accelerating molecular breeding and improving the comprehensive utilization of these economically important tree species.</p>
Project description:Sequence overlap between two genes is common across all genomes, with viruses having particularly high proportions of these gene overlaps. The natural biological function and effects on fitness of gene overlaps are not fully understood and their effects on gene cluster and genome-level refactoring are unknown.The model bacteriophage φX174 genome displays complex sequence architecture in which ~26% of nucleotides are involved in encoding more than one gene. In this study we use an engineered φX174 phage containing a genome with all gene overlaps removed.
Here we have temporally measured the proteome of a synthetically engineered and wild-type φX174 during infection. We find that almost half of all phage proteins (5/11) have abnormal expression profiles after genome modularisation.
Project description:The ideal genome sequence for medical interpretation is complete and diploid, capturing the full spectrum of genetic variation. Toward this end, there has been progress in discovery of single nucleotide polymorphism (SNP) and small (<10bp) insertion/deletions (indels), but annotation of larger structural variation (SV) including copy number variation (CNV) has been less comprehensive, even with available diploid sequence assemblies. We applied a multi-step sequence and microarray-based analysis to identify numerous previously unknown SVs within the first genome sequence reported from an individual.
Project description:The ideal genome sequence for medical interpretation is complete and diploid, capturing the full spectrum of genetic variation. Toward this end, there has been progress in discovery of single nucleotide polymorphism (SNP) and small (<10bp) insertion/deletions (indels), but annotation of larger structural variation (SV) including copy number variation (CNV) has been less comprehensive, even with available diploid sequence assemblies. We applied a multi-step sequence and microarray-based analysis to identify numerous previously unknown SVs within the first genome sequence reported from an individual.
Project description:The ideal genome sequence for medical interpretation is complete and diploid, capturing the full spectrum of genetic variation. Toward this end, there has been progress in discovery of single nucleotide polymorphism (SNP) and small (<10bp) insertion/deletions (indels), but annotation of larger structural variation (SV) including copy number variation (CNV) has been less comprehensive, even with available diploid sequence assemblies. We applied a multi-step sequence and microarray-based analysis to identify numerous previously unknown SVs within the first genome sequence reported from an individual.
Project description:The ideal genome sequence for medical interpretation is complete and diploid, capturing the full spectrum of genetic variation. Toward this end, there has been progress in discovery of single nucleotide polymorphism (SNP) and small (<10bp) insertion/deletions (indels), but annotation of larger structural variation (SV) including copy number variation (CNV) has been less comprehensive, even with available diploid sequence assemblies. We applied a multi-step sequence and microarray-based analysis to identify numerous previously unknown SVs within the first genome sequence reported from an individual.
Project description:The ideal genome sequence for medical interpretation is complete and diploid, capturing the full spectrum of genetic variation. Toward this end, there has been progress in discovery of single nucleotide polymorphism (SNP) and small (<10bp) insertion/deletions (indels), but annotation of larger structural variation (SV) including copy number variation (CNV) has been less comprehensive, even with available diploid sequence assemblies. We applied a multi-step sequence and microarray-based analysis to identify numerous previously unknown SVs within the first genome sequence reported from an individual.
Project description:ChIP-seq and input sequence data used in the development and evaluation of the BEADS normalization method. Examination of ChIP and input sequence reads across the worm genome
Project description:The sequence determinants of chromatin bivalency remain unclear. We analysed sequence determinants of chromatin bivalency genome-wide in several mammalian species and performed a series of transgenic experiments in mouse ES cells. Genome-wide mapping of H3K27me3 in rat ES cells and ChIP-seq with anti-Ezh2 antibody in transgenic mouse ES cells