Project description:To determine the optimal RNA-Seq approach for animal host-bacterial symbiont analysis, we compared transcriptome bias, depth and coverage achieved by two different mRNA capture and sequencing strategies applied to the marine demosponge Amphimedon queenslandica holobiont, for which genomes of the animal host and three most abundant bacterial symbionts are available.
Project description:The naked mole-rat (NMR; Heterocephalus glaber) has recently gained considerable attention in the scientific community for its unique potential to unveil novel insights in the fields of medicine, biochemistry, and evolution. NMRs exhibit unique adaptations that include protracted fertility, cancer resistance, eusociality, and anoxia. This suite of adaptations is not found in other rodent species, suggesting that interrogating conserved and accelerated regions in the NMR genome will find regions of the NMR genome fundamental to their unique adaptations. However, the current NMR genome assembly has limits that make studying structural variations, heterozygosity, and non-coding adaptations challenging. We present a complete diploid naked-mole rat genome assembly by integrating long-read and 10X-linked read genome sequencing of a male NMR and its parents, and Hi-C sequencing in the NMR hypothalamus (N=2). Reads were identified as maternal, paternal or ambiguous (TrioCanu). We then polished genomes with Flye, Racon and Medaka. Assemblies were then scaffolded using the following tools in order: Scaff10X, Salsa2, 3d-DNA, Minimap2-alignment between assemblies, and the Juicebox Assembly Tools. We then subjected the assemblies to another round of polishing, including short-read polishing with Freebayes. We assembled the NMR mitochondrial genome with mitoVGP. Y chromosome contigs were identified by aligning male and female 10X linked reads to the paternal genome and finding male-biased contigs not present in the maternal genome. Contigs were assembled with publicly available male NMR Fibroblast Hi-C-seq data (SRR820318). Both assemblies have their sex chromosome haplotypes merged so that both assemblies have a high-quality X and Y chromosome. Finally, assemblies were evaluated with Quast, BUSCO, and Merqury, which all reported the base-pair quality and contiguity of both assemblies as high-quality. The assembly will next be annotated by Ensembl using public RNA-seq data from multiple tissues (SRP061363). Together, this assembly will provide a high-quality resource to the NMR and comparative genomics communities.
Project description:The naked mole-rat (NMR; Heterocephalus glaber) has recently gained considerable attention in the scientific community for its unique potential to unveil novel insights in the fields of medicine, biochemistry, and evolution. NMRs exhibit unique adaptations that include protracted fertility, cancer resistance, eusociality, and anoxia. This suite of adaptations is not found in other rodent species, suggesting that interrogating conserved and accelerated regions in the NMR genome will find regions of the NMR genome fundamental to their unique adaptations. However, the current NMR genome assembly has limits that make studying structural variations, heterozygosity, and non-coding adaptations challenging. We present a complete diploid naked-mole rat genome assembly by integrating long-read and 10X-linked read genome sequencing of a male NMR and its parents, and Hi-C sequencing in the NMR hypothalamus (N=2). Reads were identified as maternal, paternal or ambiguous (TrioCanu). We then polished genomes with Flye, Racon and Medaka. Assemblies were then scaffolded using the following tools in order: Scaff10X, Salsa2, 3d-DNA, Minimap2-alignment between assemblies, and the Juicebox Assembly Tools. We then subjected the assemblies to another round of polishing, including short-read polishing with Freebayes. We assembled the NMR mitochondrial genome with mitoVGP. Y chromosome contigs were identified by aligning male and female 10X linked reads to the paternal genome and finding male-biased contigs not present in the maternal genome. Contigs were assembled with publicly available male NMR Fibroblast Hi-C-seq data (SRR820318). Both assemblies have their sex chromosome haplotypes merged so that both assemblies have a high-quality X and Y chromosome. Finally, assemblies were evaluated with Quast, BUSCO, and Merqury, which all reported the base-pair quality and contiguity of both assemblies as high-quality. The assembly will next be annotated by Ensembl using public RNA-seq data from multiple tissues (SRP061363). Together, this assembly will provide a high-quality resource to the NMR and comparative genomics communities.
Project description:We used microarray analysis to investigate whole genome transcriptome dynamics of the marine cyanobacterium Prochlorococcus sp. strain MED4 and the T7-like podovirus P-SSP7 over a time course during the 8 hour latent period of lytic infection prior to cell lysis. Manuscript Summary: Interactions between bacterial hosts and their viruses (phages) lead to reciprocal genome evolution through a dynamic co-evolutionary process1-5. Phage-mediated transfer of host genes – often located in genome islands – has had a major impact on microbial evolution1, 4, 6. Furthermore, phage genomes have clearly been shaped by the acquisition of genes from their hosts2, 3, 5. Here we investigate whole-genome expression of a host and phage, the marine cyanobacterium Prochlorococcus and a T7-like cyanophage during lytic infection, to gain insight into these co-evolutionary processes. While most of the phage genome was linearly transcribed over the course of infection, 4 phage-encoded bacterial metabolism genes were part of the same expression cluster, even though they are physically separated on the genome. These genes — encoding photosystem II D1 (psbA), high-light inducible protein (hli), transaldolase (talC) and ribonucleotide reductase (nrd) — are transcribed together with phage DNA replication genes and appear to make up a functional unit involved in energy and deoxynucleotide production needed for phage replication in resource-poor oceans. Also unique to this system was the upregulation of numerous genes in the host during infection. These may be host stress response genes, and/or genes induced by the phage. Many of these host genes are located in genome islands and have homologues in cyanophage genomes. We hypothesize that phage have evolved to utilize upregulated host genes, leading to their stable incorporation into phage genomes and their subsequent transfer back to hosts in genome islands. Thus activation of host genes during infection may be directing the co-evolution of gene content in both host and phage genomes. Keywords: time course, viral infection, marine cyanobacteria, podovirus, bacteriophage, stress response
Project description:Vertebrates have highly methylated genomes at CpG positions while most invertebrates have sparsely methylated genomes. Therefore, hypermethylation is considered a major innovation that shaped the genome and the regulatory roles of DNA methylation in vertebrates. However, here we report that the marine sponge Amphimedon queenslandica, belonging to one of the earliest branching animal lineages, has evolved a hypermethylated genome with remarkable similarities to that of a vertebrate. Despite major differences in genome size and architecture, independent acquisition of hypermethylation reveal common distribution patterns and repercussions for genome regulation between both lineages. Genome wide depletion of CpGs is counterbalanced by CpG enrichment at unmethylated promoters, mirroring CpG islands. Furthermore, a subset of CpG-bearing transcription factor motifs are enriched at Amphimedon unmethylated promoters. We find that the animal-specific transcription factor NRF has conserved methyl-sensitivity over 700 million years, indicating an ancient cross-talk between transcription factors and DNA methylation. Finally, the sponge shows vertebrate-like levels of 5-hydroxymethylcytosine, the oxidative derivative of cytosine methylation involved in active demethylation. Hydroxymethylation is concentrated in regions that are enriched for transcription factor motifs and show developmentally dynamic demethylation. Together, these findings push back the links between DNA methylation and its regulatory roles to the early steps of animal evolution. Thus, the Amphimedon methylome challenges the prior hypotheses about the origins of vertebrate genome hypermethylation and its implications for regulatory complexity.
Project description:Definition of functional regulatory regions in the vast non-coding fractions of mammalian genomes remains a daunting challenge that underscores our limited understanding of mammalian gene regulation. Genome sequencing of many mammals has recently been exp
Project description:Genetic diversity in plants is remarkably high. Recent whole genome sequencing (WGS) of 67 rice accessions recovered 10,872 novel genes. Comparison of the genetic architecture among divergent populations or between crops and wild relatives is essential for obtaining functional components determining crucial traits. However, many major crops have gigabase-scale genomes, which are not well-suited to WGS. Existing cost-effective sequencing approaches including re-sequencing, exome-sequencing and restriction enzyme-based methods all have difficulty in obtaining long novel genomic sequences from highly divergent population with large genome size. The present study presented a reference-independent core genome targeted sequencing approach, CGT-seq, which employed epigenomic information from both active and repressive epigenetic marks to guide the assembly of the core genome mainly composed of promoter and intragenic regions. This method was relatively easily implemented, and displayed high accuracy, sensitivity and specificity for capturing the core genome of bread wheat. 95% intragenic and 89% promoter region from wheat were covered by CGT-seq read. We further demonstrated in rice that CGT-seq captured hundreds of novel genes and regulatory sequences from a previously unsequenced ecotype. Together, with specific enrichment and sequencing of regions within and nearby genes, CGT-seq is a time- and resource-effective approach to profiling functionally relevant regions in sequenced and non-sequenced populations with large genomes.