Project description:The naked mole-rat (NMR; Heterocephalus glaber) has recently gained considerable attention in the scientific community for its unique potential to unveil novel insights in the fields of medicine, biochemistry, and evolution. NMRs exhibit unique adaptations that include protracted fertility, cancer resistance, eusociality, and anoxia. This suite of adaptations is not found in other rodent species, suggesting that interrogating conserved and accelerated regions in the NMR genome will find regions of the NMR genome fundamental to their unique adaptations. However, the current NMR genome assembly has limits that make studying structural variations, heterozygosity, and non-coding adaptations challenging. We present a complete diploid naked-mole rat genome assembly by integrating long-read and 10X-linked read genome sequencing of a male NMR and its parents, and Hi-C sequencing in the NMR hypothalamus (N=2). Reads were identified as maternal, paternal or ambiguous (TrioCanu). We then polished genomes with Flye, Racon and Medaka. Assemblies were then scaffolded using the following tools in order: Scaff10X, Salsa2, 3d-DNA, Minimap2-alignment between assemblies, and the Juicebox Assembly Tools. We then subjected the assemblies to another round of polishing, including short-read polishing with Freebayes. We assembled the NMR mitochondrial genome with mitoVGP. Y chromosome contigs were identified by aligning male and female 10X linked reads to the paternal genome and finding male-biased contigs not present in the maternal genome. Contigs were assembled with publicly available male NMR Fibroblast Hi-C-seq data (SRR820318). Both assemblies have their sex chromosome haplotypes merged so that both assemblies have a high-quality X and Y chromosome. Finally, assemblies were evaluated with Quast, BUSCO, and Merqury, which all reported the base-pair quality and contiguity of both assemblies as high-quality. The assembly will next be annotated by Ensembl using public RNA-seq data from multiple tissues (SRP061363). Together, this assembly will provide a high-quality resource to the NMR and comparative genomics communities.

Project description:Aim: We aim to compare current (MeDIP-seq), new (Illumina Infinium 450K BeadChip) and future (PacBio) methods for whole genome DNA methylation analysis. As the interest in determination of disease methylation profiles increases, the scope, advantages and limitations of these methods requires assessment. There are key questions to answer and specific challenges to overcome. For example, how much detail/resolution is sufficient to identify regions of differential methylation and regions of biological/medical significance within a sample? How much coverage of the genome is required for accurate methylation analysis? Is it important to confirm which regions of the genome are unmethylated in addition to focusing on those that are methylated? Loss of methylation may be of equal importance within the cell since this may also contribute to disease pathogenesis. A multi-method (affinity enrichment/bisulphite-conversion based/direct sequencing of methyl-cytosine) and technology platform (Illumina HiSeq/PacBio/Illumina Infinium BeadChip) comparison will enable us to determine the strengths and weakness of each method. We propose to compare four methods using two DNA samples from the Coriell Institute for Cell Repository to assess both current and future capabilities for whole genome methylation analysis in parallel: A) MeDIP-seq using Illumina HiSeq B) Illumina Infinium HumanMethylation 450K BeadChip and C) whole genome methylation sequencing using PacBio. Existing single molecule deep bisulphite sequencing data generated previously from these same samples at the WTSI for targeted regions (30-40 genes) on the human X chromosome will be used to assess performance of each method. The methods selected for this study will generate data covering a range of resolutions from a whole genome scan to array (target defined) resolution and up to single base pair, single molecule resolution; the highest level of detail possible with methods currently available.Samples: DNA from sibling pair GM01240 (female) and GM01240 (male).Requirements: Both samples will be analysed using;A.MeDIP-seq using Illumina HiSeq (one HiSeq lane, 75bp paired end, per sample) B.Illumina Infinium HumanMethylation 450K BeadChipWe are expecting a potentially unnecessary high coverage using one HiSeq lane per sample. However, for the MeDIP procedure we do not have a multiplexing procedure in place. Our requirements for PacBio sequencing have been discussed with and will be supported by the Sequencing Technology Development group.

Project description:The naked mole-rat (NMR; Heterocephalus glaber) has recently gained considerable attention in the scientific community for its unique potential to unveil novel insights in the fields of medicine, biochemistry, and evolution. NMRs exhibit unique adaptations that include protracted fertility, cancer resistance, eusociality, and anoxia. This suite of adaptations is not found in other rodent species, suggesting that interrogating conserved and accelerated regions in the NMR genome will find regions of the NMR genome fundamental to their unique adaptations. However, the current NMR genome assembly has limits that make studying structural variations, heterozygosity, and non-coding adaptations challenging. We present a complete diploid naked-mole rat genome assembly by integrating long-read and 10X-linked read genome sequencing of a male NMR and its parents, and Hi-C sequencing in the NMR hypothalamus (N=2). Reads were identified as maternal, paternal or ambiguous (TrioCanu). We then polished genomes with Flye, Racon and Medaka. Assemblies were then scaffolded using the following tools in order: Scaff10X, Salsa2, 3d-DNA, Minimap2-alignment between assemblies, and the Juicebox Assembly Tools. We then subjected the assemblies to another round of polishing, including short-read polishing with Freebayes. We assembled the NMR mitochondrial genome with mitoVGP. Y chromosome contigs were identified by aligning male and female 10X linked reads to the paternal genome and finding male-biased contigs not present in the maternal genome. Contigs were assembled with publicly available male NMR Fibroblast Hi-C-seq data (SRR820318). Both assemblies have their sex chromosome haplotypes merged so that both assemblies have a high-quality X and Y chromosome. Finally, assemblies were evaluated with Quast, BUSCO, and Merqury, which all reported the base-pair quality and contiguity of both assemblies as high-quality. The assembly will next be annotated by Ensembl using public RNA-seq data from multiple tissues (SRP061363). Together, this assembly will provide a high-quality resource to the NMR and comparative genomics communities.

Dataset Information

Drosophilidae

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure