Project description:Custom exon aCGH analysis of copy number across the genomes of 16 horse breeds Two-condition experiment, All breed samples were compared to a single Thoroughbred reference, Reference was then compared to Twilight (DNA from horse used for reference genome assembly)
Project description:The naked mole-rat (NMR; Heterocephalus glaber) has recently gained considerable attention in the scientific community for its unique potential to unveil novel insights in the fields of medicine, biochemistry, and evolution. NMRs exhibit unique adaptations that include protracted fertility, cancer resistance, eusociality, and anoxia. This suite of adaptations is not found in other rodent species, suggesting that interrogating conserved and accelerated regions in the NMR genome will find regions of the NMR genome fundamental to their unique adaptations. However, the current NMR genome assembly has limits that make studying structural variations, heterozygosity, and non-coding adaptations challenging. We present a complete diploid naked-mole rat genome assembly by integrating long-read and 10X-linked read genome sequencing of a male NMR and its parents, and Hi-C sequencing in the NMR hypothalamus (N=2). Reads were identified as maternal, paternal or ambiguous (TrioCanu). We then polished genomes with Flye, Racon and Medaka. Assemblies were then scaffolded using the following tools in order: Scaff10X, Salsa2, 3d-DNA, Minimap2-alignment between assemblies, and the Juicebox Assembly Tools. We then subjected the assemblies to another round of polishing, including short-read polishing with Freebayes. We assembled the NMR mitochondrial genome with mitoVGP. Y chromosome contigs were identified by aligning male and female 10X linked reads to the paternal genome and finding male-biased contigs not present in the maternal genome. Contigs were assembled with publicly available male NMR Fibroblast Hi-C-seq data (SRR820318). Both assemblies have their sex chromosome haplotypes merged so that both assemblies have a high-quality X and Y chromosome. Finally, assemblies were evaluated with Quast, BUSCO, and Merqury, which all reported the base-pair quality and contiguity of both assemblies as high-quality. The assembly will next be annotated by Ensembl using public RNA-seq data from multiple tissues (SRP061363). Together, this assembly will provide a high-quality resource to the NMR and comparative genomics communities.
Project description:The naked mole-rat (NMR; Heterocephalus glaber) has recently gained considerable attention in the scientific community for its unique potential to unveil novel insights in the fields of medicine, biochemistry, and evolution. NMRs exhibit unique adaptations that include protracted fertility, cancer resistance, eusociality, and anoxia. This suite of adaptations is not found in other rodent species, suggesting that interrogating conserved and accelerated regions in the NMR genome will find regions of the NMR genome fundamental to their unique adaptations. However, the current NMR genome assembly has limits that make studying structural variations, heterozygosity, and non-coding adaptations challenging. We present a complete diploid naked-mole rat genome assembly by integrating long-read and 10X-linked read genome sequencing of a male NMR and its parents, and Hi-C sequencing in the NMR hypothalamus (N=2). Reads were identified as maternal, paternal or ambiguous (TrioCanu). We then polished genomes with Flye, Racon and Medaka. Assemblies were then scaffolded using the following tools in order: Scaff10X, Salsa2, 3d-DNA, Minimap2-alignment between assemblies, and the Juicebox Assembly Tools. We then subjected the assemblies to another round of polishing, including short-read polishing with Freebayes. We assembled the NMR mitochondrial genome with mitoVGP. Y chromosome contigs were identified by aligning male and female 10X linked reads to the paternal genome and finding male-biased contigs not present in the maternal genome. Contigs were assembled with publicly available male NMR Fibroblast Hi-C-seq data (SRR820318). Both assemblies have their sex chromosome haplotypes merged so that both assemblies have a high-quality X and Y chromosome. Finally, assemblies were evaluated with Quast, BUSCO, and Merqury, which all reported the base-pair quality and contiguity of both assemblies as high-quality. The assembly will next be annotated by Ensembl using public RNA-seq data from multiple tissues (SRP061363). Together, this assembly will provide a high-quality resource to the NMR and comparative genomics communities.
Project description:Background Trombidid mites have a unique lifecycle in which only the larval stage is ectoparasitic. In the superfamily Trombiculoidea (“chiggers”), the larvae feed preferentially on vertebrates, including humans. Species in the genus Leptotrombidium are vectors of a potentially fatal bacterial infection, scrub typhus, which affects 1 million people annually. Moreover, chiggers can cause pruritic dermatitis (trombiculiasis) in humans and domesticated animals. In the Trombidioidea (velvet mites), the larvae feed on other arthropods and are potential biological control agents for agricultural pests. Here, we present the first trombidid mites genomes, obtained both for a chigger, Leptotrombidium deliense, and for a velvet mite, Dinothrombium tinctorium. Results Sequencing was performed on the Illumina MiSeq platform. A 180 Mb draft assembly for D. tinctorium was generated from two paired-end and one mate-pair library using a single adult specimen. For L. deliense, a lower-coverage draft assembly (117 Mb) was obtained using pooled, engorged larvae with a single paired-end library. Remarkably, both genomes exhibited evidence of ancient lateral gene transfer from soil-derived bacteria or fungi. The transferred genes confer functions that are rare in animals, including terpene and carotenoid synthesis. Thirty-seven allergenic protein families were predicted in the L. deliense genome, of which nine were unique. Preliminary proteomic analyses identified several of these putative allergens in larvae. Conclusions Trombidid mite genomes appear to be more dynamic than those of other acariform mites. A priority for future research is to determine the biological function of terpene synthesis in this taxon and its potential for exploitation in disease control. Project was jointly supervised by Stuart Armstrong and Ben Makepeace.
Project description:Rhizoctonia solani Kühn is a soilborne basidiomycetous fungus that causes significant damage to many economically important crops. R. solani isolates are classified into 13 Anastomosis Groups (AGs) with interspecific subgroups having distinctive morphology, pathogenicity and wide host range. However, the genetic factors that drive the unique fungal pathology are still not well characterized due to the limited number of available annotated genomes. Therefore, we performed genome sequencing, assembly, annotation and functional analysis of 13 R. solani isolates covering 7 AGs and selected subgroups (AG1-IA, AG1-IB, AG1-IC, AG2-2IIIB, AG3-PT, AG3-TB, AG4-HG-I, AG5, AG6, and AG8). Here, we report a pangenome comparative analysis of 13 R. solani isolates covering important groups to elucidate unique and common attributes associated with each isolate, including molecular factors potentially involved in determining AG-specific host preference. Finally, we present the largest repertoire of annotated R. solani genomes, compiled as a comprehensive and user-friendly database, viz. RsolaniDB. Since 7 genomes are reported for the first time, the database stands as a valuable platform for formulating new hypotheses by hosting annotated genomes, with tools for functional enrichment, orthologs and sequence analysis, currently not available with other accessible state-of-the-art platforms hosting Rhizoctonia genome sequences.
Project description:The majority of bacterial genomes have high coding efficiencies, but there are an few genomes of the intracellular bacteria that have low gene density. The genome of the endosymbiont Sodalis glossinidius contains almost 50% pseudogenes containing mutations that putatively silence them at the genomic level. We have applied multiple omic strategies: combining single molecule DNA-sequencing and annotation; stranded RNA-sequencing and proteome analysis to better understand the transcriptional and translational landscape of Sodalis pseudogenes, and potential mechanisms for their control. Between 53% and 74% of the Sodalis transcriptome remains active in cell-free culture. Mean sense transcription from Coding Domain Sequences (CDS) is four-times greater than that from pseudogenes. Core-genome analysis of six Illumina sequenced Sodalis isolates from different host Glossina species shows pseudogenes make up ~40% of the 2,729 genes in the core genome, suggesting are stable and/or Sodalis is a recent introduction across the Glossina genus as a facultative symbiont. These data further shed light on the importance of transcriptional and translational control in deciphering host-microbe interactions, and demonstrate that pseudogenes are more complex than a simple degrading DNA sequence. For this reason, we show that combining genomics, transcriptomics and proteomics represents an important resource for studying prokaryotic genomes with a view to elucidating evolutionary adaptation to novel environmental niches.
Project description:a chromosome-level nuclear genome and organelle genomes of the alpine snow alga Chloromonas typhlos were sequenced and assembled by integrating short- and long-read sequencing and proteogenomic strategy
Project description:This study describes the combined sequencing of the genomes and transcriptomes of single blastomeres from mouse 8-cell stage embryos.
Project description:High-order three-dimensional (3D) organization of regulatory elements provide a topological basis for genes regulation, but it remains unclear that how multiple regulatory elements interact across mammalian genome in a single cell. To address this, herein, we developed scNanoHi-C, which applies Nanopore sequencing to explore genome-wide proximal high-order chromatin contacts within individual cells. Evaluation of the method suggested that scNanoHi-C reliably and effectively profiled chromosome structure and distinguished structure subtype among single cells, which could also be used to detect genomic variations including CNVs and SVs, as well as scaffold the de novo assembly of single cell genomes. Importantly, our results suggested that high-order structures were extensively existed in active transcription chromatin across the genome, and multiway interactions between enhancers and their target promoters were observed in the first time within single cells. Taken together, scNanoHi-C sparked new insights of high-order 3D genome structure at the single-cell level.
Project description:Detailed analyses of the clone-based genome assembly reveal that the recent duplication content of mouse (4.94%) is now comparable to that of human (5.5%), in contrast to previous estimates from the whole-genome shotgun sequence assembly. The architecture of mouse and human genomes differ dramatically; most mouse duplications are organized into discrete clusters of tandem duplications that are depleted for genes/transcripts and enriched for LINE1 and LTR retroposons. We assessed copy-number variation of the C57BL/6J duplicated regions within 15 mouse strains used for genetic association studies, sequencing, and the mouse phenome project. We determined that over 60% of these basepairs are polymorphic between the strains (on average 20 Mbp of copy-number variable DNA between different mouse strains). Our data suggest that different mouse strains show comparable, if not greater, copy-number polymorphism when compared to human; however, such variation is more locally restricted. We show large and complex patterns of inter-strain copy-number variation restricted to large gene families associated with spermatogenesis, pregnancy, viviparity, phermone signaling, and immune response. Keywords: comparative genomic hybridization Genomic DNA of 14 inbred mouse strains was tested against reference C57BL/6J sample. C57BL/6J DNA sample from another individual was tested as negative control.