Project description:The naked mole-rat (NMR; Heterocephalus glaber) has recently gained considerable attention in the scientific community for its unique potential to unveil novel insights in the fields of medicine, biochemistry, and evolution. NMRs exhibit unique adaptations that include protracted fertility, cancer resistance, eusociality, and anoxia. This suite of adaptations is not found in other rodent species, suggesting that interrogating conserved and accelerated regions in the NMR genome will find regions of the NMR genome fundamental to their unique adaptations. However, the current NMR genome assembly has limits that make studying structural variations, heterozygosity, and non-coding adaptations challenging. We present a complete diploid naked-mole rat genome assembly by integrating long-read and 10X-linked read genome sequencing of a male NMR and its parents, and Hi-C sequencing in the NMR hypothalamus (N=2). Reads were identified as maternal, paternal or ambiguous (TrioCanu). We then polished genomes with Flye, Racon and Medaka. Assemblies were then scaffolded using the following tools in order: Scaff10X, Salsa2, 3d-DNA, Minimap2-alignment between assemblies, and the Juicebox Assembly Tools. We then subjected the assemblies to another round of polishing, including short-read polishing with Freebayes. We assembled the NMR mitochondrial genome with mitoVGP. Y chromosome contigs were identified by aligning male and female 10X linked reads to the paternal genome and finding male-biased contigs not present in the maternal genome. Contigs were assembled with publicly available male NMR Fibroblast Hi-C-seq data (SRR820318). Both assemblies have their sex chromosome haplotypes merged so that both assemblies have a high-quality X and Y chromosome. Finally, assemblies were evaluated with Quast, BUSCO, and Merqury, which all reported the base-pair quality and contiguity of both assemblies as high-quality. The assembly will next be annotated by Ensembl using public RNA-seq data from multiple tissues (SRP061363). Together, this assembly will provide a high-quality resource to the NMR and comparative genomics communities.
Project description:The naked mole-rat (NMR; Heterocephalus glaber) has recently gained considerable attention in the scientific community for its unique potential to unveil novel insights in the fields of medicine, biochemistry, and evolution. NMRs exhibit unique adaptations that include protracted fertility, cancer resistance, eusociality, and anoxia. This suite of adaptations is not found in other rodent species, suggesting that interrogating conserved and accelerated regions in the NMR genome will find regions of the NMR genome fundamental to their unique adaptations. However, the current NMR genome assembly has limits that make studying structural variations, heterozygosity, and non-coding adaptations challenging. We present a complete diploid naked-mole rat genome assembly by integrating long-read and 10X-linked read genome sequencing of a male NMR and its parents, and Hi-C sequencing in the NMR hypothalamus (N=2). Reads were identified as maternal, paternal or ambiguous (TrioCanu). We then polished genomes with Flye, Racon and Medaka. Assemblies were then scaffolded using the following tools in order: Scaff10X, Salsa2, 3d-DNA, Minimap2-alignment between assemblies, and the Juicebox Assembly Tools. We then subjected the assemblies to another round of polishing, including short-read polishing with Freebayes. We assembled the NMR mitochondrial genome with mitoVGP. Y chromosome contigs were identified by aligning male and female 10X linked reads to the paternal genome and finding male-biased contigs not present in the maternal genome. Contigs were assembled with publicly available male NMR Fibroblast Hi-C-seq data (SRR820318). Both assemblies have their sex chromosome haplotypes merged so that both assemblies have a high-quality X and Y chromosome. Finally, assemblies were evaluated with Quast, BUSCO, and Merqury, which all reported the base-pair quality and contiguity of both assemblies as high-quality. The assembly will next be annotated by Ensembl using public RNA-seq data from multiple tissues (SRP061363). Together, this assembly will provide a high-quality resource to the NMR and comparative genomics communities.
Project description:We present a draft genome assembly that includes 200 Gb of Illumina reads, 4 Gb of Moleculo synthetic long-reads and 108 Gb of Chicago libraries, with a final size matching the estimated genome size of 2.7 Gb, and a scaffold N50 of 4.8 Mb. We also present an alternative assembly including 27 Gb raw reads generated using the Pacific Biosciences platform. In addition, we sequenced the proteome of the same individual and RNA from three different tissue types from three other species of squid species (Onychoteuthis banksii, Dosidicus gigas, and Sthenoteuthis oualaniensis) to assist genome annotation. We annotated 33,406 protein coding genes supported by evidence and the genome completeness estimated by BUSCO reached 92%. Repetitive regions cover 49.17% of the genome.
Project description:Rhizoctonia solani Kühn is a soilborne basidiomycetous fungus that causes significant damage to many economically important crops. R. solani isolates are classified into 13 Anastomosis Groups (AGs) with interspecific subgroups having distinctive morphology, pathogenicity and wide host range. However, the genetic factors that drive the unique fungal pathology are still not well characterized due to the limited number of available annotated genomes. Therefore, we performed genome sequencing, assembly, annotation and functional analysis of 13 R. solani isolates covering 7 AGs and selected subgroups (AG1-IA, AG1-IB, AG1-IC, AG2-2IIIB, AG3-PT, AG3-TB, AG4-HG-I, AG5, AG6, and AG8). Here, we report a pangenome comparative analysis of 13 R. solani isolates covering important groups to elucidate unique and common attributes associated with each isolate, including molecular factors potentially involved in determining AG-specific host preference. Finally, we present the largest repertoire of annotated R. solani genomes, compiled as a comprehensive and user-friendly database, viz. RsolaniDB. Since 7 genomes are reported for the first time, the database stands as a valuable platform for formulating new hypotheses by hosting annotated genomes, with tools for functional enrichment, orthologs and sequence analysis, currently not available with other accessible state-of-the-art platforms hosting Rhizoctonia genome sequences.
Project description:We first report the use of next-generation massively parallel sequencing technologies and de novo transcriptome assembly to gain insight into the wide range of transcriptome of Hevea brasiliensis. The output of sequenced data showed that more than 12 million sequence reads with average length of 90nt were generated. Totally 48,768 unigenes (mean size = 488 bp) were assembled through transcriptome de novo assembly, which represent more than 3-fold of all the sequences of Hevea brasiliensis deposited in the GenBank. Assembled sequences were annotated with gene descriptions, gene ontology and clusters of orthologous group terms. Total 37,373 unigenes were successfully annotated and more than 10% of unigenes were aligned to known proteins of Euphorbiaceae. The unigenes contain nearly complete collection of known rubber-synthesis-related genes. Our data provides the most comprehensive sequence resource available for study rubber tree and demonstrates the availability of Illumina sequencing and de novo transcriptome assembly in a species lacking genome information. The transcriptome of latex and leaf in Hevea brasiliensis
Project description:We have sequenced a wild Prunus mume and constructed a reference sequence for this genome. In order to improve quality of gene models, RNA samples of five tissues (bud, leaf, root, stem, fruit) were extracted from the Prunus mume. To investigate tissue specific expression using the reference genome assembly and annotated genes, we extracted RNA samples of different tissues and conducted transcriptome sequencing and DEG analysis.