Project description:Microsporidia are intracellular parasitic fungi whose genomes rank among the smallest of all known eukaryotes. A number of outstanding questions remain concerning the evolution of their large-scale variation in genome architecture, responsible for genome size variation of more than an order of magnitude. This genome report presents the first near-chromosomal assembly of a large-genome microsporidium, Hamiltosporidium tvaerminnensis. Combined Oxford Nanopore, Pacific Biosciences (PacBio), and Illumina sequencing led to a genome assembly of 17 contigs, 11 of which represent complete chromosomes. Our assembly is 21.64 Mb in length, has an N50 of 1.44 Mb, and consists of 39.56% interspersed repeats. We introduce a novel approach in microsporidia, PacBio Iso-Seq, as part of a larger annotation pipeline for obtaining high-quality annotations of 3,573 protein-coding genes. Based on direct evidence from the full-length Iso-Seq transcripts, we present evidence for alternative polyadenylation and variation in splicing efficiency, which are potential regulation mechanisms for gene expression in microsporidia. The generated high-quality genome assembly is a necessary resource for comparative genomics that will help elucidate the evolution of genome architecture in response to intracellular parasitism.
Project description:Malaria remains a major healthcare risk to growing economies like India, and a chromosome-level reference genome of Anopheles stephensi is critical for successful vector management and understanding of vector evolution using comparative genomics. We report chromosome-level assemblies of an Indian strain, STE2, and a Pakistani strain SDA-500 by combining draft genomes of the two strains using a homology-based iterative approach. The resulting assembly IndV3/PakV3 with L50 of 9/12 and N50 6.3/6.9 Mb had scaffolds long enough for building 90% of the euchromatic regions of the three chromosomes, IndV3s/PakV3s, using low-resolution physical markers and enabled the generation of the next version of genome assemblies, IndV4/PakV4, using HiC data. We have validated these assemblies using contact maps against publicly available HiC raw data from two strains including STE2 and another lab strain of An. stephensi from UCI and compare the quality of the assemblies with other assemblies made available as preprints since the submission of the manuscript. We show that the IndV3s and IndV4 assemblies are sensitive in identifying a homozygous 2Rb inversion in the UCI strain and a 2Rb polymorphism in the STE2 strain. Multiple tandem copies of CYP6a14, 4c1, and 4c21 genes, implicated in insecticide resistance, lie within this inversion locus. Comparison of assembled genomes suggests a variation of 1 in 81 positions between the UCI and STE2 lab strains, 1 in 82 between SDA-500 and UCI strain, and 1 in 113 between SDA-500 and STE2 strains of An. stephensi, which are closer than 1 in 68 variations among individuals from two other lab strains sequenced and reported here. Based on the developmental transcriptome and orthology of all the 54 olfactory receptors (ORs) to those of other Anopheles species, we identify an OR with the potential for host recognition in the genus Anopheles. A comparative analysis of An. stephensi genomes with the completed genomes of a few other Anopheles species suggests limited inter-chromosomal gene flow and loss of synteny within chromosomal arms even among the closely related species.
Project description:The Tasmanian devil, a marsupial carnivore, is endangered due to the emergence of a clonally transmissible cancer known as Devil Facial Tumor Disease (DFTD). This fatal cancer is clonally derived and is an allograft transmitted between devils by biting. We performed a large-scale genetic analysis of DFTD with microsatellite genotyping, mitochondrial genome analysis, as well as deep sequencing of the DFTD transcriptome and miRNAs. These studies confirm that DFTD is a monophyletic clonally transmissible tumor, and suggest that the disease is of Schwann cell origin. On the basis of these results we have generated a diagnostic marker for DFTD, and identify a suite of genes of relevance to DFTD pathology and transmission. We provide a genomic dataset for the Tasmanian devil, which is applicable to cancer diagnosis, disease evolution and conservation biology. This submission contains only small RNA sequence data from this study. Small RNA (18 - 24 nt) sequences from 15 Tasmanian devil (Sarcophilus harrisii) tissue samples
Project description:The Tasmanian devil, a marsupial carnivore, is endangered due to the emergence of a clonally transmissible cancer known as Devil Facial Tumor Disease (DFTD). This fatal cancer is clonally derived and is an allograft transmitted between devils by biting. We performed a large-scale genetic analysis of DFTD with microsatellite genotyping, mitochondrial genome analysis, as well as deep sequencing of the DFTD transcriptome and miRNAs. These studies confirm that DFTD is a monophyletic clonally transmissible tumor, and suggest that the disease is of Schwann cell origin. On the basis of these results we have generated a diagnostic marker for DFTD, and identify a suite of genes of relevance to DFTD pathology and transmission. We provide a genomic dataset for the Tasmanian devil, which is applicable to cancer diagnosis, disease evolution and conservation biology. This submission contains only small RNA sequence data from this study.
Project description:Over the past decade, the spotted wing Drosophila, Drosophila suzukii, has invaded Europe and America and has become a major agricultural pest in these areas, thereby prompting intense research activities to better understand its biology. Two draft genome assemblies already exist for this species but contain pervasive assembly errors and are highly fragmented, which limits their values. Our purpose here was to improve the assembly of the D. suzukii genome and to annotate it in a way that facilitates comparisons with D. melanogaster. For this, we generated PacBio long-read sequencing data and assembled a novel, high-quality D. suzukii genome assembly. It is one of the largest Drosophila genomes, notably because of the expansion of its repeatome. We found that despite 16 rounds of full-sib crossings the D. suzukii strain that we sequenced has maintained high levels of polymorphism in some regions of its genome. As a consequence, the quality of the assembly of these regions was reduced. We explored possible origins of this high residual diversity, including the presence of structural variants and a possible heterogeneous admixture pattern of North American and Asian ancestry. Overall, our assembly and annotation constitute a high-quality genomic resource that can be used for both high-throughput sequencing approaches, as well as manipulative genetic technologies to study D. suzukii.
Project description:Sablefish (Anoplopoma fimbria) are in the suborder Cottioidei, which also includes stickleback and lumpfish. This species inhabits coastal regions of the northeastern and northwestern Pacific Ocean from California to Japan. A commercial fishery for sablefish began to flourish in the 1960s, though a downward trend in stock biomass and landings has been observed since 2010. Aquaculture protocols have been developed for sablefish; eggs and sperm from wild-caught and hatchery-reared captive broodstock are used to generate offspring that reach market size in about two years. Parentage analyses show that survival in aquaculture varies among families. Growth rate and disease resistance also vary among individuals and cohorts, but the extent to which genetics and the environment contribute to this variation is unclear. The sablefish genome assembly reported here will form the foundation for SNP-based surveys designed to detect genetic markers associated with survival, growth rate, and pathogen resistance. Beyond its contribution to sablefish domestication, the sablefish genome can be a resource for the management of the wild sablefish fishery. The assembly generated in this study had a length of 653 Mbp, a scaffold N50 of 26.74 Mbp, a contig N50 of 2.57 Mbp, and contained more than 98% of the 3640 Actinopterygii core genes. We placed 620.9 Mbp (95% of the total) onto 24 chromosomes using a genetic map derived from six full-sib families and Hi-C contact data.