Project description:BACKGROUND:The western mosquitofish (Gambusia affinis) is a sexually dimorphic poeciliid fish known for its worldwide biological invasion and therefore an important research model for studying invasion biology. This organism may also be used as a suitable model to explore sex chromosome evolution and reproductive development in terms of differentiation of ZW sex chromosomes, ovoviviparity, and specialization of reproductive organs. However, there is a lack of high-quality genomic data for the female G. affinis; hence, this study aimed to generate a chromosome-level genome assembly for it. RESULTS:The chromosome-level genome assembly was constructed using Oxford nanopore sequencing, BioNano, and Hi-C technology. G. affinis genomic DNA sequences containing 217 contigs with an N50 length of 12.9 Mb and 125 scaffolds with an N50 length of 26.5 Mb were obtained by Oxford nanopore and BioNano, respectively, and the 113 scaffolds (90.4% of scaffolds containing 97.9% nucleotide bases) were assembled into 24 chromosomes (pseudo-chromosomes) by Hi-C. The Z and W chromosomes of G. affinis were identified by comparative genomic analysis of female and male G. affinis, and the mechanism of differentiation of the Z and W chromosomes was explored. Combined with transcriptome data from 6 tissues, a total of 23,997 protein-coding genes were predicted and 23,737 (98.9%) genes were functionally annotated. CONCLUSIONS:The high-quality female G. affinis reference genome provides a valuable omics resource for future studies of comparative genomics and functional genomics to explore the evolution of Z and W chromosomes and the reproductive developmental biology of G. affinis.
Project description:The teleost fish Monopterus albus is emerging as a new model for biological studies due to its natural sex transition and small genome, in addition to its enormous economic and potential medical value. However, no genomic information for the Monopterus is currently available. Here, we sequenced and de novo assembled the genome of M. albus and report the de novochromosome assembly by FISH walking assisted by conserved synteny (Cafs). Using Cafs, 328 scaffolds were assembled into 12 chromosomes, which covered genomic sequences of 555 Mb, accounting for 81.3% of the sequences assembled in scaffolds (∼689 Mb). A total of 18 ,660 genes were mapped on the chromosomes and showed a nonrandom distribution along chromosomes. We report the first reference genome of the Monopterus and provide an efficient Cafs strategy for a de novo chromosome-level assembly of the Monopterus genome, which provides a valuable resource, not only for further studies in genetics, evolution, and development, particularly sex determination, but also for breed improvement of the species.
Project description:The axolotl (Ambystoma mexicanum) provides critical models for studying regeneration, evolution, and development. However, its large genome (∼32 Gb) presents a formidable barrier to genetic analyses. Recent efforts have yielded genome assemblies consisting of thousands of unordered scaffolds that resolve gene structures, but do not yet permit large-scale analyses of genome structure and function. We adapted an established mapping approach to leverage dense SNP typing information and for the first time assemble the axolotl genome into 14 chromosomes. Moreover, we used fluorescence in situ hybridization to verify the structure of these 14 scaffolds and assign each to its corresponding physical chromosome. This new assembly covers 27.3 Gb and encompasses 94% of annotated gene models on chromosomal scaffolds. We show the assembly's utility by resolving genome-wide orthologies between the axolotl and other vertebrates, identifying the footprints of historical introgression events that occurred during the development of axolotl genetic stocks, and precisely mapping several phenotypes including a large deletion underlying the cardiac mutant. This chromosome-scale assembly will greatly facilitate studies of the axolotl in biological research.
Project description:Cucumis hystrix Chakr. (2n?=?2x?=?24) is a wild species that can hybridize with cultivated cucumber (C. sativus L., 2n?=?2x?=?14), a globally important vegetable crop. However, cucumber breeding is hindered by its narrow genetic base. Therefore, introgression from C. hystrix has been anticipated to bring a breakthrough in cucumber improvement. Here, we report the chromosome-scale assembly of C. hystrix genome (289?Mb). Scaffold N50 reached 14.1?Mb. Over 90% of the sequences were anchored onto 12 chromosomes. A total of 23,864 genes were annotated using a hybrid method. Further, we conducted a comprehensive comparative genomic analysis of cucumber, C. hystrix, and melon (C. melo L., 2n?=?2x?=?24). Whole-genome comparisons revealed that C. hystrix is phylogenetically closer to cucumber than to melon, providing a molecular basis for the success of its hybridization with cucumber. Moreover, expanded gene families of C. hystrix were significantly enriched in "defense response," and C. hystrix harbored 104 nucleotide-binding site-encoding disease resistance gene analogs. Furthermore, 121 genes were positively selected, and 12 (9.9%) of these were involved in responses to biotic stimuli, which might explain the high disease resistance of C. hystrix. The alignment of whole C. hystrix genome with cucumber genome and self-alignment revealed 45,417 chromosome-specific sequences evenly distributed on C. hystrix chromosomes. Finally, we developed four cucumber-C. hystrix alien addition lines and identified the exact introgressed chromosome using molecular and cytological methods. The assembled C. hystrix genome can serve as a valuable resource for studies on Cucumis evolution and interspecific introgression breeding of cucumber.
Project description:<p><strong>BACKGROUND:</strong> Plants exhibit wide chemical diversity due to the production of specialized metabolites that function as pollinator attractants, defensive compounds, and signaling molecules. Lamiaceae (mints) are known for their chemodiversity and have been cultivated for use as culinary herbs, as well as sources of insect repellents, health-promoting compounds, and fragrance.</p><p><strong>FINDINGS:</strong> We report the chromosome-scale genome assembly of Callicarpa americana L. (American beautyberry), a species within the early-diverging Callicarpoideae clade of Lamiaceae, known for its metallic purple fruits and use as an insect repellent due to its production of terpenoids. Using long-read sequencing and Hi-C scaffolding, we generated a 506.1-Mb assembly spanning 17 pseudomolecules with N50 contig and N50 scaffold sizes of 7.5 and 29.0 Mb, respectively. In all, 32,164 genes were annotated, including 53 candidate terpene synthases and 47 putative clusters of specialized metabolite biosynthetic pathways. Our analyses revealed 3 putative whole-genome duplication events, which, together with local tandem duplications, contributed to gene family expansion of terpene synthases. Kolavenyl diphosphate is a gateway to many of the bioactive terpenoids in C. americana; experimental validation confirmed that CamTPS2 encodes kolavenyl diphosphate synthase. Syntenic analyses with Tectona grandis L. f. (teak), a member of the Tectonoideae clade of Lamiaceae known for exceptionally strong wood resistant to insects, revealed 963 collinear blocks and 21,297 C. americana syntelogs.</p><p><strong>CONCLUSIONS:</strong> Access to the C. americana genome provides a road map for rapid discovery of genes encoding plant-derived agrichemicals and a key resource for understanding the evolution of chemical diversity in Lamiaceae.</p>
Project description:BACKGROUND:Plants exhibit wide chemical diversity due to the production of specialized metabolites that function as pollinator attractants, defensive compounds, and signaling molecules. Lamiaceae (mints) are known for their chemodiversity and have been cultivated for use as culinary herbs, as well as sources of insect repellents, health-promoting compounds, and fragrance. FINDINGS:We report the chromosome-scale genome assembly of Callicarpa americana L. (American beautyberry), a species within the early-diverging Callicarpoideae clade of Lamiaceae, known for its metallic purple fruits and use as an insect repellent due to its production of terpenoids. Using long-read sequencing and Hi-C scaffolding, we generated a 506.1-Mb assembly spanning 17 pseudomolecules with N50 contig and N50 scaffold sizes of 7.5 and 29.0 Mb, respectively. In all, 32,164 genes were annotated, including 53 candidate terpene synthases and 47 putative clusters of specialized metabolite biosynthetic pathways. Our analyses revealed 3 putative whole-genome duplication events, which, together with local tandem duplications, contributed to gene family expansion of terpene synthases. Kolavenyl diphosphate is a gateway to many of the bioactive terpenoids in C. americana; experimental validation confirmed that CamTPS2 encodes kolavenyl diphosphate synthase. Syntenic analyses with Tectona grandis L. f. (teak), a member of the Tectonoideae clade of Lamiaceae known for exceptionally strong wood resistant to insects, revealed 963 collinear blocks and 21,297 C. americana syntelogs. CONCLUSIONS:Access to the C. americana genome provides a road map for rapid discovery of genes encoding plant-derived agrichemicals and a key resource for understanding the evolution of chemical diversity in Lamiaceae.
Project description:The mangrove Kandelia obovata (Rhizophoraceae) is an important coastal shelterbelt and landscape tree distributed in tropical and subtropical areas across East Asia and Southeast Asia. Herein, a chromosome-level reference genome of K. obovata based on PacBio, Illumina, and Hi-C data is reported. The high-quality assembled genome size is 177.99 Mb, with a contig N50 value of 5.74 Mb. A large number of contracted gene families and a small number of expanded gene families, as well as a small number of repeated sequences, may account for the small K. obovata genome. We found that K. obovata experienced two whole-genome polyploidization events: one whole-genome duplication shared with other Rhizophoreae and one shared with most eudicots (γ event). We confidently annotated 19,138 protein-coding genes in K. obovata and identified the MADS-box gene class and the RPW8 gene class, which might be related to flowering and resistance to powdery mildew in K. obovata and Rhizophora apiculata, respectively. The reference K. obovata genome described here will be very useful for further molecular elucidation of various traits, the breeding of this coastal shelterbelt species, and evolutionary studies with related taxa.
Project description:BackgroundThe availability of chromosome-scale genome assemblies is fundamentally important to advance genetics and breeding in crops, as well as for evolutionary and comparative genomics. The improvement of long-read sequencing technologies and the advent of optical mapping and chromosome conformation capture technologies in the last few years, significantly promoted the development of chromosome-scale genome assemblies of model plants and crop species. In grasses, chromosome-scale genome assemblies recently became available for cultivated and wild species of the Triticeae subfamily. Development of state-of-the-art genomic resources in species of the Poeae subfamily, which includes important crops like fescues and ryegrasses, is lagging behind the progress in the cereal species.ResultsHere, we report a new chromosome-scale genome sequence assembly for perennial ryegrass, obtained by combining PacBio long-read sequencing, Illumina short-read polishing, BioNano optical mapping and Hi-C scaffolding. More than 90% of the total genome size of perennial ryegrass (approximately 2.55 Gb) is covered by seven pseudo-chromosomes that show high levels of collinearity to the orthologous chromosomes of Triticeae species. The transposon fraction of perennial ryegrass was found to be relatively low, approximately 35% of the total genome content, which is less than half of the genome repeat content of cultivated cereal species. We predicted 54,629 high-confidence gene models, 10,287 long non-coding RNAs and a total of 8,393 short non-coding RNAs in the perennial ryegrass genome.ConclusionsThe new reference genome sequence and annotation presented here are valuable resources for comparative genomic studies in grasses, as well as for breeding applications and will expedite the development of productive varieties in perennial ryegrass and related species.
Project description:BackgroundAccurate and complete reference genome assemblies are fundamental for biological research. Cucumber is an important vegetable crop and model system for sex determination and vascular biology. Low-coverage Sanger sequences and high-coverage short Illumina sequences have been used to assemble draft cucumber genomes, but the incompleteness and low quality of these genomes limit their use in comparative genomics and genetic research. A high-quality and complete cucumber genome assembly is therefore essential.FindingsWe assembled single-molecule real-time (SMRT) long reads to generate an improved cucumber reference genome. This version contains 174 contigs with a total length of 226.2 Mb and an N50 of 8.9 Mb, and provides 29.0 Mb more sequence data than previous versions. Using 10X Genomics and high-throughput chromosome conformation capture (Hi-C) data, 89 contigs (∼211.0 Mb) were directly linked into 7 pseudo-chromosome sequences. The newly assembled regions show much higher guanine-cytosine or adenine-thymine content than found previously, which is likely to have been inaccessible to Illumina sequencing. The new assembly contains 1,374 full-length long terminal retrotransposons and 1,078 novel genes including 239 tandemly duplicated genes. For example, we found 4 tandemly duplicated tyrosylprotein sulfotransferases, in contrast to the single copy of the gene found previously and in most other plants.ConclusionThis high-quality genome presents novel features of the cucumber genome and will serve as a valuable resource for genetic research in cucumber and plant comparative genomics.