Project description:Gerbera delavayi Franch. endemic to southwest China, is a rare fiber plant. In this study, the leaves of G. delavayi were sequenced based on Illumina Hi-Seq2500. The results showed that 108694 unigenes were found. N50 was 593.90bp, and the mean length was 912bp. By comparing with Nr and Swiss-prot database, 40915 unigenes were annotated, and 67779 unigenes were unannotated. In addition, 30 unigenes had homology with Ces family, 20 unigenes had homology with Cls family, and 11 unigenes had homology with SuSy. 11369 unigenes were assigned to 25 categories with COG database, and 21378 unigenes were divided into 52 GO terms. Function annotation against KEGG database obtained 8087 unigenes and 118 pathways. 47 unigenes were found at “phenylpropanoid biosynthesis” pathway. Furthermore, 4908 unigenes contained 5179 SSRs, 1 SSR occurred every 12.46kb. The largest number of SSR type was mono-nucleotide repeat, and its frequency was 54.37%; the next was di-nucleotide repeat and tri-nucleotide repeat, with the frequencies of 22.90% and 21.70%, respectively. These results greatly enriched the genetic information of G. delavayi, and provided basic data for genetic breeding and exploitation of this unique plant resource.
Project description:Rhododendron delavayi Franch. is globally famous as an ornamental plant. Its distribution in southwest China covers several different habitats and environments. However, not much research had been conducted on Rhododendron spp. at the molecular level, which hinders understanding of its evolution, speciation, and synthesis of secondary metabolites, as well as its wide adaptability to different environments. Here, we report the genome assembly and gene annotation of R. delavayi var. delavayi (the second genome sequenced in the Ericaceae), which will facilitate the study of the family. The genome assembly will have further applications in genome-assisted cultivar breeding. The final size of the assembled R. delavayi var. delavayi genome (695.09 Mb) was close to the 697.94 Mb, estimated by k-mer analysis. A total of 336.83 gigabases (Gb) of raw Illumina HiSeq 2000 reads were generated from 9 libraries (with insert sizes ranging from 170 bp to 40 kb), achieving a raw sequencing depth of ×482.6. After quality filtering, 246.06 Gb of clean reads were obtained, giving ×352.55 coverage depth. Assembly using Platanus gave a total scaffold length of 695.09 Mb, with a contig N50 of 61.8 kb and a scaffold N50 of 637.83 kb. Gene prediction resulted in the annotation of 32 938 protein-coding genes. The genome completeness was evaluated by CEGMA and BUSCO and reached 95.97% and 92.8%, respectively. The gene annotation completeness was also evaluated by CEGMA and BUSCO and reached 97.01% and 87.4%, respectively. Genome annotation revealed that 51.77% of the R. delavayi genome is composed of transposable elements, and 37.48% of long terminal repeat elements (LTRs). The de novo assembled genome of R. delavayi var. delavayi (hereinafter referred to as R. delavayi) is the second genomic resource of the family Ericaceae and will provide a valuable resource for research on future comparative genomic studies in Rhododendron species. The availability of the R. delavayi genome sequence will hopefully provide a tool for scientists to tackle open questions regarding molecular mechanisms underlying environmental interactions in the genus Rhododendron, more accurately understand the evolutionary processes and systematics of the genus, facilitate the identification of genes encoding pharmaceutically important compounds, and accelerate molecular breeding to release elite varieties.