Project description:Rhododendronmariesii Hemsley et Wilson, 1907, a typical member of the family Ericaeae, possesses valuable medicinal and horticultural properties. In this research, the complete chloroplast (cp) genome of R.mariesii was sequenced and assembled, which proved to be a typical quadripartite structure with the length of 203,480 bp. In particular, the lengths of the large single copy region (LSC), small single copy region (SSC), and inverted repeat regions (IR) were 113,715 bp, 7,953 bp, and 40,918 bp, respectively. Among the 151 unique genes, 98 were protein-coding genes, 8 were tRNA genes, and 45 were rRNA genes. The structural characteristics of the R.mariesiicp genome was similar to other angiosperms. Leucine was the most representative amino acid, while cysteine was the lowest representative. Totally, 30 codons showed obvious codon usage bias, and most were A/U-ending codons. Six highly variable regions were observed, such as trnK-pafI and atpE-rpoB, which could serve as potential markers for future barcoding and phylogenetic research of R.mariesii species. Coding regions were more conserved than non-coding regions. Expansion and contraction in the IR region might be the main length variation in R.mariesii and related Ericaeae species. Maximum-likelihood (ML) phylogenetic analysis revealed that R.mariesii was relatively closed to the R.simsii Planchon, 1853 and R.pulchrum Sweet,1831. This research will supply rich genetic resource for R.mariesii and related species of the Ericaeae.
Project description:BackgroundMyristicaceae was widly distributed from tropical Asia to Oceania, Africa, and tropical America. There are 3 genera and 10 species of Myristicaceae present in China, mainly distributed in the south of Yunnan Province. Most research on this family focuses on fatty acids, medicine, and morphology. Based on the morphology, fatty acid chemotaxonomy, and a few of molecular data, the phylogenetic position of Horsfieldia pandurifolia Hu was controversial.ResultsIn this study, the chloroplast genomes of two Knema species, Knema globularia (Lam.) Warb. and Knema cinerea (Poir.) Warb., were characterized. Comparing the genome structure of these two species with those of other eight published species, including three Horsfieldia species, four Knema species, and one Myristica species, it was found that the chloroplast genomes of these species were relatively conserved, retaining the same gene order. Through sequence divergence analysis, there were 11 genes and 18 intergenic spacers were subject to positive selection, which can be used to analyze the population genetic structure of this family. Phylogenetic analysis showed that all Knema species were clustered in the same group and formed a sister clade with Myristica species support by both high maximum likelihood bootstrap values and Bayesian posterior probabilities; among Horsfieldia species, Horsfieldia amygdalina (Wall.) Warb., Horsfieldia kingii (Hook.f.) Warb., Horsfieldia hainanensis Merr. and Horsfieldia tetratepala C.Y.Wu. were grouped together, but H. pandurifolia formed a single group and formed a sister clade with genus Myristica and Knema. Through the phylogenetic analysis, we support de Wilde' view that the H. pandurifolia should be separated from Horsfieldia and placed in the genus Endocomia, namely Endocomia macrocoma subsp. prainii (King) W.J.de Wilde.ConclusionThe findings of this study provide a novel genetic resources for future research in Myristicaceae and provide a molecular evidence for the taxonomic classification of Myristicaceae.
Project description:Herb genomics, at the forefront of traditional Chinese medicine research, combines genomics with traditional practices, facilitating the scientific validation of ancient remedies. This integration enhances public understanding of traditional Chinese medicine's efficacy and broadens its scope in modern healthcare. Stachys species encompass annual or perennial herbs or small shrubs, exhibiting simple petiolate or sessile leaves. Despite their wide-ranging applications across various fields, molecular data have been lacking, hindering the precise identification and taxonomic elucidation of Stachys species. To address this gap, we assembled the complete chloroplast (CP) genome of Stachys geobombycis and conducted reannotation and comparative analysis of seven additional species within the Stachys genus. The findings demonstrate that the CP genomes of these species exhibit quadripartite structures, with lengths ranging from 14,523 to 150,599 bp. Overall, the genome structure remains relatively conserved, hosting 131 annotated genes, including 87 protein coding genes, 36 tRNA genes, and 8 rRNA genes. Additionally, 78 to 98 SSRs and long repeat sequences were detected , and notably, 6 highly variable regions were identified as potential molecular markers in the CP genome through sequence alignment. Phylogenetic analysis based on Bayesian inference and maximum likelihood methods strongly supported the phylogenetic position of the genus Stachys as a member of Stachydeae tribe. Overall, this comprehensive bioinformatics study of Stachys CP genomes lays the groundwork for phylogenetic classification, plant identification, genetic engineering, evolutionary studies, and breeding research concerning medicinal plants within the Stachys genus.
Project description:Qat (Catha edulis, Celastraceae) is a woody evergreen species with great economic and cultural importance. It is cultivated for its stimulant alkaloids cathine and cathinone in East Africa and southwest Arabia. However, genome information, especially DNA sequence resources, for C. edulis are limited, hindering studies regarding interspecific and intraspecific relationships. Herein, the complete chloroplast (cp) genome of Catha edulis is reported. This genome is 157,960 bp in length with 37% GC content and is structurally arranged into two 26,577 bp inverted repeats and two single-copy areas. The size of the small single-copy and the large single-copy regions were 18,491 bp and 86,315 bp, respectively. The C. edulis cp genome consists of 129 coding genes including 37 transfer RNA (tRNA) genes, 8 ribosomal RNA (rRNA) genes, and 84 protein coding genes. For those genes, 112 are single copy genes and 17 genes are duplicated in two inverted regions with seven tRNAs, four rRNAs, and six protein coding genes. The phylogenetic relationships resolved from the cp genome of qat and 32 other species confirms the monophyly of Celastraceae. The cp genomes of C. edulis, Euonymus japonicus and seven Celastraceae species lack the rps16 intron, which indicates an intron loss took place among an ancestor of this family. The cp genome of C. edulis provides a highly valuable genetic resource for further phylogenomic research, barcoding and cp transformation in Celastraceae.
Project description:BackgroundThe recent determination of complete chloroplast (cp) genomic sequences of various plant species has enabled numerous comparative analyses as well as advances in plant and genome evolutionary studies. In angiosperms, the complete cp genome sequences of about 70 species have been determined, whereas those of only three gymnosperm species, Cycas taitungensis, Pinus thunbergii, and Pinus koraiensis have been established. The lack of information regarding the gene content and genomic structure of gymnosperm cp genomes may severely hamper further progress of plant and cp genome evolutionary studies. To address this need, we report here the complete nucleotide sequence of the cp genome of Cryptomeria japonica, the first in the Cupressaceae sensu lato of gymnosperms, and provide a comparative analysis of their gene content and genomic structure that illustrates the unique genomic features of gymnosperms.ResultsThe C. japonica cp genome is 131,810 bp in length, with 112 single copy genes and two duplicated (trnI-CAU, trnQ-UUG) genes that give a total of 116 genes. Compared to other land plant cp genomes, the C. japonica cp has lost one of the relevant large inverted repeats (IRs) found in angiosperms, fern, liverwort, and gymnosperms, such as Cycas and Gingko, and additionally has completely lost its trnR-CCG, partially lost its trnT-GGU, and shows diversification of accD. The genomic structure of the C. japonica cp genome also differs significantly from those of other plant species. For example, we estimate that a minimum of 15 inversions would be required to transform the gene organization of the Pinus thunbergii cp genome into that of C. japonica. In the C. japonica cp genome, direct repeat and inverted repeat sequences are observed at the inversion and translocation endpoints, and these sequences may be associated with the genomic rearrangements.ConclusionThe observed differences in genomic structure between C. japonica and other land plants, including pines, strongly support the theory that the large IRs stabilize the cp genome. Furthermore, the deleted large IR and the numerous genomic rearrangements that have occurred in the C. japonica cp genome provide new insights into both the evolutionary lineage of coniferous species in gymnosperm and the evolution of the cp genome.
Project description:BackgroundThe plastome of medicinal and endangered species in Kingdom of Saudi Arabia, Barleria prionitis was sequenced. The plastome was compared with that of seven Acanthoideae species in order to describe the plastome, spot the microsatellite, assess the dissimilarities within the sampled plastomes and to infer their phylogenetic relationships.ResultsThe plastome of B. prionitis was 152,217 bp in length with Guanine-Cytosine and Adenine-Thymine content of 38.3 and 61.7% respectively. It is circular and quadripartite in structure and constitute of a large single copy (LSC, 83, 772 bp), small single copy (SSC, 17, 803 bp) and a pair of inverted repeat (IRa and IRb 25, 321 bp each). 131 genes were identified in the plastome out of which 113 are unique and 18 were repeated in IR region. The genome consists of 4 rRNA, 30 tRNA and 80 protein-coding genes. The analysis of long repeat showed all types of repeats were present in the plastome and palindromic has the highest frequency. A total number of 98 SSR were also identified of which mostly were mononucleotide Adenine-Thymine and are located at the non coding regions. Comparative genomic analysis among the plastomes revealed that the pair of the inverted repeat is more conserved than the single copy region. In addition high variation is observed in the intergenic spacer region than the coding region. The genes, ycf1and ndhF and are located at the border junction of the small single copy region and IRb region of all the plastome. The analysis of sequence divergence in the protein coding genes indicates that the following genes undergo positive selection (atpF, petD, psbZ, rpl20, petB, rpl16, rps16, rpoC, rps7, rpl32 and ycf3). Phylogenetic analysis indicated sister relationship between Ruellieae and Justcieae. In addition, Barleria, Justicia and Ruellia are paraphyletic, suggesting that Justiceae, Ruellieae, Andrographideae and Barlerieae should be treated as tribes.ConclusionsThis study sequenced and assembled the first plastome of the taxon Barleria and reported the basics resources for evolutionary studies of B. prionitis and tools for phylogenetic relationship studies within the core Acanthaceae.
Project description:Stryphnodendron adstringens is a medicinal plant belonging to the Leguminosae family, and it is commonly found in the southeastern savannas, endemic to the Cerrado biome. The goal of this study was to assemble and annotate the chloroplast genome of S. adstringens and to compare it with previously known genomes of the mimosoid clade within Leguminosae. The chloroplast genome was reconstructed using de novo and referenced-based assembly of paired-end reads generated by shotgun sequencing of total genomic DNA. The size of the S. adstringens chloroplast genome was 162,169 bp. This genome included a large single-copy (LSC) region of 91,045 bp, a small single-copy (SSC) region of 19,014 bp and a pair of inverted repeats (IRa and IRb) of 26,055 bp each. The S. adstringens chloroplast genome contains a total of 111 functional genes, including 77 protein-coding genes, 30 transfer RNA genes, and 4 ribosomal RNA genes. A total of 137 SSRs and 42 repeat structures were identified in S. adstringens chloroplast genome, with the highest proportion in the LSC region. A comparison of the S. adstringens chloroplast genome with those from other mimosoid species indicated that gene content and synteny are highly conserved in the clade. The phylogenetic reconstruction using 73 conserved coding-protein genes from 19 Leguminosae species was supported to be paraphyletic. Furthermore, the noncoding and coding regions with high nucleotide diversity may supply valuable markers for molecular evolutionary and phylogenetic studies at different taxonomic levels in this group.
Project description:Epimedium tianmenshanensis is a rare perennial herb distributed in China, and it is also an important medicinal plant. Here, we used illumina paired-end sequencing technology to obtain the complete chloroplast genome of E. tianmenshanensis, and compared analysis with related species. The length of the complete chloroplast genome of E. tianmenshanensis is 156,956 bp, which is a relatively conserved quadripartite structure including a large single copy (LSC) region of 88,409 bp, a small single copy (SSC) region of 17,448 bp, and a pair of inverted repeat (IRa/IRb) regions of 25,550 bp. The whole genome contains 132 unique genes, including 85 protein-coding genes, 38 tRNA genes, eight rRNA genes and one pseudogene. 87 simple sequence repeats (SSRs) were identified, and most of them were found to be composed of A/T. In addition, 22,923 codons were detected in 78 protein-coding genes of E. tianmenshanensis, and the overall codon bias pattern in the genome tended to use A/U ending codons. Phylogenetic analysis demonstrated that all the Epimedium species formed a monophyletic clade, and E. tianmenshanensis had the closest relationship to E. dolichostemon. The results of this study provided useful molecular information about the evolution and molecular biology of E. tianmenshanensis.
Project description:As one of the most cold and salt-tolerant mangrove species, Kandelia obovata is widely distributed in China. Here, we report the complete chloroplast genome sequence K. obovata (Rhizophoraceae) obtained via next-generation sequencing, compare the general features of the sampled plastomes of this species to those of other sequenced mangrove species, and perform a phylogenetic analysis based on the protein-coding genes of these plastomes. The complete chloroplast genome of K. obovata is 160,325 bp in size and has a 35.22% GC content. The genome has a typical circular quadripartite structure, with a pair of inverted repeat (IR) regions 26,670 bp in length separating a large single-copy (LSC) region (91,156 bp) and a small single-cope (SSC) region (15,829 bp). The chloroplast genome of K. obovata contains 128 unique genes, including 80 protein-coding genes, 38 tRNA genes, 8 rRNA genes and 2 pseudogenes (ycf1 in the IRA region and rpl22 in the IRB region). In addition, a simple sequence repeat (SSR) analysis found 108 SSR loci in the chloroplast genome of K. obovata, most of which are A/T rich. IR expansion and contraction regions were compared between K. obovata and five related species: two from Malpighiales and three mangrove species from different orders. The mVISTA results indicated that the genome structure, gene order and gene content are highly conserved among the analyzed species. The phylogenetic analysis using 54 common protein-coding genes from the chloroplast genome showed that the plant most closely related to K. obovata is Ceriops tagal of Rhizophoraceae. The results of this study provide useful molecular information about the evolution and molecular biology of these mangrove trees.
Project description:Pinaceae, the largest family of conifers, has a diversified organization of chloroplast (cp) genomes with two typical highly reduced inverted repeats (IRs). In the current study, we determined the complete sequence of the cp genome of an economically and ecologically important conifer tree, the loblolly pine (Pinus taeda L.), using Illumina paired-end sequencing and compared the sequence with those of other pine species. The results revealed a genome size of 121,531 base pairs (bp) containing a pair of 830-bp IR regions, distinguished by a small single copy (42,258 bp) and large single copy (77,614 bp) region. The chloroplast genome of P. taeda encodes 120 genes, comprising 81 protein-coding genes, four ribosomal RNA genes, and 35 tRNA genes, with 151 randomly distributed microsatellites. Approximately 6 palindromic, 34 forward, and 22 tandem repeats were found in the P. taeda cp genome. Whole cp genome comparison with those of other Pinus species exhibited an overall high degree of sequence similarity, with some divergence in intergenic spacers. Higher and lower numbers of indels and single-nucleotide polymorphism substitutions were observed relative to P. contorta and P. monophylla, respectively. Phylogenomic analyses based on the complete genome sequence revealed that 60 shared genes generated trees with the same topologies, and P. taeda was closely related to P. contorta in the subgenus Pinus. Thus, the complete P. taeda genome provided valuable resources for population and evolutionary studies of gymnosperms and can be used to identify related species.