Project description:Celtis julianae C.K. Schneid. is a large deciduous tree of Ulmaceae. In this study, the chloroplast genome sequence of C. julianae was 159,064 bp in length, consisting of a large single-copy (LSC) region with 86,139 bp, a small single-copy (SSC) region with 19,137 bp, and two inverted repeat regions (IRs) with 26,894 bp. The GC content in the chloroplast genome of C. julianae was 36.3%. The chloroplast genome of C. julianae contained 127 genes, including 86 protein-coding genes, 37 tRNA genes, and four rRNA genes. Phylogenetic tree showed that C. julianae was clustered with C. tetrandra.
Project description:With genomes of up to 2.7 Mb propagated in μm-long oblong particles and initially predicted to encode more than 2000 proteins, members of the Pandoraviridae family display the most extreme features of the known viral world. The mere existence of such giant viruses raises fundamental questions about their origin and the processes governing their evolution. A previous analysis of six newly available isolates, independently confirmed by a study including three others, established that the Pandoraviridae pan-genome is open, meaning that each new strain exhibits protein-coding genes not previously identified in other family members. With an average increment of about 60 proteins, the gene repertoire shows no sign of reaching a limit and remains largely coding for proteins without recognizable homologs in other viruses or cells (ORFans). To explain these results, we proposed that most new protein-coding genes were created de novo, from pre-existing non-coding regions of the G+C rich pandoravirus genomes. The comparison of the gene content of a new isolate, pandoravirus celtis, closely related (96% identical genome) to the previously described p. quercus is now used to test this hypothesis by studying genomic changes in a microevolution range. Our results confirm that the differences between these two similar gene contents mostly consist of protein-coding genes without known homologs, with statistical signatures close to that of intergenic regions. These newborn proteins are under slight negative selection, perhaps to maintain stable folds and prevent protein aggregation pending the eventual emergence of fitness-increasing functions. Our study also unraveled several insertion events mediated by a transposase of the hAT family, 3 copies of which are found in p. celtis and are presumably active. Members of the Pandoraviridae are presently the first viruses known to encode this type of transposase.
Project description:Celtis is a Cannabaceae genus of 60-70 species of trees, or rarely shrubs, commonly known as hackberries. This woody genus consists of very valuable forest plants that provide important wildlife habitat for birds and mammals. Although previous studies have identified its phylogenetic position, interspecific relationships within Celtis remain unclear. In this study, we generated genome skimming data from five Celtis species to analyze phylogenetic relationships within the genus and develop genome resources. The plastomes of Celtis ranged in length from 158,989 bp to 159,082 bp, with a typical angiosperm quadripartite structure, and encoded a total of 132 genes with 20 duplicated in the IRs. Comparative analyses showed that plastome content and structure were relatively conserved. Whole plastomes showed no signs of gene loss, translocations, inversions, or genome rearrangement. Six plastid hotspot regions (trnH-psbA, psbA-trnK, trnG-trnR, psbC-trnS, cemA-petA and rps8-rpl14), 4097 polymorphic nuclear SSRs, as well as 62 low or single-copy gene fragments were identified within Celtis. Moreover, the phylogenetic relationships based on the complete plastome sequences strongly endorse the placement of C. biondii as sister to the ((((C. koraiensis, C. sinensis), C. tetrandra), C. julianae), C. cerasifera) clade. These findings and the genetic resources developed here will be conducive to further studies on the genus Celtis involving phylogeny, population genetics, and conservation biology.
Project description:Celtis is a large genus in Cannabaceae family, with more than 70 species in the world. However, the intraspecific variabilities of morphological features make it difficult for some species to be distinguished based on their morphological characteristics. To supply the chloroplast (cp) genome resources of Celtis for species identification, the plastome of Celtis sinensis Persoon 1805 was newly sequenced and comparative genomics was analyzed. The chloroplast genome was 159,085 bp in length and had a quadripartite structure consisting of two inverted repeats (IRs) separated by a small single copy (SSC) and a large single copy (LSC) region. A total of 133 genes were annotated, including 88 protein-coding genes, eight rRNA genes, and 37 tRNA genes. Among the protein-coding genes, the frequency of the leucine codon is the highest and that of the cysteine codon is the lowest. Comparative genomic analysis showed that the IRS region was more conservative than the LSC and SSC regions, with most sequence variations located in the intergenic spacer rather than the protein-coding region. Moreover, sixteen highly divergent hotspots were identified. The ML phylogenetic tree showed that all involved Celtis species were clustered together, and the plastome reported in this paper has high enough resolution to distinguish C. sinensis (Pers.) from other Celtis plants. This study provides useful genetic resources for the identification of C. sinensis (Pers.) and is also of great significance for the phylogeny study of Celtis plants in the future.