Unknown

Dataset Information

0

A high-quality genome assembly and annotation of Quercus acutissima Carruth.


ABSTRACT:

Introduction

Quercus acutissima is an economic and ecological tree species often used for afforestation of arid and semi-arid lands and is considered as an excellent tree for soil and water conservation.

Methods

Here, we combined PacBio long reads, Hi-C, and Illumina short reads to assemble Q. acutissima genome.

Results

We generated a 957.1 Mb genome with a contig N50 of 1.2 Mb and scaffold N50 of 77.0 Mb. The repetitive sequences constituted 55.63% of the genome, among which long terminal repeats were the majority and accounted for 23.07% of the genome. Ab initio, homology-based and RNA sequence-based gene prediction identified 29,889 protein-coding genes, of which 82.6% could be functionally annotated. Phylogenetic analysis showed that Q. acutissima and Q. variabilis were differentiated around 3.6 million years ago, and showed no evidence of species-specific whole genome duplication.

Conclusion

The assembled and annotated high-quality Q. acutissima genome not only promises to accelerate the species molecular biology studies and breeding, but also promotes genome level evolutionary studies.

SUBMITTER: Liu D 

PROVIDER: S-EPMC9729791 | biostudies-literature | 2022

REPOSITORIES: biostudies-literature

altmetric image

Publications

A high<i>-</i>quality genome assembly and annotation of <i>Quercus acutissima</i> Carruth.

Liu Dan D   Xie Xiaoman X   Tong Boqiang B   Zhou Chengcheng C   Qu Kai K   Guo Haili H   Zhao Zhiheng Z   El-Kassaby Yousry A YA   Li Wei W   Li Wenqing W  

Frontiers in plant science 20221124


<h4>Introduction</h4><i>Quercus acutissima</i> is an economic and ecological tree species often used for afforestation of arid and semi-arid lands and is considered as an excellent tree for soil and water conservation.<h4>Methods</h4>Here, we combined PacBio long reads, Hi-C, and Illumina short reads to assemble <i>Q. acutissima</i> genome.<h4>Results</h4>We generated a 957.1 Mb genome with a contig N50 of 1.2 Mb and scaffold N50 of 77.0 Mb. The repetitive sequences constituted 55.63% of the gen  ...[more]

Similar Datasets

| S-EPMC6121628 | biostudies-literature
| S-EPMC10511450 | biostudies-literature
| PRJNA915792 | ENA
| PRJNA756808 | ENA
| PRJNA757164 | ENA
| S-EPMC7210891 | biostudies-literature
| S-EPMC11797025 | biostudies-literature
| S-EPMC8785237 | biostudies-literature
| S-EPMC8022769 | biostudies-literature
| S-EPMC7679098 | biostudies-literature