Unknown

Dataset Information

0

The first chromosome-level genome assembly of Entomobrya proxima Folsom, 1924 (Collembola: Entomobryidae).


ABSTRACT: The Entomobryoidea, the largest superfamily of Collembola, encompasses over 2,000 species in the world. However, the lack of high-quality genomes hinders our understanding of the evolution and ecology of this group. This study presents a chromosome-level genome of Entomobrya proxima by combining PacBio long reads, Illumina short reads, and Hi-C data. The genome has a size of 362.37 Mb, with a scaffold N50 size of 57.67 Mb, and 97.12% (351.95 Mb) of the assembly is located on six chromosomes. The BUSCO analysis of our assembly indicates a completeness of 96.1% (n = 1,013), including 946 (93.4%) single-copy BUSCOs and 27 (2.7%) duplicated BUSCOs. We identified that the genome contains 22.16% (80.06 Mb) repeat elements and 20,988 predicted protein-coding genes. Gene family evolution analysis of E. proxima identified 177 gene families that underwent significant expansions, which were primarily associated with detoxification and metabolism. Moreover, our inter-genomic synteny analysis showed strong chromosomal synteny between E. proxima and Sinella curviseta. Our study provides valuable genomic information for comprehending the evolution and ecology of Collembola.

SUBMITTER: Jin J 

PROVIDER: S-EPMC10432511 | biostudies-literature | 2023 Aug

REPOSITORIES: biostudies-literature

altmetric image

Publications

The first chromosome-level genome assembly of Entomobrya proxima Folsom, 1924 (Collembola: Entomobryidae).

Jin Jianfeng J   Zhao Yuxin Y   Zhang Guoqiang G   Pan Zhixiang Z   Zhang Feng F  

Scientific data 20230816 1


The Entomobryoidea, the largest superfamily of Collembola, encompasses over 2,000 species in the world. However, the lack of high-quality genomes hinders our understanding of the evolution and ecology of this group. This study presents a chromosome-level genome of Entomobrya proxima by combining PacBio long reads, Illumina short reads, and Hi-C data. The genome has a size of 362.37 Mb, with a scaffold N50 size of 57.67 Mb, and 97.12% (351.95 Mb) of the assembly is located on six chromosomes. The  ...[more]

Similar Datasets

| S-EPMC10410292 | biostudies-literature
| S-EPMC11220072 | biostudies-literature
| S-EPMC8995043 | biostudies-literature
| S-EPMC7707712 | biostudies-literature
| S-EPMC10194901 | biostudies-literature
| S-EPMC3234423 | biostudies-literature
| S-EPMC9346132 | biostudies-literature
| S-EPMC10810969 | biostudies-literature
| S-EPMC11255207 | biostudies-literature
| S-EPMC11882816 | biostudies-literature