Unknown

Dataset Information

0

Geometric construction of viral genome space and its applications.


ABSTRACT: Understanding the relationships between genomic sequences is essential to the classification and characterization of living beings. The classes and characteristics of an organism can be identified in the corresponding genome space. In the genome space, the natural metric is important to describe the distribution of genomes. Therefore, the similarity of two biological sequences can be measured. Here, we report that all of the viral genomes are in 32-dimensional Euclidean space, in which the natural metric is the weighted summation of Euclidean distance of k-mer natural vectors. The classification of viral genomes in the constructed genome space further proves the convex hull principle of taxonomy, which states that convex hulls of different families are mutually disjoint. This study provides a novel geometric perspective to describe the genome sequences.

SUBMITTER: Sun N 

PROVIDER: S-EPMC8353408 | biostudies-literature | 2021

REPOSITORIES: biostudies-literature

altmetric image

Publications

Geometric construction of viral genome space and its applications.

Sun Nan N   Pei Shaojun S   He Lily L   Yin Changchuan C   He Rong Lucy RL   Yau Stephen S-T SS  

Computational and structural biotechnology journal 20210727


Understanding the relationships between genomic sequences is essential to the classification and characterization of living beings. The classes and characteristics of an organism can be identified in the corresponding genome space. In the genome space, the natural metric is important to describe the distribution of genomes. Therefore, the similarity of two biological sequences can be measured. Here, we report that all of the viral genomes are in 32-dimensional Euclidean space, in which the natur  ...[more]

Similar Datasets

| S-EPMC2885272 | biostudies-literature
| S-EPMC9351466 | biostudies-literature
| S-EPMC4841724 | biostudies-literature
| S-EPMC4020086 | biostudies-literature
| S-EPMC4115532 | biostudies-literature
| S-EPMC3047556 | biostudies-literature
| S-EPMC4425229 | biostudies-literature
| S-EPMC8523713 | biostudies-literature
| S-EPMC6098008 | biostudies-literature
| S-EPMC6364436 | biostudies-literature