Unknown

Dataset Information

0

A Novel Method for Alignment-free DNA Sequence Similarity Analysis Based on the Characterization of Complex Networks.


ABSTRACT: Determination of sequence similarity is one of the major steps in computational phylogenetic studies. One of the major tasks of computational biologists is to develop novel mathematical descriptors for similarity analysis. DNA clustering is an important technology that automatically identifies inherent relationships among large-scale DNA sequences. The comparison between the DNA sequences of different species helps determine phylogenetic relationships among species. Alignment-free approaches have continuously gained interest in various sequence analysis applications such as phylogenetic inference and metagenomic classification/clustering, particularly for large-scale sequence datasets. Here, we construct a novel and simple mathematical descriptor based on the characterization of cis sequence complex DNA networks. This new approach is based on a code of three cis nucleotides in a gene that could code for an amino acid. In particular, for each DNA sequence, we will set up a cis sequence complex network that will be used to develop a characterization vector for the analysis of mitochondrial DNA sequence phylogenetic relationships among nine species. The resulting phylogenetic relationships among the nine species were determined to be in agreement with the actual situation.

SUBMITTER: Zhou J 

PROVIDER: S-EPMC5054945 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| S-EPMC6403383 | biostudies-literature
| S-EPMC4410667 | biostudies-literature
| S-EPMC3384675 | biostudies-literature
| S-EPMC6355110 | biostudies-literature
| S-EPMC4427953 | biostudies-literature
| S-EPMC3799466 | biostudies-literature
| S-EPMC2808352 | biostudies-literature
| S-EPMC5870879 | biostudies-literature
| S-EPMC1131888 | biostudies-literature
| S-EPMC3429886 | biostudies-literature