Unknown

Dataset Information

0

Visualization of genome signatures of eukaryote genomes by batch-learning self-organizing map with a special emphasis on Drosophila genomes.


ABSTRACT: A strategy of evolutionary studies that can compare vast numbers of genome sequences is becoming increasingly important with the remarkable progress of high-throughput DNA sequencing methods. We previously established a sequence alignment-free clustering method "BLSOM" for di-, tri-, and tetranucleotide compositions in genome sequences, which can characterize sequence characteristics (genome signatures) of a wide range of species. In the present study, we generated BLSOMs for tetra- and pentanucleotide compositions in approximately one million sequence fragments derived from 101 eukaryotes, for which almost complete genome sequences were available. BLSOM recognized phylotype-specific characteristics (e.g., key combinations of oligonucleotide frequencies) in the genome sequences, permitting phylotype-specific clustering of the sequences without any information regarding the species. In our detailed examination of 12 Drosophila species, the correlation between their phylogenetic classification and the classification on the BLSOMs was observed to visualize oligonucleotides diagnostic for species-specific clustering.

SUBMITTER: Abe T 

PROVIDER: S-EPMC3967822 | biostudies-literature | 2014

REPOSITORIES: biostudies-literature

altmetric image

Publications

Visualization of genome signatures of eukaryote genomes by batch-learning self-organizing map with a special emphasis on Drosophila genomes.

Abe Takashi T   Hamano Yuta Y   Ikemura Toshimichi T  

BioMed research international 20140311


A strategy of evolutionary studies that can compare vast numbers of genome sequences is becoming increasingly important with the remarkable progress of high-throughput DNA sequencing methods. We previously established a sequence alignment-free clustering method "BLSOM" for di-, tri-, and tetranucleotide compositions in genome sequences, which can characterize sequence characteristics (genome signatures) of a wide range of species. In the present study, we generated BLSOMs for tetra- and pentanuc  ...[more]

Similar Datasets

| S-EPMC7350273 | biostudies-literature
| S-EPMC7119937 | biostudies-literature
| S-EPMC6880984 | biostudies-literature
| S-EPMC1274250 | biostudies-literature
| S-EPMC2474711 | biostudies-literature
| S-EPMC9641310 | biostudies-literature
2016-07-09 | PXD000816 | Pride
| S-EPMC4689344 | biostudies-literature
| S-EPMC3048497 | biostudies-literature
| S-EPMC4227818 | biostudies-literature