Unknown

Dataset Information

0

Whole-proteome phylogeny of large dsDNA virus families by an alignment-free method.


ABSTRACT: The vast sequence divergence among different virus groups has presented a great challenge to alignment-based sequence comparison among different virus families. Using an alignment-free comparison method, we construct the whole-proteome phylogeny for a population of viruses from 11 viral families comprising 142 large dsDNA eukaryote viruses. The method is based on the feature frequency profiles (FFP), where the length of the feature (l-mer) is selected to be optimal for phylogenomic inference. We observe that (i) the FFP phylogeny segregates the population into clades, the membership of each has remarkable agreement with current classification by the International Committee on the Taxonomy of Viruses, with one exception that the mimivirus joins the phycodnavirus family; (ii) the FFP tree detects potential evolutionary relationships among some viral families; (iii) the relative position of the 3 herpesvirus subfamilies in the FFP tree differs from gene alignment-based analysis; (iv) the FFP tree suggests the taxonomic positions of certain "unclassified" viruses; and (v) the FFP method identifies candidates for horizontal gene transfer between virus families.

SUBMITTER: Wu GA 

PROVIDER: S-EPMC2722272 | biostudies-literature | 2009 Aug

REPOSITORIES: biostudies-literature

altmetric image

Publications

Whole-proteome phylogeny of large dsDNA virus families by an alignment-free method.

Wu Guohong Albert GA   Jun Se-Ran SR   Sims Gregory E GE   Kim Sung-Hou SH  

Proceedings of the National Academy of Sciences of the United States of America 20090624 31


The vast sequence divergence among different virus groups has presented a great challenge to alignment-based sequence comparison among different virus families. Using an alignment-free comparison method, we construct the whole-proteome phylogeny for a population of viruses from 11 viral families comprising 142 large dsDNA eukaryote viruses. The method is based on the feature frequency profiles (FFP), where the length of the feature (l-mer) is selected to be optimal for phylogenomic inference. We  ...[more]

Similar Datasets

| S-EPMC1839080 | biostudies-literature
| S-EPMC2806744 | biostudies-literature
| S-EPMC6436989 | biostudies-literature
| S-EPMC3549825 | biostudies-literature
| S-EPMC9602327 | biostudies-literature
| S-EPMC5743538 | biostudies-literature
| S-EPMC4918981 | biostudies-literature
| S-EPMC4678791 | biostudies-literature
| S-EPMC4501066 | biostudies-literature
| S-EPMC4896704 | biostudies-literature