Ontology highlight
ABSTRACT:
SUBMITTER: Reinert G
PROVIDER: S-EPMC2818754 | biostudies-literature | 2009 Dec
REPOSITORIES: biostudies-literature
Reinert Gesine G Chew David D Sun Fengzhu F Waterman Michael S MS
Journal of computational biology : a journal of computational molecular cell biology 20091201 12
Large-scale comparison of the similarities between two biological sequences is a major issue in computational biology; a fast method, the D(2) statistic, relies on the comparison of the k-tuple content for both sequences. Although it has been known for some years that the D(2) statistic is not suitable for this task, as it tends to be dominated by single-sequence noise, to date no suitable adjustments have been proposed. In this article, we suggest two new variants of the D(2) word count statist ...[more]