Ontology highlight
ABSTRACT:
SUBMITTER: Burden CJ
PROVIDER: S-EPMC3880068 | biostudies-literature | 2014 Jan
REPOSITORIES: biostudies-literature
Burden Conrad J CJ Leopardi Paul P Forêt Sylvain S
Journal of computational biology : a journal of computational molecular cell biology 20131026 1
Word match counts have traditionally been proposed as an alignment-free measure of similarity for biological sequences. The D(2) statistic, which simply counts the number of exact word matches between two sequences, is a useful test bed for developing rigorous mathematical results, which can then be extended to more biologically useful measures. The distributional properties of the D(2) statistic under the null hypothesis of identically and independently distributed letters have been studied ext ...[more]