Ontology highlight
ABSTRACT:
SUBMITTER: Tonkin-Hill G
PROVIDER: S-EPMC6582336 | biostudies-literature | 2019 Jun
REPOSITORIES: biostudies-literature
Tonkin-Hill Gerry G Lees John A JA Bentley Stephen D SD Frost Simon D W SDW Corander Jukka J
Nucleic acids research 20190601 11
We present fastbaps, a fast solution to the genetic clustering problem. Fastbaps rapidly identifies an approximate fit to a Dirichlet process mixture model (DPM) for clustering multilocus genotype data. Our efficient model-based clustering approach is able to cluster datasets 10-100 times larger than the existing model-based methods, which we demonstrate by analyzing an alignment of over 110 000 sequences of HIV-1 pol genes. We also provide a method for rapidly partitioning an existing hierarchy ...[more]