Ontology highlight
ABSTRACT: Motivation
Simulation under the coalescent model is ubiquitous in the analysis of genetic data. The rapid growth of real data sets from multiple human populations led to increasing interest in simulating very large sample sizes at whole-chromosome scales. When the sample size is large, the coalescent model becomes an increasingly inaccurate approximation of the discrete time Wright-Fisher model (DTWF). Analytical and computational treatment of the DTWF, however, is generally harder.Results
We present a simulator (ARGON) for the DTWF process that scales up to hundreds of thousands of samples and whole-chromosome lengths, with a time/memory performance comparable or superior to currently available methods for coalescent simulation. The simulator supports arbitrary demographic history, migration, Newick tree output, variable mutation/recombination rates and gene conversion, and efficiently outputs pairwise identical-by-descent sharing data.Availability
ARGON (version 0.1) is written in Java, open source, and freely available at https://github.com/pierpal/ARGON CONTACT: ppalama@hsph.harvard.eduSupplementary information
Supplementary data are available at Bioinformatics online.
SUBMITTER: Palamara PF
PROVIDER: S-EPMC6191159 | biostudies-literature | 2016 Oct
REPOSITORIES: biostudies-literature
Bioinformatics (Oxford, England) 20160616 19
<h4>Motivation</h4>Simulation under the coalescent model is ubiquitous in the analysis of genetic data. The rapid growth of real data sets from multiple human populations led to increasing interest in simulating very large sample sizes at whole-chromosome scales. When the sample size is large, the coalescent model becomes an increasingly inaccurate approximation of the discrete time Wright-Fisher model (DTWF). Analytical and computational treatment of the DTWF, however, is generally harder.<h4>R ...[more]