Unknown

Dataset Information

0

ARGON: fast, whole-genome simulation of the discrete time Wright-fisher process.


ABSTRACT:

Motivation

Simulation under the coalescent model is ubiquitous in the analysis of genetic data. The rapid growth of real data sets from multiple human populations led to increasing interest in simulating very large sample sizes at whole-chromosome scales. When the sample size is large, the coalescent model becomes an increasingly inaccurate approximation of the discrete time Wright-Fisher model (DTWF). Analytical and computational treatment of the DTWF, however, is generally harder.

Results

We present a simulator (ARGON) for the DTWF process that scales up to hundreds of thousands of samples and whole-chromosome lengths, with a time/memory performance comparable or superior to currently available methods for coalescent simulation. The simulator supports arbitrary demographic history, migration, Newick tree output, variable mutation/recombination rates and gene conversion, and efficiently outputs pairwise identical-by-descent sharing data.

Availability

ARGON (version 0.1) is written in Java, open source, and freely available at https://github.com/pierpal/ARGON CONTACT: ppalama@hsph.harvard.edu

Supplementary information

Supplementary data are available at Bioinformatics online.

SUBMITTER: Palamara PF 

PROVIDER: S-EPMC6191159 | biostudies-literature | 2016 Oct

REPOSITORIES: biostudies-literature

altmetric image

Publications

ARGON: fast, whole-genome simulation of the discrete time Wright-fisher process.

Palamara Pier Francesco PF  

Bioinformatics (Oxford, England) 20160616 19


<h4>Motivation</h4>Simulation under the coalescent model is ubiquitous in the analysis of genetic data. The rapid growth of real data sets from multiple human populations led to increasing interest in simulating very large sample sizes at whole-chromosome scales. When the sample size is large, the coalescent model becomes an increasingly inaccurate approximation of the discrete time Wright-Fisher model (DTWF). Analytical and computational treatment of the DTWF, however, is generally harder.<h4>R  ...[more]

Similar Datasets

| S-EPMC6389312 | biostudies-literature
| S-EPMC5592947 | biostudies-literature
| S-EPMC4649640 | biostudies-literature
| S-EPMC5499119 | biostudies-literature
| S-EPMC7365441 | biostudies-literature
| S-EPMC3953531 | biostudies-literature