Unknown

Dataset Information

0

Tstrait: a quantitative trait simulator for ancestral recombination graphs.


ABSTRACT:

Summary

Ancestral recombination graphs (ARGs) encode the ensemble of correlated genealogical trees arising from recombination in a compact and efficient structure, and are of fundamental importance in population and statistical genetics. Recent breakthroughs have made it possible to simulate and infer ARGs at biobank scale, and there is now intense interest in using ARG-based methods across a broad range of applications, particularly in genome-wide association studies (GWAS). Sophisticated methods exist to simulate ARGs using population genetics models, but there is currently no software to simulate quantitative traits directly from these ARGs. To apply existing quantitative trait simulators users must export genotype data, losing important information about ancestral processes and producing prohibitively large files when applied to the biobank-scale datasets currently of interest in GWAS. We present tstrait, an open-source Python library to simulate quantitative traits on ARGs, and show how this user-friendly software can quickly simulate phenotypes for biobank-scale datasets on a laptop computer.

Availability and implementation

tstrait is available for download on the Python Package Index. Full documentation with examples and workflow templates is available on https://tskit.dev/tstrait/docs/, and the development version is maintained on GitHub (https://github.com/tskit-dev/tstrait).

Contact

daiki.tagami@hertford.ox.ac.uk.

SUBMITTER: Tagami D 

PROVIDER: S-EPMC10980058 | biostudies-literature | 2024 Mar

REPOSITORIES: biostudies-literature

altmetric image

Publications

tstrait: a quantitative trait simulator for ancestral recombination graphs.

Tagami Daiki D   Bisschop Gertjan G   Kelleher Jerome J  

bioRxiv : the preprint server for biology 20240314


<h4>Summary</h4>Ancestral recombination graphs (ARGs) encode the ensemble of correlated genealogical trees arising from recombination in a compact and efficient structure, and are of fundamental importance in population and statistical genetics. Recent breakthroughs have made it possible to simulate and infer ARGs at biobank scale, and there is now intense interest in using ARG-based methods across a broad range of applications, particularly in genome-wide association studies (GWAS). Sophisticat  ...[more]

Similar Datasets

| S-EPMC11784591 | biostudies-literature
| S-EPMC1698562 | biostudies-literature
| S-EPMC8936483 | biostudies-literature
| S-EPMC4022496 | biostudies-literature
| S-EPMC5289856 | biostudies-literature
| S-EPMC11373519 | biostudies-literature
| S-EPMC4988722 | biostudies-literature
| S-EPMC6304023 | biostudies-literature
| S-EPMC4904167 | biostudies-literature
| S-EPMC10635123 | biostudies-literature