Dataset Information

Rapid detection of identity-by-descent tracts for mega-scale datasets.

ABSTRACT: The ability to identify segments of genomes identical-by-descent (IBD) is a part of standard workflows in both statistical and population genetics. However, traditional methods for finding local IBD across all pairs of individuals scale poorly leading to a lack of adoption in very large-scale datasets. Here, we present iLASH, an algorithm based on similarity detection techniques that shows equal or improved accuracy in simulations compared to current leading methods and speeds up analysis by several orders of magnitude on genomic datasets, making IBD estimation tractable for millions of individuals. We apply iLASH to the PAGE dataset of ~52,000 multi-ethnic participants, including several founder populations with elevated IBD sharing, identifying IBD segments in ~3 minutes per chromosome compared to over 6 days for a state-of-the-art algorithm. iLASH enables efficient analysis of very large-scale datasets, as we demonstrate by computing IBD across the UK Biobank (~500,000 individuals), detecting 12.9 billion pairwise connections.

SUBMITTER: Shemirani R

PROVIDER: S-EPMC8192555 | biostudies-literature | 2021 Jun

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Rapid detection of identity-by-descent tracts for mega-scale datasets.

Shemirani Ruhollah R Belbin Gillian M GM Avery Christy L CL Kenny Eimear E EE Gignoux Christopher R CR Ambite José Luis JL

Nature communications 20210610 1

The ability to identify segments of genomes identical-by-descent (IBD) is a part of standard workflows in both statistical and population genetics. However, traditional methods for finding local IBD across all pairs of individuals scale poorly leading to a lack of adoption in very large-scale datasets. Here, we present iLASH, an algorithm based on similarity detection techniques that shows equal or improved accuracy in simulations compared to current leading methods and speeds up analysis by sev ...[more]

PMID: 34112768

Dataset Information

Rapid detection of identity-by-descent tracts for mega-scale datasets.

Publications

Rapid detection of identity-by-descent tracts for mega-scale datasets.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Rapid, Phase-free Detection of Long Identity-by-Descent Segments Enables Effective Relationship Classification.
| S-EPMC7118564 | biostudies-literature

diCal-IBD: demography-aware inference of identity-by-descent tracts in unrelated individuals.
| S-EPMC4296155 | biostudies-literature

Identity-by-descent detection across 487,409 British samples reveals fine scale population structure and ultra-rare variant associations.
| S-EPMC7704644 | biostudies-literature

Length distributions of identity by descent reveal fine-scale demographic history.
| S-EPMC3487132 | biostudies-literature

RaPID-Query for Fast Identity by Descent Search and Genealogical Analysis.
| S-EPMC10244210 | biostudies-literature

Biobank-scale inference of multi-individual identity by descent and gene conversion.
| S-EPMC10635131 | biostudies-literature

Accurate detection of identity-by-descent segments in human ancient DNA.
| S-EPMC10786714 | biostudies-literature

RaPID: ultra-fast, powerful, and accurate detection of segments identical by descent (IBD) in biobank-scale cohorts.
| S-EPMC6659282 | biostudies-literature

PRIMUS: rapid reconstruction of pedigrees from genome-wide estimates of identity by descent.
| S-EPMC4225580 | biostudies-literature

Improving the accuracy and efficiency of identity-by-descent detection in population data.
| S-EPMC3664855 | biostudies-other