Unknown

Dataset Information

0

GraphTyper2 enables population-scale genotyping of structural variation using pangenome graphs.


ABSTRACT: Analysis of sequence diversity in the human genome is fundamental for genetic studies. Structural variants (SVs) are frequently omitted in sequence analysis studies, although each has a relatively large impact on the genome. Here, we present GraphTyper2, which uses pangenome graphs to genotype SVs and small variants using short-reads. Comparison to the syndip benchmark dataset shows that our SV genotyping is sensitive and variant segregation in families demonstrates the accuracy of our approach. We demonstrate that incorporating public assembly data into our pipeline greatly improves sensitivity, particularly for large insertions. We validate 6,812 SVs on average per genome using long-read data of 41 Icelanders. We show that GraphTyper2 can simultaneously genotype tens of thousands of whole-genomes by characterizing 60 million small variants and half a million SVs in 49,962 Icelanders, including 80 thousand SVs with high-confidence.

SUBMITTER: Eggertsson HP 

PROVIDER: S-EPMC6881350 | biostudies-literature | 2019 Nov

REPOSITORIES: biostudies-literature

altmetric image

Publications

GraphTyper2 enables population-scale genotyping of structural variation using pangenome graphs.

Eggertsson Hannes P HP   Kristmundsdottir Snaedis S   Beyter Doruk D   Jonsson Hakon H   Skuladottir Astros A   Hardarson Marteinn T MT   Gudbjartsson Daniel F DF   Stefansson Kari K   Halldorsson Bjarni V BV   Melsted Pall P  

Nature communications 20191127 1


Analysis of sequence diversity in the human genome is fundamental for genetic studies. Structural variants (SVs) are frequently omitted in sequence analysis studies, although each has a relatively large impact on the genome. Here, we present GraphTyper2, which uses pangenome graphs to genotype SVs and small variants using short-reads. Comparison to the syndip benchmark dataset shows that our SV genotyping is sensitive and variant segregation in families demonstrates the accuracy of our approach.  ...[more]

Similar Datasets

| S-EPMC7017486 | biostudies-literature
| S-EPMC8275641 | biostudies-literature
| S-EPMC7141861 | biostudies-literature
| S-EPMC6521551 | biostudies-literature
| S-EPMC9237687 | biostudies-literature
| S-EPMC8420074 | biostudies-literature
| S-EPMC6853660 | biostudies-literature
| S-EPMC8388040 | biostudies-literature
| S-EPMC8756200 | biostudies-literature
| S-EPMC5094049 | biostudies-literature