Dataset Information

Evaluating genotype imputation pipeline for ultra-low coverage ancient genomes.

ABSTRACT: Although ancient DNA data have become increasingly more important in studies about past populations, it is often not feasible or practical to obtain high coverage genomes from poorly preserved samples. While methods of accurate genotype imputation from > 1 × coverage data have recently become a routine, a large proportion of ancient samples remain unusable for downstream analyses due to their low coverage. Here, we evaluate a two-step pipeline for the imputation of common variants in ancient genomes at 0.05-1 × coverage. We use the genotype likelihood input mode in Beagle and filter for confident genotypes as the input to impute missing genotypes. This procedure, when tested on ancient genomes, outperforms a single-step imputation from genotype likelihoods, suggesting that current genotype callers do not fully account for errors in ancient sequences and additional quality controls can be beneficial. We compared the effect of various genotype likelihood calling methods, post-calling, pre-imputation and post-imputation filters, different reference panels, as well as different imputation tools. In a Neolithic Hungarian genome, we obtain ~ 90% imputation accuracy for heterozygous common variants at coverage 0.05 × and > 97% accuracy at coverage 0.5 ×. We show that imputation can mitigate, though not eliminate reference bias in ultra-low coverage ancient genomes.

SUBMITTER: Hui R

PROVIDER: S-EPMC7596702 | biostudies-literature | 2020 Oct

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Evaluating genotype imputation pipeline for ultra-low coverage ancient genomes.

Hui Ruoyun R D'Atanasio Eugenia E Cassidy Lara M LM Scheib Christiana L CL Kivisild Toomas T

Scientific reports 20201029 1

Although ancient DNA data have become increasingly more important in studies about past populations, it is often not feasible or practical to obtain high coverage genomes from poorly preserved samples. While methods of accurate genotype imputation from > 1 × coverage data have recently become a routine, a large proportion of ancient samples remain unusable for downstream analyses due to their low coverage. Here, we evaluate a two-step pipeline for the imputation of common variants in ancient gen ...[more]

PMID: 33122697

Dataset Information

Evaluating genotype imputation pipeline for ultra-low coverage ancient genomes.

Publications

Evaluating genotype imputation pipeline for ultra-low coverage ancient genomes.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

TKGWV2: an ancient DNA relatedness pipeline for ultra-low coverage whole genome shotgun data.
| S-EPMC8553948 | biostudies-literature

MTaxi: A comparative tool for taxon identification of ultra low coverage ancient genomes.
| S-EPMC10565424 | biostudies-literature

Accurate Genotype Imputation in Multiparental Populations from Low-Coverage Sequence.
| S-EPMC6116951 | biostudies-literature

Imputation of ancient human genomes.
| S-EPMC10282092 | biostudies-literature

Imputation of low-coverage sequencing data from 150,119 UK Biobank genomes.
| S-EPMC10335927 | biostudies-literature

Best practices for genotype imputation from low-coverage sequencing data in natural populations.
| S-EPMC10879460 | biostudies-literature

Accurate genotype imputation from low-coverage whole-genome sequencing data of rainbow trout.
| S-EPMC11373650 | biostudies-literature

Genotype imputation with thousands of genomes.
| S-EPMC3276165 | biostudies-literature

Comparison of Genotype Imputation for SNP Array and Low-Coverage Whole-Genome Sequencing Data.
| S-EPMC8762119 | biostudies-literature