Unknown

Dataset Information

0

Joint Estimation of Contamination, Error and Demography for Nuclear DNA from Ancient Humans.


ABSTRACT: When sequencing an ancient DNA sample from a hominin fossil, DNA from present-day humans involved in excavation and extraction will be sequenced along with the endogenous material. This type of contamination is problematic for downstream analyses as it will introduce a bias towards the population of the contaminating individual(s). Quantifying the extent of contamination is a crucial step as it allows researchers to account for possible biases that may arise in downstream genetic analyses. Here, we present an MCMC algorithm to co-estimate the contamination rate, sequencing error rate and demographic parameters-including drift times and admixture rates-for an ancient nuclear genome obtained from human remains, when the putative contaminating DNA comes from present-day humans. We assume we have a large panel representing the putative contaminant population (e.g. European, East Asian or African). The method is implemented in a C++ program called 'Demographic Inference with Contamination and Error' (DICE). We applied it to simulations and genome data from ancient Neanderthals and modern humans. With reasonable levels of genome sequence coverage (>3X), we find we can recover accurate estimates of all these parameters, even when the contamination rate is as high as 50%.

SUBMITTER: Racimo F 

PROVIDER: S-EPMC4822957 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| S-EPMC5100944 | biostudies-literature
| S-EPMC7418405 | biostudies-literature
| S-EPMC4601135 | biostudies-literature
| S-EPMC6145933 | biostudies-literature
| S-EPMC4498232 | biostudies-literature
| S-EPMC5499161 | biostudies-literature
| S-EPMC7050530 | biostudies-literature
| S-EPMC6829038 | biostudies-literature
| S-EPMC3926038 | biostudies-literature
| S-EPMC3556025 | biostudies-literature