Ontology highlight
ABSTRACT: Motivation
In order to minimize the effects of genetic confounding on the analysis of high-throughput genetic association studies, e.g. (whole-genome) sequencing (WGS) studies, genome-wide association studies (GWAS), etc., we propose a general framework to assess and to test formally for genetic heterogeneity among study subjects. As the approach fully utilizes the recent ancestor information captured by rare variants, it is especially powerful in WGS studies. Even for relatively moderate sample sizes, the proposed testing framework is able to identify study subjects that are genetically too similar, e.g. cryptic relationships, or that are genetically too different, e.g. population substructure. The approach is computationally fast, enabling the application to whole-genome sequencing data, and straightforward to implement.Results
Simulation studies illustrate the overall performance of our approach. In an application to the 1000 Genomes Project, we outline an analysis/cleaning pipeline that utilizes our approach to formally assess whether study subjects are related and whether population substructure is present. In the analysis of the 1000 Genomes Project data, our approach revealed subjects that are most likely related, but had previously passed standard qc-filters.Availability and implementation
An implementation of our method, Similarity Test for Estimating Genetic Outliers (STEGO), is available in the R package stego from Github at https://github.com/dschlauch/stego .Contact
dschlauch@fas.harvard.edu.Supplementary information
Supplementary data are available at Bioinformatics online.
SUBMITTER: Schlauch D
PROVIDER: S-EPMC5870703 | biostudies-literature | 2017 Jul
REPOSITORIES: biostudies-literature
Schlauch Daniel D Fier Heide H Lange Christoph C
Bioinformatics (Oxford, England) 20170701 13
<h4>Motivation</h4>In order to minimize the effects of genetic confounding on the analysis of high-throughput genetic association studies, e.g. (whole-genome) sequencing (WGS) studies, genome-wide association studies (GWAS), etc., we propose a general framework to assess and to test formally for genetic heterogeneity among study subjects. As the approach fully utilizes the recent ancestor information captured by rare variants, it is especially powerful in WGS studies. Even for relatively moderat ...[more]