Unknown

Dataset Information

0

Combining sequence data from multiple studies: Impact of analysis strategies on rare variant calling and association results.


ABSTRACT: Individual sequencing studies often have limited sample sizes and so limited power to detect trait associations with rare variants. A common strategy is to aggregate data from multiple studies. For studying rare variants, jointly calling all samples together is the gold standard strategy but can be difficult to implement due to privacy restrictions and computational burden. Here, we compare joint calling to the alternative of single-study calling in terms of variant detection sensitivity and genotype accuracy as a function of sequencing coverage and assess their impact on downstream association analysis. To do so, we analyze deep-coverage (~82×) exome and low-coverage (~5×) genome sequence data on 2,250 individuals from the Genetics of Type 2 Diabetes study jointly and separately within five geographic cohorts. For rare single nucleotide variants (SNVs): (a) ?97% of discovered SNVs are found by both calling strategies; (b) nonreference concordance with a set of highly accurate genotypes is ?99% for both calling strategies; (c) meta-analysis has similar power to joint analysis in deep-coverage sequence data but can be less powerful in low-coverage sequence data. Given similar data processing and quality control steps, we recommend single-study calling as a viable alternative to joint calling for analyzing SNVs of all minor allele frequency in deep-coverage data.

SUBMITTER: Chen Z 

PROVIDER: S-EPMC7231418 | biostudies-literature | 2020 Jan

REPOSITORIES: biostudies-literature

altmetric image

Publications

Combining sequence data from multiple studies: Impact of analysis strategies on rare variant calling and association results.

Chen Zhongsheng Z   Boehnke Michael M   Fuchsberger Christian C  

Genetic epidemiology 20190914 1


Individual sequencing studies often have limited sample sizes and so limited power to detect trait associations with rare variants. A common strategy is to aggregate data from multiple studies. For studying rare variants, jointly calling all samples together is the gold standard strategy but can be difficult to implement due to privacy restrictions and computational burden. Here, we compare joint calling to the alternative of single-study calling in terms of variant detection sensitivity and gen  ...[more]

Similar Datasets

| S-EPMC3449077 | biostudies-literature
| S-EPMC7205561 | biostudies-literature
| S-EPMC4836983 | biostudies-literature
| S-EPMC2978957 | biostudies-literature
| S-EPMC4937198 | biostudies-other
| S-EPMC2997372 | biostudies-literature
| S-EPMC10841766 | biostudies-literature
| S-EPMC3530907 | biostudies-literature
| S-EPMC8509018 | biostudies-literature
| S-EPMC6283567 | biostudies-other