Dataset Information

Analysis in case-control sequencing association studies with different sequencing depths.

ABSTRACT: With the advent of next-generation sequencing, investigators have access to higher quality sequencing data. However, to sequence all samples in a study using next generation sequencing can still be prohibitively expensive. One potential remedy could be to combine next generation sequencing data from cases with publicly available sequencing data for controls, but there could be a systematic difference in quality of sequenced data, such as sequencing depths, between sequenced study cases and publicly available controls. We propose a regression calibration (RC)-based method and a maximum-likelihood method for conducting an association study with such a combined sample by accounting for differential sequencing errors between cases and controls. The methods allow for adjusting for covariates, such as population stratification as confounders. Both methods control type I error and have comparable power to analysis conducted using the true genotype with sufficiently high but different sequencing depths. We show that the RC method allows for analysis using naive variance estimate (closely approximates true variance in practice) and standard software under certain circumstances. We evaluate the performance of the proposed methods using simulation studies and apply our methods to a combined data set of exome sequenced acute lung injury cases and healthy controls from the 1000 Genomes project.

SUBMITTER: Chen S

PROVIDER: S-EPMC7308042 | biostudies-literature | 2020 Jul

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Analysis in case-control sequencing association studies with different sequencing depths.

Chen Sixing S Lin Xihong X

Biostatistics (Oxford, England) 20200701 3

With the advent of next-generation sequencing, investigators have access to higher quality sequencing data. However, to sequence all samples in a study using next generation sequencing can still be prohibitively expensive. One potential remedy could be to combine next generation sequencing data from cases with publicly available sequencing data for controls, but there could be a systematic difference in quality of sequenced data, such as sequencing depths, between sequenced study cases and publi ...[more]

PMID: 30590456

Dataset Information

Analysis in case-control sequencing association studies with different sequencing depths.

Publications

Analysis in case-control sequencing association studies with different sequencing depths.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Analysis and optimal design for association studies using next-generation sequencing with case-control pools.
| S-EPMC4139478 | biostudies-literature

Unified Analysis of Secondary Traits in Case-Control Association Studies.
| S-EPMC3881430 | biostudies-literature

Analysis of case-control association studies with known risk variants.
| S-EPMC3381970 | biostudies-literature

Robust analysis of secondary phenotypes in case-control genetic association studies.
| S-EPMC5870885 | biostudies-literature

Powerful SNP-set analysis for case-control genome-wide association studies.
| S-EPMC3032061 | biostudies-literature

Genetic model selection in two-phase analysis for case-control association studies.
| S-EPMC3294316 | biostudies-literature

Association between <i>TP73 G4C14-A4T14</i> polymorphism and different cancer types: an updated meta-analysis of 55 case-control studies.
| S-EPMC9623385 | biostudies-literature

Marker selection for genetic case-control association studies.
| S-EPMC3025519 | biostudies-literature

Association between Myocardial Infarction and Periodontitis: A Meta-Analysis of Case-Control Studies.
| S-EPMC5095113 | biostudies-literature

Association between adipokines and thyroid carcinoma: a meta-analysis of case-control studies.
| S-EPMC7441682 | biostudies-literature