Unknown

Dataset Information

0

Selecting cases and controls for DNA sequencing studies using family histories of disease.


ABSTRACT: Recent improvements in sequencing technology have enabled the investigation of so-called missing heritability, and a large number of affected subjects have been sequenced in order to detect significant associations between human diseases and rare variants. However, the cost of genome sequencing is still high, and a statistically powerful strategy for selecting informative subjects would be useful. Therefore, in this report, we propose a new statistical method for selecting cases and controls for sequencing studies based on family history. We assume that disease status is determined by unobserved liability scores. Our method consists of two steps: first, the conditional means of liability are estimated with the liability threshold model given the individual's disease status and those of their relatives. Second, the informative subjects are selected with the estimated conditional means. Our simulation studies showed that statistical power is substantially affected by the subject selection strategy chosen, and power is maximized when affected (unaffected) subjects with high (low) risks are selected as cases (controls). The proposed method was successfully applied to genome-wide association studies for type 2 diabetes, and our analysis results reveal the practical value of the proposed methods. Copyright © 2017 John Wiley & Sons, Ltd.

SUBMITTER: Kim W 

PROVIDER: S-EPMC5810411 | biostudies-literature | 2017 Jun

REPOSITORIES: biostudies-literature

altmetric image

Publications

Selecting cases and controls for DNA sequencing studies using family histories of disease.

Kim Wonji W   Qiao Dandi D   Cho Michael H MH   Kwak Soo Heon SH   Park Kyong Soo KS   Silverman Edwin K EK   Sham Pak P   Won Sungho S  

Statistics in medicine 20170221 13


Recent improvements in sequencing technology have enabled the investigation of so-called missing heritability, and a large number of affected subjects have been sequenced in order to detect significant associations between human diseases and rare variants. However, the cost of genome sequencing is still high, and a statistically powerful strategy for selecting informative subjects would be useful. Therefore, in this report, we propose a new statistical method for selecting cases and controls for  ...[more]

Similar Datasets

| S-EPMC8909853 | biostudies-literature
| S-EPMC4035547 | biostudies-literature
| S-EPMC5256917 | biostudies-literature
2020-10-31 | GSE152547 | GEO
2020-10-31 | GSE152548 | GEO
| S-EPMC4866073 | biostudies-literature
| S-EPMC6699738 | biostudies-literature
| S-EPMC3371359 | biostudies-literature
| S-EPMC27423 | biostudies-literature
| S-EPMC4195845 | biostudies-literature