Unknown

Dataset Information

0

Further improvements to linear mixed models for genome-wide association studies.


ABSTRACT: We examine improvements to the linear mixed model (LMM) that better correct for population structure and family relatedness in genome-wide association studies (GWAS). LMMs rely on the estimation of a genetic similarity matrix (GSM), which encodes the pairwise similarity between every two individuals in a cohort. These similarities are estimated from single nucleotide polymorphisms (SNPs) or other genetic variants. Traditionally, all available SNPs are used to estimate the GSM. In empirical studies across a wide range of synthetic and real data, we find that modifications to this approach improve GWAS performance as measured by type I error control and power. Specifically, when only population structure is present, a GSM constructed from SNPs that well predict the phenotype in combination with principal components as covariates controls type I error and yields more power than the traditional LMM. In any setting, with or without population structure or family relatedness, a GSM consisting of a mixture of two component GSMs, one constructed from all SNPs and another constructed from SNPs that well predict the phenotype again controls type I error and yields more power than the traditional LMM. Software implementing these improvements and the experimental comparisons are available at http://microsoft.com/science.

SUBMITTER: Widmer C 

PROVIDER: S-EPMC4230738 | biostudies-literature | 2014

REPOSITORIES: biostudies-literature

altmetric image

Publications

Further improvements to linear mixed models for genome-wide association studies.

Widmer Christian C   Lippert Christoph C   Weissbrod Omer O   Fusi Nicolo N   Kadie Carl C   Davidson Robert R   Listgarten Jennifer J   Heckerman David D  

Scientific reports 20141112


We examine improvements to the linear mixed model (LMM) that better correct for population structure and family relatedness in genome-wide association studies (GWAS). LMMs rely on the estimation of a genetic similarity matrix (GSM), which encodes the pairwise similarity between every two individuals in a cohort. These similarities are estimated from single nucleotide polymorphisms (SNPs) or other genetic variants. Traditionally, all available SNPs are used to estimate the GSM. In empirical studi  ...[more]

Similar Datasets

| S-EPMC10387571 | biostudies-literature
| S-EPMC8968846 | biostudies-literature
| S-EPMC6054291 | biostudies-literature
| S-EPMC2931336 | biostudies-literature
| S-EPMC10176706 | biostudies-literature
| S-EPMC4143695 | biostudies-literature
| S-EPMC4211878 | biostudies-literature
| S-EPMC6383949 | biostudies-literature
| S-EPMC3042187 | biostudies-literature
| S-EPMC7684894 | biostudies-literature