Dataset Information

Further improvements to linear mixed models for genome-wide association studies.

ABSTRACT: We examine improvements to the linear mixed model (LMM) that better correct for population structure and family relatedness in genome-wide association studies (GWAS). LMMs rely on the estimation of a genetic similarity matrix (GSM), which encodes the pairwise similarity between every two individuals in a cohort. These similarities are estimated from single nucleotide polymorphisms (SNPs) or other genetic variants. Traditionally, all available SNPs are used to estimate the GSM. In empirical studies across a wide range of synthetic and real data, we find that modifications to this approach improve GWAS performance as measured by type I error control and power. Specifically, when only population structure is present, a GSM constructed from SNPs that well predict the phenotype in combination with principal components as covariates controls type I error and yields more power than the traditional LMM. In any setting, with or without population structure or family relatedness, a GSM consisting of a mixture of two component GSMs, one constructed from all SNPs and another constructed from SNPs that well predict the phenotype again controls type I error and yields more power than the traditional LMM. Software implementing these improvements and the experimental comparisons are available at http://microsoft.com/science.

SUBMITTER: Widmer C

PROVIDER: S-EPMC4230738 | biostudies-literature | 2014 Nov

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Further improvements to linear mixed models for genome-wide association studies.

Widmer Christian C Lippert Christoph C Weissbrod Omer O Fusi Nicolo N Kadie Carl C Davidson Robert R Listgarten Jennifer J Heckerman David D

Scientific reports 20141112

We examine improvements to the linear mixed model (LMM) that better correct for population structure and family relatedness in genome-wide association studies (GWAS). LMMs rely on the estimation of a genetic similarity matrix (GSM), which encodes the pairwise similarity between every two individuals in a cohort. These similarities are estimated from single nucleotide polymorphisms (SNPs) or other genetic variants. Traditionally, all available SNPs are used to estimate the GSM. In empirical studi ...[more]

PMID: 25387525

Dataset Information

Further improvements to linear mixed models for genome-wide association studies.

Publications

Further improvements to linear mixed models for genome-wide association studies.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Trade-offs of Linear Mixed Models in Genome-Wide Association Studies.
| S-EPMC8968846 | biostudies-literature

Federated generalized linear mixed models for collaborative genome-wide association studies.
| S-EPMC10387571 | biostudies-literature

Methodological implementation of mixed linear models in multi-locus genome-wide association studies.
| S-EPMC6054291 | biostudies-literature

Mixed linear model approach adapted for genome-wide association studies.
| S-EPMC2931336 | biostudies-literature

BGWAS: Bayesian variable selection in linear mixed models with nonlocal priors for genome-wide association studies.
| S-EPMC10176706 | biostudies-literature

Penalized multivariate linear mixed model for longitudinal genome-wide association studies.
| S-EPMC4143695 | biostudies-literature

Efficient multivariate linear mixed model algorithms for genome-wide association studies.
| S-EPMC4211878 | biostudies-literature

Regularized multi-trait multi-locus linear mixed models for genome-wide association studies and genomic selection in crops.
| S-EPMC10604903 | biostudies-literature

Fast and flexible linear mixed models for genome-wide genetics.
| S-EPMC6383949 | biostudies-literature

An efficient hierarchical generalized linear mixed model for pathway analysis of genome-wide association studies.
| S-EPMC3042187 | biostudies-literature