Unknown

Dataset Information

0

Genome-wide association studies using binned genotypes.


ABSTRACT: Linear mixed models (LMM) that tests trait association one marker at a time have been the most popular methods for genome-wide association studies. However, this approach has potential pitfalls: over conservativeness after Bonferroni correction, ignorance of linkage disequilibrium (LD) between neighboring markers, and power reduction due to overfitting SNP effects. So, multiple locus models that can simultaneously estimate and test all markers in the genome are more appropriate. Based on the multiple locus models, we proposed a bin model that combines markers into bins based on their LD relationships. A bin is treated as a new synthetic marker and we detect the associations between bins and traits. Since the number of bins can be substantially smaller than the number of markers, a penalized multiple regression method can be adopted by fitting all bins to a single model. We developed an innovative method to bin the neighboring markers and used the least absolute shrinkage and selection operator (LASSO) method. We compared BIN-Lasso with SNP-Lasso and Q?+?K-LMM in a simulation experiment, and showed that the new method is more powerful with less Type I error than the other two methods. We also applied the bin model to a Chinese Simmental beef cattle population for bone weight association study. The new method identified more significant associations than the classical LMM. The bin model is a new dimension reduction technique that takes advantage of biological information (i.e., LD). The new method will be a significant breakthrough in associative genomics in the big data era.

SUBMITTER: An B 

PROVIDER: S-EPMC6972794 | biostudies-literature | 2020 Feb

REPOSITORIES: biostudies-literature

altmetric image

Publications


Linear mixed models (LMM) that tests trait association one marker at a time have been the most popular methods for genome-wide association studies. However, this approach has potential pitfalls: over conservativeness after Bonferroni correction, ignorance of linkage disequilibrium (LD) between neighboring markers, and power reduction due to overfitting SNP effects. So, multiple locus models that can simultaneously estimate and test all markers in the genome are more appropriate. Based on the mul  ...[more]

Similar Datasets

| S-EPMC2964405 | biostudies-literature
| S-EPMC4143727 | biostudies-literature
| S-EPMC8072158 | biostudies-literature
| S-EPMC5007749 | biostudies-other
| S-EPMC5743780 | biostudies-literature
| S-EPMC2908056 | biostudies-literature
| S-EPMC7261120 | biostudies-literature
| S-EPMC6239891 | biostudies-literature
| S-EPMC4420238 | biostudies-literature
| S-EPMC4341076 | biostudies-literature