An enhanced machine learning tool for cis-eQTL mapping with regularization and confounder adjustments.
Ontology highlight
ABSTRACT: Many expression quantitative trait loci (eQTL) studies have been conducted to investigate the biological effects of variants in gene regulation. However, these eQTL studies may suffer from low or moderate statistical power and overly conservative false-discovery rate. In practice, most algorithms for eQTL identification do not model the joint effects of multiple genetic variants with weak or moderate influence. Here we present a novel machine-learning algorithm, lasso least-squares kernel machine (LSKM-LASSO) that model the association between multiple genetic variants and phenotypic traits simultaneously with the existence of nongenetic and genetic confounding. With a more general and flexible framework for the estimation of genetic confounding, LSKM-LASSO is able to provide a more accurate evaluation of the joint effects of multiple genetic variants. Our simulations demonstrate that our approach outperforms three state-of-the-art alternatives in terms of eQTL identification and phenotype prediction. We then apply our method to genotype and gene expression data of 11 tissues obtained from the Genotype-Tissue Expression project. Our algorithm was able to identify more genes with eQTL than other algorithms. By incorporating a regularization term and combining it with least-squares kernel machine, LSKM-LASSO provides a powerful tool for eQTL mapping and phenotype prediction.
SUBMITTER: Yan KK
PROVIDER: S-EPMC7875251 | biostudies-literature |
REPOSITORIES: biostudies-literature
ACCESS DATA