EQTL analysis of many thousands of expressed genes while simultaneously controlling for hidden factors
Ontology highlight
ABSTRACT: Motivation: Identification of eQTL, the genetic loci that contribute to heritable variation in gene expression, can be obstructed by factors that produce variation in expression profiles if these factors are unmeasured or hidden from direct analysis. Methods: We have developed a method for Hidden Expression Factor analysis (HEFT) that identifies individual and pleiotropic effects of eQTL in the presence of hidden factors. The HEFT model simultaneously accounts for the effects of genotypes while learning hidden factors, where we make use of the complete likelihood of a unified multivariate regression and factor analysis model to derive a ridge estimator for combined factor learning and detection of eQTL. HEFT requires no pre-estimation of hidden factor effects, no iterative model selection, it provides p-values, and is extremely fast, requiring just a few hours to complete an eQTL analysis of thousands of expression variables when analyzing hundreds of thousands of SNPs on a standard 8 core 2.6G desktop. Results: By analyzing simulated data, we demonstrate that HEFT can correct for an unknown number of hidden factors and outperforms related hidden factor methods for eQTL analysis, where the improved performance is particularly evident in the detection of eQTL with multivariate effects. To demonstrate a real-world application, we applied HEFT to identify eQTL affecting gene expression in human lung tissue for a study that included presumptive hidden factors. The analysis identified a number of eQTL with direct relevance to lung disease that could not be found without a hidden factor analysis, including cis-eQTL for GTF2H1 and MTRR, genes that have been independently associated with lung cancer.
ORGANISM(S): Homo sapiens
PROVIDER: GSE40364 | GEO | 2014/01/07
SECONDARY ACCESSION(S): PRJNA173725
REPOSITORIES: GEO
ACCESS DATA