Unknown

Dataset Information

0

Finding alternative expression quantitative trait loci by exploring sparse model space.


ABSTRACT: Sparse modeling, a feature selection method widely used in the machine-learning community, has been recently applied to identify associations in genetic studies including expression quantitative trait locus (eQTL) mapping. These genetic studies usually involve high dimensional data where the number of features is much larger than the number of samples. The high dimensionality of genetic data introduces a problem that there exist multiple solutions for optimizing a sparse model. In such situations, a single optimization result provides only an incomplete view of the data and lacks power to find alternative features associated with the same trait. In this article, we propose a novel method aimed to detecting alternative eQTLs where two genetic variants have alternative relationships regarding their associations with the expression of a particular gene. Our method accomplishes this goal by exploring multiple solutions sampled from the solution space. We proved our method theoretically and demonstrated its usage on simulated data. We then applied our method to a real eQTL data and identified a set of alternative eQTLs with potential biological insights. Additionally, these alternative eQTLs implicate a network view of understanding gene regulation.

SUBMITTER: Wang Z 

PROVIDER: S-EPMC4010169 | biostudies-literature | 2014 May

REPOSITORIES: biostudies-literature

altmetric image

Publications

Finding alternative expression quantitative trait loci by exploring sparse model space.

Wang Zhiyong Z   Xu Jinbo J   Shi Xinghua X  

Journal of computational biology : a journal of computational molecular cell biology 20140401 5


Sparse modeling, a feature selection method widely used in the machine-learning community, has been recently applied to identify associations in genetic studies including expression quantitative trait locus (eQTL) mapping. These genetic studies usually involve high dimensional data where the number of features is much larger than the number of samples. The high dimensionality of genetic data introduces a problem that there exist multiple solutions for optimizing a sparse model. In such situation  ...[more]

Similar Datasets

| S-EPMC2674843 | biostudies-literature
| S-EPMC6230954 | biostudies-literature
| S-EPMC3090625 | biostudies-literature
2010-06-25 | E-GEOD-7628 | biostudies-arrayexpress
| S-EPMC1893048 | biostudies-literature
2008-03-04 | GSE7628 | GEO
2021-03-04 | MSV000087000 | MassIVE
| S-EPMC7384761 | biostudies-literature
| S-EPMC1088296 | biostudies-literature
| S-EPMC5228698 | biostudies-literature