Unknown

Dataset Information

0

Tilting the lasso by knowledge-based post-processing.


ABSTRACT: It is useful to incorporate biological knowledge on the role of genetic determinants in predicting an outcome. It is, however, not always feasible to fully elicit this information when the number of determinants is large. We present an approach to overcome this difficulty. First, using half of the available data, a shortlist of potentially interesting determinants are generated. Second, binary indications of biological importance are elicited for this much smaller number of determinants. Third, an analysis is carried out on this shortlist using the second half of the data.We show through simulations that, compared with adaptive lasso, this approach leads to models containing more biologically relevant variables, while the prediction mean squared error (PMSE) is comparable or even reduced. We also apply our approach to bone mineral density data, and again final models contain more biologically relevant variables and have reduced PMSEs.Our method leads to comparable or improved predictive performance, and models with greater face validity and interpretability with feasible incorporation of biological knowledge into predictive models.

SUBMITTER: Tharmaratnam K 

PROVIDER: S-EPMC5010709 | biostudies-literature | 2016 Sep

REPOSITORIES: biostudies-literature

altmetric image

Publications

Tilting the lasso by knowledge-based post-processing.

Tharmaratnam Kukatharmini K   Sperrin Matthew M   Jaki Thomas T   Reppe Sjur S   Frigessi Arnoldo A  

BMC bioinformatics 20160902 1


<h4>Background</h4>It is useful to incorporate biological knowledge on the role of genetic determinants in predicting an outcome. It is, however, not always feasible to fully elicit this information when the number of determinants is large. We present an approach to overcome this difficulty. First, using half of the available data, a shortlist of potentially interesting determinants are generated. Second, binary indications of biological importance are elicited for this much smaller number of de  ...[more]

Similar Datasets

| S-EPMC5303311 | biostudies-literature
| S-EPMC8574949 | biostudies-literature
| S-EPMC3685865 | biostudies-literature
| S-EPMC8745299 | biostudies-literature
| S-EPMC5245167 | biostudies-literature
| S-EPMC4390489 | biostudies-literature
2020-11-30 | GSE158375 | GEO
| S-EPMC5745070 | biostudies-literature
| S-EPMC6393391 | biostudies-literature
| S-EPMC7063084 | biostudies-literature