Unknown

Dataset Information

0

Multiple-kernel learning for genomic data mining and prediction.


ABSTRACT:

Background

Advances in medical technology have allowed for customized prognosis, diagnosis, and treatment regimens that utilize multiple heterogeneous data sources. Multiple kernel learning (MKL) is well suited for the integration of multiple high throughput data sources. MKL remains to be under-utilized by genomic researchers partly due to the lack of unified guidelines for its use, and benchmark genomic datasets.

Results

We provide three implementations of MKL in R. These methods are applied to simulated data to illustrate that MKL can select appropriate models. We also apply MKL to combine clinical information with miRNA gene expression data of ovarian cancer study into a single analysis. Lastly, we show that MKL can identify gene sets that are known to play a role in the prognostic prediction of 15 cancer types using gene expression data from The Cancer Genome Atlas, as well as, identify new gene sets for the future research.

Conclusion

Multiple kernel learning coupled with modern optimization techniques provides a promising learning tool for building predictive models based on multi-source genomic data. MKL also provides an automated scheme for kernel prioritization and parameter tuning. The methods used in the paper are implemented as an R package called RMKL package, which is freely available for download through CRAN at https://CRAN.R-project.org/package=RMKL .

SUBMITTER: Wilson CM 

PROVIDER: S-EPMC6694479 | biostudies-literature | 2019 Aug

REPOSITORIES: biostudies-literature

altmetric image

Publications

Multiple-kernel learning for genomic data mining and prediction.

Wilson Christopher M CM   Li Kaiqiao K   Yu Xiaoqing X   Kuan Pei-Fen PF   Wang Xuefeng X  

BMC bioinformatics 20190815 1


<h4>Background</h4>Advances in medical technology have allowed for customized prognosis, diagnosis, and treatment regimens that utilize multiple heterogeneous data sources. Multiple kernel learning (MKL) is well suited for the integration of multiple high throughput data sources. MKL remains to be under-utilized by genomic researchers partly due to the lack of unified guidelines for its use, and benchmark genomic datasets.<h4>Results</h4>We provide three implementations of MKL in R. These method  ...[more]

Similar Datasets

| S-EPMC9235505 | biostudies-literature
| S-EPMC5994931 | biostudies-literature
| S-EPMC6737184 | biostudies-literature
| S-EPMC8561914 | biostudies-literature
| S-EPMC6471546 | biostudies-literature
| S-EPMC2906488 | biostudies-literature
| S-EPMC6401099 | biostudies-literature
| S-EPMC5727873 | biostudies-literature
| S-EPMC9310626 | biostudies-literature
| S-EPMC4713389 | biostudies-literature