Unknown

Dataset Information

0

Bayesian integrative model for multi-omics data with missingness.


ABSTRACT:

Motivation

Integrative analysis of multi-omics data from different high-throughput experimental platforms provides valuable insight into regulatory mechanisms associated with complex diseases, and gains statistical power to detect markers that are otherwise overlooked by single-platform omics analysis. In practice, a significant portion of samples may not be measured completely due to insufficient tissues or restricted budget (e.g. gene expression profile are measured but not methylation). Current multi-omics integrative methods require complete data. A common practice is to ignore samples with any missing platform and perform complete case analysis, which leads to substantial loss of statistical power.

Methods

In this article, inspired by the popular Integrative Bayesian Analysis of Genomics data (iBAG), we propose a full Bayesian model that allows incorporation of samples with missing omics data.

Results

Simulation results show improvement of the new full Bayesian approach in terms of outcome prediction accuracy and feature selection performance when sample size is limited and proportion of missingness is large. When sample size is large or the proportion of missingness is low, incorporating samples with missingness may introduce extra inference uncertainty and generate worse prediction and feature selection performance. To determine whether and how to incorporate samples with missingness, we propose a self-learning cross-validation (CV) decision scheme. Simulations and a real application on child asthma dataset demonstrate superior performance of the CV decision scheme when various types of missing mechanisms are evaluated.

Availability and implementation

Freely available on the GitHub at https://github.com/CHPGenetics/FBM.

Supplementary information

Supplementary data are available at Bioinformatics online.

SUBMITTER: Fang Z 

PROVIDER: S-EPMC6223369 | biostudies-literature | 2018 Nov

REPOSITORIES: biostudies-literature

altmetric image

Publications

Bayesian integrative model for multi-omics data with missingness.

Fang Zhou Z   Ma Tianzhou T   Tang Gong G   Zhu Li L   Yan Qi Q   Wang Ting T   Celedón Juan C JC   Chen Wei W   Tseng George C GC  

Bioinformatics (Oxford, England) 20181101 22


<h4>Motivation</h4>Integrative analysis of multi-omics data from different high-throughput experimental platforms provides valuable insight into regulatory mechanisms associated with complex diseases, and gains statistical power to detect markers that are otherwise overlooked by single-platform omics analysis. In practice, a significant portion of samples may not be measured completely due to insufficient tissues or restricted budget (e.g. gene expression profile are measured but not methylation  ...[more]

Similar Datasets

| S-EPMC6455926 | biostudies-literature
| S-EPMC4945831 | biostudies-other
| S-EPMC4133046 | biostudies-literature
| S-EPMC3901289 | biostudies-other
| S-EPMC8796362 | biostudies-literature
| S-EPMC7293045 | biostudies-literature
2020-05-18 | GSE148665 | GEO
| S-EPMC8674330 | biostudies-literature
| S-EPMC7663540 | biostudies-literature
| S-EPMC5952106 | biostudies-literature