Unknown

Dataset Information

0

A Cross-Level Information Transmission Network for Hierarchical Omics Data Integration and Phenotype Prediction from a New Genotype.


ABSTRACT:

Motivation

An unsolved fundamental problem in biology is to predict phenotypes from a new genotype under environmental perturbations. The emergence of multiple omics data provides new opportunities but imposes great challenges in the predictive modeling of genotype-phenotype associations. Firstly, the high-dimensionality of genomics data and the lack of coherent labeled data often make the existing supervised learning techniques less successful. Secondly, it is challenging to integrate heterogeneous omics data from different resources. Finally, few works have explicitly modeled the information transmission from DNA to phenotype, which involves multiple intermediate molecular types. Higher-level features (e.g., gene expression) usually have stronger discriminative and interpretable power than lower-level features (e.g., somatic mutation).

Results

We propose a novel Cross-LEvel Information Transmission network (CLEIT) framework to address the above issues. CLEIT aims to represent the asymmetrical multi-level organization of the biological system by integrating multiple incoherent omics data and to improve the prediction power of low-level features. CLEIT first learns the latent representation of the high-level domain then uses it as ground-truth embedding to improve the representation learning of the low-level domain in the form of contrastive loss. Besides, CLEIT can leverage the unlabeled heterogeneous omics data to improve the generalizability of the predictive model. We demonstrate the effectiveness and significant performance boost of CLEIT in predicting anti-cancer drug sensitivity from somatic mutations via the assistance of gene expressions when compared with state-of-the-art methods. CLEIT provides a general framework to model information transmissions and integrate multi-modal data in a multi-level system.

Availability

The source code is freely available at https://github.com/XieResearchGroup/CLEIT.

Supplementary information

Supplementary data are available at Bioinformatics online.

SUBMITTER: He D 

PROVIDER: S-EPMC8696111 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| S-EPMC2535605 | biostudies-literature
| S-EPMC5907722 | biostudies-literature
| S-EPMC8664198 | biostudies-literature
| S-EPMC6882790 | biostudies-literature
| S-EPMC1630430 | biostudies-literature
| S-EPMC8278664 | biostudies-literature