Dataset Information

Disease gene identification by random walk on multigraphs merging heterogeneous genomic and phenotype data.

ABSTRACT: BACKGROUND: High throughput experiments resulted in many genomic datasets and hundreds of candidate disease genes. To discover the real disease genes from a set of candidate genes, computational methods have been proposed and worked on various types of genomic data sources. As a single source of genomic data is prone of bias, incompleteness and noise, integration of different genomic data sources is highly demanded to accomplish reliable disease gene identification. RESULTS: In contrast to the commonly adapted data integration approach which integrates separate lists of candidate genes derived from the each single data sources, we merge various genomic networks into a multigraph which is capable of connecting multiple edges between a pair of nodes. This novel approach provides a data platform with strong noise tolerance to prioritize the disease genes. A new idea of random walk is then developed to work on multigraphs using a modified step to calculate the transition matrix. Our method is further enhanced to deal with heterogeneous data types by allowing cross-walk between phenotype and gene networks. Compared on benchmark datasets, our method is shown to be more accurate than the state-of-the-art methods in disease gene identification. We also conducted a case study to identify disease genes for Insulin-Dependent Diabetes Mellitus. Some of the newly identified disease genes are supported by recently published literature. CONCLUSIONS: The proposed RWRM (Random Walk with Restart on Multigraphs) model and CHN (Complex Heterogeneous Network) model are effective in data integration for candidate gene prioritization.

SUBMITTER: Li Y

PROVIDER: S-EPMC3521411 | biostudies-literature | 2012

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Disease gene identification by random walk on multigraphs merging heterogeneous genomic and phenotype data.

Li Yongjin Y Li Jinyan J

BMC genomics 20121213

<h4>Background</h4>High throughput experiments resulted in many genomic datasets and hundreds of candidate disease genes. To discover the real disease genes from a set of candidate genes, computational methods have been proposed and worked on various types of genomic data sources. As a single source of genomic data is prone of bias, incompleteness and noise, integration of different genomic data sources is highly demanded to accomplish reliable disease gene identification.<h4>Results</h4>In cont ...[more]

PMID: 23282070

Dataset Information

Disease gene identification by random walk on multigraphs merging heterogeneous genomic and phenotype data.

Publications

Disease gene identification by random walk on multigraphs merging heterogeneous genomic and phenotype data.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Biased Random Walk With Restart on Multilayer Heterogeneous Networks for MiRNA-Disease Association Prediction.
| S-EPMC8384471 | biostudies-literature

Three-Layer Heterogeneous Network Combined With Unbalanced Random Walk for miRNA-Disease Association Prediction.
| S-EPMC6967737 | biostudies-literature

Predicting LncRNA-Disease Association by a Random Walk With Restart on Multiplex and Heterogeneous Networks.
| S-EPMC8417042 | biostudies-literature

PRWHMDA: Human Microbe-Disease Association Prediction by Random Walk on the Heterogeneous Network with PSO.
| S-EPMC6036753 | biostudies-literature

An Entropy-Based Directed Random Walk for Cancer Classification Using Gene Expression Data Based on Bi-Random Walk on Two Separated Networks.
| S-EPMC10048140 | biostudies-literature

Inferring microRNA-disease association by hybrid recommendation algorithm and unbalanced bi-random walk on heterogeneous network.
| S-EPMC6385311 | biostudies-literature

A novel approach for predicting microbe-disease associations by bi-random walk on the heterogeneous network.
| S-EPMC5589230 | biostudies-literature

HGPEC: a Cytoscape app for prediction of novel disease-gene and disease-disease associations and evidence collection based on a random walk on heterogeneous network.
| S-EPMC5472867 | biostudies-literature

RWCFusion: identifying phenotype-specific cancer driver gene fusions based on fusion pair random walk scoring method.
| S-EPMC5308635 | biostudies-literature

Prediction of lncRNA-disease association based on a Laplace normalized random walk with restart algorithm on heterogeneous networks.
| S-EPMC8729064 | biostudies-literature