Unknown

Dataset Information

0

Using Machine Learning to Measure Relatedness Between Genes: A Multi-Features Model.


ABSTRACT: Measuring conditional relatedness between a pair of genes is a fundamental technique and still a significant challenge in computational biology. Such relatedness can be assessed by gene expression similarities while suffering high false discovery rates. Meanwhile, other types of features, e.g., prior-knowledge based similarities, is only viable for measuring global relatedness. In this paper, we propose a novel machine learning model, named Multi-Features Relatedness (MFR), for accurately measuring conditional relatedness between a pair of genes by incorporating expression similarities with prior-knowledge based similarities in an assessment criterion. MFR is used to predict gene-gene interactions extracted from the COXPRESdb, KEGG, HPRD, and TRRUST databases by the 10-fold cross validation and test verification, and to identify gene-gene interactions collected from the GeneFriends and DIP databases for further verification. The results show that MFR achieves the highest area under curve (AUC) values for identifying gene-gene interactions in the development, test, and DIP datasets. Specifically, it obtains an improvement of 1.1% on average of precision for detecting gene pairs with both high expression similarities and high prior-knowledge based similarities in all datasets, comparing to other linear models and coexpression analysis methods. Regarding cancer gene networks construction and gene function prediction, MFR also obtains the results with more biological significances and higher average prediction accuracy, than other compared models and methods. A website of the MFR model and relevant datasets can be accessed from http://bmbl.sdstate.edu/MFR .

SUBMITTER: Wang Y 

PROVIDER: S-EPMC6414665 | biostudies-literature | 2019 Mar

REPOSITORIES: biostudies-literature

altmetric image

Publications

Using Machine Learning to Measure Relatedness Between Genes: A Multi-Features Model.

Wang Yan Y   Yang Sen S   Zhao Jing J   Du Wei W   Liang Yanchun Y   Wang Cankun C   Zhou Fengfeng F   Tian Yuan Y   Ma Qin Q  

Scientific reports 20190312 1


Measuring conditional relatedness between a pair of genes is a fundamental technique and still a significant challenge in computational biology. Such relatedness can be assessed by gene expression similarities while suffering high false discovery rates. Meanwhile, other types of features, e.g., prior-knowledge based similarities, is only viable for measuring global relatedness. In this paper, we propose a novel machine learning model, named Multi-Features Relatedness (MFR), for accurately measur  ...[more]

Similar Datasets

| S-EPMC8150380 | biostudies-literature
| S-EPMC7806470 | biostudies-literature
| S-EPMC3483212 | biostudies-literature
| S-EPMC7599600 | biostudies-literature
| S-EPMC5564607 | biostudies-literature
| S-EPMC8627224 | biostudies-literature
| S-EPMC5378777 | biostudies-literature
| S-EPMC8225597 | biostudies-literature
| S-EPMC5867876 | biostudies-other
| S-EPMC8874745 | biostudies-literature