Unknown

Dataset Information

0

GBDTCDA: Predicting circRNA-disease Associations Based on Gradient Boosting Decision Tree with Multiple Biological Data Fusion.


ABSTRACT: Circular RNA (circRNA) is a closed-loop structural non-coding RNA molecule which plays a significant role during the gene regulation processes. There are many previous studies shown that circRNAs can be regarded as the sponges of miRNAs. Thus, circRNA is also a key point for disease diagnosing, treating and inferring. However, traditional experimental approaches to verify the associations between the circRNA and disease are time-consuming and money-consuming. There are few computational models to predict potential circRNA-disease associations, which become our motivation to propose a new computational model. In this study, we propose a machine learning based computational model named Gradient Boosting Decision Tree with multiple biological data to predict circRNA-disease associations (GBDTCDA). The known circRNA-disease associations' data are downloaded from cricR2Disease database (http://bioinfo.snnu.edu.cn/CircR2Disease/). The feature vector of each circRNA-disease association pair is composed of four parts, which are the statistics information of different biological networks, the graph theory information of different biological networks, circRNA-disease associations' network information and circRNA nucleotide sequence information, respectively. Therefore, we use those feature vectors to train the gradient boosting decision tree regression model. Then, the leave one out cross validation (LOOCV) is adopted to evaluate the performance of our computational model. As for predicting some common diseases related circRNAs, our method GBDTCDA also obtains the better results. The Area under the ROC Curve (AUC) values of Basal cell carcinoma, Non-small cell lung cancer and cervical cancer are 95.8%, 88.3% and 93.5%, respectively. For further illustrating the performance of GBDTCDA, a case study of breast cancer is also supplemented in this study. Thus, our proposed method GBDTCDA is a powerful tool to predict potential circRNA-disease associations based on experimental results and analyses.

SUBMITTER: Lei X 

PROVIDER: S-EPMC6909967 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| S-EPMC6555260 | biostudies-literature
| S-EPMC5549711 | biostudies-other
| S-EPMC9553945 | biostudies-literature
| S-EPMC9873411 | biostudies-literature
| S-EPMC7757635 | biostudies-literature
| S-EPMC6902475 | biostudies-literature
| S-EPMC10165759 | biostudies-literature
| S-EPMC6311892 | biostudies-other
| S-EPMC10496006 | biostudies-literature
| S-EPMC8489074 | biostudies-literature