Unknown

Dataset Information

0

An Approach for Predicting Essential Genes Using Multiple Homology Mapping and Machine Learning Algorithms.


ABSTRACT: Investigation of essential genes is significant to comprehend the minimal gene sets of cell and discover potential drug targets. In this study, a novel approach based on multiple homology mapping and machine learning method was introduced to predict essential genes. We focused on 25 bacteria which have characterized essential genes. The predictions yielded the highest area under receiver operating characteristic (ROC) curve (AUC) of 0.9716 through tenfold cross-validation test. Proper features were utilized to construct models to make predictions in distantly related bacteria. The accuracy of predictions was evaluated via the consistency of predictions and known essential genes of target species. The highest AUC of 0.9552 and average AUC of 0.8314 were achieved when making predictions across organisms. An independent dataset from Synechococcus elongatus, which was released recently, was obtained for further assessment of the performance of our model. The AUC score of predictions is 0.7855, which is higher than other methods. This research presents that features obtained by homology mapping uniquely can achieve quite great or even better results than those integrated features. Meanwhile, the work indicates that machine learning-based method can assign more efficient weight coefficients than using empirical formula based on biological knowledge.

SUBMITTER: Hua HL 

PROVIDER: S-EPMC5021884 | biostudies-literature | 2016

REPOSITORIES: biostudies-literature

altmetric image

Publications

An Approach for Predicting Essential Genes Using Multiple Homology Mapping and Machine Learning Algorithms.

Hua Hong-Li HL   Zhang Fa-Zhan FZ   Labena Abraham Alemayehu AA   Dong Chuan C   Jin Yan-Ting YT   Guo Feng-Biao FB  

BioMed research international 20160830


Investigation of essential genes is significant to comprehend the minimal gene sets of cell and discover potential drug targets. In this study, a novel approach based on multiple homology mapping and machine learning method was introduced to predict essential genes. We focused on 25 bacteria which have characterized essential genes. The predictions yielded the highest area under receiver operating characteristic (ROC) curve (AUC) of 0.9716 through tenfold cross-validation test. Proper features w  ...[more]

Similar Datasets

| S-EPMC7139829 | biostudies-literature
| S-EPMC8150380 | biostudies-literature
| S-EPMC8493462 | biostudies-literature
| S-EPMC7702310 | biostudies-literature
| S-EPMC8755739 | biostudies-literature
| S-EPMC4610387 | biostudies-literature
| S-EPMC9278327 | biostudies-literature
| S-EPMC8140139 | biostudies-literature
| S-EPMC7863674 | biostudies-literature
| S-EPMC4781880 | biostudies-literature