Unknown

Dataset Information

0

Computational prediction of species-specific yeast DNA replication origin via iterative feature representation.


ABSTRACT: Deoxyribonucleic acid replication is one of the most crucial tasks taking place in the cell, and it has to be precisely regulated. This process is initiated in the replication origins (ORIs), and thus it is essential to identify such sites for a deeper understanding of the cellular processes and functions related to the regulation of gene expression. Considering the important tasks performed by ORIs, several experimental and computational approaches have been developed in the prediction of such sites. However, existing computational predictors for ORIs have certain curbs, such as building only single-feature encoding models, limited systematic feature engineering efforts and failure to validate model robustness. Hence, we developed a novel species-specific yeast predictor called yORIpred that accurately identify ORIs in the yeast genomes. To develop yORIpred, we first constructed optimal 40 baseline models by exploring eight different sequence-based encodings and five different machine learning classifiers. Subsequently, the predicted probability of 40 models was considered as the novel feature vector and carried out iterative feature learning approach independently using five different classifiers. Our systematic analysis revealed that the feature representation learned by the support vector machine algorithm (yORIpred) could well discriminate the distribution characteristics between ORIs and non-ORIs when compared with the other four algorithms. Comprehensive benchmarking experiments showed that yORIpred achieved superior and stable performance when compared with the existing predictors on the same training datasets. Furthermore, independent evaluation showcased the best and accurate performance of yORIpred thus underscoring the significance of iterative feature representation. To facilitate the users in obtaining their desired results without undergoing any mathematical, statistical or computational hassles, we developed a web server for the yORIpred predictor, which is available at: http://thegleelab.org/yORIpred.

SUBMITTER: Manavalan B 

PROVIDER: S-EPMC8294535 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| S-EPMC8300595 | biostudies-literature
| S-EPMC4947862 | biostudies-literature
| S-EPMC6967417 | biostudies-literature
| S-EPMC3989655 | biostudies-literature
| S-EPMC10622758 | biostudies-literature
| S-EPMC10810365 | biostudies-literature
| S-EPMC3879090 | biostudies-other
| S-EPMC8658322 | biostudies-literature
| S-EPMC6328132 | biostudies-literature
| PRJEB49309 | ENA