Unknown

Dataset Information

0

IPro54-PseKNC: a sequence-based predictor for identifying sigma-54 promoters in prokaryote with pseudo k-tuple nucleotide composition.


ABSTRACT: The σ(54) promoters are unique in prokaryotic genome and responsible for transcripting carbon and nitrogen-related genes. With the avalanche of genome sequences generated in the postgenomic age, it is highly desired to develop automated methods for rapidly and effectively identifying the σ(54) promoters. Here, a predictor called 'iPro54-PseKNC' was developed. In the predictor, the samples of DNA sequences were formulated by a novel feature vector called 'pseudo k-tuple nucleotide composition', which was further optimized by the incremental feature selection procedure. The performance of iPro54-PseKNC was examined by the rigorous jackknife cross-validation tests on a stringent benchmark data set. As a user-friendly web-server, iPro54-PseKNC is freely accessible at http://lin.uestc.edu.cn/server/iPro54-PseKNC. For the convenience of the vast majority of experimental scientists, a step-by-step protocol guide was provided on how to use the web-server to get the desired results without the need to follow the complicated mathematics that were presented in this paper just for its integrity. Meanwhile, we also discovered through an in-depth statistical analysis that the distribution of distances between the transcription start sites and the translation initiation sites were governed by the gamma distribution, which may provide a fundamental physical principle for studying the σ(54) promoters.

SUBMITTER: Lin H 

PROVIDER: S-EPMC4245931 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| S-EPMC6759546 | biostudies-literature
| S-EPMC4055483 | biostudies-literature
| S-EPMC4510404 | biostudies-literature
2018-12-25 | GSE111317 | GEO
| S-EPMC7670410 | biostudies-literature
2013-06-29 | GSE48410 | GEO