Unknown

Dataset Information

0

Omni-PolyA: a method and tool for accurate recognition of Poly(A) signals in human genomic DNA.


ABSTRACT:

Background

Polyadenylation is a critical stage of RNA processing during the formation of mature mRNA, and is present in most of the known eukaryote protein-coding transcripts and many long non-coding RNAs. The correct identification of poly(A) signals (PAS) not only helps to elucidate the 3'-end genomic boundaries of a transcribed DNA region and gene regulatory mechanisms but also gives insight into the multiple transcript isoforms resulting from alternative PAS. Although progress has been made in the in-silico prediction of genomic signals, the recognition of PAS in DNA genomic sequences remains a challenge.

Results

In this study, we analyzed human genomic DNA sequences for the 12 most common PAS variants. Our analysis has identified a set of features that helps in the recognition of true PAS, which may be involved in the regulation of the polyadenylation process. The proposed features, in combination with a recognition model, resulted in a novel method and tool, Omni-PolyA. Omni-PolyA combines several machine learning techniques such as different classifiers in a tree-like decision structure and genetic algorithms for deriving a robust classification model. We performed a comparison between results obtained by state-of-the-art methods, deep neural networks, and Omni-PolyA. Results show that Omni-PolyA significantly reduced the average classification error rate by 35.37% in the prediction of the 12 considered PAS variants relative to the state-of-the-art results.

Conclusions

The results of our study demonstrate that Omni-PolyA is currently the most accurate model for the prediction of PAS in human and can serve as a useful complement to other PAS recognition methods. Omni-PolyA is publicly available as an online tool accessible at www.cbrc.kaust.edu.sa/omnipolya/ .

SUBMITTER: Magana-Mora A 

PROVIDER: S-EPMC5558757 | biostudies-literature | 2017 Aug

REPOSITORIES: biostudies-literature

altmetric image

Publications

Omni-PolyA: a method and tool for accurate recognition of Poly(A) signals in human genomic DNA.

Magana-Mora Arturo A   Kalkatawi Manal M   Bajic Vladimir B VB  

BMC genomics 20170815 1


<h4>Background</h4>Polyadenylation is a critical stage of RNA processing during the formation of mature mRNA, and is present in most of the known eukaryote protein-coding transcripts and many long non-coding RNAs. The correct identification of poly(A) signals (PAS) not only helps to elucidate the 3'-end genomic boundaries of a transcribed DNA region and gene regulatory mechanisms but also gives insight into the multiple transcript isoforms resulting from alternative PAS. Although progress has be  ...[more]

Similar Datasets

| S-EPMC3244764 | biostudies-literature
| S-EPMC7337927 | biostudies-literature
| S-EPMC5325289 | biostudies-literature
| S-EPMC1327677 | biostudies-literature
| S-EPMC6449759 | biostudies-literature
| S-EPMC2912421 | biostudies-literature
| S-EPMC2657899 | biostudies-other
| S-EPMC5499748 | biostudies-literature
| S-EPMC1865077 | biostudies-literature
| S-EPMC1865059 | biostudies-literature