NmSEER V2.0: a prediction tool for 2'-O-methylation sites based on random forest and multi-encoding combination.
Ontology highlight
ABSTRACT: BACKGROUND:2'-O-methylation (2'-O-me or Nm) is a post-transcriptional RNA methylation modified at 2'-hydroxy, which is common in mRNAs and various non-coding RNAs. Previous studies revealed the significance of Nm in multiple biological processes. With Nm getting more and more attention, a revolutionary technique termed Nm-seq, was developed to profile Nm sites mainly in mRNA with single nucleotide resolution and high sensitivity. In a recent work, supported by the Nm-seq data, we have reported a method in silico for predicting Nm sites, which relies on nucleotide sequence information, and established an online server named NmSEER. More recently, a more confident dataset produced by refined Nm-seq was available. Therefore, in this work, we redesigned the prediction model to achieve a more robust performance on the new data. RESULTS:We redesigned the prediction model from two perspectives, including machine learning algorithm and multi-encoding scheme combination. With optimization by 5-fold cross-validation tests and evaluation by independent test respectively, random forest was selected as the most robust algorithm. Meanwhile, one-hot encoding, together with position-specific dinucleotide sequence profile and K-nucleotide frequency encoding were collectively applied to build the final predictor. CONCLUSIONS:The predictor of updated version, named NmSEER V2.0, achieves an accurate prediction performance (AUROC?=?0.862) and has been settled into a brand-new server, which is available at http://www.rnanut.net/nmseer-v2/ for free.
SUBMITTER: Zhou Y
PROVIDER: S-EPMC6929462 | biostudies-literature | 2019 Dec
REPOSITORIES: biostudies-literature
ACCESS DATA