TERIUS: accurate prediction of lncRNA via high-throughput sequencing data representing RNA-binding protein association.
Ontology highlight
ABSTRACT: BACKGROUND:LncRNAs are long regulatory non-coding RNAs, some of which are arguably predicted to have coding potential. Despite coding potential classifiers that utilize ribosome profiling data successfully detected actively translated regions, they are less sensitive to lncRNAs. Furthermore, lncRNA annotation can be susceptible to false positives obtained from 3' untranslated region (UTR) fragments of mRNAs. RESULTS:To lower these limitations in lncRNA annotation, we present a novel tool TERIUS that provides a two-step filtration process to distinguish between bona fide and false lncRNAs. The first step successfully separates lncRNAs from protein-coding genes showing enhanced sensitivity compared to other methods. To eliminate 3'UTR fragments, the second step takes advantage of the 3'UTR-specific association with regulator of nonsense transcripts 1 (UPF1), leading to refined lncRNA annotation. Importantly, TERIUS enabled the detection of misclassified transcripts in published lncRNA annotations. CONCLUSIONS:TERIUS is a robust method for lncRNA annotation, which provides an additional filtration step for 3'UTR fragments. TERIUS was able to successfully re-classify GENCODE and miTranscriptome lncRNA annotations. We believe that TERIUS can benefit construction of extensive and accurate non-coding transcriptome maps in many genomes.
SUBMITTER: Choi SW
PROVIDER: S-EPMC5836835 | biostudies-literature | 2018 Feb
REPOSITORIES: biostudies-literature
ACCESS DATA