Unknown

Dataset Information

0

RNA-CODE: a noncoding RNA classification tool for short reads in NGS data lacking reference genomes.


ABSTRACT: The number of transcriptomic sequencing projects of various non-model organisms is still accumulating rapidly. As non-coding RNAs (ncRNAs) are highly abundant in living organism and play important roles in many biological processes, identifying fragmentary members of ncRNAs in small RNA-seq data is an important step in post-NGS analysis. However, the state-of-the-art ncRNA search tools are not optimized for next-generation sequencing (NGS) data, especially for very short reads. In this work, we propose and implement a comprehensive ncRNA classification tool (RNA-CODE) for very short reads. RNA-CODE is specifically designed for ncRNA identification in NGS data that lack quality reference genomes. Given a set of short reads, our tool classifies the reads into different types of ncRNA families. The classification results can be used to quantify the expression levels of different types of ncRNAs in RNA-seq data and ncRNA composition profiles in metagenomic data, respectively. The experimental results of applying RNA-CODE to RNA-seq of Arabidopsis and a metagenomic data set sampled from human guts demonstrate that RNA-CODE competes favorably in both sensitivity and specificity with other tools. The source codes of RNA-CODE can be downloaded at http://www.cse.msu.edu/~chengy/RNA_CODE.

SUBMITTER: Yuan C 

PROVIDER: S-EPMC3808423 | biostudies-literature | 2013

REPOSITORIES: biostudies-literature

altmetric image

Publications

RNA-CODE: a noncoding RNA classification tool for short reads in NGS data lacking reference genomes.

Yuan Cheng C   Sun Yanni Y  

PloS one 20131025 10


The number of transcriptomic sequencing projects of various non-model organisms is still accumulating rapidly. As non-coding RNAs (ncRNAs) are highly abundant in living organism and play important roles in many biological processes, identifying fragmentary members of ncRNAs in small RNA-seq data is an important step in post-NGS analysis. However, the state-of-the-art ncRNA search tools are not optimized for next-generation sequencing (NGS) data, especially for very short reads. In this work, we  ...[more]

Similar Datasets

| S-EPMC4659627 | biostudies-literature
| S-EPMC4274631 | biostudies-literature
| S-EPMC5009518 | biostudies-literature
| S-EPMC5522380 | biostudies-literature
| S-EPMC6690869 | biostudies-literature
2023-08-10 | GSE194237 | GEO
| S-EPMC3106194 | biostudies-literature
| S-EPMC6933617 | biostudies-literature
| S-EPMC5287235 | biostudies-literature
2020-12-31 | GSE135651 | GEO