Unknown

Dataset Information

0

Improving the Flexibility of RNA-Seq Data Analysis Pipelines.


ABSTRACT: Accurate quantification of gene or isoform expression with RNA-Seq depends on complete knowledge of the transcriptome. Because a complete genomic annotation does not yet exist, novel isoform discovery is an important component of the RNA-Seq quantification process. Thus, a typical RNA-Seq pipeline includes a transcriptome mapping step to quantify known genes and isoforms, and a reference genome mapping step to discover new genes and isoforms. Several tools implement this approach, but are limited in that they force the use of a single mapping algorithm at both the transcriptome and reference genome mapping stages. The choice of mapping algorithm could affect quantification accuracy on a per-dataset basis. Thus, we describe a method that enables the merging of transcriptome and reference genome mapping stages provided that they conform to the standard SAM/BAM format. This procedure could potentially improve the accuracy of gene or isoform quantification by increasing flexibility when selecting RNA-Seq data analysis pipelines. We demonstrate an example of a flexible RNA-Seq pipeline by assessing its potential for novel isoform discovery and by validating its quantification performance using qRT-PCR.

SUBMITTER: Phan JH 

PROVIDER: S-EPMC4985025 | biostudies-literature | 2012 Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications

Improving the Flexibility of RNA-Seq Data Analysis Pipelines.

Phan John H JH   Wu Po-Yen PY   Wang May D MD  

IEEE International Workshop on Genomic Signal Processing and Statistics : [proceedings]. IEEE International Workshop on Genomic Signal Processing and Statistics 20121201


Accurate quantification of gene or isoform expression with RNA-Seq depends on complete knowledge of the transcriptome. Because a complete genomic annotation does not yet exist, novel isoform discovery is an important component of the RNA-Seq quantification process. Thus, a typical RNA-Seq pipeline includes a transcriptome mapping step to quantify known genes and isoforms, and a reference genome mapping step to discover new genes and isoforms. Several tools implement this approach, but are limite  ...[more]

Similar Datasets

| S-EPMC6789098 | biostudies-literature
| S-EPMC4226638 | biostudies-literature
| S-EPMC4842274 | biostudies-literature
| S-EPMC5102490 | biostudies-literature
| S-EPMC5267345 | biostudies-literature
| S-EPMC6275443 | biostudies-literature
| S-EPMC7751108 | biostudies-literature
| S-EPMC4575116 | biostudies-literature
| S-EPMC4325541 | biostudies-literature
| S-EPMC7745353 | biostudies-literature