Unknown

Dataset Information

0

The Selection of Quantification Pipelines for Illumina RNA-seq Data Using a Subsampling Approach.


ABSTRACT: RNA sequencing, or (RNA-seq for short,, is a widely applied technology that for extractings gene and transcript expression from biological samples. Given numerous quantification pipelines for RNA-seq data, one fundamental challenge is to determine identify a pipeline that can produce the most accurate estimate the most accurate gene and/or transcript expression. Exploring all available pipelines requires tremendous extensive computational resources, so. Therefore, we propose to use a subsampling approach that can improve speed up the pipeline evaluation and selection the efficiency process of pipeline performance evaluation for a given RNA-seq dataset. We applied our approach to one simulated and two real RNA-seq datasets and found that expression estimates derived from subsampled data are close surrogates for those derived from original data. In addition, the ranking of quantification pipelines based on the subsampled data was highly correlated concordant with that based on the original data. Therefore, we conclude that subsampling is a valid approach to facilitating efficient quantification pipeline selection using RNA-seq data.

SUBMITTER: Wu PY 

PROVIDER: S-EPMC5267345 | biostudies-literature | 2016 Feb

REPOSITORIES: biostudies-literature

altmetric image

Publications

The Selection of Quantification Pipelines for Illumina RNA-seq Data Using a Subsampling Approach.

Wu Po-Yen PY   Wang May D MD  

... IEEE-EMBS International Conference on Biomedical and Health Informatics. IEEE-EMBS International Conference on Biomedical and Health Informatics 20160201


RNA sequencing, or (RNA-seq for short,, is a widely applied technology that for extractings gene and transcript expression from biological samples. Given numerous quantification pipelines for RNA-seq data, one fundamental challenge is to determine identify a pipeline that can produce the most accurate estimate the most accurate gene and/or transcript expression. Exploring all available pipelines requires tremendous extensive computational resources, so. Therefore, we propose to use a subsampling  ...[more]

Similar Datasets

| S-EPMC4842274 | biostudies-literature
| S-EPMC2863065 | biostudies-literature
| S-EPMC4985025 | biostudies-literature
| S-EPMC11003973 | biostudies-literature
| S-EPMC4985018 | biostudies-literature
| S-EPMC4226638 | biostudies-literature
| S-EPMC6016759 | biostudies-literature
| S-EPMC4992401 | biostudies-literature
| S-EPMC8275344 | biostudies-literature
| S-EPMC8284643 | biostudies-literature