Unknown

Dataset Information

0

PennSeq: accurate isoform-specific gene expression quantification in RNA-Seq by modeling non-uniform read distribution.


ABSTRACT: Correctly estimating isoform-specific gene expression is important for understanding complicated biological mechanisms and for mapping disease susceptibility genes. However, estimating isoform-specific gene expression is challenging because various biases present in RNA-Seq (RNA sequencing) data complicate the analysis, and if not appropriately corrected, can affect isoform expression estimation and downstream analysis. In this article, we present PennSeq, a statistical method that allows each isoform to have its own non-uniform read distribution. Instead of making parametric assumptions, we give adequate weight to the underlying data by the use of a non-parametric approach. Our rationale is that regardless what factors lead to non-uniformity, whether it is due to hexamer priming bias, local sequence bias, positional bias, RNA degradation, mapping bias or other unknown reasons, the probability that a fragment is sampled from a particular region will be reflected in the aligned data. This empirical approach thus maximally reflects the true underlying non-uniform read distribution. We evaluate the performance of PennSeq using both simulated data with known ground truth, and using two real Illumina RNA-Seq data sets including one with quantitative real time polymerase chain reaction measurements. Our results indicate superior performance of PennSeq over existing methods, particularly for isoforms demonstrating severe non-uniformity. PennSeq is freely available for download at http://sourceforge.net/projects/pennseq.

SUBMITTER: Hu Y 

PROVIDER: S-EPMC3919567 | biostudies-literature | 2014 Feb

REPOSITORIES: biostudies-literature

altmetric image

Publications

PennSeq: accurate isoform-specific gene expression quantification in RNA-Seq by modeling non-uniform read distribution.

Hu Yu Y   Liu Yichuan Y   Mao Xianyun X   Jia Cheng C   Ferguson Jane F JF   Xue Chenyi C   Reilly Muredach P MP   Li Hongzhe H   Li Mingyao M  

Nucleic acids research 20131220 3


Correctly estimating isoform-specific gene expression is important for understanding complicated biological mechanisms and for mapping disease susceptibility genes. However, estimating isoform-specific gene expression is challenging because various biases present in RNA-Seq (RNA sequencing) data complicate the analysis, and if not appropriately corrected, can affect isoform expression estimation and downstream analysis. In this article, we present PennSeq, a statistical method that allows each i  ...[more]

Similar Datasets

| S-EPMC5935499 | biostudies-literature
| S-EPMC4380033 | biostudies-literature
2024-07-10 | GSE271530 | GEO
2024-07-10 | GSE271528 | GEO
2024-07-10 | GSE271527 | GEO
| S-EPMC3718502 | biostudies-literature
| S-EPMC2863065 | biostudies-literature
| S-EPMC4428808 | biostudies-literature
| S-EPMC4598124 | biostudies-literature
| S-EPMC8145802 | biostudies-literature