Ontology highlight
ABSTRACT: Motivation
RNA sequencing (RNA-Seq) is a powerful new technology for mapping and quantifying transcriptomes using ultra high-throughput next-generation sequencing technologies. Using deep sequencing, gene expression levels of all transcripts including novel ones can be quantified digitally. Although extremely promising, the massive amounts of data generated by RNA-Seq, substantial biases and uncertainty in short read alignment pose challenges for data analysis. In particular, large base-specific variation and between-base dependence make simple approaches, such as those that use averaging to normalize RNA-Seq data and quantify gene expressions, ineffective.Results
In this study, we propose a Poisson mixed-effects (POME) model to characterize base-level read coverage within each transcript. The underlying expression level is included as a key parameter in this model. Since the proposed model is capable of incorporating base-specific variation as well as between-base dependence that affect read coverage profile throughout the transcript, it can lead to improved quantification of the true underlying expression level.Availability and implementation
POME can be freely downloaded at http://www.stat.purdue.edu/~yuzhu/pome.html.Contact
yuzhu@purdue.edu; zhaohui.qin@emory.eduSupplementary information
Supplementary data are available at Bioinformatics online.
SUBMITTER: Hu M
PROVIDER: S-EPMC3244770 | biostudies-literature | 2012 Jan
REPOSITORIES: biostudies-literature
Hu Ming M Zhu Yu Y Taylor Jeremy M G JM Liu Jun S JS Qin Zhaohui S ZS
Bioinformatics (Oxford, England) 20111108 1
<h4>Motivation</h4>RNA sequencing (RNA-Seq) is a powerful new technology for mapping and quantifying transcriptomes using ultra high-throughput next-generation sequencing technologies. Using deep sequencing, gene expression levels of all transcripts including novel ones can be quantified digitally. Although extremely promising, the massive amounts of data generated by RNA-Seq, substantial biases and uncertainty in short read alignment pose challenges for data analysis. In particular, large base- ...[more]