Dataset Information

A flexible Bayesian method for detecting allelic imbalance in RNA-seq data.

ABSTRACT:

Background

One method of identifying cis regulatory differences is to analyze allele-specific expression (ASE) and identify cases of allelic imbalance (AI). RNA-seq is the most common way to measure ASE and a binomial test is often applied to determine statistical significance of AI. This implicitly assumes that there is no bias in estimation of AI. However, bias has been found to result from multiple factors including: genome ambiguity, reference quality, the mapping algorithm, and biases in the sequencing process. Two alternative approaches have been developed to handle bias: adjusting for bias using a statistical model and filtering regions of the genome suspected of harboring bias. Existing statistical models which account for bias rely on information from DNA controls, which can be cost prohibitive for large intraspecific studies. In contrast, data filtering is inexpensive and straightforward, but necessarily involves sacrificing a portion of the data.

Results

Here we propose a flexible Bayesian model for analysis of AI, which accounts for bias and can be implemented without DNA controls. In lieu of DNA controls, this Poisson-Gamma (PG) model uses an estimate of bias from simulations. The proposed model always has a lower type I error rate compared to the binomial test. Consistent with prior studies, bias dramatically affects the type I error rate. All of the tested models are sensitive to misspecification of bias. The closer the estimate of bias is to the true underlying bias, the lower the type I error rate. Correct estimates of bias result in a level alpha test.

Conclusions

To improve the assessment of AI, some forms of systematic error (e.g., map bias) can be identified using simulation. The resulting estimates of bias can be used to correct for bias in the PG model, without data filtering. Other sources of bias (e.g., unidentified variant calls) can be easily captured by DNA controls, but are missed by common filtering approaches. Consequently, as variant identification improves, the need for DNA controls will be reduced. Filtering does not significantly improve performance and is not recommended, as information is sacrificed without a measurable gain. The PG model developed here performs well when bias is known, or slightly misspecified. The model is flexible and can accommodate differences in experimental design and bias estimation.

SUBMITTER: Leon-Novelo LG

PROVIDER: S-EPMC4230747 | biostudies-literature | 2014 Oct

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

A flexible Bayesian method for detecting allelic imbalance in RNA-seq data.

León-Novelo Luis G LG McIntyre Lauren M LM Fear Justin M JM Graze Rita M RM

BMC genomics 20141023

<h4>Background</h4>One method of identifying cis regulatory differences is to analyze allele-specific expression (ASE) and identify cases of allelic imbalance (AI). RNA-seq is the most common way to measure ASE and a binomial test is often applied to determine statistical significance of AI. This implicitly assumes that there is no bias in estimation of AI. However, bias has been found to result from multiple factors including: genome ambiguity, reference quality, the mapping algorithm, and bias ...[more]

PMID: 25339465

Dataset Information

A flexible Bayesian method for detecting allelic imbalance in RNA-seq data.

Background

Results

Conclusions

Publications

A flexible Bayesian method for detecting allelic imbalance in RNA-seq data.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Power calculator for detecting allelic imbalance using hierarchical Bayesian model.
| S-EPMC8626927 | biostudies-literature

Detecting Multivariate Gene Interactions in RNA-Seq Data Using Optimal Bayesian Classification.
| S-EPMC4818202 | biostudies-other

Experimental and Computational Methods for Allelic Imbalance Analysis from Single-Nucleus RNA-seq Data.
| S-EPMC11343128 | biostudies-literature

Comparison of quantitative trait loci methods: Total expression and allelic imbalance method in brain RNA-seq.
| S-EPMC6576752 | biostudies-literature

An Efficient and Flexible Method for Deconvoluting Bulk RNA-Seq Data with Single-Cell RNA-Seq Data.
| S-EPMC6830085 | biostudies-literature

Allelic imbalance metre (Allim), a new tool for measuring allele-specific gene expression with RNA-seq data.
| S-EPMC3739924 | biostudies-literature

Detecting cell-type-specific allelic expression imbalance by integrative analysis of bulk and single-cell RNA sequencing data.
| S-EPMC7963069 | biostudies-literature

MATS: a Bayesian framework for flexible detection of differential alternative splicing from RNA-Seq data.
| S-EPMC3333886 | biostudies-literature

A general and flexible method for signal extraction from single-cell RNA-seq data.
| S-EPMC5773593 | biostudies-literature

A statistical method for detecting differentially expressed SNVs based on next-generation RNA-seq data.
| S-EPMC5151178 | biostudies-literature