Dataset Information

A novel statistical method for quantitative comparison of multiple ChIP-seq datasets.

ABSTRACT: ChIP-seq is a powerful technology to measure the protein binding or histone modification strength in the whole genome scale. Although there are a number of methods available for single ChIP-seq data analysis (e.g. 'peak detection'), rigorous statistical method for quantitative comparison of multiple ChIP-seq datasets with the considerations of data from control experiment, signal to noise ratios, biological variations and multiple-factor experimental designs is under-developed.In this work, we develop a statistical method to perform quantitative comparison of multiple ChIP-seq datasets and detect genomic regions showing differential protein binding or histone modification. We first detect peaks from all datasets and then union them to form a single set of candidate regions. The read counts from IP experiment at the candidate regions are assumed to follow Poisson distribution. The underlying Poisson rates are modeled as an experiment-specific function of artifacts and biological signals. We then obtain the estimated biological signals and compare them through the hypothesis testing procedure in a linear model framework. Simulations and real data analyses demonstrate that the proposed method provides more accurate and robust results compared with existing ones.An R software package ChIPComp is freely available at http://web1.sph.emory.edu/users/hwu30/software/ChIPComp.html.

SUBMITTER: Chen L

PROVIDER: S-EPMC4542775 | biostudies-literature | 2015 Jun

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

A novel statistical method for quantitative comparison of multiple ChIP-seq datasets.

Chen Li L Wang Chi C Qin Zhaohui S ZS Wu Hao H

Bioinformatics (Oxford, England) 20150213 12

<h4>Motivation</h4>ChIP-seq is a powerful technology to measure the protein binding or histone modification strength in the whole genome scale. Although there are a number of methods available for single ChIP-seq data analysis (e.g. 'peak detection'), rigorous statistical method for quantitative comparison of multiple ChIP-seq datasets with the considerations of data from control experiment, signal to noise ratios, biological variations and multiple-factor experimental designs is under-developed ...[more]

PMID: 25682068

Similar Datasets

Project description:MotivationChIP-seq technology enables investigators to study genome-wide binding of transcription factors and mapping of epigenomic marks. Although the availability of basic analysis tools for ChIP-seq data is rapidly increasing, there has not been much progress on the related design issues. A challenging question for designing a ChIP-seq experiment is how deeply should the ChIP and the control samples be sequenced? The answer depends on multiple factors some of which can be set by the experimenter based on pilot/preliminary data. The sequencing depth of a ChIP-seq experiment is one of the key factors that determine whether all the underlying targets (e.g. binding locations or epigenomic profiles) can be identified with a targeted power.ResultsWe developed a statistical framework named CSSP (ChIP-seq Statistical Power) for power calculations in ChIP-seq experiments by considering a local Poisson model, which is commonly adopted by many peak callers. Evaluations with simulations and data-driven computational experiments demonstrate that this framework can reliably estimate the power of a ChIP-seq experiment at different sequencing depths based on pilot data. Furthermore, it provides an analytical approach for calculating the required depth for a targeted power while controlling the false discovery rate at a user-specified level. Hence, our results enable researchers to use their own or publicly available data for determining required sequencing depths of their ChIP-seq experiments and potentially make better use of the multiplexing functionality of the sequencers. Evaluation of power for multiple public ChIP-seq datasets indicate that, currently, typical ChIP-seq studies are powered well for detecting large fold changes of ChIP enrichment over the control sample, but they have considerably less power for detecting smaller fold changes.AvailabilityAvailable at www.stat.wisc.edu/~zuo/CSSP.Contactkeles@stat.wisc.eduSupplementary informationSupplementary data are available at Bioinformatics online.

Dataset Information

A novel statistical method for quantitative comparison of multiple ChIP-seq datasets.

Publications

A novel statistical method for quantitative comparison of multiple ChIP-seq datasets.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets