Project description:ChIP-seq experiments are standard experimental procedure for interrogating epigenetic states and protein-DNA interactions. Sequencing experiments are often designed according to the trade-off between the need to obtain maximum sequencing coverage limited funds. Multiplexing samples is a common approach to minimize cost and maximize information yield. We therefore performed an extensive ChiP-seq multiplexing study to gain a better understanding of the effect of multiplexing on the resulting peak detection and genomic annotation and to provide solid guidelines for multiplexing ChIP-seq studies. For a well characterized antibody, our results indicate that multiplexing to ~20M reads (roughly 8 samples per sequencing lane) is sufficient to capture most of the biological signal. Multiplexing samples in sequencing experiments is a common approach to maximize information yield while minimizing cost. In most cases the number of samples that are multiplexed is determined by financial consideration or experimental convenience with limited understanding on the effects on the experimental results. Here we set to examine the impact of multiplexing ChIP-seq experiments on the ability to identify a specific epigenetic modification. We performed an analysis of peak detection to determine the effects of multiplexing. These include false discovery rates, size, position and statistical significance of peak detection and changes in gene annotation. We found that, for histone marker H3K4me3, one can multiplex up to 8 samples (7 IP + 1 input) at ~21 million reads each and still detect over 90% of all peaks found when using a full lane for sample. Furthermore, there are no variations introduced by indexing or lane batch effects and importantly there is no significant reduction in the number of genes with neighboring H3K4me3 peaks. We conclude that, for a well characterized antibody and therefore, model IP condition, multiplexing 8 samples per lane is sufficient to capture most of the biological signal.
Project description:Methods: The cDNA libraries from 3 pooled samples of cultured cells for each group were sequenced to generate RNA profiles using Illumina Miseq platform Results: The sequencing runs yielded a total of 95.01 M reads with an average length of 73-74 bp. The high-throughput sequencing performed for liver samples with different treatments showed similar numbers of yielded reads ranged from 5.57 to 5.74 M and the same average length. The Strand NGS software (version 2.1) was used using default parameters for pre-alignment and post-alignment quality control analysis and 100% of the raw reads remained in the dataset. Of these, 19.02 M reads (84%) were mapped into contigs of the rat genome (rn5) and identified 31457 transcripts in liver samples.