Project description:With its capacity for high-resolution data output in one region of interest, chromosome conformation capture combined with high-throughput sequencing (4C-seq) is a state-of-the-art next-generation sequencing technique that provides epigenetic insights, and regularly advances current medical research. However, 4C-seq data is complex and prone to biases, and while specialized programs exist, an unbiased, extensive benchmarking is still lacking. Furthermore, neither substantial datasets with fully characterized ground truth, nor simulation programs for realistic 4C-seq data have been published. We conducted a benchmarking study on 54 4C-seq samples from 12 datasets, including original murine BMM, T-cell, and 416B data, and developed a novel 4C-seq simulation software to allow for more detailed comparisons of 4C-seq algorithms on 50 simulated datasets with 10 to 120 samples each.
Project description:Plants are attacked by diverse herbivores and respond with manifold defense responses. To study transcriptional and other early regulation events of these plant responses, herbivory is often mimicked to standardize the temporal and spatial dynamics that vary tremendously for natural herbivory. Yet to what extent such mimics of herbivory are able to elicit the same plant response as real herbivory remains largely undetermined. We examined the transcriptional response of a new model plant to herbivory by a lepidopteran larva and to a commonly used herbivory mimic by applying the larvae’s oral secretions to standardized wounds. We designed a microarray for Solanum dulcamara and showed that the transcriptional response to real and to simulated herbivory by Spodoptera exigua overlapped moderately by about 40%. Interestingly, certain responses were mimicked better than others; 60% of the genes up-regulated but not even a quarter of the genes down-regulated by herbivory were similarly affected by application of oral secretions to wounds. While the regulation of genes involved in signaling, defense and water stress were mirrored well by the herbivory mimic, most of the genes related to photosynthesis, carbohydrate- and lipid metabolism were exclusively regulated by real herbivory. Thus, wounding and elicitor application decently mimics herbivory-induced defense responses but likely not the re-allocation of primary metabolites induced by real herbivory.
Project description:4C-Seq has proven to be a powerful technique to identify genome-wide interactions with a single locus of interest (or "bait") that can be important for gene regulation. However, analysis of 4C-Seq data is complicated by the many biases inherent to the technique. An important consideration when dealing with 4C-Seq data is the differences in resolution of signal across the genome that result from differences in 3D distance separation from the bait. This leads to the highest signal in the region immediately surrounding the bait and increasingly lower signals in far-cis and trans. Another important aspect of 4C-Seq experiments is the resolution, which is greatly influenced by the choice of restriction enzyme and the frequency at which it can cut the genome. Thus, it is important that a 4C-Seq analysis method is flexible enough to analyze data generated using different enzymes and to identify interactions across the entire genome. Current methods for 4C-Seq analysis only identify interactions in regions near the bait or in regions located in far-cis and trans, but no method comprehensively analyzes 4C signals of different length scales. In addition, some methods also fail in experiments where chromatin fragments are generated using frequent cutter restriction enzymes. Here, we describe 4C-ker, a Hidden-Markov Model based pipeline that identifies regions throughout the genome that interact with the 4C bait locus. In addition, we incorporate methods for the identification of differential interactions in multiple 4C-seq datasets collected from different genotypes or experimental conditions. Adaptive window sizes are used to correct for differences in signal coverage in near-bait regions, far-cis and trans chromosomes. Using several datasets, we demonstrate that 4C-ker outperforms all existing 4C-Seq pipelines in its ability to reproducibly identify interaction domains at all genomic ranges with different resolution enzymes. 4C-Seq experiments from Igh and Cd83 bait in activated B cells and Tcrb (Eb) bait in double negative T cells and immature B cells. RNA-Seq and ATAC-Seq experiments in DN and Immature B cells.
Project description:4C-Seq has proven to be a powerful technique to identify genome-wide interactions with a single locus of interest (or "bait") that can be important for gene regulation. However, analysis of 4C-Seq data is complicated by the many biases inherent to the technique. An important consideration when dealing with 4C-Seq data is the differences in resolution of signal across the genome that result from differences in 3D distance separation from the bait. This leads to the highest signal in the region immediately surrounding the bait and increasingly lower signals in far-cis and trans. Another important aspect of 4C-Seq experiments is the resolution, which is greatly influenced by the choice of restriction enzyme and the frequency at which it can cut the genome. Thus, it is important that a 4C-Seq analysis method is flexible enough to analyze data generated using different enzymes and to identify interactions across the entire genome. Current methods for 4C-Seq analysis only identify interactions in regions near the bait or in regions located in far-cis and trans, but no method comprehensively analyzes 4C signals of different length scales. In addition, some methods also fail in experiments where chromatin fragments are generated using frequent cutter restriction enzymes. Here, we describe 4C-ker, a Hidden-Markov Model based pipeline that identifies regions throughout the genome that interact with the 4C bait locus. In addition, we incorporate methods for the identification of differential interactions in multiple 4C-seq datasets collected from different genotypes or experimental conditions. Adaptive window sizes are used to correct for differences in signal coverage in near-bait regions, far-cis and trans chromosomes. Using several datasets, we demonstrate that 4C-ker outperforms all existing 4C-Seq pipelines in its ability to reproducibly identify interaction domains at all genomic ranges with different resolution enzymes.
Project description:At present, it is admitted that RNA-seq is a more powerful and adaptable technique than hybridization arrays. Nevertheless, as RNA-seq needs a more complex data analysis, it has generated a lot of research on algorithms and workflows. This has resulted in an exponential increase of the options at each step of the analysis. Consequently, there is no clear consensus on the appropriate algorithms and pipelines that should be used to analyse RNA-seq data. In the present study, 192 pipelines on 18 samples from 2 human cell lines were evaluated. Absolute gene expression quantification was assessed by non-parametric statistics to measure precision and accuracy. Relative gene expression performance was estimated testing 19 differential expression methods. These results were contrasted in parallel with the microarray HTA 2.0 data from Affymetrix using the same set of samples. All procedures were validated by qRT-PCR on 32 genes in all samples. In addition, this study proposes a new statistical approach for precision and accuracy evaluation on real RNA-seq data. It also weights up the advantages and disadvantages of the algorithms and pipelines tested and gives a guide to select the appropriate pipeline to analyse RNA-seq and microarray data.
Project description:We describe an improved individual nucleotide resolution CLIP protocol (iiCLIP), which can be completed within 4 days from UV crosslinking to libraries for sequencing. For benchmarking, we directly compared PTBP1 iiCLIP libraries with the iCLIP2 protocol produced under standardised conditions with 1 million HEK293 cells, and with public eCLIP and iCLIP PTBP1 data. There are 3 PTBP1 iiCLIP libraries, 1 input iiCLIP library and 1 PTBP1 iCLIP2 library produced in this study.
Project description:Benchmarking Proteomics Quantitation in DIA-type data using real patient material to create a benchmark dataset comprising inter-patient heterogeneity