Project description:We use nucleosome maps obtained by high-throughput sequencing to study sequence specificity of intrinsic histone-DNA interactions. In contrast with previous approaches, we employ an analogy between a classical one-dimensional fluid of finite-size particles in an arbitrary external potential and arrays of DNA-bound histone octamers. We derive an analytical solution to infer free energies of nucleosome formation directly from nucleosome occupancies measured in high-throughput experiments. The sequence-specific part of free energies is then captured by fitting them to a sum of energies assigned to individual nucleotide motifs. We have developed hierarchical models of increasing complexity and spatial resolution, establishing that nucleosome occupancies can be explained by systematic differences in mono- and dinucleotide content between nucleosomal and linker DNA sequences, with periodic dinucleotide distributions and longer sequence motifs playing a secondary role. Furthermore, similar sequence signatures are exhibited by control experiments in which genomic DNA is either sonicated or digested with micrococcal nuclease in the absence of nucleosomes, making it possible that current predictions based on highthroughput nucleosome positioning maps are biased by experimental artifacts. Included are raw (eland) and mapped (wig) reads. The mapped reads are provided in eland and wiggle formats, and the raw reads are included in the eland file. This series includes only Mnase control data. The sonicated control is part of this already published accession, as is a in vitro nucleosome map: http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE15188 We also studied data (in vitro and in vivo maps as well as a model) from http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE13622 and from: http://www.ncbi.nlm.nih.gov/sra/?term=SRA001023
Project description:To identify more targets in soybean, particularly specific targets of Cd-stress-responsive miRNAs, high-throughput degradome sequencing was used. In total, we obtained 8913111 raw reads from the library which was constructed from a mixture of four samples (HX3-CK, HX3-Cd-treatment, ZH24-CK and ZH24-Cd-treatment). After removing the reads without the CAGAG adaptor, 5430126 unique raw-reads were obtained. The unique sequences were aligned to the G. max genome database, and 6516276 reads were mapped to the genome. The mapped reads from the libraries represented 51481 annotated G. max genes.
Project description:Genotyping of RpoD mutants via amplicon sequencing from the following manuscript: \\"Systematic dissection of σ70 sequence diversity and function in bacteria\\" by Park and Wang (2020). Includes raw sequencing reads from samples from MAGE-seq single codon saturation mutagenesis and high-throughput fitness competition experiment as well as the RpoD ortholog mutants generated through recombineering and CRISPR selection.
Project description:Purpose: To ensure that ABX464 acted specifically on HIV splicing and did not significantly or globally affect the splicing events of human genes, we used an assembly approach of HIV (YU2 strain) putative transcripts and human long non-coding sequences from paired-reads (2x75bp) captured on a NimbleGen SeqCap® EZ Developer Library (Roche/NimbleGen). Methods: Cells were infected with 80 ng of p24/106 cells of the YU-2 strain for 4 to 6 hours and then rinsed with PBS before medium renewal, followed by high-throughput RNAseq from custom SeqCap EZ capture libraries. Each raw dataset of the samples contained between 5 and 30 million paired-end reads (75 bp), with an average of approximately 12 million raw reads per sample. Results: The raw reads were then cleaned and assembled per library to generate contigs, giving an average of 930 contigs per sample for further analyses. Conclusions: Our results show that high-throughput analyses coupled with bioinformatics-specific tools offers a comprehensive and more accurate view of mRNA splicing within a cell.
2018-12-31 | GSE120109 | GEO
Project description:High throughput sequencing Raw sequence reads
Project description:Purpose: To ensure that ABX464 acted specifically on HIV splicing and did not significantly or globally affect the splicing events of human genes, we used a high-throughput RNAseq approach. Many genome-wide expression studies of HIV infection are based on analyses of total peripheral blood mononuclear cells (PBMCs), which consist of over a dozen cell subsets, including T cells, B cells, NK cells and monocytes Methods: The CD4 T cells were uninfected or infected with the YU2 strain and were untreated or treated for 6 days with ABX464, followed by high-throughput RNAseq. Each raw dataset of the samples contained between 44 and 105 million single-end reads (50 bp), with an average of approximately 60 million raw reads per sample Results: Approximately 98% of the total raw reads were mapped to the human genome sequence (GRCh38), giving an average of 60 million human reads per sample for further analyses. The reads that were correctly mapped (approximately 98% of total input reads) to the gene and transcript locations (GTF annotation file) Conclusions: The MDS of our gene expression data showed, without any outliers, that the different donors segregated well and distributed into the DMSO (untreated) and ABX464 treatments that were infected or uninfected. The displayed variance was donor-dependent (clustered by donor) but treatment-independent (no data structure related to the different treatments), which suggests that the ABX464 molecule did not induce a major difference in CD4 T cell gene expression.
Project description:To identify more targets in soybean, particularly specific targets of Cd-stress-responsive miRNAs, high-throughput degradome sequencing was used. In total, we obtained 8913111 raw reads from the library which was constructed from a mixture of four samples (HX3-CK, HX3-Cd-treatment, ZH24-CK and ZH24-Cd-treatment). After removing the reads without the CAGAG adaptor, 5430126 unique raw-reads were obtained. The unique sequences were aligned to the G. max genome database, and 6516276 reads were mapped to the genome. The mapped reads from the libraries represented 51481 annotated G. max genes. Identification of miRNA targerts in soybean roots
Project description:Purpose: Here we describe the modulation of a gene expression program involved in cell fate. Methods: We depleted U2AF1 in human induced pluripotent stem cells (hiPSCs) to the level found in differentiated cells using an inducible shRNA system, followed by high-throughput RNAseq, revealing a gene expression program involved in cell fate determination. Results: Approximately 85% of the total raw reads were mapped to the human genome sequence (GRCh37), giving an average of 200 million human reads per sample for total RNA and 15 million human reads per sample for small RNA libraries. Conclusions: Our results show that transcriptional control of gene expression in hiPSCs can be set by the CSF U2AF1, establishing a direct link between transcription and AS during cell fate determination.
Project description:To define the sequence preference of SALL4 C2H2 zinc finger domains, we performed SELEX coupled with high-throughput sequencing (HT-SELEX) using the purified SALL4 ZFC1, ZFC2 and ZFC4 domains combined with no protein control experiment. We re-sequenced the libraries from E-MTAB-9236 with very high coverage to estimate the minimum number of reads required per sample for accurate results.