Project description:A complete description of the transcriptome of an organism is crucial for a comprehensive understanding of how it functions, how its transcriptional networks are controlled, and my provide insights into the organism's evolution. Despite the status of Saccharomyces cerevisiae as arguably the most well understood model eukaryote, we still do not have a full catalog and understanding of all its genes. In order to interrogate the transcriptome of S. cerevisiae for low abundance or rapidly turned over transcripts, we deleted elements of the RNA degradation machinery with the aim of increasing the relative abundance of such transcripts. We then used high-resolution tiling microarrays and ultra high-throughput sequencing (UHTS) to identify and map the locations of unannotated transcripts that are more abundant in the RNA degradation mutants relative to wild-type cells, revealing 540 currently unannotated, presumably low abundance or short-lived RNAs, of which 231 are previously unknown and unique to this study. It is likely that many of these represent cryptic unstable transcripts (CUTs) which are rapidly degraded and whose function(s) within the cell are still unclear, while others may represent novel functional transcripts. Of the 271 transcripts we identified in current intergenic regions, greater than 90 percent have lower conservation scores amongst closely related yeast species than 95 percent of the verified ORFs in S. cerevisiae; such regions of the genome have typically been less well studied, and by definition encode transcripts that distinguish S. cerevisiae from these closely related species. Keywords: Saccharomyces cerevisiae, transcriptome, salt shock, xrn1, rrp6, lsm1, pat1, RNA degradation
Project description:Annotating Low Abundance and Transient RNAs in Yeast using Tiling Microarrays and Ultra High-throughput Sequencing Reveals New Transcripts that are not Conserved Across Closely Related Yeast Species A complete description of the transcriptome of an organism is crucial for a comprehensive understanding of how it functions, how its transcriptional networks are controlled, and may provide insights into the organism’s evolution. Despite the status of Saccharomyces cerevisiae as arguably the most well studied model eukaryote, we still do not have a full catalog or understanding of all its genes. In order to interrogate the transcriptome of S. cerevisiae for low abundance or rapidly turned over transcripts, we deleted elements of the RNA degradation machinery with the goal of preferentially increasing the relative abundance of such transcripts. We then used high-resolution tiling microarrays and ultra high-throughput sequencing (UHTS) to identify, map and validate unannotated transcripts that are more abundant in the RNA degradation mutants relative to wild-type cells. We identified 365 currently unannotated transcripts, the majority presumably representing low abundance or short-lived RNAs, of which 185 are previously unknown and unique to this study. It is likely that many of these are cryptic unstable transcripts (CUTs), which are rapidly degraded and whose function(s) within the cell are still unclear, while others may be novel functional transcripts. Of the 185 transcripts we identified as novel to our study, greater than 80 percent come from regions of the genome that have lower conservation scores amongst closely related yeast species than 85 percent of the verified ORFs in S. cerevisiae. Such regions of the genome have typically been less well studied, and by definition transcripts from these regions will distinguish S. cerevisiae from these closely related species. Keywords: Saccharomyces cerevisiae, transcriptome, RNA-Seq, RNA degradation, XRN1, RRP6, LSM1, PAT1, Salt Shock, NaCl
Project description:Annotating Low Abundance and Transient RNAs in Yeast using Tiling Microarrays and Ultra High-throughput Sequencing Reveals New Transcripts that are not Conserved Across Closely Related Yeast Species A complete description of the transcriptome of an organism is crucial for a comprehensive understanding of how it functions, how its transcriptional networks are controlled, and may provide insights into the organism’s evolution. Despite the status of Saccharomyces cerevisiae as arguably the most well studied model eukaryote, we still do not have a full catalog or understanding of all its genes. In order to interrogate the transcriptome of S. cerevisiae for low abundance or rapidly turned over transcripts, we deleted elements of the RNA degradation machinery with the goal of preferentially increasing the relative abundance of such transcripts. We then used high-resolution tiling microarrays and ultra high-throughput sequencing (UHTS) to identify, map and validate unannotated transcripts that are more abundant in the RNA degradation mutants relative to wild-type cells. We identified 365 currently unannotated transcripts, the majority presumably representing low abundance or short-lived RNAs, of which 185 are previously unknown and unique to this study. It is likely that many of these are cryptic unstable transcripts (CUTs), which are rapidly degraded and whose function(s) within the cell are still unclear, while others may be novel functional transcripts. Of the 185 transcripts we identified as novel to our study, greater than 80 percent come from regions of the genome that have lower conservation scores amongst closely related yeast species than 85 percent of the verified ORFs in S. cerevisiae. Such regions of the genome have typically been less well studied, and by definition transcripts from these regions will distinguish S. cerevisiae from these closely related species. Keywords: Saccharomyces cerevisiae, transcriptome, RNA-Seq, RNA degradation, XRN1, RRP6, LSM1, PAT1, Salt Shock, NaCl Four samples, a wild-type reference and three mutants (1 containing 4 deletions and the other two containing 3 each), were subjected to high salt shock, and total RNA was harvested. PolyA RNA was purified, and libraries were generated for the Solexa platform. Each library was run on 4 lanes of a Solexa flow cell.
Project description:A complete description of the transcriptome of an organism is crucial for a comprehensive understanding of how it functions and how its transcriptional networks are controlled, and may provide insights into the organism's evolution. Despite the status of Saccharomyces cerevisiae as arguably the most well-studied model eukaryote, we still do not have a full catalog or understanding of all its genes. In order to interrogate the transcriptome of S. cerevisiae for low abundance or rapidly turned over transcripts, we deleted elements of the RNA degradation machinery with the goal of preferentially increasing the relative abundance of such transcripts. We then used high-resolution tiling microarrays and ultra high-throughput sequencing (UHTS) to identify, map, and validate unannotated transcripts that are more abundant in the RNA degradation mutants relative to wild-type cells. We identified 365 currently unannotated transcripts, the majority presumably representing low abundance or short-lived RNAs, of which 185 are previously unknown and unique to this study. It is likely that many of these are cryptic unstable transcripts (CUTs), which are rapidly degraded and whose function(s) within the cell are still unclear, while others may be novel functional transcripts. Of the 185 transcripts we identified as novel to our study, greater than 80 percent come from regions of the genome that have lower conservation scores amongst closely related yeast species than 85 percent of the verified ORFs in S. cerevisiae. Such regions of the genome have typically been less well-studied, and by definition transcripts from these regions will distinguish S. cerevisiae from these closely related species.
Project description:BACKGROUND: Mismatched oligonucleotides are widely used on microarrays to differentiate specific from nonspecific hybridization. While many experiments rely on such oligos, the hybridization behavior of various degrees of mismatch (MM) structure has not been extensively studied. Here, we present the results of two large-scale microarray experiments on S. cerevisiae and H. sapiens genomic DNA, to explore MM oligonucleotide behavior with real sample mixtures under tiling-array conditions. RESULTS: We examined all possible nucleotide substitutions at the central position of 36-nucleotide probes, and found that nonspecific binding by MM oligos depends upon the individual nucleotide substitutions they incorporate: C-->A, C-->G and T-->A (yielding purine-purine mispairs) are most disruptive, whereas A-->X were least disruptive. We also quantify a marked GC skew effect: substitutions raising probe GC content exhibit higher intensity (and vice versa). This skew is small in highly-expressed regions (+/- 0.5% of total intensity range) and large (+/- 2% or more) elsewhere. Multiple mismatches per oligo are largely additive in effect: each MM added in a distributed fashion causes an additional 21% intensity drop relative to PM, three-fold more disruptive than adding adjacent mispairs (7% drop per MM). CONCLUSION: We investigate several parameters for oligonucleotide design, including the effects of each central nucleotide substitution on array signal intensity and of multiple MM per oligo. To avoid GC skew, individual substitutions should not alter probe GC content. RNA sample mixture complexity may increase the amount of nonspecific hybridization, magnify GC skew and boost the intensity of MM oligos at all levels.
Project description:Traditional microarrays use probes complementary to known genes to quantitate the differential gene expression between two or more conditions. Genomic tiling microarray experiments differ in that probes that span a genomic region at regular intervals are used to detect the presence or absence of transcription. This difference means the same sets of biases and the methods for addressing them are unlikely to be relevant to both types of experiment. We introduce the informatics challenges arising in the analysis of tiling microarray experiments as open problems to the scientific community and present initial approaches for the analysis of this nascent technology.
Project description:BackgroundHigh-density tiling microarrays are a powerful tool for the characterization of complete genomes. The two major computational challenges associated with custom-made arrays are design and analysis. Firstly, several genome dependent variables, such as the genome's complexity and sequence composition, need to be considered in the design to ensure a high quality microarray. Secondly, since tiling projects today very often exceed the limits of conventional array-experiments, researchers cannot use established computer tools designed for commercial arrays, and instead have to redesign previous methods or create novel tools.Principal findingsHere we describe the multiple aspects involved in the design of tiling arrays for transcriptome analysis and detail the normalisation and analysis procedures for such microarrays. We introduce a novel design method to make two 280,000 feature microarrays covering the entire genome of the bacterial species Escherichia coli and Neisseria meningitidis, respectively, as well as the use of multiple copies of control probe-sets on tiling microarrays. Furthermore, a novel normalisation and background estimation procedure for tiling arrays is presented along with a method for array analysis focused on detection of short transcripts. The design, normalisation and analysis methods have been applied in various experiments and several of the detected novel short transcripts have been biologically confirmed by Northern blot tests.ConclusionsTiling-arrays are becoming increasingly applicable in genomic research, but researchers still lack both the tools for custom design of arrays, as well as the systems and procedures for analysis of the vast amount of data resulting from such experiments. We believe that the methods described herein will be a useful contribution and resource for researchers designing and analysing custom tiling arrays for both bacteria and higher organisms.
Project description:We demonstrate the use of a chromosomal walk (or "tiling path") printed as DNA microarrays for mapping protein-DNA interactions across large regions of contiguous genomic DNA in Drosophila melanogaster. Microarrays were constructed with genomic DNA fragments 430-920 bp in length, covering 2.9 million base pairs of the Adh-cactus region of chromosome 2 and 85,000 base pairs of the 82F region of chromosome 3. We performed DNA localization mapping for the heterochromatin protein HP1 and for the sequence-specific GAGA transcription factor, producing a comprehensive, high-resolution map of in vivo protein-DNA interactions throughout these regions of the Drosophila genome.
Project description:RNase Y is a key endoribonuclease affecting global mRNA stability in Bacillus subtilis. Its characterization provided the first evidence that endonucleolytic cleavage plays a major role in the mRNA metabolism of this organism. RNase Y shares important functional features with the RNA decay initiating RNase E from Escherichia coli, notably a similar cleavage specificity and a preference for 5' monophosphorylated substrates. We used high-resolution tiling arrays to analyze the effect of RNase Y depletion on RNA abundance covering the entire genome. The data confirm that this endoribonuclease plays a key role in initiating the decay of a large number of mRNAs as well as non coding RNAs. The downstream cleavage products are likely to be degraded by the 5' exonucleolytic activity of RNases J1/J2 as we show for a specific case. Comparison of the data with that of two other recent studies revealed very significant differences. About two thirds of the mRNAs upregulated following RNase Y depletion were different when compared to either one of these studies and only about 10% were in common in all three studies. This highlights that experimental conditions and data analysis play an important role in identifying RNase Y substrates by global transcriptional profiling. Our data confirmed already known RNase Y substrates and due to the precision and reproducibility of the profiles allow an exceptionally detailed view of the turnover of hundreds of new RNA substrates.
Project description:Single-cell omics provide insight into cellular heterogeneity and function. Recent technological advances have accelerated single-cell analyses, but workflows remain expensive and complex. We present a method enabling simultaneous, ultra-high throughput single-cell barcoding of millions of cells for targeted analysis of proteins and RNAs. Quantum barcoding (QBC) avoids isolation of single cells by building cell-specific oligo barcodes dynamically within each cell. With minimal instrumentation (four 96-well plates and a multichannel pipette), cell-specific codes are added to each tagged molecule within cells through sequential rounds of classical split-pool synthesis. Here we show the utility of this technology in mouse and human model systems for as many as 50 antibodies to targeted proteins and, separately, >70 targeted RNA regions. We demonstrate that this method can be applied to multi-modal protein and RNA analyses. It can be scaled by expansion of the split-pool process and effectively renders sequencing instruments as versatile multi-parameter flow cytometers.