Project description:RNA-Seq is a powerful tool for transcriptome profiling, but is hampered by sequence-dependent bias and inaccuracy at low copy numbers intrinsic to exponential PCR amplification. We developed a simple strategy for mitigating these complications, allowing truly digital RNA-Seq. Following reverse transcription, a large set of barcode sequences is added in excess, and nearly every cDNA molecule is uniquely labeled by random attachment of barcode sequences to both ends. After PCR, we applied paired-end deep sequencing to read the two barcodes and cDNA sequences. Rather than counting the number of reads, RNA abundance is measured based on the number of unique barcode sequences observed for a given cDNA sequence. We optimized the barcodes to be unambiguously identifiable even in the presence of multiple sequencing errors. This method allows counting with single copy resolution despite sequence-dependent bias and PCR amplification noise, and is analogous to digital PCR but amendable to quantifying a whole transcriptome. We demonstrated transcriptome profiling of E. coli with more accurate and reproducible quantification than conventional RNA-Seq.
Project description:The Tibellus genus spider is an active hunter that does not spin webs and remains highly underinvestigated in terms of the venom composition. Here, we present a combination of venom glands transcriptome cDNA analysis, venom proteome analysis for unveiling of the Tibellus genus spider venom composition.
Project description:While a first draft of the equine genome is available and predictions are made regarding resulting genes and proteins, little is known about the actual transcriptome. So far, published expressed sequence tags (ESTs) from different horse tissues were generally rather short (≤600bp) and hardly annotated, reflecting the problem that good cDNA libraries are very difficult to analyse. In this approach, we aimed to establish and analyse a normalised immune cell cDNA library (using freshly isolated and activated lymphocytes, NK cells, monocytes and DC). In particular, we wanted to test next generation sequencing combined with a series of bioinformatic approaches. The resulting cDNA library contained 2x107 clones of which 1056 were used for an initial Sanger sequencing and 4x106 for the deep sequencing analysis. Through the latter we obtained >29k sequences for which more than 5000 matches where found on the equine reference sequences. Additionally we could identify more than 3500 sequences which had matches on both - non-equine RNA sequences as well as the equine genome. In these we find both extensions of existing RefSeq models and novel mRNAs alike. Less than 2% of sequences did not have any match in the mentioned databases.
Project description:Sequencing technologies together with new bioinformatics tools have led to the complete sequencing of various genomes. However, information regarding the human transcriptome and its annotation is yet to be completed. The Human Cancer Genome Project, using ORESTES (open reading frame EST sequences) methodology, contributed to this major objective by generating data from about 1.2 million expressed sequence tags (ESTs). Approximately 30% of these sequences did not align to ESTs in the public databases and were considered no-match ORESTES. On the basis that a set of these ESTs could represent new transcripts, we constructed a cDNA microarray. This platform was used to hybridize against 12 different normal or tumor tissues. We identified 3,421 transcribed regions not associated with annotated transcripts, representing 83.3% of the platform. The total number of differentially expressed sequences, with fold differences between tumor and normal samples of at least two, in one or more different tissues, was 1,007. Also, about 28% of analyzed sequences could represent non-coding RNAs (ncRNAs). Our data reinforces the knowledge of the human genome being pervasively transcribed, and point out molecular marker candidates for different cancers. To reinforce our data, we confirmed, by real-time PCR, the differential expression of 3 out of 8 potentially tumor markers in prostate tissues. A list of 1,007 differentially expressed sequences, as well as the 291 potentially non-coding molecular markers for different tumors was provided.
Project description:Transcripts from the human and mouse CAPN10 cDNA sequences cloned into pcDNA and expressed in cultured cells undergo unexpected splicing. Peptide sequences missing from the spliced version of CAPN10 expressed in COS7 cells were determined by MS analysis.