Project description:This data was generated by ENCODE. If you have questions about the data, contact the submitting laboratory directly (Jonathan Preall (Generation 0 Data from Hannon Lab), Carrie Davis (experimental), Alex Dobin (computational), Wei Lin (computational), Tom Gingeras (primary investigator)). If you have questions about the Genome Browser track associated with this data, contact ENCODE ( hg18: This data was produced by Hannon lab part of Cold Spring Harbor as part of the ENCODE Project. The series depicts NextGen sequencing information for RNAs between the sizes of 20-200 nt isolated from RNA samples from tissues or sub cellular compartments of cell lines. hg19: This track depicts NextGen sequencing information for RNAs between the sizes of 20-200 nt isolated from RNA samples from tissues or sub cellular compartments from ENCODE cell lines. The overall goal of the ENCODE project is to identify and characterize all functional elements in the sequence of the human genome. hg19: This cloning protocol generates directional libraries that are read from the 5' ends of the inserts, which should largely correspond to the 5' ends of the mature RNAs. The libraries were sequenced on a Solexa platform for a total of 36, 50 or 76 cycles however the reads undergo post-processing resulting in trimming of their 3' ends. Consequently, the mapped read lengths are variable. For data usage terms and conditions, please refer to and
Project description:This data was generated by ENCODE. If you have questions about the data, contact the submitting laboratory directly (Carrie Davis (experimental), Roderic Guigo and lab (data processing) and Tom Gingeras (primary investigator)). If you have questions about the Genome Browser track associated with this data, contact ENCODE ( These tracks were generated by the ENCODE Consortia. They contain information about mouse RNAs > 200 nucleotides in length obtained as short reads off the Illumina platform. Data are available from biological replicates. For data usage terms and conditions, please refer to and
Project description:This data was generated by ENCODE. If you have questions about the data, contact the submitting laboratory directly (Jonathan Preall (Generation 0 Data from Hannon Lab), Carrie Davis (experimental), Alex Dobin (computational), Wei Lin (computational), Tom Gingeras (primary investigator)). If you have questions about the Genome Browser track associated with this data, contact ENCODE ( hg18: This data was produced by Hannon lab part of Cold Spring Harbor as part of the ENCODE Project. The series depicts NextGen sequencing information for RNAs between the sizes of 20-200 nt isolated from RNA samples from tissues or sub cellular compartments of cell lines. hg19: This track depicts NextGen sequencing information for RNAs between the sizes of 20-200 nt isolated from RNA samples from tissues or sub cellular compartments from ENCODE cell lines. The overall goal of the ENCODE project is to identify and characterize all functional elements in the sequence of the human genome. hg19: This cloning protocol generates directional libraries that are read from the 5' ends of the inserts, which should largely correspond to the 5' ends of the mature RNAs. The libraries were sequenced on a Solexa platform for a total of 36, 50 or 76 cycles however the reads undergo post-processing resulting in trimming of their 3' ends. Consequently, the mapped read lengths are variable. For data usage terms and conditions, please refer to and hg18: Small RNAs between 20-200 nt were ribominus treated according to the manufacturer's protocol (Invitrogen) using custom LNA probes targeting ribosomal RNAs (some datasets are also depleted of U snRNAs and high abundant microRNAs). The RNA was treated with Tobacco Alkaline Pyrophosphatase to eliminate any 5' cap structure. Poly-A Polymerase was used to catalyze the addition of C's to the 3' end. The 5' ends were phosphorylated using T4 PNK and an RNA linker was ligated onto the 5' end. Reverse transcription was carried out using a poly-G oligo with a defined 5' extension. The inserts were then amplified using oligos targeting the 5' linker and poly-G extension and containing sequencing adapters. The library was sequenced on an Illumina GA machine for a total of 36, 50 or 76 cycles. Initially 1 lane is run. If an appreciable number of mappable reads are obtained, additional lanes are run. Sequence reads underwent quality filtration using Illumina standard pipeline (Gerlad). The read lengths may exceed the insert sizes and consequently introduce 3' adaptor sequence into the 3' end of the reads. The 3' sequencing adaptor was removed from the reads using a custom clipper program, which aligned the adaptor sequence to the short-reads, allowing up to 2 mismatches and no indels. Regions that aligned were "clipped" off from the read. The trimmed portions were collapsed into identical reads, their count noted and aligned to the human genome (NCBI build 36, hg18 unmasked) using Nexalign (Lassmann et al., not published). The alignment parameters are tuned to tolerate up to 2 mismatches with no indels and will allow for trimmed portions as small as 5 nucleotides to be mapped. We report reads that mapped 10 or fewer times. Data obtained from each lane is processed and mapped independently. The processed/mapped data from each lane is then complied as a single track without additional processing and submitted to UCSC. Consequently, identical reads within a lane were collapsed and their value is reported as the "transfrag" signal value. However, the redundancy between lanes has not been eliminated so the same transfrag may appear multiple times within a signal. hg19: Small RNAs between 20-200 nt were ribominus treated according to the manufacturer's protocol (Invitrogen) using custom LNA probes targeting ribosomal RNAs (some datasets are also depleted of U snRNAs and high abundant microRNAs). The RNA was treated with Tobacco Alkaline Pyrophosphatase to eliminate any 5' cap structures. Poly-A Polymerase was used to catalyze the addition of C's to the 3' end. The 5' ends were phosphorylated using T4 PNK and an RNA linker was ligated onto the 5' end. Reverse transcription was carried out using a poly-G oligo with a defined 5' extension. The inserts were then amplified using oligos targeting the 5' linker and poly-G extension and containing sequencing adapters. The library was sequenced on an Illumina GA machine for a total of 36, 50 or 76 cycles. Initially, one lane was run. If an appreciable number of mappable reads were obtained, additional lanes were run. Sequence reads underwent quality filtration using Illumina standard pipeline (GERALD). The Illumina reads were initially trimmed to discard any bases following a quality score less than or equal to 20 and converted into FASTA format, thereby discarding quality information for the rest of the pipeline. As a result, the sequence quality scores in the BAM output are all displayed as "40" to indicate no quality information. The read lengths may exceed the insert sizes and consequently introduce 3' adapter sequence into the 3' end of the reads. The 3' sequencing adapter was removed from the reads using a custom clipper program (available at, which aligned the adapter sequence to the short-reads using up to 2 mismatches and no indels. Regions that aligned were "clipped" off from the read. Terminal C nucleotides introduced at the 3' end of the RNA via the cloning procedure are also trimmed. The trimmed portions were collapsed into identical reads, their count noted and aligned to the human genome (version hg19, using the gender build appropriate to the sample in question - female/male) using Bowtie (Langmead B. et al). The alignment parameter allowed 0, 1, or 2 mismatches iteratively. We report reads that mapped 20 or fewer times. Discrepancies between hg18 and hg19 versions of CSHL small RNA data: The alignment pipeline for the CSHL small RNA data was updated upon the release of the human genome version hg19, resulting in a few noteworthy discrepancies with the hg18 dataset. First, mapping was conducted with the open-source Bowtie algorithm ( rather than the custom NexAlign software. As each algorithm uses different strategies to perform alignments, the mapping results may vary even in genomic regions that do not differ between builds. The read processing pipeline also varies slightly, in that we no longer retain information regarding whether a read was 'clipped' off adapter sequence.
Project description:The comparision between gradual ocean acidification (GC) and one way ocean acidification (HC) of physiological and molecular responses on diatom Skeletonema costatum
Project description:This data was generated by ENCODE. If you have questions about the data, contact the submitting laboratory directly (Carrie Davis (experimental), Roderic Guigo and lab (data processing) and Tom Gingeras (primary investigator)). If you have questions about the Genome Browser track associated with this data, contact ENCODE ( These tracks were generated by the ENCODE Consortia. They contain information about mouse RNAs > 200 nucleotides in length obtained as short reads off the Illumina platform. Data are available from biological replicates. For data usage terms and conditions, please refer to and Tissue Samples: Individual tissues were harvested from mouse strain C57BL/6NJ at different timepoints according to ENCODE cell culture protocols. Whenever possible biological replicates from litermates. Library Preparation: The published cDNA sequencing protocol was used. This protocol generates directional libraries and reports the transcripts' strand of origin. Exogenous RNA spike-ins were added to each endogenous RNA isolate and carried through library construction and sequencing. The spike-in sequence and the concentrations are available for download in the supplemental directory. Sequencing and Mapping: The libraries were sequenced on the Illumina platform (either GAIIx or Hi-Seq) in mate-pair fashion (either pair-end 76 or pair-end 101) to an average depth of 100 million mate-pairs. The data were mapped against hg19 using Spliced Transcript Alignment and Reconstruction (STAR) written by Alex Dobin (CSHL). More information about STAR, including the parameters used for these data, is available from the Gingeras lab. Verification: FPKM (fragments per kilobase of exon per million fragments mapped) values were calculated for annotated exons and Spearman correlation coefficients were computed. In general, Rho values are > .90 between biological replicates.
Project description:In this study we investigated how changes in pH and ocean chemistry consistent with the scenarios of the Intergovernmental Panel on Climate Change (IPCC) drive major changes in gene expression, respiration, photosynthesis and symbiosis of the coral, Acropora millepora, long before they affect biomineralization. Changes in gene expression were consistent with metabolic suppression, an increase in oxidative stress, apoptosis and symbiont loss. Other expression patterns demonstrated up-regulation of membrane transporters, as well as the regulation of genes involved in membrane cytoskeletal interactions and cytoskeletal remodeling. These widespread changes in gene expression emphasize the need to expand future studies of ocean acidification to include a wider spectrum of cellular processes, many of which may occur well before impacts on calcification. We applied a reference microarray design for the experiment outlined in the study, which was a three condition experiment of ocean acidification: control pH 8.0-8.2, medium pH 7.8-7.9 and high pH 7.6-7.7, and across three time points: time zero, day 1 and day 28. Samples from time zero and control treatments were used to generate the reference sample for the microarray hybridization experiments. A total of 27 microarrays were used in the entire experiment, 3 biological replicates per treatment and timepoint. Reference samples in each array was labeled with Cy3, and the actual experimental samples with Cy5.