Project description:Purpose: Tag-based approach for global gene expression analysis has been revolutionised with the advent of next generation sequencing (NGS) technology. The aim of the present study is to present comprehensive view of differentially expressed genes under cold and freeze stress in seabuckthorn (Hippophae rhamnoides L.) Methods: DeepSAGE, a tag based approach, was used to identify differentially expressed genes under cold and freeze treatments in seabuckthorn (Hippophae rhamnoides L.). The 30 days old plantlets, at six leaves stage, were subjected to cold stress (CS) at 4°C and freeze stress (FS) at -10°C treatment for 6 hr. The seedlings grown at 28°C were taken as control (CON). Total RNA from all the three samples was isolated. Illumina Gene Expression Sample Prep Kit and Solexa Sequencing Chip (Flowcell) were used for tag preparation and the main instruments used for sequencing included Illumina Cluster Station and Illumina HiSeqTM 2000 System. Bioinformatics analysis resulted in to high number of differentially expressed genes under cold and freeze stress. Results: 36.2 million raw tags including 13.9 million distinct tags were generated from three leaf tissue libraries (control, cold stress and freeze stress). After removing low quality tags, 35.5 million clean tags including 7 million distinct clean tags were obtained. In total, 11922 differentially expressed genes (DEGs) were identified including 6539 up regulated and 5383 down regulated genes. Conclusions: DeepSAGE data of seabuckthorn provided useful resource and reference dataset for further functional genomics analysis in seabuckthorn and other important crops. The present study implicated a large number of genes with different biological functions expressing differentially in response to cold and freeze stress treatment. Isolation and further characterization of these genes will help researchers in understanding their role in cold and freeze tolerance in seabuckthorn and may provide important gene resources to be exploited for the development of stress tolerant crop plants in future.
Project description:Structural variation has played an important role in the evolutionary restructuring of human and great ape genomes. We generated approximately 10-fold genomic sequence coverage from a western lowland gorilla and integrated these data into a physical and cytogenetic framework to develop a comprehensive view of structural variation. We discovered and validated over 7,665 structural changes within the gorilla lineage including sequence resolution of inversions, deletions, duplications and retrotranspositions. A comparison with human and other ape genomes shows that the gorilla genome has been subjected to the highest rate of segmental duplication. We show that both the gorilla and chimpanzee genomes have experienced independent yet parallel patterns of structural mutation that have not occurred in humans, including the formation of subtelomeric heterochromatic caps, the hyperexpansion of segmental duplications and bursts of retroviral integrations. Our analysis suggests that the chimpanzee and gorilla genomes are structurally more derived than either orangutan or human.
Project description:Prokaryotic genome annotation is highly dependent on automated methods, as manual curation cannot keep up with the exponential growth of sequenced genomes. Current automated techniques depend heavily on sequence context and often underestimate the complexity of the proteome. We developed REPARATION (RibosomeE Profiling Assisted (Re-)AnnotaTION), a de novo algorithm that takes advantage of experimental evidence from ribosome profiling (Ribo-seq) to delineate translated open reading frames (ORFs) in bacteria, independent of genome annotation. Ribo-seq next generation sequencing technique that provides a genome-wide snapshot of the position translating ribosome along an mRNA at the time of the experiment. REPARATION evaluates all possible ORFs in the genome and estimates minimum thresholds to screen for spurious ORFs based on a growth curve model. We applied REPARATION to three annotated bacterial species to obtain a more comprehensive mapping of their translation landscape in support of experimental data. In all cases, we identified hundreds of novel ORFs including variants of previously annotated and novel small ORFs (<71 codons). Our predictions were supported by matching mass spectrometry (MS) proteomics data and sequence conservation analysis. REPARATION is unique in that it makes use of experimental Ribo-seq data to perform de novo ORF delineation in bacterial genomes, and thus can identify putative coding ORFs irrespective of the sequence context of the reading frame.
Project description:Prokaryotic genome annotation is highly dependent on automated methods, as manual curation cannot keep up with the exponential growth of sequenced genomes. Current automated techniques depend heavily on sequence context and often underestimate the complexity of the proteome. We developed REPARATION (RibosomeE Profiling Assisted (Re-)AnnotaTION), a de novo algorithm that takes advantage of experimental evidence from ribosome profiling (Ribo-seq) to delineate translated open reading frames (ORFs) in bacteria, independent of genome annotation. Ribo-seq next generation sequencing technique that provides a genome-wide snapshot of the position translating ribosome along an mRNA at the time of the experiment. REPARATION evaluates all possible ORFs in the genome and estimates minimum thresholds to screen for spurious ORFs based on a growth curve model. We applied REPARATION to three annotated bacterial species to obtain a more comprehensive mapping of their translation landscape in support of experimental data. In all cases, we identified hundreds of novel ORFs including variants of previously annotated and novel small ORFs (<71 codons). Our predictions were supported by matching mass spectrometry (MS) proteomics data and sequence conservation analysis. REPARATION is unique in that it makes use of experimental Ribo-seq data to perform de novo ORF delineation in bacterial genomes, and thus can identify putative coding ORFs irrespective of the sequence context of the reading frame.
Project description:Over 2000 publicly accessible human and mouse ChIP-Seq datasets for about 250 Transcription Factors and chromatin complexes from various databases (ENCODE, GEO) were mapped to custom-made human and mouse genomes containing a reference rDNA sequence of the appropriate species (Genbank U13369.1 for human, BK000964.3 for mouse). The read mapping density across the rDNA sequence was then extracted and normalized to the median in that dataset. Unbiased clustering and analysis, followed by curation, was performed to identify high-confidence patterns of rDNA occupancy for numerous hematopoietic TFs and TF families at canonical TF motif sequences. ************************ Data processing steps: FASTQs were trimmed using Trimmomatic with the following parameters: LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:30 Reads were mapped to customized genomes (containing additional rDNA sequence) using Bowtie2 using the following parameter: -X 2000 Read density across the rDNA sequence was extracted using igvtools ************************