Project description:The Zika outbreak, spread by the Aedes aegypti mosquito, highlights the need to create high-quality assemblies of large genomes in a rapid and cost-effective fashion. Here, we combine Hi-C data with existing draft assemblies to generate chromosome-length scaffolds. We validate this method by assembling a human genome, de novo, from short reads alone (67X coverage, Sample GSM1551550). We then combine our method with draft sequences to create genome assemblies of the mosquito disease vectors Aedes aegypti and Culex quinquefasciatus, each consisting of three scaffolds corresponding to the three chromosomes in each species. These assemblies indicate that virtually all genomic rearrangements among these species occur within, rather than between, chromosome arms. The genome assembly procedure we describe is fast, inexpensive, accurate, and can be applied to many species.
Project description:Short-read DNA sequencing technologies provide new tools to answer biological questions. However, high cost and low throughput limit their widespread use, particularly in organisms with smaller genomes such as S. cerevisiae. Although ChIP-Seq in mammalian cell lines is replacing array-based ChIP-chip as the standard for transcription factor binding studies, ChIP-Seq in yeast is still underutilized compared to ChIP-chip. We developed a multiplex barcoding system that allows simultaneous sequencing and analysis of multiple samples using Illumina’s platform. We applied this method to analyze the chromosomal distributions of three yeast DNA binding proteins (Ste12, Cse4 and RNA PolII) and a reference sample (input DNA) in a single experiment and demonstrate its utility for rapid and accurate results at reduced costs. We developed a barcoding ChIP-Seq method for the concurrent analysis of transcription factor binding sites for yeast. Our multiplex strategy generated high quality data that was indistinguishable from data obtained with non-barcoded libraries. None of the barcoded adapters induced differences relative to a non-barcoded adapter when applied to the same DNA sample. We used this method to map the binding sites for Cse4, Ste12 and Pol II throughout the yeast genome and we found 148 binding targets for Cse4, 823 targets for Ste12 and 2508 targets for PolII. Cse4 was strongly bound to all yeast centromeres as expected and the remaining non-centromeric targets correspond to highly expressed genes in rich media, the latter constituting a novel finding. We designed a multiplex short-read DNA sequencing method to perform efficient ChIP-Seq in yeast and other small genome model organisms. This method produces accurate results with higher throughput and reduced cost. Given constant improvements in high-throughput sequencing technologies, increasing multiplexing will be possible to further decrease costs per sample and to accelerate the completion of large consortium projects such as modENCODE.
Project description:RNA-protein interactions are central to biological regulation. Cross-linking immunoprecipitation (CLIP)-seq is a powerful tool for genome-wide interrogation of RNA-protein interactomes, but current CLIP methods are limited by challenging biochemical steps and fail to detect many classes of noncoding and non-human RNAs. Here we present FAST-iCLIP, an integrated pipeline with improved CLIP biochemistry and an automated informatic pipeline for comprehensive analysis across protein coding, noncoding, repetitive, retroviral, and non-human transcriptomes. FAST-iCLIP of Poly-C binding protein 2 (PCBP2) showed that PCBP2 bound CU-rich motifs in different topologies to recognize mRNAs and noncoding RNAs with distinct biological functions. FAST-iCLIP of PCBP2 in hepatitis C virus-infected cells enabled a joint analysis of the PCBP2 interactome with host and viral RNAs and their interplay. These results show that FAST-iCLIP can be used to rapidly discover and decipher mechanisms of RNA-protein recognition across the diversity of human and pathogen RNAs. Characterization of non-coding and pathogen RNA-protein interactions using an automated computational pipeline and improved iCLIP biochemistry
Project description:To assess compatibility in sequence analysis we compared results from Sanger sequencing (with sequencing threshold >15%) and Next Generation Sequencing (with sequencing treshold >5%). Totally, there were 60 patients included in this part of the study. Here we demonstrate how reliable tool for fast and accurate identification of low-level viral quasispecies is deep-sequencing.
Project description:To assess compatibility in sequence analysis we compared results from Sanger sequencing (with sequencing threshold >15%) and Next Generation Sequencing (with sequencing treshold >5%). Totally, there were 48 patients included in this part of the study but sequencing for one sample failed. Here we demonstrate how reliable tool for fast and accurate identification of low-level viral quasispecies is deep-sequencing.
Project description:Bisulfite sequencing is a valuable tool for mapping the position of 5-methylcytosine in the genome at single base resolution. However, the associated chemistry renders the majority of DNA fragments unsequenceable, thus necessitating PCR amplification. Furthermore, bisulfite conversion generates an A,T-rich DNA library that leads to major PCR biases that may confound methylation analysis. Here we report a method that enables accurate methylation analysis, by rebuilding the damaged DNA library after bisulfite treatment. This recovery after bisulfite treatment (ReBuilT) approach enables PCR-free bisulfite sequencing from low nanogram quantities of genomic DNA. We applied the ReBuilT method for whole methylome analysis of the A,T rich genome of Plasmodium berghei. We demonstrate substantial improvements in coverage and the reduction of sequence-context biases as compared to classical methylome analysis. Our method will be widely applicable for accurate, quantitative methylation analysis, even for technically challenging genomes, and where limited sample DNA is available. From the same DNA sample we prepared 3 PCR-free Bisulfite-Seq replicates (ReBuilT) and 2 standard Bisulfite-Seq replicates (PCR-BS).
Project description:RNA-protein interactions are central to biological regulation. Cross-linking immunoprecipitation (CLIP)-seq is a powerful tool for genome-wide interrogation of RNA-protein interactomes, but current CLIP methods are limited by challenging biochemical steps and fail to detect many classes of noncoding and non-human RNAs. Here we present FAST-iCLIP, an integrated pipeline with improved CLIP biochemistry and an automated informatic pipeline for comprehensive analysis across protein coding, noncoding, repetitive, retroviral, and non-human transcriptomes. FAST-iCLIP of Poly-C binding protein 2 (PCBP2) showed that PCBP2 bound CU-rich motifs in different topologies to recognize mRNAs and noncoding RNAs with distinct biological functions. FAST-iCLIP of PCBP2 in hepatitis C virus-infected cells enabled a joint analysis of the PCBP2 interactome with host and viral RNAs and their interplay. These results show that FAST-iCLIP can be used to rapidly discover and decipher mechanisms of RNA-protein recognition across the diversity of human and pathogen RNAs.