Project description:Single-cell genome sequencing has proven valuable for the detection of somatic variation, particularly in the context of tumor evolution. Current technologies suffer from high library construction costs, which restrict the number of cells that can be assessed and thus impose limitations on the ability to measure heterogeneity within a tissue. Here, we present single-cell combinatorial indexed sequencing (SCI-seq) as a means of simultaneously generating thousands of low-pass single-cell libraries for detection of somatic copy-number variants. We constructed libraries for 16,698 single cells from a combination of cultured cell lines, primate frontal cortex tissue and two human adenocarcinomas, and obtained a detailed assessment of subclonal variation within a pancreatic tumor.
Project description:Identifying genomic features that differ between individuals and cells can help uncover the functional variants that drive phenotypes and disease susceptibilities. For this, single-cell studies are paramount, as it becomes increasingly clear that the contribution of rare but functional cellular subpopulations is important for disease prognosis, management, and progression. Until now, studying these associations has been challenged by our inability to map structural rearrangements accurately and comprehensively. To overcome this, we coupled single-cell sequencing of DNA template strands (Strand-seq) with custom analysis software to rapidly discover, map, and genotype genomic rearrangements at high resolution. This allowed us to explore the distribution and frequency of inversions in a heterogeneous cell population, identify several polymorphic domains in complex regions of the genome, and locate rare alleles in the reference assembly. We then mapped the entire genomic complement of inversions within two unrelated individuals to characterize their distinct inversion profiles and built a nonredundant global reference of structural rearrangements in the human genome. The work described here provides a powerful new framework to study structural variation and genomic heterogeneity in single-cell samples, whether from individuals for population studies or tissue types for biomarker discovery.
Project description:BackgroundWhole genome sequencing of viruses and bacteriophages is often hindered because of the need for large quantities of genomic material. A method is described that combines single plaque sequencing with an optimization of Sequence Independent Single Primer Amplification (SISPA). This method can be used for de novo whole genome next-generation sequencing of any cultivable virus without the need for large-scale production of viral stocks or viral purification using centrifugal techniques.MethodsA single viral plaque of a variant of the 2009 pandemic H1N1 human Influenza A virus was isolated and amplified using the optimized SISPA protocol. The sensitivity of the SISPA protocol presented here was tested with bacteriophage F_HA0480sp/Pa1651 DNA. The amplified products were sequenced with 454 and Illumina HiSeq platforms. Mapping and de novo assemblies were performed to analyze the quality of data produced from this optimized method.ResultsAnalysis of the sequence data demonstrated that from a single viral plaque of Influenza A, a mapping assembly with 3590-fold average coverage representing 100% of the genome could be produced. The de novo assembled data produced contigs with 30-fold average sequence coverage, representing 96.5% of the genome. Using only 10 pg of starting DNA from bacteriophage F_HA0480sp/Pa1651 in the SISPA protocol resulted in sequencing data that gave a mapping assembly with 3488-fold average sequence coverage, representing 99.9% of the reference and a de novo assembly with 45-fold average sequence coverage, representing 98.1% of the genome.ConclusionsThe optimized SISPA protocol presented here produces amplified product that when sequenced will give high quality data that can be used for de novo assembly. The protocol requires only a single viral plaque or as little as 10 pg of DNA template, which will facilitate rapid identification of viruses during an outbreak and viruses that are difficult to propagate.
Project description:BackgroundThe short reads output by first- and second-generation DNA sequencing instruments cannot completely reconstruct microbial chromosomes. Therefore, most genomes have been left unfinished due to the significant resources required to manually close gaps in draft assemblies. Third-generation, single-molecule sequencing addresses this problem by greatly increasing sequencing read length, which simplifies the assembly problem.ResultsTo measure the benefit of single-molecule sequencing on microbial genome assembly, we sequenced and assembled the genomes of six bacteria and analyzed the repeat complexity of 2,267 complete bacteria and archaea. Our results indicate that the majority of known bacterial and archaeal genomes can be assembled without gaps, at finished-grade quality, using a single PacBio RS sequencing library. These single-library assemblies are also more accurate than typical short-read assemblies and hybrid assemblies of short and long reads.ConclusionsAutomated assembly of long, single-molecule sequencing data reduces the cost of microbial finishing to $1,000 for most genomes, and future advances in this technology are expected to drive the cost lower. This is expected to increase the number of completed genomes, improve the quality of microbial genome databases, and enable high-fidelity, population-scale studies of pan-genomes and chromosomal organization.
Project description:The overall purpose of this study is to describe the cellular composition of the human colon and its gene expression using scRNAseq and scATACseq methods. This will potentially provide is with a detailed map of the colon aiding our understanding of how diseases of the colon develop as well as the colons influence on systemic diseases such as type II diabetes.
Project description:This study describes the combined sequencing of the genomes and transcriptomes of single blastomeres from mouse 8-cell stage embryos.
Project description:This study (McConnell, et al. Science 2012) used both SNP array and sequencing data to examine copy number variation in neuronal genomes. Encolsed here are the SNP Array data from the 42 fibroblasts, 19 human induced pluripotent stem cell (hiPSC)-derived neural progenitor cells (NPCs), and 40 hiPSC-derived neurons that were reported in the manuscript. Copy number analysis was performed on .CEL files using Partek Genomics Suite with a custom single cell reference file.
Project description:Next-generation sequencing has become the most widely used sequencing technology in genomics research, but it has inherent drawbacks when dealing with high-GC content genomes. Recently, single-molecule real-time sequencing technology (SMRT) was introduced as a third-generation sequencing strategy to compensate for this drawback. Here, we report that the unbiased and longer read length of SMRT sequencing markedly improved genome assembly with high GC content via gap filling and repeat resolution.
Project description:A central challenge in sequencing single-cell genomes is the accurate determination of point mutations, phasing of these mutations, and identifying copy number variations with few assumptions. Ideally, this is accomplished under as low sequencing coverage as possible. Here we report our attempt to meet these goals with a novel library construction and library amplification methodology. In our approach, single-cell genomic DNA is first fragmented with saturated transposition to make a primary library that uniformly covers the whole genome by short fragments. The library is then amplified by a carefully optimized PCR protocol in a uniform and synchronized fashion for next-generation sequencing. Each step of the protocol can be quantitatively characterized. Our shallow sequencing data show that the library is tightly distributed and is useful for the determination of copy number variations.
Project description:Generating chromosome-level, haplotype-resolved assemblies of heterozygous genomes remains challenging. To address this, we developed gamete binning, a method based on single-cell sequencing of haploid gametes enabling separation of the whole-genome sequencing reads into haplotype-specific reads sets. After assembling the reads of each haplotype, the contigs are scaffolded to chromosome level using a genetic map derived from the gametes. We assemble the two genomes of a diploid apricot tree based on whole-genome sequencing of 445 individual pollen grains. The two haplotype assemblies (N50: 25.5 and 25.8 Mb) feature a haplotyping precision of greater than 99% and are accurately scaffolded to chromosome-level.