Project description:A widespread assumption for single-cell analyses specifies that one cell’s nucleic acids are predominantly captured by one oligonucleotide barcode. However, we show that ~13-21% of cell barcodes from the 10x Chromium scATAC-seq assay may have been derived from a droplet with more than one oligonucleotide sequence, which we call “barcode multiplets”. We demonstrate that barcode multiplets can be derived from at least two different sources. First, we confirm that approximately 4% of droplets from the 10x platform may contain multiple beads. Additionally, we find that approximately 5% of beads may contain detectable levels of multiple oligonucleotide barcodes. We show that this artifact can confound single-cell analyses, including the interpretation of clonal diversity and proliferation of intra-tumor lymphocytes. Overall, our work provides a conceptual and computational framework to identify and assess the impacts of barcode multiplets in single-cell data.
Project description:A widespread assumption for single-cell analyses specifies that one cell's nucleic acids are predominantly captured by one oligonucleotide barcode. Here, we show that ~13-21% of cell barcodes from the 10x Chromium scATAC-seq assay may have been derived from a droplet with more than one oligonucleotide sequence, which we call "barcode multiplets". We demonstrate that barcode multiplets can be derived from at least two different sources. First, we confirm that approximately 4% of droplets from the 10x platform may contain multiple beads. Additionally, we find that approximately 5% of beads may contain detectable levels of multiple oligonucleotide barcodes. We show that this artifact can confound single-cell analyses, including the interpretation of clonal diversity and proliferation of intra-tumor lymphocytes. Overall, our work provides a conceptual and computational framework to identify and assess the impacts of barcode multiplets in single-cell data.
Project description:Barcode swapping results in the mislabeling of sequencing reads between multiplexed samples on the new patterned flow cell Illumina sequencing machines. This may compromise the validity of numerous genomic assays, especially for single-cell studies where many samples are routinely multiplexed together. The severity and consequences of barcode swapping for single-cell transcriptomic studies remain poorly understood. We have used two statistical approaches to robustly quantify the fraction of swapped reads in each of two plate-based single-cell RNA sequencing datasets. We found that approximately 2.5% of reads were mislabeled between samples on the HiSeq 4000 machine, which is lower than previous reports. We observed no correlation between the swapped fraction of reads and the concentration of free barcode across plates. Further- more, we have demonstrated that barcode swapping may generate complex but artefactual cell libraries in droplet-based single-cell RNA sequencing studies. To eliminate these artefacts, we have developed an algorithm to exclude individual molecules that have swapped between samples in 10X Genomics experiments, exploiting the combinatorial complexity present in the data. This permits the continued use of cutting-edge sequencing machines for droplet-based experiments while avoiding the confounding effects of barcode swapping. This data repository contains the sequencing files associated with the droplet based scRNA-seq dataset in Griffiths et al. (2018). The data presented here should purely used for technical analysis, the biological motivation is nonetheless briefly described in the following: The mammary gland is a unique organ as it undergoes most of its development during puberty and adulthood. Characterising the hierarchy of the various mammary epithelial cells and how they are regulated in response to gestation, lactation and involution is important for understanding how breast cancer develops. Recent studies have used numerous markers to enrich, isolate and characterise the different epithelial cell compartments within the adult mammary gland. However, in all of these studies only a handful of markers were used to define and trace cell populations. Therefore, there is a need for an unbiased and comprehensive description of mammary epithelial cells within the gland at different developmental stages. To this end we used single cell RNA sequencing (scRNAseq) to determine the gene expression profile of individual mammary epithelial cells across four adult developmental stages; nulliparous, mid gestation, lactation and post weaning (full natural involution).
Project description:High-throughput single-cell assays increasingly require special consideration in experimental design, sample multiplexing, batch effect removal, and data interpretation. Here, we describe a lentiviral barcode-based multiplexing approach, CellTag Indexing, which uses predefined genetic barcodes that are also heritable, enabling cell populations to be tagged, pooled, and tracked over time in the same experimental replicate. We demonstrate the utility of CellTag Indexing by sequencing transcriptomes using a variety of cell types, including long-term tracking of cell engraftment and differentiation in vivo. Together, this presents CellTag Indexing as a broadly applicable genetic multiplexing tool that is complementary with existing single-cell technologies.
Project description:These datasets are test datasets of sample-multiplexed scRNA-seq, consisting of cDNA (transcriptome) and sample barcode read files: Three-sample multiplexing experiment (JS009) is a MULTI-seq dataset containing mesenchyme embryonic hind limb bud cells, embryonic stem (ES) cells, and NIH3T3 cells. Each cell sample was labelled with a distinct MULTI-seq barcode. The barcode sequences were, CATAGAGC, TCCTCGAA, and GTGTACCT for the limb bud mesenchyme cells, the ES cells, and the NIH3T3 cells, respectively. Two-sample multiplexing experiment (JS010) is a CellPlex detaset, containing ES cells and NIH3T3 cells. Each cell sample was labelled with the 3'CellPlex Kit (10X Genomics). NIH3T3 cells and ES cells were labelled with CMO301 and CMO302, respectively.
Project description:Cell-cell interactions are important to numerous biological systems, including tissue microenvironments, the immune system, and cancer. However, current methods for studying cell combinations and interactions are limited in scalability, allowing just hundreds to thousands of multi-cell assays per experiment; this limited throughput makes it difficult to characterize interactions at biologically relevant scales. Here, we describe a new paradigm in cell interaction profiling that allows accurate grouping of cells and characterization of their interactions for tens to hundreds of thousands of combinations. Our approach leverages high throughput droplet microfluidics to construct multicellular combinations in a deterministic process that allows inclusion of programmed reagent mixtures and beads. The combination droplets are compatible with common manipulation and measurement techniques, including imaging, barcode-based genomics, and sorting. We demonstrate the approach by using it to enrich for CAR-T cells that activate upon incubation with target cells, a bottleneck in the therapeutic T cell engineering pipeline. The speed and control of our approach should enable valuable cell interaction studies.
Project description:Using 3' droplet-based single-cell sequencing, we performed the transcriptional profiling of mouse large intestinal epithelial cells at the single-cell level.
Project description:Methods: We create a CRISPR vector containing a polyadenylated RNA barcode and couple it with droplet scRNA-seq to get a large scale transcriptional measurements of perturbations Results: We were able to perform regulatory inference of gene function, observe nonlinear interactions, and perform downsampling analysis to show that gene signature effects can be seen with as few as 10's of cells while gene level phenotypes, depending on effects size would require 100's of cells Conclusion: Perturb-seq presents a scalable paradigm for obtaining rich genomic profiles of perturbations