SRiD- A facile DNA barcode generation and management system for high throughput screening
Ontology highlight
ABSTRACT: 20 random DNA barcodes were designed in silico and transfected into PC3 cells. Barcodes were sequenced using Illumina-Miseq technology to find the sequence and their respective copy numbers. Current file contains the raw data of these DNA barcodes in fastq format
Project description:20 random DNA barcodes were designed in silico and transfected into PC3 cells. Barcodes were sequenced using Illumina-Miseq technology to find the sequence and their respective copy numbers. Current file contains the raw data of these DNA barcodes in fastq format Validating an algorithm called SRiD that generates random DNA barcodes that do not match a genome of interest, in this case human genome. 20 DNA barcodes were used for this validation.
Project description:Temporal analysis of Irf4 and PU.1 genome binding during B cell activation and differentiation in vitro using antigen (NP-Ficoll) CD40L and IL-2/4/5 cytokines (see Molecular Systems Biology 7:495 for details of cellular system). The results provide insight in the target genes and binding specificity of IRF4 and PU.1 during coordination of different programs of B cell differentiation. Regrettably three of the FASTQ raw sequence files in our study were corrupted during storage. FASTQ data from our experimental and control groups are available for download via GEO SRA; however, two groups are missing select raw sequence files. These include one PU.1 Day 3 group file (Sample GSM1133499) and two of four input files used to generate a concatenated “super” input file (Sample GSM1133490); the raw data provided for input consists of the two input files recovered. Importantly, FASTA sequences for both of these datasets are available as supplementary data through GEO, and we can make available upon request (rsciamma@uchicago.edu) all files in our study in the ELAND-extended alignment format. Please note that GEO no longer supports this format.
Project description:Following the removal of implanted mammary tumors, nude mice develop multiple-organ metastases at late stage. The metastases may originate from the primary tumors before the resection surgery, or alternatively, from some established metastases. By multiple approaches, we have proved that bone environment could invigorate cancer cells for further dissemination. this study aims to examine if metastatic dissemination from bone to other sites occurs in natural setting of metastatic spread. We herein apply the rapidly evolving barcode system using homing guide RNA/Cas9 to trace the metastases formation in mouse. hgRNA/Cas9 is a self-targeting Crispr system which allows the mutation occurs in the DNA sequence of guide RNA. Tumor cells wer labelled with doxycycline inducible evolving barcoding system. Upon doxycycline treatment the DNA sequence of hgRNA accumulate mutations with time. The diversity of barcodes in each lesion can infer the timeing of seeding while the mutation patterns of barcodes suggest the phylogenetic correlation of metastases. Several findings were made on this study. First, at the terminal stage, multi-organ metastases are not genetically grouped according to sites of metastases. Nonnegative Matrix Factorization (NMF) analysis of mutant barcodes suggested the early disseminated metastases, which have highest level of Shannon entropy, were featured with a common cluster of mutant barcodes irrespective of their locations. Second, most metastases are potentially multiclonal as indicated by multiple clusters of independent mutant barcodes. Third, when we use Shannon entropy as an index of metastasis age , putative parent-child relationship between metastases with unique mutant barcodes clearly exemplified secondary metastatic seeding from bone to other organs. Finally, we did not observe a clear correlation between tumor burden and Shannon entropy across different metastases, suggesting that putative parental metastases might remain small after seeding further metastases.
Project description:An Illumina sequencing lane for testing our demultiplexer, named Ultraplex, which splits a raw FASTQ file containing barcodes either at a single end or at both 5’ and 3’ ends of reads, trims the sequencing adaptors and low quality bases, and moves unique molecular identifiers (UMIs) into the read header, allowing subsequent removal of PCR duplicates. Ultraplex is able to perform such single or combinatorial demultiplexing on both single- and paired-end sequencing data, and can process an entire Illumina HiSeq lane, consisting of nearly 500 million reads, in less than twenty minutes.
Project description:Single cells from human colorectal cancer and normal adjacent colon of 16 patients were used for single-cell RNA-seq, TCR-seq, CITE-seq and Cell hashing. In brief, single cells were incubated for 3h with or without PMA/Ionomycin, and were treated with Cell hashing and CITE-seq antibodies to distinguish samples, stimulation/non-stimulation, and cell surface proteins. Sorted viable CD3+TCRαβ+ single cells were loaded into 10x genomics ChromiumTM controller to make nanoliter-scale droplets with uniquely barcoded 5’ gel beads called GEMs. After GEM-RT and the following some cDNA amplification steps, cDNAs derived from cellular mRNA were pooled for downstream processing and library preparation according to the manufacturer’s instructions. The 5’ transcript library was sequenced with Illumina Novaseq. The single cell TCR enriched library was sequenced with Illumina Miseq using 150 paired-end reads. HTO/ADTs from Cell hashing or CITE-seq were amplified using specific primers that append P5 and P7 sequences for illumina sequencing (Miseq or Nextseq). All fastq files were demultiplexed. Cell hashing and CITE-seq barcodes are available in attached text files. Fastq files from RNA-seq and TCR-seq can be processed through cellranger and vdjranger by 10xgenomics. The datasets include the data of independent experiments at May 29, June 16, June 23, and Aug 13, 2019. Details are available in Masuda et al., bioRxiv, 2020, The functional and phenotypic diversity of single T-cell infiltrates in human colorectal cancer as correlated with clinical outcome.
Project description:The transcription factor IRF4 regulates immunoglobulin class switch recombination and plasma cell differentiation. Its differing concentrations appear to regulate mutually antagonistic programs of B and plasma cell gene expression. We show IRF4 to be also required for generation of germinal center (GC) B cells. Its transient expression in vivo induced the expression of key GC genes including Bcl6 and Aicda. In contrast, sustained and higher concentrations of IRF4 promoted the generation of plasma cells while antagonizing the GC fate. IRF4 cobound with the transcription factors PU.1 or BATF to Ets or AP-1 composite motifs, associated with genes involved in B cell activation and the GC response. At higher concentrations, IRF4 binding shifted to interferon sequence response motifs; these enriched for genes involved in plasma cell differentiation. Our results support a model of "kinetic control" in which signaling-induced dynamics of IRF4 in activated B cells control their cell-fate outcomes. Regrettably three of the FASTQ raw sequence files in our study were corrupted during storage. FASTQ data from our experimental and control groups are available for download via GEO SRA; however, two groups are missing select raw sequence files. These include one PU.1 Day 3 group file (Sample GSM1133499) and two of four input files used to generate a concatenated “super” input file (Sample GSM1133490); the raw data provided for input consists of the two input files recovered. Importantly, FASTA sequences for both of these datasets are available as supplementary data through GEO, and we can make available upon request (rsciamma@uchicago.edu) all files in our study in the ELAND-extended alignment format. Please note that GEO no longer supports this format.