Project description:The advent of next-generation sequencing (NGS) has accelerated biomedical research by enabling the high-throughput analysis of DNA sequences at a very low cost. However, NGS has limitations in detecting rare-frequency variants (< 1%) because of high sequencing errors (> 0.1~1%). NGS errors could be filtered out using molecular barcodes, by comparing read replicates among those with the same barcodes. Accordingly, these barcoding methods require redundant reads of non-target sequences, resulting in high sequencing cost. Here, we present a cost-effective NGS error validation method in a barcode-free manner. By physically extracting and individually amplifying the DNA clones of erroneous reads, we distinguish true variants of frequency > 0.003% from the systematic NGS error and selectively validate NGS error after NGS. We achieve a PCR-induced error rate of 2.5×10-6 per base per doubling event, using 10 times less sequencing reads compared to those from previous studies.
Project description:BACKGROUND:Pediatric leukemias have a diverse genomic landscape associated with complex structural variants, including gene fusions, insertions and deletions, and single nucleotide variants. Routine karyotype and fluorescence in situ hybridization (FISH) techniques lack sensitivity for smaller genomic alternations. Next-generation sequencing (NGS) assays are being increasingly utilized for assessment of these various lesions. However, standard NGS lacks quantitative sensitivity for minimal residual disease (MRD) surveillance due to an inherently high error rate. METHODS:Primary bone marrow samples from pediatric leukemia (n =?32) and adult leukemia subjects (n =?5), cell line MV4-11, and an umbilical cord sample were utilized for this study. Samples were sequenced using molecular barcoding with targeted DNA and RNA library enrichment techniques based on anchored multiplexed PCR (AMP®) technology, amplicon based error-corrected sequencing (ECS) or a human cancer transcriptome assay. Computational analyses were performed to quantitatively assess limit of detection (LOD) for various DNA and RNA lesions, which could be systematically used for MRD assays. RESULTS:Matched leukemia patient samples were analyzed at three time points; diagnosis, end of induction (EOI), and relapse. Similar to flow cytometry for ALL MRD, the LOD for point mutations by these sequencing strategies was ?0.001. For DNA structural variants, FLT3 internal tandem duplication (ITD) positive cell line and patient samples showed a LOD of ?0.001 in addition to previously unknown copy number losses in leukemia genes. ECS in RNA identified multiple novel gene fusions, including a SPANT-ABL gene fusion in an ALL patient, which could have been used to alter therapy. Collectively, ECS for RNA demonstrated a quantitative and complex landscape of RNA molecules with 12% of the molecules representing gene fusions, 12% exon duplications, 8% exon deletions, and 68% with retained introns. Droplet digital PCR validation of ECS-RNA confirmed results to single mRNA molecule quantities. CONCLUSIONS:Collectively, these assays enable a highly sensitive, comprehensive, and simultaneous analysis of various clonal leukemic mutations, which can be tracked across disease states (diagnosis, EOI, and relapse) with a high degree of sensitivity. The approaches and results presented here highlight the ability to use NGS for MRD tracking.
Project description:Advances in high-throughput sequencing have enabled technologies that probe the adaptive immune system with unprecedented depth. We have developed a multiplex PCR method to sequence tens of millions of T cell receptors (TCRs) from a single sample in a few days. A method is presented to test the precision, accuracy, and sensitivity of this assay. T cell clones, each with one fixed productive TCR rearrangement, are doped into complex blood cell samples. TCRs from a total of eleven samples are sequenced, with the doped T cell clones ranging from 10% of the total sample to 0.001% (one cell in 100,000). The assay is able to detect even the rarest clones. The precision of the assay is demonstrated across five orders of magnitude. The accuracy for each clone is within an overall factor of three across the 100,000 fold dynamic range. Additionally, the assay is shown to be highly repeatable.
Project description:Error-corrected sequences (ECSs) that utilize double-stranded DNA sequences are useful in detecting mutagen-induced mutations. However, relatively higher frequencies of G:C > T:A (1 × 10-7 bp) and G:C > C:G (2 × 10-7 bp) errors decrease the accuracy of detection of rare G:C mutations (approximately 10-7 bp). Oxidized guanines in single-strand (SS) overhangs generated after shearing could serve as the source of these errors. To remove these errors, we first computationally discarded up to 20 read bases corresponding to the ends of the DNA fragments. Error frequencies decreased proportionately with trimming length; however, the results indicated that they were not sufficiently removed. To efficiently remove SS overhangs, we evaluated three mechanistically distinct SS-specific nucleases (S1 Nuclease, mung bean nuclease, and RecJf exonuclease) and found that they were more efficient than computational trimming. Consequently, we established Jade-Seq™, an ECS protocol with S1 Nuclease treatment, which reduced G:C > T:A and G:C > C:G errors to 0.50 × 10-7 bp and 0.12 × 10-7 bp, respectively. This was probably because S1 Nuclease removed SS regions, such as gaps and nicks, depending on its wide substrate specificity. Subsequently, we evaluated the mutation-detection sensitivity of Jade-Seq™ using DNA samples from TA100 cells exposed to 3-methylcholanthrene and 7,12-dimethylbenz[a]anthracene, which contained the rare G:C > T:A mutation (i.e., 2 × 10-7 bp). Fold changes of G:C > T:A compared to the vehicle control were 1.2- and 1.3-times higher than those of samples without S1 Nuclease treatment, respectively. These findings indicate the potential of Jade-Seq™ for detecting rare mutations and determining the mutagenicity of environmental mutagens.
Project description:BackgroundCirculating free DNA sequencing (cfDNA-Seq) can portray cancer genome landscapes, but highly sensitive and specific technologies are necessary to accurately detect mutations with often low variant frequencies.MethodsWe developed a customizable hybrid-capture cfDNA-Seq technology using off-the-shelf molecular barcodes and a novel duplex DNA molecule identification tool for enhanced error correction.ResultsModeling based on cfDNA yields from 58 patients showed that this technology, requiring 25 ng of cfDNA, could be applied to >95% of patients with metastatic colorectal cancer (mCRC). cfDNA-Seq of a 32-gene, 163.3-kbp target region detected 100% of single-nucleotide variants, with 0.15% variant frequency in spike-in experiments. Molecular barcode error correction reduced false-positive mutation calls by 97.5%. In 28 consecutively analyzed patients with mCRC, 80 out of 91 mutations previously detected by tumor tissue sequencing were called in the cfDNA. Call rates were similar for point mutations and indels. cfDNA-Seq identified typical mCRC driver mutations in patients in whom biopsy sequencing had failed or did not include key mCRC driver genes. Mutations only called in cfDNA but undetectable in matched biopsies included a subclonal resistance driver mutation to anti-EGFR antibodies in KRAS, parallel evolution of multiple PIK3CA mutations in 2 cases, and TP53 mutations originating from clonal hematopoiesis. Furthermore, cfDNA-Seq off-target read analysis allowed simultaneous genome-wide copy number profile reconstruction in 20 of 28 cases. Copy number profiles were validated by low-coverage whole-genome sequencing.ConclusionsThis error-corrected, ultradeep cfDNA-Seq technology with a customizable target region and publicly available bioinformatics tools enables broad insights into cancer genomes and evolution.Clinicaltrialsgov identifierNCT02112357.
Project description:Droplet-based high throughput single cell sequencing techniques tremendously advanced our insight into cell-to-cell heterogeneity. However, those approaches only allow analysis of one extremity of the transcript after short read sequencing. In consequence, information on splicing and sequence heterogeneity is lost. To overcome this limitation, several approaches that use long-read sequencing were introduced recently. Yet, those techniques are limited by low sequencing depth and/or lacking or inaccurate assignment of unique molecular identifiers (UMIs), which are critical for elimination of PCR bias and artifacts. We introduce ScNaUmi-seq, an approach that combines the high throughput of Oxford Nanopore sequencing with an accurate cell barcode and UMI assignment strategy. UMI guided error correction allows to generate high accuracy full length sequence information with the 10x Genomics single cell isolation system at high sequencing depths. We analyzed transcript isoform diversity in embryonic mouse brain and show that ScNaUmi-seq allows defining splicing and SNVs (RNA editing) at a single cell level.
Project description:E18 mouse brain single cell profiling using the 10x Genomics Chromium instrument workflow with either Illumina short read sequencing for the standard gene profiling and Nanopore PromethION long read sequencing for isoform profiling.
Project description:PurposePersistent molecular disease (PMD) after induction chemotherapy predicts relapse in AML. In this study, we used whole-exome sequencing (WES) and targeted error-corrected sequencing to assess the frequency and mutational patterns of PMD in 30 patients with AML.Materials and methodsThe study cohort included 30 patients with adult AML younger than 65 years who were uniformly treated with standard induction chemotherapy. Tumor/normal WES was performed for all patients at presentation. PMD analysis was evaluated in bone marrow samples obtained during clinicopathologic remission using repeat WES and analysis of patient-specific mutations and error-corrected sequencing of 40 recurrently mutated AML genes (MyeloSeq).ResultsWES for patient-specific mutations detected PMD in 63% of patients (19/30) using a minimum variant allele fraction (VAF) of 2.5%. In comparison, MyeloSeq identified persistent mutations above 0.1% VAF in 77% of patients (23/30). PMD was usually present at relatively high levels (>2.5% VAFs), such that WES and MyeloSeq agreed for 73% of patients despite differences in detection limits. Mutations in DNMT3A, ASXL1, and TET2 (ie, DTA mutations) were persistent in 16 of 17 patients, but WES also detected non-DTA mutations in 14 of these patients, which for some patients distinguished residual AML cells from clonal hematopoiesis. Surprisingly, MyeloSeq detected additional variants not identified at presentation in 73% of patients that were consistent with new clonal cell populations after chemotherapy.ConclusionPMD and clonal hematopoiesis are both common in patients with AML in first remission. These findings demonstrate the importance of baseline testing for accurate interpretation of mutation-based tumor monitoring assays for patients with AML and highlight the need for clinical trials to determine whether these complex mutation patterns correlate with clinical outcomes in AML.
Project description:There is a limited understanding about the impact of rare protein-truncating variants across multiple phenotypes. We explore the impact of this class of variants on 13 quantitative traits and 10 diseases using whole-exome sequencing data from 100,296 individuals. Protein-truncating variants in genes intolerant to this class of mutations increased risk of autism, schizophrenia, bipolar disorder, intellectual disability, and ADHD. In individuals without these disorders, there was an association with shorter height, lower education, increased hospitalization, and reduced age at enrollment. Gene sets implicated from GWASs did not show a significant protein-truncating variants burden beyond what was captured by established Mendelian genes. In conclusion, we provide a thorough investigation of the impact of rare deleterious coding variants on complex traits, suggesting widespread pleiotropic risk.
Project description:The tracking of leukemic clones in acute myeloid leukemic promisses deeper insights into disease development and therapeutic options. We therefore established a fluorescent genetic barcoding (FGB) labeling approach that allows for flow cytomtric tracking of color-coded clones in vitro and in vivo. In Hoxa9 and Meis1 (H9M) dependent murine AML, we tracked the growth behavior of 24 clones in parallel and enriched for pre-leukemic clones as well as their de novo expanded counterparts and stably expanded clones from leukemic mice by fluorescence-activated cell sorting. These samples were subjected toRNA sequencing for the assessment of transcriptional changes underlying clonal maintenance and expansion.