Project description:Detection of SNV/indels in the germline of pediatric cancer patients with a focus on CHEK2 germline variants. Included in this EGA upload are the parents of the children in EGA uplodad EGAD00001008763.
Project description:Despite relevant clinical and/or familial presentations suggesting a hereditary predisposition (early-onset, multiple primary tumors, familial aggregation), targeted genomic analysis based on the phenotype are often non contributive. As somatic cancer genes are limited, the hypothesis is that the targeted next-generation sequencing of 200 genes, selected for their implications in cancers may contribute to the understanding of many selected patients’ presentation by the identification of germline deleterious mutations, and may identified phenotype overlapping and/or mosaicisms. The focus will be put on early-onset breast, ovarian, colorectal cancer or pediatric cancers and multiple primary tumors.
Project description:Whole genome sequencing of 10 HCLc tumor and matched-germline T cells. Genomic DNA from highly purified HCLc tumor and T cell populations were utilized for library preparation using NEBNext Ultra DNA library prep kit. Sequencing was performed as 150 bp paired end sequencing using four lanes of an Illumina HiSeq4000 to an average depth of 12X. Reads from each library were aligned to the human reference genome GRCh37 using BWA-MEM (v0.7.12). The analysis of somatic genetic alterations in WGS data from tumor-germline pair HCLc samples was divided based on the nature of the mutation, as follow: single-nucleotide variants (SNVs), indels, CNAs and SVs. Moreover, COSMIC mutational signatures and subclonal architecture was inferred for each tumor.
Project description:Purpose: To identify the genetic basis of posterior polymorphous corneal dystrophy 1 (PPCD1). Methods: Next-generation sequencing was performed on DNA samples from 4 affected and 4 unaffected members of a previously reported family with PPCD1 linked to chromosome 20 between D20S182 and D20S195. Custom capture probes were utilized for targeted region capture of the linked interval. Single nucleotide variants (SNVs) and insertions/deletions (indels) were identified using two bioinformatics pipelines and two annotation databases. Candidate variants met the following criteria: quality score â¥20, read depth â¥5X, heterozygous, novel or rare (minor allele frequency (MAF) ⤠0.05), present in each affected individual and absent in each unaffected individual. Structural variants were detected with two different microarray platforms to identify indels of varying sizes. Results: Sequencing reads aligned to the linked region on chromosome 20, and high coverage was obtained across the sequenced region. The majority of identified variants were detected with both pipelines and annotation databases, although unique variants were identified. Twelve SNVs in 10 genes (2 synonymous variants and 10 noncoding variants) and 9 indels in 7 genes met the filtering criteria and were considered candidate variants for PPCD1. Conclusions: Next-generation sequencing of the PPCD1 interval has identified 17 genes containing novel or rare SNVs and indels that segregate with the affected phenotype in an affected family previously mapped to the PPCD1 locus. We anticipate that screening of these candidate genes in other families previously mapped to the PPCD1 locus will result in the identification of the genetic basis of PPCD1. Four affected and 4 unaffected individuals from a single family were analyzed for copy number variation within the PPCD1 disease locus. Array design and analysis is based on genome build hg19.
Project description:Purpose: To identify the genetic basis of posterior polymorphous corneal dystrophy 1 (PPCD1). Methods: Next-generation sequencing was performed on DNA samples from 4 affected and 4 unaffected members of a previously reported family with PPCD1 linked to chromosome 20 between D20S182 and D20S195. Custom capture probes were utilized for targeted region capture of the linked interval. Single nucleotide variants (SNVs) and insertions/deletions (indels) were identified using two bioinformatics pipelines and two annotation databases. Candidate variants met the following criteria: quality score ≥20, read depth ≥5X, heterozygous, novel or rare (minor allele frequency (MAF) ≤ 0.05), present in each affected individual and absent in each unaffected individual. Structural variants were detected with two different microarray platforms to identify indels of varying sizes. Results: Sequencing reads aligned to the linked region on chromosome 20, and high coverage was obtained across the sequenced region. The majority of identified variants were detected with both pipelines and annotation databases, although unique variants were identified. Twelve SNVs in 10 genes (2 synonymous variants and 10 noncoding variants) and 9 indels in 7 genes met the filtering criteria and were considered candidate variants for PPCD1. Conclusions: Next-generation sequencing of the PPCD1 interval has identified 17 genes containing novel or rare SNVs and indels that segregate with the affected phenotype in an affected family previously mapped to the PPCD1 locus. We anticipate that screening of these candidate genes in other families previously mapped to the PPCD1 locus will result in the identification of the genetic basis of PPCD1.
Project description:U87MG is a commonly studied grade IV glioma cell line that has been analyzed in at least 1,700 publications over four decades. In order to comprehensively characterize the genome of this cell line and to serve as a model of broad cancer genome sequencing, we have generated greater than 30x genomic sequence coverage using a novel 50-base mate paired strategy with a 1.4kb mean insert library. A total of 1,014,984,286 mate-end and 120,691,623 single-end two-base encoded reads were generated from five slides. All data were aligned using a custom designed tool called BFAST, allowing optimal color space read alignment and accurate identification of DNA variants. The aligned sequence reads and mate pair information identified 35 interchromosomal translocation events, 1,315 structural variations (>100bp), 191,743 small (<21bp) insertions and deletions (indels), and 2,384,470 single nucleotide variations (SNVs). Among these observations, the known homozygous mutation in PTEN was robustly identified, and genes involved in cell adhesion were overrepresented in the mutated gene list. Data were compared to 219,187 heterozygous single nucleotide polymorphisms assayed by Illumina 1M Duo genotyping array to assess accuracy: 93.83% of all SNPs were reliably detected at filtering thresholds that yield greater than 99.99% sequence accuracy. Protein coding sequences were disrupted predominantly in this cancer cell line due to small indels, large deletions and translocations. In total, 512 genes were homozygously mutated, including 154 by SNVs, 178 by small indels, 145 by large microdeletions and 35 by interchromosomal translocations to reveal a highly mutated cell line genome. Of the small homozygously mutated variants, 8 SNVs and 99 indels were novel events not present in dbSNP. These data demonstrate that routine generation of broad cancer genome sequence is possible outside of genome centers. The sequence analysis of U87MG provides an unparalleled level of mutational resolution compared to any cell line to date. Whole genome sequencing of the U87MG brain cancer cell line using the AB SOLiD3 sequencer and genotyping using the Illumina Human1M-Duov3 DNA Analysis BeadChip
Project description:CRISPRs and TALENs are efficient systems for gene editing in many organisms including plants. In many cases the CRISPR-Cas or TALEN modules are expressed in the plant cell only transiently. Theoretically, transient expression of the editing modules should limit unexpected effects compared to stable transformation. However, very few studies have measured the off-target and unpredicted effects of editing strategies on the plant genome, and none of them have compared these two major editing systems. We conducted a comprehensive genome-wide investigation of off-target mutations using either a CRISPR-Cas9 or a TALEN strategy. We observed a similar number of SNVs and InDels for the two editing strategies compared to control non-transfected plants, with an average of 8.25 SNVs and 19.5 InDels for the CRISPR-edited plants, and an average of 17.5 SNVs and 32 InDels for the TALEN-edited plants. Interestingly, a comparable number of SNVs and InDels could be detected in the PEG-treated control plants. This shows that except for the on-target modifications, the gene editing tools used in this study did not show a significant off-target activity nor unpredicted effects on the genome, and that the PEG treatment in itself was probably the main source of mutations found in the edited plants.