Project description:The study included 15 patients (7 males, 8 females) with JMML. Peripheral blood and/or bone marrow aspirates were collected on EDTA at diagnosis. Non-hematopoietic tissues (fibroblasts) was derived from skin biopsy for each patient. Exome sequencing was performed in several distinct series between 2012 and 2017, which explains the differences in capture kit versions and reference genome version.Targeted enrichment and massive parallel sequencing were performed on paired genomic DNA from leukocytes and fibroblasts. Exome capture was carried out using the SureSelect Human All Exon V4+UTRs or V5 or V5+UTRs or SureSelect Clinical Research (Agilent Technologies, Santa Clara, CA, USA) according to manufacturer’s instruction and protocols by IntegraGen (Evry, France). Paired-end 75 bases sequencing was performed on a HiSeq2000 or HiSeq4000 instrument (Illumina, San Diego, CA, USA). Image analysis and base calling were performed using the Real Time Analysis (RTA) pipeline v. 1.14 (Illumina) with default parameters. The alignment of paired-end reads to the reference human genome (UCSC GRCh37/hg19 or UCSC GRCh38), variant calling and generation of Quality variants scores were carried out using the CASAVA v.1.8 pipeline (Illumina).
Project description:More than 2x10E9 sequences made on Illumina platform derived from the genome of E14 embryonic stem cells cultured in our laboratory were used to build a database of about 2.7x10E6 single nucleotide variant. The database was validated using other two sequencing datasets from other laboratory and high overlap was observed. The identified variant are enriched on intergenic regions, but several thousands reside on gene exons and regulatory regions, such as promoters, enhancers, splicing site and untranslated regions of RNA, thus indicating high probability of an important functional impact on the molecular biology of this cells. We created a new E14 genome assembly including the new identified variants and used it to map reads from next generation sequencing data generated in our laboratory or in others on E14 cell line. We observed an increase in the number of mapped reads of about 5%. CpG dinucleotide showed the higher variation frequency, probably because of it could be target of DNA methylation. We performed a reduced representation bisulfite sequencing on E14 cell line to test our new genome assembly with respect to the mm9 genome reference. After mapping and methylation status calling, we obtained an increase of about 120,000 called CpG and we avoided about 20,000 wrong CpG calling. genotyping of E14 embryonic stem cells (ESCs) and Reduced representation Bisulfite Sequencing (RRBS) of E14 ESCs.
Project description:Background: Whole exome sequencing (WES) has been proven to serve as a valuable basis for various applications such as variant calling and copy number variation (CNV) analyses. For those analyses the read coverage should be optimally balanced throughout protein coding regions at sufficient read depth. Unfortunately, WES is known for its uneven coverage within coding regions due to GC-rich regions or off-target enrichment. Results: In order to examine the irregularities of WES within genes, we applied Agilent SureSelectXT exome capture on human samples and sequenced these via Illumina in 2x101 paired-end mode. As we suspected the sequenced insert length to be crucial in the uneven coverage of exome captured samples, we sheared 12 genomic DNA samples to two different DNA insert size lengths, namely 130 and 170 bp. Interestingly, although mean coverages of target regions were clearly higher in samples of 130 bp insert length, the level of evenness was more pronounced in 170 bp samples. Moreover, merging overlapping paired-end reads revealed a positive effect on evenness indicating overlapping reads as another reason for the unevenness. In addition, mutation analysis on a subset of the samples was performed. In these isogenic subclones almost twofold mutations were failed in the 130 bp samples when compared to the 170 bp samples. Visual inspection of the discarded mutation sites exposed low coverages at the sites embedded in high amplitudes of coverage depth in the affected region. Conclusions: Producing longer insert reads could be a good strategy to achieve better uniform read coverage in coding regions and hereby enhancing the effective sequencing yield to provide an improved basis for further variant calling and CNV analyses.
Project description:This SuperSeries is composed of the following subset Series: GSE36950: SNP array for CNV calling AUTS2 project [Affymetrix] GSE37141: Oligo array for CNV calling AUTS2 project [Agilent] GSE37142: SNP array for CNV calling AUTS2 project [Illumina] GSE37654: Oligo array for calling CNV's for AUTS2 project [NimbleGen] GSE37656: Oligo array for CNV calling AUTS2 project [Bluegnome] Refer to individual Series
Project description:More than 2x10E9 sequences made on Illumina platform derived from the genome of E14 embryonic stem cells cultured in our laboratory were used to build a database of about 2.7x10E6 single nucleotide variant. The database was validated using other two sequencing datasets from other laboratory and high overlap was observed. The identified variant are enriched on intergenic regions, but several thousands reside on gene exons and regulatory regions, such as promoters, enhancers, splicing site and untranslated regions of RNA, thus indicating high probability of an important functional impact on the molecular biology of this cells. We created a new E14 genome assembly including the new identified variants and used it to map reads from next generation sequencing data generated in our laboratory or in others on E14 cell line. We observed an increase in the number of mapped reads of about 5%. CpG dinucleotide showed the higher variation frequency, probably because of it could be target of DNA methylation. We performed a reduced representation bisulfite sequencing on E14 cell line to test our new genome assembly with respect to the mm9 genome reference. After mapping and methylation status calling, we obtained an increase of about 120,000 called CpG and we avoided about 20,000 wrong CpG calling.