Project description:Background: Whole exome sequencing (WES) has been proven to serve as a valuable basis for various applications such as variant calling and copy number variation (CNV) analyses. For those analyses the read coverage should be optimally balanced throughout protein coding regions at sufficient read depth. Unfortunately, WES is known for its uneven coverage within coding regions due to GC-rich regions or off-target enrichment. Results: In order to examine the irregularities of WES within genes, we applied Agilent SureSelectXT exome capture on human samples and sequenced these via Illumina in 2x101 paired-end mode. As we suspected the sequenced insert length to be crucial in the uneven coverage of exome captured samples, we sheared 12 genomic DNA samples to two different DNA insert size lengths, namely 130 and 170 bp. Interestingly, although mean coverages of target regions were clearly higher in samples of 130 bp insert length, the level of evenness was more pronounced in 170 bp samples. Moreover, merging overlapping paired-end reads revealed a positive effect on evenness indicating overlapping reads as another reason for the unevenness. In addition, mutation analysis on a subset of the samples was performed. In these isogenic subclones almost twofold mutations were failed in the 130 bp samples when compared to the 170 bp samples. Visual inspection of the discarded mutation sites exposed low coverages at the sites embedded in high amplitudes of coverage depth in the affected region. Conclusions: Producing longer insert reads could be a good strategy to achieve better uniform read coverage in coding regions and hereby enhancing the effective sequencing yield to provide an improved basis for further variant calling and CNV analyses.
Project description:Whole exome sequencing of a cell line derived from an Rb1 and Trp53 genetically engineered mouse model (GEMM) to assess the baseline copy number landscape of the cells prior to experimental modification.
Project description:Of the multiple anatomical sites represented in oral cancer, squamous cell carcinoma of the tongue (TSCC) shows the highest incidence among younger age group. Chewing betel leaf, areca nut & slaked lime and smoking tobacco are common practises in India which have direct clinical implication in TSCC carcinogenesis. Here, for the first time we define the landscape of genomic alterations in TSCC from the Indian diaspora which would help to identify novel therapeutic targets for clinical intervention and define the genetic basis for TSCC. We performed high throughput sequencing of fifty four tongue samples using whole exome sequencing (n=47, 23 paired normal tumor and 1 unpaired) and transcriptome sequencing (n=17, 10 tumor and 5 normal). Mutation, copy number analysis were carried out using exome sequencing data and transcriptome analysis provided expressed genes and transcript fusions in tongue cancer patients. Further, integrated analysis were performed to identify biologically relevant alterations. Our preliminary analysis revealed presence of most frequently altered mutations in TSCC which includes mutations in TP53, NOTCH1, CDKN2A, USP6, KMT2D etc, consistent with literature. We observed high frequency of CG/T(GC/A) transversions in non-CpG islands, a signature associated with tobacco exposure. Somatic copy number analysis revealed copy number gain in known hallmarks such as CCND1, MYC, ORAOV1 genes along with copy number alteration in novel genes. Significant positive correlation was observed in the genes harbouring copy number gains and showing increased expression.
Project description:The study involves whole exome sequencing of 20 primary tumors obtained from lung squamous carcinoma patients of Indian origin. With this, we aim to describe the mutational profile of this specific subset of lung cancer patients. This knowledge will further allow us to gain an insight into potentially actionable genomic alterations prevalent in Indian lung squamous carcinoma.
Project description:Current methods for detection of copy number aberrations (CNA) from whole-exome sequencing (WES) data are based on the read counts of the captured exons only. However, accurate CNA determination is complicated by the non-uniform read depth and uneven distribution of exons. Therefore, we developed ENCODER (ENhanced COpy number Detection from Exome Reads), which eludes these problems. By exploiting the ‘off-target’ sequence reads, it allows for creation of robust copy number profiles from WES. The accuracy of ENCODER compares to approaches specifically designed for copy number detection, and outperforms current exon-based WES methods, particularly in samples of low quality. Current methods for detection of copy number aberrations (CNA) from whole-exome sequencing (WES) data are based on the read counts of the captured exons only. However, accurate CNA determination is complicated by the non-uniform read depth and uneven distribution of exons. Therefore, we developed ENCODER (ENhanced COpy number Detection from Exome Reads), which eludes these problems. By exploiting the ‘off-target’ sequence reads, it allows for creation of robust copy number profiles from WES. The accuracy of ENCODER compares to approaches specifically designed for copy number detection, and outperforms current exon-based WES methods, particularly in samples of low quality. Current methods for detection of copy number aberrations (CNA) from whole-exome sequencing (WES) data are based on the read counts of the captured exons only. However, accurate CNA determination is complicated by the non-uniform read depth and uneven distribution of exons. Therefore, we developed ENCODER (ENhanced COpy number Detection from Exome Reads), which eludes these problems. By exploiting the ‘off-target’ sequence reads, it allows for creation of robust copy number profiles from WES. The accuracy of ENCODER compares to approaches specifically designed for copy number detection, and outperforms current exon-based WES methods, particularly in samples of low quality. DNA copy number profiles generated with a new tool, ENCODER, were compared to DNA copy number profiles from SNP6, NimbleGen and low-coverage Whole Genome Sequencing.