Project description:We performed shallow whole genome sequencing (WGS) on circulating free (cf)DNA extracted from plasma or cerebrospinal fluid (CSF), and shallow WGS on the tissue DNA extracted from the biopsy in order to evaluate the correlation between the two biomaterials. After library construction and sequencing (Hiseq3000 or Ion Proton), copy number variations were called with WisecondorX.
Project description:Whole genome sequencing (WGS) of tongue cancer samples and cell line was performed to identify the fusion gene translocation breakpoint. WGS raw data was aligned to human reference genome (GRCh38.p12) using BWA-MEM (v0.7.17). The BAM files generated were further analysed using SvABA (v1.1.3) tool to identify translocation breakpoints. The translocation breakpoints were annotated using custom scripts, using the reference GENCODE GTF (v30). The fusion breakpoints identified in the SvABA analysis were additionally confirmed using MANTA tool (v1.6.0).
Project description:In principle, whole-genome sequencing (WGS) of the human genome even at low coverage offers higher resolution for genomic copy number variation (CNV) detection compared to array-based technologies, which is currently the first-tier approach in clinical cytogenetics. There are, however, obstacles in replacing array-based CNV detection with that of low-coverage WGS such as cost, turnaround time, and lack of systematic performance comparisons. With technological advances in WGS in terms of library preparation, instrument platforms, and data analysis algorithms, obstacles imposed by cost and turnaround time are fading. However, a systematic performance comparison between array and low-coverage WGS-based CNV detection has yet to be performed. Here, we compared the CNV detection capabilities between WGS (short-insert, 3kb-, and 5kb-mate-pair libraries) at 1X, 3X, and 5X coverages and standardly used high-resolution arrays in the genome of 1000-Genomes-Project CEU genome NA12878. CNV detection was performed using standard analysis methods, and the results were then compared to a list of Gold Standard NA12878 CNVs distilled from the 1000-Genomes Project. Overall, low-coverage WGS is able to detect drastically more (approximately 5 fold more on average) Gold Standard CNVs compared to arrays and is accompanied with fewer CNV calls without secondary validation. Furthermore, we also show that WGS (at ≥1X coverage) is able to detect all seven validated deletions larger than 100 kb in the NA12878 genome whereas only one of such deletions is detected in most arrays. Finally, we show that the much larger 15 Mbp Cri-du-chat deletion can be clearly seen at even 1X coverage from short-insert WGS.
Project description:We evaluated linked-read whole genome sequencing (WGS) for detection of structural chromosomal rearrangements in primary samples of varying DNA quality from 12 patients diagnosed with ALL. Linked-read WGS enabled precise, allele-specific, digital karyotyping at a base-pair resolution for a wide range of structural variants including complex rearrangements, aneuploidy assessment and gene deletions. Additional RNA-sequencing and copy number aberrations (CNA) data from Illumina Infinium arrays were also generated and assessed against the linked-read WGS data. RNA-sequencing data was used to support structural chromosomal rearrangements detected in the linked-read WGS data by detecting expressed fusion genes as a consequence of the rearrangements. Illumina Infinium arrays (450k array and/or SNP array) were used to assess CNA status to further support the findings in the linked-read WGS data. The processed CNA data from the primary ALL patient samples has been deposited to GEO. RNA-sequencing, linked-read WGS data, and raw SNP array data from the primary ALL patient samples will not be deposited because the patient/parent consent does not cover depositing data that may be used for large-scale determination of germline variants in a repository. The ALL samples were collected 10-20 years ago from pediatric patients aged 2-15 years, some whom have deceased. The linked-read WGS data and the RNA-sequencing data sets generated in the study are available upon reasonable request from the corresponding author Jessica.Nordlund@medsci.uu.se.
Project description:We evaluated linked-read whole genome sequencing (WGS) for detection of structural chromosomal rearrangements in primary samples of varying DNA quality from 12 patients diagnosed with ALL. Linked-read WGS enabled precise, allele-specific, digital karyotyping at a base-pair resolution for a wide range of structural variants including complex rearrangements, aneuploidy assessment and gene deletions. Additional RNA-sequencing and copy number aberrations (CNA) data from Illumina Infinium arrays were also generated and assessed against the linked-read WGS data. RNA-sequencing data was used to support structural chromosomal rearrangements detected in the linked-read WGS data by detecting expressed fusion genes as a consequence of the rearrangements. Illumina Infinium arrays (450k array and/or SNP array) were used to assess CNA status to further support the findings in the linked-read WGS data. The processed CNA data from the primary ALL patient samples has been deposited to GEO. RNA-sequencing, linked-read WGS data, and raw SNP array data from the primary ALL patient samples will not be deposited because the patient/parent consent does not cover depositing data that may be used for large-scale determination of germline variants in a repository. The ALL samples were collected 10-20 years ago from pediatric patients aged 2-15 years, some whom have deceased. The linked-read WGS data and the RNA-sequencing data sets generated in the study are available upon reasonable request from the corresponding author Jessica.Nordlund@medsci.uu.se.
Project description:We evaluated linked-read whole genome sequencing (WGS) for detection of structural chromosomal rearrangements in primary samples of varying DNA quality from 12 patients diagnosed with ALL. Linked-read WGS enabled precise, allele-specific, digital karyotyping at a base-pair resolution for a wide range of structural variants including complex rearrangements, aneuploidy assessment and gene deletions. Additional RNA-sequencing and copy number aberrations (CNA) data from Illumina Infinium arrays were also generated and assessed against the linked-read WGS data. RNA-sequencing data was used to support structural chromosomal rearrangements detected in the linked-read WGS data by detecting expressed fusion genes as a consequence of the rearrangements. Illumina Infinium arrays (450k array and/or SNP array) were used to assess CNA status to further support the findings in the linked-read WGS data. The processed CNA data from the primary ALL patient samples has been deposited to GEO. RNA-sequencing, linked-read WGS data, and raw SNP array data from the primary ALL patient samples will not be deposited because the patient/parent consent does not cover depositing data that may be used for large-scale determination of germline variants in a repository. The ALL samples were collected 10-20 years ago from pediatric patients aged 2-15 years, some whom have deceased. The linked-read WGS data and the RNA-sequencing data sets generated in the study are available upon reasonable request from the corresponding author Jessica.Nordlund@medsci.uu.se.
Project description:<p>Adapted from manuscript in review:</p> <p>Nearly all prostate cancer deaths are from metastatic castration-resistant prostate cancer (mCRPC) but there have been few whole genome sequencing (WGS) studies of this disease state. We performed linked-read WGS on 23 mCRPC biopsy specimens and analyzed cell-free DNA sequencing data from 86 patients with mCRPC. In addition to frequent rearrangements affecting known prostate cancer genes, we observed complex rearrangements of the AR locus in most cases. These include highly recurrent tandem duplications involving an upstream enhancer of AR. A subset of cases also displayed a genome-wide tandem duplicator phenotype associated with CDK12 inactivation. Our findings highlight the complex genomic structure of mCRPC nominate new alterations that may inform prostate cancer treatment, and suggest that additional recurrent events in the noncoding mCRPC genome remain to be discovered. </p>
Project description:In this study, two accessions of Arabidopsis thaliana (Columbia and Landsberg erecta) were crossed and tetrads were obtained using thanks to the the quartet mutation, which keeps the 4 haploid pollen grains from a single meiosis attached together. By fertilising a plant (of ecotype Columbia) with a single tetrad, then selecting seeds from siliques containing exactly four seeds and finally sequencing the 4 developed plants, we can access the complete history of meiotic recombination events occurring in a single male meiosis. Comparison of the genomic sequences (WGS) of the 4 plants in a tetrad makes it possible to identify meiotic COs and NCOS events in each tetrad thanks to the numerous polymorphisms specific to each of the parental genomes. The analysis of the WGS tetrads data consists of genotyping a series of SNV markers (differentiating Columbia and Landsberg) positioned on the five chromosomes for the 4 (M1, M2, M3, M4) individuals of a tetrad, representing the 4 chromatids of each chromosomes. A total of 20 tetrads, the F1 and the two parental accessions Columbia and Landsberg erecta were sequenced.