Project description:Primary objectives: The primary objective is to investigate circulating tumor DNA (ctDNA) via deep sequencing for mutation detection and by whole genome sequencing for copy number analyses before start (baseline) with regorafenib and at defined time points during administration of regorafenib for treatment efficacy in colorectal cancer patients in terms of overall survival (OS).
Primary endpoints: circulating tumor DNA (ctDNA) via deep sequencing for mutation detection and by whole genome sequencing for copy number analyses before start (baseline) with regorafenib and at defined time points during administration of regorafenib for treatment efficacy in colorectal cancer patients in terms of overall survival (OS).
Project description:We applied the RNA-Seq approach to reconstruct the transcriptome of Vitis vinifera cv. Corvina, using RNA pooled from a comprehensive set of sampled tissues in different organs and development steps, and we were able to reconstruct some novel and putative private Corvina genes. We analyzed the expression of these genes in three berry developmental conditions, and posit that they may play some role in the formation of the mature organ. Background: Plants display a high genetic and phenotypic variability among different cultivars. Understanding the genetic components that contribute to phenotypic diversity is necessary to disentangle genetic factors from the environment. Given the high degree of genetic diversity among plant cultivars a whole-genome sequencing and re-annotation of each variety is required but a reliable genome assembly is hindered by the high heterozigosity and sequence divergence. Results: we show the feasibility of an approach based on sequencing of cDNA by RNA-Seq to analyze varietal diversity between a local grape cultivar Corvina and the PN40024 grape reference genome. We detected 15,260 known genes and we annotated alternative splicing isoforms for 9,463 genes. Our approach allowed to define 2,321 protein coding putative novel genes in unannotated or unassembled regions of the reference genome PN40024 and 180 putative private Corvina genes whose sequence is not shared with the reference genome. Conclusions: With a de novo assembly based approach we were able to reconstruct a substantial part of the Corvina transcriptome and we improved substantially known genes annotations by better defining the structure of known genes, annotating splicing isoforms and detecting unannotated genes. Moreover our results clearly define sets of private genes which are likely part of the âdispensableâ genome and potentially involved into influencing some cultivar-specific characteristics. In plant biology a transcriptome de novo assembly approach should not be limited to species where no reference genome is available as it can improve the annotation lead to the identification of genes peculiar of a cultivar.
Project description:For this project, we have sequenced, assembled and annotated a transcriptome of a diploid wheat Triticum urartu accession PI 428198. The sequencing libraries were prepared from shoot and root tissues harvested from 2-3 week old seedlings. All sequencing was carried out on the Illumina HiSeq platform using the 100 bp pair-end protocol (248.5 million reads). The assembly was constructed using a multiple k-mer approach with a de novo assembly algorithm implemented in CLC Genomics Workbench 5.5 and additional redundancy reduction with CD-HIT and blast2cap3 programs. Open reading frames and proteins were predicted using BLASTX searches and a findorf algorithm.
Project description:The naked mole-rat (NMR; Heterocephalus glaber) has recently gained considerable attention in the scientific community for its unique potential to unveil novel insights in the fields of medicine, biochemistry, and evolution. NMRs exhibit unique adaptations that include protracted fertility, cancer resistance, eusociality, and anoxia. This suite of adaptations is not found in other rodent species, suggesting that interrogating conserved and accelerated regions in the NMR genome will find regions of the NMR genome fundamental to their unique adaptations. However, the current NMR genome assembly has limits that make studying structural variations, heterozygosity, and non-coding adaptations challenging. We present a complete diploid naked-mole rat genome assembly by integrating long-read and 10X-linked read genome sequencing of a male NMR and its parents, and Hi-C sequencing in the NMR hypothalamus (N=2). Reads were identified as maternal, paternal or ambiguous (TrioCanu). We then polished genomes with Flye, Racon and Medaka. Assemblies were then scaffolded using the following tools in order: Scaff10X, Salsa2, 3d-DNA, Minimap2-alignment between assemblies, and the Juicebox Assembly Tools. We then subjected the assemblies to another round of polishing, including short-read polishing with Freebayes. We assembled the NMR mitochondrial genome with mitoVGP. Y chromosome contigs were identified by aligning male and female 10X linked reads to the paternal genome and finding male-biased contigs not present in the maternal genome. Contigs were assembled with publicly available male NMR Fibroblast Hi-C-seq data (SRR820318). Both assemblies have their sex chromosome haplotypes merged so that both assemblies have a high-quality X and Y chromosome. Finally, assemblies were evaluated with Quast, BUSCO, and Merqury, which all reported the base-pair quality and contiguity of both assemblies as high-quality. The assembly will next be annotated by Ensembl using public RNA-seq data from multiple tissues (SRP061363). Together, this assembly will provide a high-quality resource to the NMR and comparative genomics communities.
Project description:The naked mole-rat (NMR; Heterocephalus glaber) has recently gained considerable attention in the scientific community for its unique potential to unveil novel insights in the fields of medicine, biochemistry, and evolution. NMRs exhibit unique adaptations that include protracted fertility, cancer resistance, eusociality, and anoxia. This suite of adaptations is not found in other rodent species, suggesting that interrogating conserved and accelerated regions in the NMR genome will find regions of the NMR genome fundamental to their unique adaptations. However, the current NMR genome assembly has limits that make studying structural variations, heterozygosity, and non-coding adaptations challenging. We present a complete diploid naked-mole rat genome assembly by integrating long-read and 10X-linked read genome sequencing of a male NMR and its parents, and Hi-C sequencing in the NMR hypothalamus (N=2). Reads were identified as maternal, paternal or ambiguous (TrioCanu). We then polished genomes with Flye, Racon and Medaka. Assemblies were then scaffolded using the following tools in order: Scaff10X, Salsa2, 3d-DNA, Minimap2-alignment between assemblies, and the Juicebox Assembly Tools. We then subjected the assemblies to another round of polishing, including short-read polishing with Freebayes. We assembled the NMR mitochondrial genome with mitoVGP. Y chromosome contigs were identified by aligning male and female 10X linked reads to the paternal genome and finding male-biased contigs not present in the maternal genome. Contigs were assembled with publicly available male NMR Fibroblast Hi-C-seq data (SRR820318). Both assemblies have their sex chromosome haplotypes merged so that both assemblies have a high-quality X and Y chromosome. Finally, assemblies were evaluated with Quast, BUSCO, and Merqury, which all reported the base-pair quality and contiguity of both assemblies as high-quality. The assembly will next be annotated by Ensembl using public RNA-seq data from multiple tissues (SRP061363). Together, this assembly will provide a high-quality resource to the NMR and comparative genomics communities.
Project description:Porcine 60K BeadChip genotyping arrays (Illumina) are increasingly being applied in pig genomics to validate SNPs identified by re-sequencing or assembly-versus-assembly method. Here we report that more than 98% SNPs identified from the porcine 60K BeadChip genotyping array (Illumina) were consistent with the SNPs identified from the assembly-based method. This result demonstrates that whole-genome de novo assembly is a reliable approach to deriving accurate maps of SNPs.
Project description:Aedes aegypti mosquitoes infect hundreds of millions of people each year with dangerous viral pathogens including dengue, yellow fever, Zika, and chikungunya. Progress in understanding the biology of this insect, and developing tools to fight it, depends on the availablity of a high-quality genome assembly. Here we use DNA proximity ligaton (Hi-C) and Pacific Biosciences long reads to create AaegL5 - a highly contiguous A. aegypti reference.
Project description:We sequenced and analyzed the genome of a highly inbred miniature Chinese pig strain, the Banna Minipig Inbred Line (BMI). we conducted whole genome screening using next generation sequencing (NGS) technology and performed SNP calling using Sus Scrofa genome assembly Sscrofa11.1.