Project description:We reprocessed RNA-Seq data for 9264 tumor samples and 741 normal samples across 24 cancer types from The Cancer Genome Atlas with "Rsubread". Rsubread is an open source R package that has shown high concordance with other existing methods of alignment and summarization, but is simple to use and takes significantly less time to process data. Additionally, we provide clinical variables publicly available as of May 20, 2015 for the tumor samples where the TCGA ids are matched.
Project description:Aberrant hypermethylation of CpG dinucleotides located in CpG islands within the promoters of key cancer genes is an epigenetic abnormality associated with heritable transcriptional gene silencing and inactivation in cancer. The genes involved include important tumor suppressors affecting key pathways for tumor initiation and progression. These methylated sequences can serve as potentially valuable markers for cancer risk assessment, diagnosis, prognosis, and prediction of therapeutic responses. In addition, many key cancer genes may be targeted by both epigenetic and genetic alterations and, thus epigenetic analysis can help focus the search for mutations, and vice versa. Studies of major cancer types suggest that any individual patient’s tumor may harbor at least 300 or more DNA hypermethylated genes. In TCGA, a pilot project is underway to begin defining these genes for GBM via genomic approaches. The approach in the epigenetic pilot is a two-tiered one which, first, involves pharmacological treatment of both well established human GBM cell lines, and a cell line grown as a neurosphere to enrich for tumor propagating cells, with a DNA methylation inhibitor (5-aza-2’-deoxycytidine, DAC) or a histone deacetylation inhibitor (trichostatin A) followed by an expression transcriptome analysis as previously described (Schuebel et. al.). This has resulted in identification of more than 3,700 total candidate genes. In the second tier, the top candidates are then analyzed on a custom Illumina GoldenGate array with the capacity to monitor methylation at a single CpG dinucleotide in the CpG islands of 1,498 gene promoters for the high throughput analysis of TCGA GBM samples. Keywords: Microarray, Hypermethylome, DNA-hypermethylation, DAC, TSA, Epigenetic, TCGA, The Cancer Genome Atlas, GBM, Glioblastoma, Glioblastoma multiforme, Brain
Project description:<p>The Cancer Genome Atlas (TCGA) is a comprehensive and coordinated effort to accelerate our understanding of the molecular basis of cancer through the application of genome analysis technologies, including large-scale genome sequencing. TCGA is a joint effort of the National Cancer Institute (NCI) and the National Human Genome Research Institute (NHGRI), which are both part of the National Institutes of Health, U.S. Department of Health and Human Services.</p> <p>TCGA projects are organized by cancer type or subtype. Click <a href="http://cancergenome.nih.gov/cancersselected" target="_blank">here</a> for a current list of cancer types selected for study in TCGA.</p> <p>Data from TCGA (e.g., gene expression, copy number variation and clinical information), are available via the <a href="https://gdc.cancer.gov/" target="_blank">Genomic Data Commons (GDC)</a>.</p> <p>Data from TCGA projects are organized into two tiers: <b>Open Access and Controlled Access</b>. <ul> <li>Open Access data tier contains data that cannot be attributed to an individual research participant. The Open Access data tier does not require user certification. Data in Open Access tier are available in the TCGA Data Portal.</li> <li>Controlled Access data tier contains individual-level genotype data that are unique to an individual. Access to data in the Controlled Access data tier requires user certification through <a href="https://dbgap.ncbi.nlm.nih.gov/aa/wga.cgi?login=&page=login" target="_blank">dbGaP Authorized Access</a>.</li> <li>Controlled Access data types consist of the following: <ul> <li>Individual germline variant data (SNP .cel files)</li> <li>Primary sequence data (.bam files), which are available at GDC</li> <li>Clinical free text fields</li> <li>Exon Array files (for Glioblastoma and Ovarian projects only)</li> </ul> </li> </ul> </p> <p><b>NOTE: TCGA strives to release most data in the open access tier. Individual genotype or sequence files are prominent exceptions. Commonly requested files such as descriptions of somatic mutations or clinical data are open access.</b></p> <p>Please go to this page: <a href="https://tcga-data.nci.nih.gov/docs/publications/" target="_blank">https://tcga-data.nci.nih.gov/docs/publications/</a> to access all data associated with TCGA tumor specific publications.</p> <p><b>The TCGA study is utilized in the following dbGaP substudies.</b> To view genotypes and other molecular data collected in these substudies, please click on the following substudies below or in the "Substudies" section of this top-level study page phs000178 TCGA study. <ul> <li><a href="./study.cgi?study_id=phs000854">phs000854</a> Genome-wide Analysis of Noncoding Regulatory Mutations in Cancer</li> </ul> </p>
Project description:<p>The Cancer Genome Atlas (TCGA) is a comprehensive and coordinated effort to accelerate our understanding of the molecular basis of cancer through the application of genome analysis technologies, including large-scale genome sequencing. TCGA is a joint effort of the National Cancer Institute (NCI) and the National Human Genome Research Institute (NHGRI), which are both part of the National Institutes of Health, U.S. Department of Health and Human Services.</p> <p>TCGA projects are organized by cancer type or subtype. Click <a href="http://cancergenome.nih.gov/cancersselected" target="_blank">here</a> for a current list of cancer types selected for study in TCGA.</p> <p>Data from TCGA (e.g., gene expression, copy number variation and clinical information), are available via the <a href="https://tcga-data.nci.nih.gov/tcga/" target="_blank">TCGA Data Portal</a>, EXCEPT for the genomic sequence data (.bam files), which are hosted at the <a href="https://cghub.ucsc.edu/" target="_blank">Cancer Genomics Hub (CGHub)</a>.</p> <p>Data from TCGA projects are organized into two tiers: <b>Open Access and Controlled Access</b>. <ul> <li>Open Access data tier contains data that cannot be attributed to an individual research participant. The Open Access data tier does not require user certification. Data in Open Access tier are available in the TCGA Data Portal.</li> <li>Controlled Access data tier contains individual-level genotype data that are unique to an individual. Access to data in the Controlled Access data tier requires user certification through <a href="https://dbgap.ncbi.nlm.nih.gov/aa/wga.cgi?login=&page=login" target="_blank">dbGaP Authorized Access</a>.</li> <li>Controlled Access data types consist of the following: <ul> <li>Individual germline variant data (SNP .cel files)</li> <li>Primary sequence data (.bam files), which are available at CGHub</li> <li>Clinical free text fields</li> <li>Exon Array files (for Glioblastoma and Ovarian projects only)</li> </ul> </li> </ul> </p> <p><b>NOTE: TCGA strives to release most data in the open access tier. Individual genotype or sequence files are prominent exceptions. Commonly requested files such as descriptions of somatic mutations or clinical data are open access.</b></p> <p><b>The TCGA study is utilized in the following dbGaP substudies.</b> To view genotypes and other molecular data collected in these substudies, please click on the following substudies below or in the "Substudies" box located on the right hand side of this top-level study page phs000178 TCGA study. <ul> <li><a href="./study.cgi?study_id=phs000441">phs000441</a> Integrated Genomic Analyses of Ovarian Carcinoma (OV)</li> <li><a href="./study.cgi?study_id=phs000489">phs000489</a> Comprehensive Genomic Characterization Defines Human Glioblastoma Genes and Core Pathways</li> </ul> </p>
Project description:To investigate gene expression in the tree shrew pancreatic cancer model, tumor samples were analyzed by RNA-seq and the expression profiles were compared to those of human pancreatic cancer samples (n = 30) containing KRAS, TP53, and CDKN2A/B mutations downloaded from The Cancer Genome Atlas (https://tcga-data.nci.nih.gov/) and mouse pancreatic cancer (accession no. GSE87388), which showed similar histological features
Project description:Large-scale initiatives like The Cancer Genome Atlas (TCGA) performed omics studies on hundreds of kidney cancer patients, but predominantly on Caucasians. We now investigated genomics of Chinese clear cell renal cell carcinoma (ccRCC) patients.
Project description:To address the need to study frozen clinical specimens using next-generation RNA, DNA, chromatin immunoprecipitation (ChIP) sequencing and protein analyses, we developed a biobank work flow to prospectively collect biospecimens from patients with renal cell carcinoma (RCC). We describe our standard operating procedures and work flow to annotate pathologic results and clinical outcomes. We report quality control outcomes, nucleic acid yields of our RCC submissions (N=16) to The Cancer Genome Atlas (TCGA) project, as well as newer discovery platforms by describing mass spectrometry analysis of albumin oxidation in plasma and 6 ChIP sequencing libraries generated from nephrectomy specimens after histone H3 lysine 36 trimethylation (H3K36me3) immunoprecipitation. From June 1, 2010, through January 1, 2013, we enrolled 328 patients with RCC. Our mean (SD) TCGA RNA integrity numbers (RINs) were 8.1 (0.8) for papillary RCC, with a 12.5% overall rate of sample disqualification for RIN <7. Banked plasma had significantly less albumin oxidation (by mass spectrometry analysis) than plasma kept at 25°C (P<.001). For ChIP sequencing, the FastQC score for average read quality was at least 30 for 91-95% of paired-end reads. In parallel, we analyzed frozen tissue by RNA sequencing and after genome alignments, only 0.2-0.4% of total reads failed the default quality check steps of Bowtie2, which was comparable to the disqualification ratio (0.1%) of the 786-O RCC cell line, prepared under optimal RNA isolation conditions. The overall correlation coefficients for gene expression between the Mayo Clinic vs. TCGA tissues ranged from 0.75 to 0.82. These data support the generation of high-quality nucleic acids for genomic analyses from banked RCC. Importantly, the protocol does not interfere with routine clinical care. Collections over defined time points during disease treatment further enhance collaborative efforts to integrate genomic information with outcomes. Examination of H3K36me3 in ccRCC