Project description:The Genetic Association Information Network (GAIN) Data Access Committee was established in June 2007 to provide prompt and fair access to data from six genome-wide association studies through the database of Genotypes and Phenotypes (dbGaP). Of 945 project requests received through 2011, 749 (79%) have been approved; median receipt-to-approval time decreased from 14 days in 2007 to 8 days in 2011. Over half (54%) of the proposed research uses were for GAIN-specific phenotypes; other uses were for method development (26%) and adding controls to other studies (17%). Eight data-management incidents, defined as compromises of any of the data-use conditions, occurred among nine approved users; most were procedural violations, and none violated participant confidentiality. Over 5 years of experience with GAIN data access has demonstrated substantial use of GAIN data by investigators from academic, nonprofit, and for-profit institutions with relatively few and contained policy violations. The availability of GAIN data has allowed for advances in both the understanding of the genetic underpinnings of mental-health disorders, diabetes, and psoriasis and the development and refinement of statistical methods for identifying genetic and environmental factors related to complex common diseases.
Project description:The epigenome is the dynamic interface between our changing environment and the static genome, and understanding it is a goal of immense importance to human health. We will map reference cell epigenomes of the brain, breast, blood and approved embryonic stem cells, inclusive of males and females and different racial groups. This cooperative work will transform our understanding of the short and long-lasting consequences of environment impact on human health and disease. We are working cooperatively with other Mapping Centers and the Data Coordination Center (EDACC) to comprehensively map epigenomes of select human cells with significant relevance to complex human disease. Our group, consisting of scientists at UCSF, UC Davis, UCSC and the British Columbia Genome Sciences Centre will focus on cells relevant to human health and complex disease including cells from the blood, brain, breast and U.S. Government-approved lines of human embryonic stem cells. We will incorporate high quality, homogeneous cells from males and females, and two predominant racial groups, and biological replicates of each cell type. Production of comprehensive maps will include 6 histone modifications selected for their opposing roles in regulating active and inactive chromatin, DNA methylation and miRNA and gene expression. This epigenetic data, along with genetic and expression data will be integrated using advanced informatics to address fundamental roles of epigenetics in differentiation, maintenance of cell-type identity and gene expression. Our cell and data production pipeline will incorporate verification and data validation with independent methods, and will operate under a model motivated by increased data production and decrease cost. We summarize the analysis capacity of our instruments and our explicit strategy for data sharing of our proposed REMC-generated resources including biological specimens, protocols, data, software tools and intellectual resources. We envision that our group in conjunction with the other REMC teams, the EDACC, ENCODE, future EHHD (Epigenetics of Human Health and Disease) centers and the NIH Roadmap program will develop methods, tools and reference epigenome maps for the research community that will make the promise of epigenetics in understand and treating human complex disease a reality. Our reference epigenomes will enable new disciplines including human population epigenetics, comparative epigenomics, neuroepigenetics, and therapeutic epigenetics for tissue regeneration and reversal of disease. Contributor: BCCA Genome Sciences Centre **************** For data usage terms and conditions, please refer to: http://www.drugabuse.gov/funding/funding-opportunities/nih-common-fund/epigenomics-data-access-policies ****************
Project description:Integrative analysis of multi-omics data is a powerful approach for gaining functional insights into biological and medical processes. Conducting these multifaceted analyses on human samples is often complicated by the fact that the raw sequencing output is rarely available under open access. The Personal Genome Project UK (PGP-UK) is one of few resources that recruits its participants under open consent and makes the resulting multi-omics data freely and openly available. As part of this resource, we describe the PGP-UK multi-omics reference panel consisting of ten genomic, methylomic and transcriptomic data. Specifically, we outline the data processing, quality control and validation procedures which were implemented to ensure data integrity and exclude sample mix-ups. In addition, we provide a REST API to facilitate the download of the entire PGP-UK dataset. The data are also available from two cloud-based environments, providing platforms for free integrated analysis. In conclusion, the genotype-validated PGP-UK multi-omics human reference panel described here provides a valuable new open access resource for integrated analyses in support of personal and medical genomics.
Project description:The 1,000 plants (1KP) project is an international multi-disciplinary consortium that has generated transcriptome data from over 1,000 plant species, with exemplars for all of the major lineages across the Viridiplantae (green plants) clade. Here, we describe how to access the data used in a phylogenomics analysis of the first 85 species, and how to visualize our gene and species trees. Users can develop computational pipelines to analyse these data, in conjunction with data of their own that they can upload. Computationally estimated protein-protein interactions and biochemical pathways can be visualized at another site. Finally, we comment on our future plans and how they fit within this scalable system for the dissemination, visualization, and analysis of large multi-species data sets.
Project description:Brain tumors are the most common solid tumors of childhood, and the genetic drivers and optimal therapeutic strategies for many of the different subtypes remain unknown. We performed targeted next-generation sequencing of approximately 500 cancer-associated genes on a cohort of 13 pediatric bithalamic diffuse gliomas, a lethal brain tumor of childhood for which the genetic basis is largely unknown. We identified that bithalamic diffuse gliomas harbor frequent mutations in the EGFR oncogene in the absence of accompanying gene amplification and only rare histone H3 mutation. These EGFR mutations were either small in-frame insertions within exon 20 (intracellular tyrosine kinase domain) or missense mutations within exon 7 (extracellular ligand-binding domain). Accompanying alterations included frequent TP53 mutation, CDK6 amplification or CDKN2C mutation, and BCOR and BCORL1 mutation or deletion.
Project description:The 1000 Genomes Project was launched as one of the largest distributed data collection and analysis projects ever undertaken in biology. In addition to the primary scientific goals of creating both a deep catalog of human genetic variation and extensive methods to accurately discover and characterize variation using new sequencing technologies, the project makes all of its data publicly available. Members of the project data coordination center have developed and deployed several tools to enable widespread data access.
Project description:Chordoid glioma is a rare brain tumor thought to arise from specialized glial cells of the lamina terminalis along the anterior wall of the third ventricle. Despite being histologically low-grade, chordoid gliomas are often associated with poor outcome, as their stereotypic location in the third ventricle makes resection challenging and efficacious adjuvant therapies have not been developed. Here we performed genomic profiling on 13 chordoid gliomas and identified a recurrent D463H missense mutation in PRKCA in all tumors, which localizes in the kinase domain of the encoded protein kinase C alpha (PKCα). Expression of mutant PRKCA in immortalized human astrocytes led to increased phospho-ERK and anchorage-independent growth that could be blocked by MEK inhibition. These studies define PRKCA as a recurrently mutated oncogene in human cancer and identify a potential therapeutic vulnerability in this uncommon brain tumor.