Project description:Illumina methylation 27 array (Illumina) analysis was performed on 24 HapMap individuals including one CEU trio (family 1463 including NA12878, NA12891, NA12892) and one YRI trio (family Y117 including NA19240, NA19238, NA19239)
Project description:This dataset contains DNase-seq data and CTCF ChIP-seq data for 6 lymphoblastoid cell lines. There are 3 cell lines from a YRI trio and 3 lines from a CEU trio (HapMap GM19238, GM19239, GM 19240, GM12891, GM12892, GM12878). For data usage terms and conditions, please refer to http://www.genome.gov/27528022 and http://www.genome.gov/Pages/Research/ENCODE/ENCODEDataReleasePolicyFinal2008.pdf
Project description:Genetic variation amongst individual humans occurs on many different scales, ranging from gross alterations in the human karyotype to single-nucleotide changes. In this manuscript we explore variation on an intermediate scale-particularly insertions, deletions, and inversions affecting from a few thousand to a few million base pairs. We employed a clone-based method to interrogate this intermediate structural variation in eight individuals of diverse geographic ancestry. Our analysis provides a comprehensive overview of the normal pattern of structural variation present in these genomes, refining the location of 1695 structural variants. We find that 50% were seen in more than one individual and that nearly half lay outside regions of the genome previously described as structurally variant. We discover 525 new insertion sequences that are not present in the human reference genome and show that many of these are variable in copy number among individuals. Sequencing of a subset of structural variants reveals considerable locus complexity and provides insights into the different mutational processes that have shaped the human genome. These data provide the first high-resolution sequence-map of human structural variation-an important standard for genotyping platforms and a prelude to future individual genome sequencing projects. Keywords: comparitive genomic hybridization, copy number variation, structural variation, fosmid end sequencing CGH analysis targeted against sites identified by fosmid end sequencing. 8 HapMap samples (sources of libraries ABC7-ABC14) are hybed against NA15510 (source of fosmid library G248).
Project description:Single-cell whole-genome haplotyping allows simultaneous detection of haplotypes associated with monogenic diseases, chromosome copy-numbering and subsequently, has revealed mosaicism in embryos and embryonic stem cells. Methods, such as karyomapping and haplarithmisis, were deployed as a generic and genome-wide approach for preimplantation genetic testing (PGT) and are replacing traditional PGT methods. While current methods primarily rely on SNP array, we envision sequencing-based methods to become more accessible and cost-efficient. Here, we developed a novel sequencing-based methodology to haplotype and copy-number profile single cells. Following DNA amplification, genomic size and complexity is reduced through restriction enzyme digestion and DNA is genotyped through sequencing. This single-cell genotyping-by-sequencing (scGBS) is the input for haplarithmisis, an algorithm we previously developed for SNP array-based single-cell haplotyping. We established technical parameters and developed an analysis pipeline enabling accurate concurrent haplotyping and copy-number profiling of single cells. We demonstrate its value in human blastomere and trophectoderm samples as application for PGT for monogenic disorders. Furthermore, we demonstrate the method to work in other species through analyzing blastomeres of bovine embryos. Our scGBS method opens up the path for single-cell haplotyping of any species with diploid genomes and could make its way into the clinic as a PGT application.
Project description:This is the validation data for candidate de novo CNV calls made in the CEU Hapmap by Itsara et al., Genome Research 2010. In this study, de novo CNV calls were initially made with Illumina 1M SNP arrays. Validation of CNV calls was performed with Nimblegen custom array CGH using the extended CEPH pedigrees. A truly de novo CNV would be unobserved in the first generation (CEU trio parents), validated in the second generation (CEU trio children), and assuming no selective effects, transmitted to approximately half of the individuals in the third generation. We attempted validation of 4 de novo CNVs in 3 extended CEPH pedigrees: 1358, 1408, and 1459.
Project description:Data includes all available Affymetrix SNP data from a cohort of Pediatric malignant glioma samples, isolated from Formalin-fixed Paraffin embedded tissue. No clinical data is available. Copy number analysis of Affymetrix 250K Sty SNP arrays was performed for 28 pediatric malignant gliomas. The VN algorithm was used to generate the reference signal based on 48 Mapping 500k HapMap Trio Dataset template.
Project description:Rapid advances in biochemical technologies have enabled several strategies for typing candidate HLA alleles, but linking them into a single MHC haplotype structure remains challenging. Here we have developed a multi-loci haplotype phasing technique and demonstrate its utility towards phasing of MHC and KIR loci in human samples. We accurately (~99%) reconstruct the complete haplotypes for over 90% of sequence variants spanning the 4-megabase region of these two loci. By haplotyping a majority of coding and non-coding alleles at the MHC and KIR loci in a single assay, this method has the potential to assist transplantation matching and facilitate investigation of the genetic basis of human immunity and disease. Complete haplotype phasing of 2 loci (MHC and KIR) in 1 human cell line.
Project description:This dataset contains DNase-seq data and CTCF ChIP-seq data for 6 lymphoblastoid cell lines. There are 3 cell lines from a YRI trio and 3 lines from a CEU trio (HapMap GM19238, GM19239, GM 19240, GM12891, GM12892, GM12878). For data usage terms and conditions, please refer to http://www.genome.gov/27528022 and http://www.genome.gov/Pages/Research/ENCODE/ENCODEDataReleasePolicyFinal2008.pdf DNase-seq and ChIP-seq data from each of the 6 cell lines.