Project description:Constructing high-quality haplotype-resolved genome assemblies has substantially improved the ability to detect and characterize genetic variants. A targeted approach providing readily access to the rich information from haplotype-resolved genome assemblies will be appealing to groups of basic researchers and medical scientists focused on specific genomic regions. Here, using the 4.5 megabase, notoriously difficult-to-assemble major histocompatibility complex (MHC) region as an example, we demonstrated an approach to construct haplotype-resolved assembly of the targeted genomic region with the CRISPR-based enrichment. Compared to the results from haplotype-resolved genome assembly, our targeted approach achieved comparable completeness and accuracy with reduced computing complexity, sequencing cost, as well as the amount of starting materials. Moreover, using the targeted assembled personal MHC haplotypes as the reference both improves the quantification accuracy for sequencing data and enables allele-specific functional genomics analyses of the MHC region. Given its highly efficient use of resources, our approach can greatly facilitate population genetic studies of targeted regions, and may pave a new way to elucidate the molecular mechanisms in disease etiology.
Project description:Constructing high-quality haplotype-resolved genome assemblies has substantially improved the ability to detect and characterize genetic variants. A targeted approach providing readily access to the rich information from haplotype-resolved genome assemblies will be appealing to groups of basic researchers and medical scientists focused on specific genomic regions. Here, using the 4.5 megabase, notoriously difficult-to-assemble major histocompatibility complex (MHC) region as an example, we demonstrated an approach to construct haplotype-resolved assembly of the targeted genomic region with the CRISPR-based enrichment. Compared to the results from haplotype-resolved genome assembly, our targeted approach achieved comparable completeness and accuracy with reduced computing complexity, sequencing cost, as well as the amount of starting materials. Moreover, using the targeted assembled personal MHC haplotypes as the reference both improves the quantification accuracy for sequencing data and enables allele-specific functional genomics analyses of the MHC region. Given its highly efficient use of resources, our approach can greatly facilitate population genetic studies of targeted regions, and may pave a new way to elucidate the molecular mechanisms in disease etiology.
Project description:This Series contains data from 845 participants (188 men and 657 women) in the EPIC-Italy cohort that was produced at the Human Genetics Foundation (HuGeF) in Turin, Italy. At the last follow-up (2010), 424 participants remained cancer-free, 235 had developed primary breast cancer, 166 had developed primary colorectal cancer, and 20 had developed other primary cancers. Anthropometric measurements, and dietary and lifestyle information obtained by questionnaire are also available. A total of 845 samples from the EPIC-Italy cohort were analyzed.
Project description:Heteromeric protein complexes are key macromolecular machines of the cell, but their description remains incomplete. We previously reported an experimental strategy for global characterization of native protein assemblies based on chromatographic fractionation of biological extracts coupled to precision mass spectrometry analysis (CF/MS), but the resulting data can be challenging to process and interpret. Here, we describe EPIC (Elution Profile-based Inference of Complexes), a software toolkit for automated scoring of CF/MS data for large-scale determination of high-confidence physical interaction networks and macromolecular assemblies from diverse biological specimens. As a case study, we used EPIC to map the global interactome of Caenorhabditis elegans, defining 590 putative worm protein complexes linked to diverse biological processes, including assemblies unique to nematodes. The EPIC software is freely available as a Jupyter notebook packaged in a Docker container (https://hub.docker.com/r/baderlab/bio-epic/), and the open source code is available via GitHub (https://github.com/BaderLab/EPIC).