Project description:Constructing high-quality haplotype-resolved genome assemblies has substantially improved the ability to detect and characterize genetic variants. A targeted approach providing readily access to the rich information from haplotype-resolved genome assemblies will be appealing to groups of basic researchers and medical scientists focused on specific genomic regions. Here, using the 4.5 megabase, notoriously difficult-to-assemble major histocompatibility complex (MHC) region as an example, we demonstrated an approach to construct haplotype-resolved assembly of the targeted genomic region with the CRISPR-based enrichment. Compared to the results from haplotype-resolved genome assembly, our targeted approach achieved comparable completeness and accuracy with reduced computing complexity, sequencing cost, as well as the amount of starting materials. Moreover, using the targeted assembled personal MHC haplotypes as the reference both improves the quantification accuracy for sequencing data and enables allele-specific functional genomics analyses of the MHC region. Given its highly efficient use of resources, our approach can greatly facilitate population genetic studies of targeted regions, and may pave a new way to elucidate the molecular mechanisms in disease etiology.
Project description:Constructing high-quality haplotype-resolved genome assemblies has substantially improved the ability to detect and characterize genetic variants. A targeted approach providing readily access to the rich information from haplotype-resolved genome assemblies will be appealing to groups of basic researchers and medical scientists focused on specific genomic regions. Here, using the 4.5 megabase, notoriously difficult-to-assemble major histocompatibility complex (MHC) region as an example, we demonstrated an approach to construct haplotype-resolved assembly of the targeted genomic region with the CRISPR-based enrichment. Compared to the results from haplotype-resolved genome assembly, our targeted approach achieved comparable completeness and accuracy with reduced computing complexity, sequencing cost, as well as the amount of starting materials. Moreover, using the targeted assembled personal MHC haplotypes as the reference both improves the quantification accuracy for sequencing data and enables allele-specific functional genomics analyses of the MHC region. Given its highly efficient use of resources, our approach can greatly facilitate population genetic studies of targeted regions, and may pave a new way to elucidate the molecular mechanisms in disease etiology.
Project description:Lactobacillus casei is remarkably adaptive to diverse habitats. To understand the evolution and adaptation of Lb. casei strains isolated from different environments, the gene content of 22 Lb. casei strains isolated from various habitats (cheeses, n=8; plant materials, n=8; and human sources, n=6) were examined by comparative genome hybridization with an Lb. casei ATCC 334-based microarray.
Project description:Lactobacillus casei is remarkably adaptive to diverse habitats. To understand the evolution and adaptation of Lb. casei strains isolated from different environments, the gene content of 22 Lb. casei strains isolated from various habitats (cheeses, n=8; plant materials, n=8; and human sources, n=6) were examined by comparative genome hybridization with an Lb. casei ATCC 334-based microarray. Comparative genome hybridization was performed against an Affymetrix custom microarray designed to include 2,661 (97%) chromosomal and 17 (85%) plasmid CDSs predicted to occur in Lb. casei ATCC 334, as well as all predicted CDSs in the draft Lb. helveticus CNRZ 32 genome. CDSs that were not included in the microarray design were all transposase-encoding genes.
Project description:Somatic structural variants (SVs) are widespread in cancer, but their impact on disease evolution is understudied due to a lack of methods to directly characterize their molecular consequences. We present a computational method, scNOVA, that integrates haplotype-resolved SV discovery with nucleosome occupancy analysis using Strand-seq, to functionally characterize SVs in single cells.
Project description:The Zygnematophyceae are the closest algal relatives of land plants and hence interesting to understand land plant evolution. Species of the genus Serritaenia have an aerophytic lifestyle and form colorful, mucilaginous capsules, which surround the cells and block harmful solar radiation. Under laboratory conditions the production of this “sunscreen mucilage” can be induced by ultraviolet B radiation. The present dataset reveals insights into the cellular reaction of this alga to UV radiation (a major stressor in terrestrial habitats) and allows for comparisons with other algae and land plants to draw evolutionary conclusions.
Project description:Genetic variation is regarded as a prerequisite for evolution. Theoretical models suggest epigenetic information inherited independently of DNA sequence can also enable evolution. However, whether epigenetic inheritance mediates phenotypic evolution in natural populations is unknown. Here we show that natural epigenetic DNA methylation variation in gene bodies regulates genes expression, and thereby influences the natural variation of complex traits in Arabidopsis thaliana. Notably, the effects of methylation variation on phenotypic diversity and gene expression variance are comparable with those of DNA sequence polymorphism. We also identify methylation epialleles in numerous genes associated with environmental conditions in native habitats, suggesting that intragenic methylation facilitates adaptation to fluctuating environments. Our results demonstrate that methylation variation fundamentally shapes phenotypic diversity in natural populations and provides an epigenetic basis for adaptive Darwinian evolution independent of genetic polymorphism.