Project description:Somatic mosaicism (SM), referring to the presence of somatic mutations in sub-populations of cells within healthy individuals, is associated with an increased risk of a variety of diseases, including cancer. Blood is at particularly high risk of SM, given its rapid turnover and functionally- heterogeneous cell-type composition. While the roles of point mutations and large-scale rearrangements in blood SM have been scrutinised in recent years, the functional impact of mosaic structural variants (mSVs) remains poorly understood.
Using haplotype-resolved single-cell multi-omics, we explored the mSV landscape of human hematopoietic stem and progenitor cells (HSPCs).
Project description:Accurate detection of somatic structural variation (SV) in cancer genomes remains a challenging problem. This is in part due to the lack of high-quality, gold-standard datasets that enable the benchmarking of experimental approaches and bioinformatic analysis pipelines. Here, we performed somatic SV analysis of the paired melanoma and normal lymphoblastoid COLO829 cell lines using four different sequencing technologies. Based on the evidence from multiple technologies combined with extensive experimental validation, we compiled a comprehensive set of carefully curated and validated somatic SVs, comprising all SV types. We demonstrate the utility of this resource by determining the SV detection performance as a function of tumor purity and sequence depth, highlighting the importance of assessing these parameters in cancer genomics projects. The truth somatic SV dataset as well as the underlying raw multi-platform sequencing data are freely available and are an important resource for community somatic benchmarking efforts.
Project description:A key mutational process in cancer is structural variation, in which rearrangements delete, amplify or reorder genomic segments that range in size from kilobases to whole chromosomes1-7. Here we develop methods to group, classify and describe somatic structural variants, using data from the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA), which aggregated whole-genome sequencing data from 2,658 cancers across 38 tumour types8. Sixteen signatures of structural variation emerged. Deletions have a multimodal size distribution, assort unevenly across tumour types and patients, are enriched in late-replicating regions and correlate with inversions. Tandem duplications also have a multimodal size distribution, but are enriched in early-replicating regions-as are unbalanced translocations. Replication-based mechanisms of rearrangement generate varied chromosomal structures with low-level copy-number gains and frequent inverted rearrangements. One prominent structure consists of 2-7 templates copied from distinct regions of the genome strung together within one locus. Such cycles of templated insertions correlate with tandem duplications, and-in liver cancer-frequently activate the telomerase gene TERT. A wide variety of rearrangement processes are active in cancer, which generate complex configurations of the genome upon which selection can act.
Project description:GRIDSS2 is the first structural variant caller to explicitly report single breakends-breakpoints in which only one side can be unambiguously determined. By treating single breakends as a fundamental genomic rearrangement signal on par with breakpoints, GRIDSS2 can explain 47% of somatic centromere copy number changes using single breakends to non-centromere sequence. On a cohort of 3782 deeply sequenced metastatic cancers, GRIDSS2 achieves an unprecedented 3.1% false negative rate and 3.3% false discovery rate and identifies a novel 32-100 bp duplication signature. GRIDSS2 simplifies complex rearrangement interpretation through phasing of structural variants with 16% of somatic calls phasable using paired-end sequencing.
Project description:MotivationSingle-cell multi-omics assays simultaneously measure different molecular features from the same cell. A key question is how to benefit from the complementary data available and perform cross-modal clustering of cells.ResultsWe propose Single-Cell Multi-omics Clustering (scMoC), an approach to identify cell clusters from data with comeasurements of scRNA-seq and scATAC-seq from the same cell. We overcome the high sparsity of the scATAC-seq data by using an imputation strategy that exploits the less-sparse scRNA-seq data available from the same cell. Subsequently, scMoC identifies clusters of cells by merging clusterings derived from both data domains individually. We tested scMoC on datasets generated using different protocols with variable data sparsity levels. We show that scMoC (i) is able to generate informative scATAC-seq data due to its RNA-guided imputation strategy and (ii) results in integrated clusters based on both RNA and ATAC information that are biologically meaningful either from the RNA or from the ATAC perspective.Availability and implementationThe data used in this manuscript is publicly available, and we refer to the original manuscript for their description and availability. For convience sci-CAR data is available at NCBI GEO under the accession number of GSE117089. SNARE-seq data is available at NCBI GEO under the accession number of GSE126074. The 10X multiome data is available at the following link https://www.10xgenomics.com/resources/datasets/pbmc-from-a-healthy-donor-no-cell-sorting-3-k-1-standard-2-0-0.Supplementary informationSupplementary data are available at Bioinformatics Advances online.
Project description:In this article, we evaluated the performance of statistical methods in single-group and multi-group analysis approaches for testing group difference in indirect effects and for testing simple indirect effects in each group. We also investigated whether the performance of the methods in the single-group approach was affected when the assumption of equal variance was not satisfied. The assumption was critical for the performance of the two methods in the single-group analysis: the method using a product term for testing the group difference in a single path coefficient, and the Wald test for testing the group difference in the indirect effect. Bootstrap confidence intervals in the single-group approach and all methods in the multi-group approach were not affected by the violation of the assumption. We compared the performance of the methods and provided recommendations.
Project description:Cell classes in the human retina are highly heterogeneous with their abundance varying by several orders of magnitude. Here, we generated and integrated a multi-omics single-cell atlas of the adult human retina, including more than 250,000 nuclei for single-nuclei RNA-seq and 137,000 nuclei for single-nuclei ATAC-seq. Cross-species comparison of the retina atlas among human, monkey, mice, and chicken revealed relatively conserved and non-conserved types. Interestingly, the overall cell heterogeneity in primate retina decreases compared with that of rodent and chicken retina. Through integrative analysis, we identified 35,000 distal cis-element-gene pairs, constructed transcription factor (TF)-target regulons for more than 200 TFs, and partitioned the TFs into distinct co-active modules. We also revealed the heterogeneity of the cis-element-gene relationships in different cell types, even from the same class. Taken together, we present a comprehensive single-cell multi-omics atlas of the human retina as a resource that enables systematic molecular characterization at individual cell-type resolution.
Project description:The preconceptual, intrauterine, and early life environments can have a profound and long-lasting impact on the developmental trajectories and health outcomes of the offspring. Given the relatively low success rates of assisted reproductive technologies (ART; ∼25%), additives and adjuvants, such as glucocorticoids, are used to improve the success rate. Considering the dynamic developmental events that occur during this window, these exposures may alter blastocyst formation at a molecular level, and as such, affect not only the viability of the embryo and the ability of the blastocyst to implant, but also the developmental trajectory of the first three cell lineages, ultimately influencing the physiology of the embryo. In this study, we present a comprehensive single-cell transcriptome, methylome, and small RNA atlas in the day 7 human embryo. We show that, despite no change in morphology and developmental features, preimplantation glucocorticoid exposure reprograms the molecular profile of the TE lineage, and these changes are associated with an altered metabolic and inflammatory response. Our data also suggest that glucocorticoids can precociously mature the TE sublineages, supported by the presence of extravillous trophoblast markers in the polar sublineage and presence of X Chromosome dosage compensation. Further, we have elucidated that epigenetic regulation-DNA methylation and microRNAs (miRNAs)-likely underlies the transcriptional changes observed. This study suggests that exposures to exogenous compounds during preimplantation may unintentionally reprogram the human embryo, possibly leading to suboptimal development and longer-term health outcomes.
Project description:Rapid developments in cryogenic electron microscopy have opened new avenues to probe the structures of protein assemblies in their near native states. Recent studies have begun applying single -particle analysis to heterogeneous mixtures, revealing the potential of structural-omics approaches that combine the power of mass spectrometry and electron microscopy. Here we highlight advances and challenges in sample preparation, data processing, and molecular modeling for handling increasingly complex mixtures. Such advances will help structural-omics methods extend to cellular-level models of structural biology.
Project description:BACKGROUND:Genomic rearrangements exert a heavy influence on the molecular landscape of cancer. New analytical approaches integrating somatic structural variants (SSVs) with altered gene features represent a framework by which we can assign global significance to a core set of genes, analogous to established methods that identify genes non-randomly targeted by somatic mutation or copy number alteration. While recent studies have defined broad patterns of association involving gene transcription and nearby SSV breakpoints, global alterations in DNA methylation in the context of SSVs remain largely unexplored. RESULTS:By data integration of whole genome sequencing, RNA sequencing, and DNA methylation arrays from more than 1400 human cancers, we identify hundreds of genes and associated CpG islands (CGIs) for which the nearby presence of a somatic structural variant (SSV) breakpoint is recurrently associated with altered expression or DNA methylation, respectively, independently of copy number alterations. CGIs with SSV-associated increased methylation are predominantly promoter-associated, while CGIs with SSV-associated decreased methylation are enriched for gene body CGIs. Rearrangement of genomic regions normally having higher or lower methylation is often involved in SSV-associated CGI methylation alterations. Across cancers, the overall structural variation burden is associated with a global decrease in methylation, increased expression in methyltransferase genes and DNA damage response genes, and decreased immune cell infiltration. CONCLUSION:Genomic rearrangement appears to have a major role in shaping the cancer DNA methylome, to be considered alongside commonly accepted mechanisms including histone modifications and disruption of DNA methyltransferases.