SDePER: a hybrid machine learning and regression method for cell-type deconvolution of spatial barcoding-based transcriptomic data
Ontology highlight
ABSTRACT: Spatial barcoding-based transcriptomic (ST) data require cell type deconvolution for cellular-level downstream analysis. Here we present SDePER, a hybrid machine learning and regression method, to deconvolve ST data using reference single-cell RNA sequencing (scRNA-seq) data. SDePER removes the systematic difference between the ST and scRNA-seq data (platform effects) explicitly and efficiently to ensure the linear relationship between ST data and cell type-specific expression profile. It also considers sparsity of cell types per capture spot and across-spots spatial correlation in cell type compositions. Based on the estimations, SDePER imputes for cell type compositions and gene expression at enhanced resolution. We assessed the performance of SDePER and six existing methods using simulations and four real datasets. All results showed that SDePER achieved significantly more accurate and robust results than the existing methods suggesting the importance of considering platform effects, sparsity and spatial correlation in cell type deconvolution.
Project description:The recent advance of spatial transcriptomics (ST) technique provides valuable insights into the organization and interactions of cells within the tumor microenvironment (TME). While various analytical tools have been developed for tasks such as spatial clustering, spatially variable gene identification, and cell type deconvolution, most of them are general methods lacking consideration of histological features in spatial data analysis. This limitation results in reduced performance and interpretability of their results when studying the TME. Here, we present a computational framework named, Morphology-Enhanced Spatial Transcriptome Analysis Integrator (METI) to address this gap. METI is an end-to-end framework capable of spatial mapping of both cancer cells and various TME cell components, robust stratification of cell type and transcriptional states, and cell co-localization analysis. By integrating both spatial transcriptomics, cell morphology and curated gene signatures, METI enhances our understanding of the molecular landscape and cellular interactions within the tissue, facilitating detailed investigations of the TME and its functional implications. The performance of METI has been evaluated on ST data generated from various tumor tissues, including gastric, lung, and bladder cancers, as well as premalignant tissues. Across all these tissues and conditions, METI has demonstrated robust performance with consistency.
Project description:Spatial transcriptomics workflows using barcoded capture arrays are commonly used for resolving gene expression in tissues. However, existing techniques are either limited by capture array density or are cost prohibitive for large scale atlasing. We present Nova-ST, a dense nano-patterned spatial transcriptomics technique derived from randomly barcoded Illumina sequencing flow cells. Nova-ST enables customized, low cost, flexible, and high-resolution spatial profiling of large tissue sections. Benchmarking on mouse brain sections demonstrates significantly higher sensitivity compared to existing methods, at reduced cost.
Project description:Spatial transcriptomics workflows using barcoded capture arrays are commonly used for resolving gene expression in tissues. However, existing techniques are either limited by capture array density or are cost prohibitive for large scale atlasing. We present Nova-ST, a dense nano-patterned spatial transcriptomics technique derived from randomly barcoded Illumina sequencing flow cells. Nova-ST enables customized, low cost, flexible, and high-resolution spatial profiling of large tissue sections. Benchmarking on mouse brain sections demonstrates significantly higher sensitivity compared to existing methods, at reduced cost.
Project description:The cellular composition of heterogeneous samples can be predicted from reference gene expression profiles that represent the homogeneous, constituent populations of the heterogeneous samples. However, existing methods fail when the reference profiles are not representative of the constituent populations. We developed PERT, a new probabilistic expression deconvolution method, to address this limitation. PERT was used to deconvolve cellular composition of variably sourced and treated heterogeneous human blood samples. Our results indicate that even after correcting batch effects, cells presenting the same cell surface antigens display different transcriptional programs when they are uncultured versus culture-derived. Given gene expression profiles of culture-derived heterogeneous samples and profiles of uncultured reference populations, PERT was able to accurately recover proportions of pure populations composing the heterogeneous samples. We anticipate that PERT will be widely applicable to expression deconvolution problems using profiles from reference populations that vary from the corresponding constituent populations in cellular state but not cellular identity. Human umbilical cord blood-derived lineage negative cells and mononucleated cells Cellular compositions of mononucleated cell and lineage negative cell compartments were deconvolved based on the gene expression profiles
Project description:Deconvolution methods infer quantitative cell type estimates from bulk measurement of mixed samples including blood and tissue. DNA methylation sequencing measures multiple CpGs per read, but few existing deconvolution methods leverage this within-read information. We develop CelFiE-ISH, which extends an existing method (CelFiE) to use within-read haplotype information. CelFiE-ISH outperforms CelFiE and other existing methods, achieving 30% better accuracy and more sensitive detection of rare cell types. We also demonstrate the importance of marker selection and tailoring markers for haplotype-aware methods. While here we use gold-standard short-read sequencing data, haplotype-aware methods will be well-suited for long-read sequencing.
Project description:Tumor heterogeneity is a major challenge for oncology drug discovery and development. Understanding of the spatial tumor landscape is key to identifying new targets and impactful model systems. Here, we test the utility of spatial transcriptomics (ST) for Oncology Discovery by profiling 40 tissue sections and 80,024 capture spots across a diverse set of tissue types, sample formats, and RNA capture chemistries. We verify the accuracy and fidelity of ST by leveraging matched pathology analysis that provide a ground truth for tissue section composition. We then use spatial data to demonstrate the capture of key tumor depth features, identifying hypoxia, necrosis, vasculature, and extracellular matrix variation. We also leverage spatial context to identify relative cell type locations showing the anti-correlation of tumor and immune cells in syngeneic cancer models. Lastly, we demonstrate target identification approaches in clinical pancreatic adenocarcinoma samples, highlighting tumor intrinsic biomarkers and paracrine signaling.
Project description:The coordinated differentiation of progenitor cells into specialized cell types and their spatial organization into distinct domains is central to embryogenesis. Here, we applied a new unbiased spatially resolved single-cell transcriptomics method to identify the genetic programs underlying the emergence of specialized cell types during limb development and their spatial integration. We identify multiple transcription factors whose expression patterns are predominantly associated with cell type specification or spatial position, suggesting two parallel yet highly interconnected regulatory systems. We demonstrate that the embryonic limb undergoes a complex multi-scale re-organization upon perturbation of one of its spatial organizing centers, including the loss of specific cell populations, specific alterations of pre-existing cell states’ molecular identities and changes in their relative spatial distribution. Our study shows how multi-dimensional single-cell, spatially resolved molecular atlases can allow the deconvolution of spatial identity and cell fate and reveal the interconnected genetic networks that regulate organogenesis and its reorganization upon genetic alterations.
Project description:Immunological memory is key to productive adaptive immunity. An unbiased, high through-put gene expression profiling of tissue-resident memory T cells residing in various anatomical location within the lung is fundamental to understand lung immunity but still lacking. In this study, using a well-established model on Klebsiella pneumoniae, we performed an integrative analysis of spatial transcriptome with single-cell RNA-seq and single-cell ATAC-seq on lung cells from mice after Immunization using the 10x Genomics Chromium and Visium platform. We employed several deconvolution algorithms and established an optimized deconvolution pipeline to accurately decipher specific cell-type composition by location. We identified and located 12 major cell types by scRNA-seq and spatial transcriptomic analysis. Integrating scATAC-seq data from the same cells processed in parallel with scRNA-seq, we found epigenomic profiles provide more robust cell type identification, especially for lineage-specific T helper cells. When combining all three data modalities, we observed a dynamic change in the location of T helper cells as well as their corresponding chemokines for chemotaxis. Furthermore, cell-cell communication analysis of spatial transcriptome provided evidence of lineage-specific T helper cells receiving designated cytokine signaling. In summary, our first-in-class study demonstrated the power of multi-omics analysis to uncover intrinsic spatial- and cell-type-dependent molecular mechanisms of lung immunity. Our data provides a rich research resource of single cell multi-omics data as a reference for understanding spatial dynamics of lung immunization.
Project description:During host-pathogen encounters, the complex interactions between different immune cell-types can determine the outcome of infection. Advances in single cell RNA-seq (scRNA-seq) allow to probe this complexity of immunity, and afforded the basis for deconvolution algorithms that infer cell-type compositions from bulk RNA-seq measurements. However, immune activation, an important aspect of immune surveillance, is not represented in current algorithms. Here, using scRNA-seq of human peripheral blood cells infected with Salmonella, we developed a novel deconvolution algorithm to infer dynamic immune states from bulk measurements. We applied our dynamic deconvolution algorithm both to cohorts of healthy individuals challenged ex vivo with Salmonella and to cohorts of tuberculosis patients during different stages of disease. We revealed cell-type specific immune responses associated not only with ex vivo infection phenotype but also with clinical disease stage. We propose that our approach provides a predictive power to identify risk for disease, and can be applied to comprehensively study human infection outcome.
Project description:During host-pathogen encounters, the complex interactions between different immune cell-types can determine the outcome of infection. Advances in single cell RNA-seq (scRNA-seq) allow to probe this complexity of immunity, and afforded the basis for deconvolution algorithms that infer cell-type compositions from bulk RNA-seq measurements. However, immune activation, an important aspect of immune surveillance, is not represented in current algorithms. Here, using scRNA-seq of human peripheral blood cells infected with Salmonella, we developed a novel deconvolution algorithm to infer dynamic immune states from bulk measurements. We applied our dynamic deconvolution algorithm both to cohorts of healthy individuals challenged ex vivo with Salmonella and to cohorts of tuberculosis patients during different stages of disease. We revealed cell-type specific immune responses associated not only with ex vivo infection phenotype but also with clinical disease stage. We propose that our approach provides a predictive power to identify risk for disease, and can be applied to comprehensively study human infection outcome.