Ensemble learning for classifying single-cell data and projection across reference atlases
Ontology highlight
ABSTRACT: Single-cell data are being generated at an accelerating pace. How best to project data across single-cell atlases is an open problem. We developed a boosted learner that overcomes the greatest challenge with status quo classifiers: low sensitivity, especially when dealing with rare cell types. By comparing novel and published data from distinct scRNA-seq modalities that were acquired from the same tissues, we show that this approach preserves cell-type labels when mapping across diverse platforms.
Project description:Single-cell data are being generated at an accelerating pace.How best to project data across single-cell atlases is an open problem. We developed a boosted learner that overcomes the greatest challenge with status quo classifiers: low sensitivity, especially when dealing with rare cell types.
Project description:SummarySingle-cell data are being generated at an accelerating pace. How best to project data across single-cell atlases is an open problem. We developed a boosted learner that overcomes the greatest challenge with status quo classifiers: low sensitivity, especially when dealing with rare cell types. By comparing novel and published data from distinct scRNA-seq modalities that were acquired from the same tissues, we show that this approach preserves cell-type labels when mapping across diverse platforms.Availability and implementationhttps://github.com/diazlab/ELSA.Contactaaron.diaz@ucsf.edu.Supplementary informationSupplementary data are available at Bioinformatics online.
Project description:Single cell atlases of platyhelminth Müller’s larva and mollusc trochophore larva reveal homologous cell types and phylum specific novelties
Project description:COVID-19, caused by SARS-CoV-2, can result in acute respiratory distress syndrome and multiple-organ failure, but little is known about its pathophysiology. Here, we generated single-cell atlases of 23 lung, 16 kidney, 15 liver and 18 heart COVID-19 autopsy donor tissue samples, and spatial atlases of 14 lung donors. Integrated computational analysis uncovered substantial remodeling in the lung epithelial, immune and stromal compartments, with evidence of multiple paths of failed tissue regeneration, including defective alveolar type 2 differentiation and expansion of myofibroblasts and putative TP63+ intrapulmonary basal-like progenitor cells. Viral RNAs were enriched in mononuclear phagocytic and endothelial lung cells which induced specific host programs. Spatial analysis in lung distinguished inflammatory host responses in lung regions with and without viral RNA. Analysis of the other tissue atlases showed transcriptional alterations in multiple cell types in COVID-19 donor heart tissue, and mapped cell types and genes implicated with disease severity based on COVID-19 GWAS. Our foundational dataset elucidates the biological impact of severe SARS-CoV-2 infection across the body a key step towards new treatments.
Project description:We describe MCProj-an algorithm for analyzing query scRNA-seq data by projections over reference single-cell atlases. We represent the reference as a manifold of annotated metacell gene expression distributions. We then interpret query metacells as mixtures of atlas distributions while correcting for technology-specific gene biases. This approach distinguishes and tags query cells that are consistent with atlas states from unobserved (novel or artifactual) behaviors. It also identifies expression differences observed in successfully mapped query states. We showcase MCProj functionality by projecting scRNA-seq data on a blood cell atlas, deriving precise, quantitative, and interpretable results across technologies and datasets.
Project description:How a neuronal cell type is defined and how this relates to its transcriptome are still open questions. The Drosophila olfactory projection neurons (PNs) are among the bestcharacterized neuronal types: Different PN classes target dendrites to distinct olfactory glomeruli and PNs of the same class exhibit indistinguishable anatomical and physiological properties. Using single-cell RNA-sequencing, we comprehensively characterized the transcriptomes of 40 PN classes and unequivocally identified transcriptomes for 6 classes. We found a new lineage-specific transcription factor that instructs PN dendrite targeting. Transcriptomes of closely-related PN classes exhibit the largest difference during circuit assembly, but become indistinguishable in adults, suggesting that neuronal subtype diversity peaks during development. Genes encoding transcription factors and cell-surface molecules are the most differentially expressed, indicating their central roles in specifying neuronal identity. Finally, we show that PNs use highly redundant combinatorial molecular codes to distinguish subtypes, enabling robust specification of cell identity and circuit assembly.
Project description:The definition of neuronal type and how it relates to the transcriptome are open questions. Drosophila olfactory projection neurons (PNs) are among the best-characterized neuronal types: different PN classes target dendrites to distinct olfactory glomeruli, while PNs of the same class exhibit indistinguishable anatomical and physiological properties. Using single-cell RNA sequencing, we comprehensively characterized the transcriptomes of most PN classes and unequivocally mapped transcriptomes to specific olfactory function for six classes. Transcriptomes of closely related PN classes exhibit the largest differences during circuit assembly but become indistinguishable in adults, suggesting that neuronal subtype diversity peaks during development. Transcription factors and cell-surface molecules are the most differentially expressed genes between classes and are highly informative in encoding cell identity, enabling us to identify a new lineage-specific transcription factor that instructs PN dendrite targeting. These findings establish that neuronal transcriptomic identity corresponds with anatomical and physiological identity defined by connectivity and function.