Project description:Fast, robust and technology-independent computational methods are needed for supervised cell type annotation of single-cell RNA sequencing data. We present SciBet, a supervised cell type identifier that accurately predicts cell identity for newly sequenced cells with order-of-magnitude speed advantage. We enable web client deployment of SciBet for rapid local computation without uploading local data to the server. Facing the exponential growth in the size of single cell RNA datasets, this user-friendly and cross-platform tool can be widely useful for single cell type identification.
Project description:Cell type identification is a key step toward downstream analysis of single cell RNA-seq experiments. Although the primary objective is to identify known cell populations, good identifiers should also recognize unknown clusters which may represent a previously unidentified subpopulation of a known cell type or tumor cells of an unknown phenotype. Herein, we present MarkerCount, which utilizes the number of expressed markers, regardless of their expression level. MarkerCount works in both reference- and marker-based mode, where the latter utilizes existing lists of markers, while the former uses a pre-annotated dataset to find markers to be used for cell type identification. In both modes, MarkerCount first utilizes the "marker count" to identify cell populations and, after rejecting uncertain cells, reassigns cell type and/or makes corrections in cluster-basis. The performance of MarkerCount was evaluated and compared with existing identifiers, both marker- and reference-based, that can be customized using publicly available datasets and marker databases. The results show that MarkerCount performs better in the identification of known populations as well as of unknown ones, when compared to other reference- and marker-based cell type identifiers for most of the datasets analyzed.
Project description:Single cell RNA sequencing has a central role in immune profiling, identifying specific immune cells as disease markers and suggesting therapeutic target genes of immune cells. Immune cell-type annotation from single cell transcriptomics is in high demand for dissecting complex immune signatures from multicellular blood and organ samples. However, accurate cell type assignment from single-cell RNA sequencing data alone is complicated by a high level of gene expression heterogeneity. Many computational methods have been developed to respond to this challenge, but immune cell annotation accuracy is not highly desirable. We present ImmunIC, a simple and robust tool for immune cell identification and classification by combining marker genes with a machine learning method. With over two million immune cells and half-million non-immune cells from 66 single cell RNA sequencing studies, ImmunIC shows 98% accuracy in the identification of immune cells. ImmunIC outperforms existing immune cell classifiers, categorizing into ten immune cell types with 92% accuracy. We determine peripheral blood mononuclear cell compositions of severe COVID-19 cases and healthy controls using previously published single cell transcriptomic data, permitting the identification of immune cell-type specific differential pathways. Our publicly available tool can maximize the utility of single cell RNA profiling by functioning as a stand-alone bioinformatic cell sorter, advancing cell-type specific immune profiling for the discovery of disease-specific immune signatures and therapeutic targets.