Project description:Fast, robust and technology-independent computational methods are needed for supervised cell type annotation of single-cell RNA sequencing data. We present SciBet, a supervised cell type identifier that accurately predicts cell identity for newly sequenced cells with order-of-magnitude speed advantage. We enable web client deployment of SciBet for rapid local computation without uploading local data to the server. Facing the exponential growth in the size of single cell RNA datasets, this user-friendly and cross-platform tool can be widely useful for single cell type identification.
Project description:Cell type identification is a key step toward downstream analysis of single cell RNA-seq experiments. Although the primary objective is to identify known cell populations, good identifiers should also recognize unknown clusters which may represent a previously unidentified subpopulation of a known cell type or tumor cells of an unknown phenotype. Herein, we present MarkerCount, which utilizes the number of expressed markers, regardless of their expression level. MarkerCount works in both reference- and marker-based mode, where the latter utilizes existing lists of markers, while the former uses a pre-annotated dataset to find markers to be used for cell type identification. In both modes, MarkerCount first utilizes the "marker count" to identify cell populations and, after rejecting uncertain cells, reassigns cell type and/or makes corrections in cluster-basis. The performance of MarkerCount was evaluated and compared with existing identifiers, both marker- and reference-based, that can be customized using publicly available datasets and marker databases. The results show that MarkerCount performs better in the identification of known populations as well as of unknown ones, when compared to other reference- and marker-based cell type identifiers for most of the datasets analyzed.
Project description:In the original version of this Article the dataset identifier in the Data Availability statement was incorrect. The correct dataset identifier is PXD009500. This has been corrected in the HTML and PDF versions of this Article.
Project description:In the originally published HTML and PDF versions of this Article, gel images in Figures 7c and 8c were not prepared as per the Nature journal policy. These figure panels have now been corrected in both the PDF and HTML versions of the Article.In Fig. 7c, the lane labelled 'Ha' was inappropriately duplicated to represent the lane labelled 'Ich13'. The corrected version of Fig. 7c includes PCR-RFLP on DNA from the Ichkeul 13 line, which had been run on a separate gel. The original unprocessed gel images are provided in Supplementary Figure 1 associated with this correction, with the relevant corresponding bands denoted. A repeat experiment of the PCR-RFLP test is also presented as Supplementary Figure 2.In Fig. 8c, the image was assembled from two separate gels without clear demarcation. The corrected Fig. 8c clearly separates lanes from the two gels, and the original unprocessed gel images are provided in the Supplementary Information associated with this correction.These corrections do not alter the original meaning of the experiments, their results, their interpretation, or the conclusions of the paper. We apologize for any confusion this may have caused to the readers of Nature Communications.