Unknown

Dataset Information

0

Evaluation of some aspects in supervised cell type identification for single-cell RNA-seq: classifier, feature selection, and reference construction.


ABSTRACT:

Background

Cell type identification is one of the most important questions in single-cell RNA sequencing (scRNA-seq) data analysis. With the accumulation of public scRNA-seq data, supervised cell type identification methods have gained increasing popularity due to better accuracy, robustness, and computational performance. Despite all the advantages, the performance of the supervised methods relies heavily on several key factors: feature selection, prediction method, and, most importantly, choice of the reference dataset.

Results

In this work, we perform extensive real data analyses to systematically evaluate these strategies in supervised cell identification. We first benchmark nine classifiers along with six feature selection strategies and investigate the impact of reference data size and number of cell types in cell type prediction. Next, we focus on how discrepancies between reference and target datasets and how data preprocessing such as imputation and batch effect correction affect prediction performance. We also investigate the strategies of pooling and purifying reference data.

Conclusions

Based on our analysis results, we provide guidelines for using supervised cell typing methods. We suggest combining all individuals from available datasets to construct the reference dataset and use multi-layer perceptron (MLP) as the classifier, along with F-test as the feature selection method. All the code used for our analysis is available on GitHub ( https://github.com/marvinquiet/RefConstruction_supervisedCelltyping ).

SUBMITTER: Ma W 

PROVIDER: S-EPMC8427961 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| S-EPMC8406783 | biostudies-literature
| S-EPMC8644062 | biostudies-literature
| S-EPMC10547911 | biostudies-literature
| S-EPMC10048047 | biostudies-literature
| S-EPMC7335186 | biostudies-literature
| S-EPMC6927135 | biostudies-literature
| S-EPMC8418522 | biostudies-literature
| S-EPMC6085558 | biostudies-literature
| S-EPMC4248652 | biostudies-literature
| S-EPMC10470905 | biostudies-literature