Unknown

Dataset Information

0

CAKE: a flexible self-supervised framework for enhancing cell visualization, clustering and rare cell identification.


ABSTRACT: Single cell sequencing technology has provided unprecedented opportunities for comprehensively deciphering cell heterogeneity. Nevertheless, the high dimensionality and intricate nature of cell heterogeneity have presented substantial challenges to computational methods. Numerous novel clustering methods have been proposed to address this issue. However, none of these methods achieve the consistently better performance under different biological scenarios. In this study, we developed CAKE, a novel and scalable self-supervised clustering method, which consists of a contrastive learning model with a mixture neighborhood augmentation for cell representation learning, and a self-Knowledge Distiller model for the refinement of clustering results. These designs provide more condensed and cluster-friendly cell representations and improve the clustering performance in term of accuracy and robustness. Furthermore, in addition to accurately identifying the major type cells, CAKE could also find more biologically meaningful cell subgroups and rare cell types. The comprehensive experiments on real single-cell RNA sequencing datasets demonstrated the superiority of CAKE in visualization and clustering over other comparison methods, and indicated its extensive application in the field of cell heterogeneity analysis. Contact: Ruiqing Zheng. (rqzheng@csu.edu.cn).

SUBMITTER: Liu J 

PROVIDER: S-EPMC10749894 | biostudies-literature | 2023 Nov

REPOSITORIES: biostudies-literature

altmetric image

Publications

CAKE: a flexible self-supervised framework for enhancing cell visualization, clustering and rare cell identification.

Liu Jin J   Zeng Weixing W   Kan Shichao S   Li Min M   Zheng Ruiqing R  

Briefings in bioinformatics 20231101 1


Single cell sequencing technology has provided unprecedented opportunities for comprehensively deciphering cell heterogeneity. Nevertheless, the high dimensionality and intricate nature of cell heterogeneity have presented substantial challenges to computational methods. Numerous novel clustering methods have been proposed to address this issue. However, none of these methods achieve the consistently better performance under different biological scenarios. In this study, we developed CAKE, a nov  ...[more]

Similar Datasets

| S-EPMC10539043 | biostudies-literature
| S-EPMC7397036 | biostudies-literature
| S-EPMC8157426 | biostudies-literature
| S-EPMC9048682 | biostudies-literature
| S-EPMC8694357 | biostudies-literature
| S-EPMC11562840 | biostudies-literature
2008-08-30 | GSE12627 | GEO
| S-EPMC6221474 | biostudies-literature
| S-EPMC10041520 | biostudies-literature
| S-EPMC10958986 | biostudies-literature