Unknown

Dataset Information

0

ScKWARN: Kernel-weighted-average robust normalization for single-cell RNA-seq data.


ABSTRACT:

Motivation

Single-cell RNA-seq normalization is an essential step to correct unwanted biases caused by sequencing depth, capture efficiency, dropout, and other technical factors. Existing normalization methods primarily reduce biases arising from sequencing depth by modeling count-depth relationship and/or assuming a specific distribution for read counts. However, these methods may lead to over or under-correction due to presence of technical biases beyond sequencing depth and the restrictive assumption on models and distributions.

Results

We present scKWARN, a Kernel Weighted Average Robust Normalization designed to correct known or hidden technical confounders without assuming specific data distributions or count-depth relationships. scKWARN generates a pseudo expression profile for each cell by borrowing information from its fuzzy technical neighbors through a kernel smoother. It then compares this profile against the reference derived from cells with the same bimodality patterns to determine the normalization factor. As demonstrated in both simulated and real datasets, scKWARN outperforms existing methods in removing a variety of technical biases while preserving true biological heterogeneity.

Availability and implementation

scKWARN is freely available at https://github.com/cyhsuTN/scKWARN.

SUBMITTER: Hsu CY 

PROVIDER: S-EPMC10868328 | biostudies-literature | 2024 Feb

REPOSITORIES: biostudies-literature

altmetric image

Publications

scKWARN: Kernel-weighted-average robust normalization for single-cell RNA-seq data.

Hsu Chih-Yuan CY   Chang Chia-Jung CJ   Liu Qi Q   Shyr Yu Y  

Bioinformatics (Oxford, England) 20240201 2


<h4>Motivation</h4>Single-cell RNA-seq normalization is an essential step to correct unwanted biases caused by sequencing depth, capture efficiency, dropout, and other technical factors. Existing normalization methods primarily reduce biases arising from sequencing depth by modeling count-depth relationship and/or assuming a specific distribution for read counts. However, these methods may lead to over or under-correction due to presence of technical biases beyond sequencing depth and the restri  ...[more]

Similar Datasets

| S-EPMC5473255 | biostudies-literature
| S-EPMC9458465 | biostudies-literature
| S-EPMC8696108 | biostudies-literature
| S-EPMC7019105 | biostudies-literature
| S-EPMC8419999 | biostudies-literature
| S-EPMC11366752 | biostudies-literature
| S-EPMC5499114 | biostudies-literature
| S-EPMC7374962 | biostudies-literature
| S-EPMC7571410 | biostudies-literature
| S-EPMC8721966 | biostudies-literature