Unknown

Dataset Information

0

Bayesian correlation is a robust gene similarity measure for single-cell RNA-seq data.


ABSTRACT: Assessing similarity is highly important for bioinformatics algorithms to determine correlations between biological information. A common problem is that similarity can appear by chance, particularly for low expressed entities. This is especially relevant in single-cell RNA-seq (scRNA-seq) data because read counts are much lower compared to bulk RNA-seq. Recently, a Bayesian correlation scheme that assigns low similarity to genes that have low confidence expression estimates has been proposed to assess similarity for bulk RNA-seq. Our goal is to extend the properties of the Bayesian correlation in scRNA-seq data by considering three ways to compute similarity. First, we compute the similarity of pairs of genes over all cells. Second, we identify specific cell populations and compute the correlation in those populations. Third, we compute the similarity of pairs of genes over all clusters, by considering the total mRNA expression. We demonstrate that Bayesian correlations are more reproducible than Pearson correlations. Compared to Pearson correlations, Bayesian correlations have a smaller dependence on the number of input cells. We show that the Bayesian correlation algorithm assigns high similarity values to genes with a biological relevance in a specific population. We conclude that Bayesian correlation is a robust similarity measure in scRNA-seq data.

SUBMITTER: Sanchez-Taltavull D 

PROVIDER: S-EPMC7671344 | biostudies-literature | 2020 Mar

REPOSITORIES: biostudies-literature

altmetric image

Publications

Bayesian correlation is a robust gene similarity measure for single-cell RNA-seq data.

Sanchez-Taltavull Daniel D   Perkins Theodore J TJ   Dommann Noelle N   Melin Nicolas N   Keogh Adrian A   Candinas Daniel D   Stroka Deborah D   Beldi Guido G  

NAR genomics and bioinformatics 20200124 1


Assessing similarity is highly important for bioinformatics algorithms to determine correlations between biological information. A common problem is that similarity can appear by chance, particularly for low expressed entities. This is especially relevant in single-cell RNA-seq (scRNA-seq) data because read counts are much lower compared to bulk RNA-seq. Recently, a Bayesian correlation scheme that assigns low similarity to genes that have low confidence expression estimates has been proposed to  ...[more]

Similar Datasets

2019-12-31 | GSE134134 | GEO
| PRJNA554015 | ENA
| S-EPMC5473255 | biostudies-literature
| S-EPMC4818202 | biostudies-other
| S-EPMC7374962 | biostudies-literature
| S-EPMC7571410 | biostudies-literature
| S-EPMC4739097 | biostudies-literature
| S-EPMC7961184 | biostudies-literature
| S-EPMC6909514 | biostudies-literature
| S-EPMC5843666 | biostudies-literature