Unknown

Dataset Information

0

Corruption of the Pearson correlation coefficient by measurement error and its estimation, bias, and correction under different error models.


ABSTRACT: Correlation coefficients are abundantly used in the life sciences. Their use can be limited to simple exploratory analysis or to construct association networks for visualization but they are also basic ingredients for sophisticated multivariate data analysis methods. It is therefore important to have reliable estimates for correlation coefficients. In modern life sciences, comprehensive measurement techniques are used to measure metabolites, proteins, gene-expressions and other types of data. All these measurement techniques have errors. Whereas in the old days, with simple measurements, the errors were also simple, that is not the case anymore. Errors are heterogeneous, non-constant and not independent. This hampers the quality of the estimated correlation coefficients seriously. We will discuss the different types of errors as present in modern comprehensive life science data and show with theory, simulations and real-life data how these affect the correlation coefficients. We will briefly discuss ways to improve the estimation of such coefficients.

SUBMITTER: Saccenti E 

PROVIDER: S-EPMC6965177 | biostudies-literature | 2020 Jan

REPOSITORIES: biostudies-literature

altmetric image

Publications

Corruption of the Pearson correlation coefficient by measurement error and its estimation, bias, and correction under different error models.

Saccenti Edoardo E   Hendriks Margriet H W B MHWB   Smilde Age K AK  

Scientific reports 20200116 1


Correlation coefficients are abundantly used in the life sciences. Their use can be limited to simple exploratory analysis or to construct association networks for visualization but they are also basic ingredients for sophisticated multivariate data analysis methods. It is therefore important to have reliable estimates for correlation coefficients. In modern life sciences, comprehensive measurement techniques are used to measure metabolites, proteins, gene-expressions and other types of data. Al  ...[more]

Similar Datasets

| S-EPMC6240329 | biostudies-literature
| S-EPMC7248308 | biostudies-literature
| S-EPMC6781837 | biostudies-literature
| S-EPMC4183069 | biostudies-literature
| S-EPMC3169665 | biostudies-literature
| S-EPMC5378630 | biostudies-literature
| S-EPMC5798808 | biostudies-literature
| S-EPMC8408353 | biostudies-literature
| S-EPMC7221498 | biostudies-literature
| S-EPMC3100171 | biostudies-literature