Unknown

Dataset Information

0

Inference with Transposable Data: Modeling the Effects of Row and Column Correlations.


ABSTRACT: We consider the problem of large-scale inference on the row or column variables of data in the form of a matrix. Many of these data matrices are transposable meaning that neither the row variables nor the column variables can be considered independent instances. An example of this scenario is detecting significant genes in microarrays when the samples may be dependent due to latent variables or unknown batch effects. By modeling this matrix data using the matrix-variate normal distribution, we study and quantify the effects of row and column correlations on procedures for large-scale inference. We then propose a simple solution to the myriad of problems presented by unanticipated correlations: We simultaneously estimate row and column covariances and use these to sphere or de-correlate the noise in the underlying data before conducting inference. This procedure yields data with approximately independent rows and columns so that test statistics more closely follow null distributions and multiple testing procedures correctly control the desired error rates. Results on simulated models and real microarray data demonstrate major advantages of this approach: (1) increased statistical power, (2) less bias in estimating the false discovery rate, and (3) reduced variance of the false discovery rate estimators.

SUBMITTER: Allen GI 

PROVIDER: S-EPMC8649963 | biostudies-literature |

REPOSITORIES: biostudies-literature

Similar Datasets

| S-EPMC4133154 | biostudies-literature
| S-EPMC6472738 | biostudies-literature
| S-EPMC9275730 | biostudies-literature
| S-EPMC2943396 | biostudies-literature
| S-EPMC3233205 | biostudies-literature
| S-EPMC7723344 | biostudies-literature
| S-EPMC7856889 | biostudies-literature
| S-EPMC3796028 | biostudies-literature
| S-EPMC3944972 | biostudies-literature
| S-EPMC2719771 | biostudies-other