Dataset Information

Highly efficient hypothesis testing methods for regression-type tests with correlated observations and heterogeneous variance structure.

ABSTRACT:

Background

For many practical hypothesis testing (H-T) applications, the data are correlated and/or with heterogeneous variance structure. The regression t-test for weighted linear mixed-effects regression (LMER) is a legitimate choice because it accounts for complex covariance structure; however, high computational costs and occasional convergence issues make it impractical for analyzing high-throughput data. In this paper, we propose computationally efficient parametric and semiparametric tests based on a set of specialized matrix techniques dubbed as the PB-transformation. The PB-transformation has two advantages: 1. The PB-transformed data will have a scalar variance-covariance matrix. 2. The original H-T problem will be reduced to an equivalent one-sample H-T problem. The transformed problem can then be approached by either the one-sample Student's t-test or Wilcoxon signed rank test.

Results

In simulation studies, the proposed methods outperform commonly used alternative methods under both normal and double exponential distributions. In particular, the PB-transformed t-test produces notably better results than the weighted LMER test, especially in the high correlation case, using only a small fraction of computational cost (3 versus 933 s). We apply these two methods to a set of RNA-seq gene expression data collected in a breast cancer study. Pathway analyses show that the PB-transformed t-test reveals more biologically relevant findings in relation to breast cancer than the weighted LMER test.

Conclusions

As fast and numerically stable replacements for the weighted LMER test, the PB-transformed tests are especially suitable for "messy" high-throughput data that include both independent and matched/repeated samples. By using our method, the practitioners no longer have to choose between using partial data (applying paired tests to only the matched samples) or ignoring the correlation in the data (applying two sample tests to data with some correlated samples). Our method is implemented as an R package 'PBtest' and is available at https://github.com/yunzhang813/PBtest-R-Package .

SUBMITTER: Zhang Y

PROVIDER: S-EPMC6466736 | biostudies-literature | 2019 Apr

REPOSITORIES: biostudies-literature

ACCESS DATA

Publications

Highly efficient hypothesis testing methods for regression-type tests with correlated observations and heterogeneous variance structure.

Zhang Yun Y Bandyopadhyay Gautam G Topham David J DJ Falsey Ann R AR Qiu Xing X

BMC bioinformatics 20190415 1

<h4>Background</h4>For many practical hypothesis testing (H-T) applications, the data are correlated and/or with heterogeneous variance structure. The regression t-test for weighted linear mixed-effects regression (LMER) is a legitimate choice because it accounts for complex covariance structure; however, high computational costs and occasional convergence issues make it impractical for analyzing high-throughput data. In this paper, we propose computationally efficient parametric and semiparamet ...[more]

PMID: 30987598

Dataset Information

Highly efficient hypothesis testing methods for regression-type tests with correlated observations and heterogeneous variance structure.

Background

Results

Conclusions

Publications

Highly efficient hypothesis testing methods for regression-type tests with correlated observations and heterogeneous variance structure.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Hypothesis testing for differentially correlated features.
| S-EPMC5031944 | biostudies-literature

Efficient alternatives for Bayesian hypothesis tests in psychology.
| S-EPMC9561355 | biostudies-literature

HYPOTHESIS TESTING FOR HIGH-DIMENSIONAL SPARSE BINARY REGRESSION.
| S-EPMC4522432 | biostudies-literature

MixTwice: large-scale hypothesis testing for peptide arrays by variance mixing.
| S-EPMC8428605 | biostudies-literature

An algorithm for testing the efficient market hypothesis.
| S-EPMC3812129 | biostudies-literature

Correlated STORM-homoFRET imaging reveals highly heterogeneous membrane receptor structures.
| S-EPMC9539790 | biostudies-literature

Global and Simultaneous Hypothesis Testing for High-Dimensional Logistic Regression Models.
| S-EPMC8375316 | biostudies-literature

Testing a hypothesis of unidirectional hybridization in plants: observations on Sonneratia, Bruguiera and Ligularia.
| S-EPMC2409324 | biostudies-literature

Biclustering with heterogeneous variance.
| S-EPMC3725096 | biostudies-literature