Unknown

Dataset Information

0

Accounting for unobserved covariates with varying degrees of estimability in high-dimensional biological data.


ABSTRACT: An important phenomenon in high-throughput biological data is the presence of unobserved covariates that can have a significant impact on the measured response. When these covariates are also correlated with the covariate of interest, ignoring or improperly estimating them can lead to inaccurate estimates of and spurious inference on the corresponding coefficients of interest in a multivariate linear model. We first prove that existing methods to account for these unobserved covariates often inflate Type I error for the null hypothesis that a given coefficient of interest is zero. We then provide alternative estimators for the coefficients of interest that correct the inflation, and prove that our estimators are asymptotically equivalent to the ordinary least squares estimators obtained when every covariate is observed. Lastly, we use previously published DNA methylation data to show that our method can more accurately estimate the direct effect of asthma on DNA methylation levels compared to existing methods, the latter of which likely fail to recover and account for latent cell type heterogeneity.

SUBMITTER: McKennan C 

PROVIDER: S-EPMC6845853 | biostudies-literature | 2019 Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications

Accounting for unobserved covariates with varying degrees of estimability in high-dimensional biological data.

McKennan Chris C   Nicolae Dan D  

Biometrika 20190916 4


An important phenomenon in high-throughput biological data is the presence of unobserved covariates that can have a significant impact on the measured response. When these covariates are also correlated with the covariate of interest, ignoring or improperly estimating them can lead to inaccurate estimates of and spurious inference on the corresponding coefficients of interest in a multivariate linear model. We first prove that existing methods to account for these unobserved covariates often inf  ...[more]

Similar Datasets

| S-EPMC5870402 | biostudies-literature
| S-EPMC3963210 | biostudies-literature
| S-EPMC4935555 | biostudies-other
| S-EPMC4563215 | biostudies-literature
| S-EPMC7313320 | biostudies-literature
| S-EPMC6449749 | biostudies-literature
| S-EPMC10858611 | biostudies-literature
| S-EPMC2669665 | biostudies-literature
| S-EPMC3767131 | biostudies-literature
| S-EPMC7073148 | biostudies-literature