Unknown

Dataset Information

0

Simultaneous dimension reduction and adjustment for confounding variation.


ABSTRACT: Dimension reduction methods are commonly applied to high-throughput biological datasets. However, the results can be hindered by confounding factors, either biological or technical in origin. In this study, we extend principal component analysis (PCA) to propose AC-PCA for simultaneous dimension reduction and adjustment for confounding (AC) variation. We show that AC-PCA can adjust for (i) variations across individual donors present in a human brain exon array dataset and (ii) variations of different species in a model organism ENCODE RNA sequencing dataset. Our approach is able to recover the anatomical structure of neocortical regions and to capture the shared variation among species during embryonic development. For gene selection purposes, we extend AC-PCA with sparsity constraints and propose and implement an efficient algorithm. The methods developed in this paper can also be applied to more general settings. The R package and MATLAB source code are available at https://github.com/linzx06/AC-PCA.

SUBMITTER: Lin Z 

PROVIDER: S-EPMC5187682 | biostudies-literature | 2016 Dec

REPOSITORIES: biostudies-literature

altmetric image

Publications

Simultaneous dimension reduction and adjustment for confounding variation.

Lin Zhixiang Z   Yang Can C   Zhu Ying Y   Duchi John J   Fu Yao Y   Wang Yong Y   Jiang Bai B   Zamanighomi Mahdi M   Xu Xuming X   Li Mingfeng M   Sestan Nenad N   Zhao Hongyu H   Wong Wing Hung WH  

Proceedings of the National Academy of Sciences of the United States of America 20161207 51


Dimension reduction methods are commonly applied to high-throughput biological datasets. However, the results can be hindered by confounding factors, either biological or technical in origin. In this study, we extend principal component analysis (PCA) to propose AC-PCA for simultaneous dimension reduction and adjustment for confounding (AC) variation. We show that AC-PCA can adjust for (i) variations across individual donors present in a human brain exon array dataset and (ii) variations of diff  ...[more]

Similar Datasets

| S-EPMC8008432 | biostudies-literature
| S-EPMC10947425 | biostudies-literature
| S-EPMC7592711 | biostudies-literature
| S-EPMC6949275 | biostudies-literature
| S-EPMC10702096 | biostudies-literature
| S-EPMC9278763 | biostudies-literature
| S-EPMC7162480 | biostudies-literature
| S-EPMC7236769 | biostudies-literature
| S-EPMC3777433 | biostudies-literature
| S-EPMC3018713 | biostudies-literature