A comparison of principal component analysis and factor analysis strategies for uncovering pleiotropic factors.
Ontology highlight
ABSTRACT: Principal component analysis (PCA) and factor analysis (FA) are often used to uncover genetic factors that contribute to complex disease phenotypes. The purpose of such an analysis is to distill a genetic signal from a large number of correlated phenotype measurements. That signal can then be used in genetic analyses (e.g. linkage analysis), presumably leading to greater success at finding genes than one would achieve with any one raw trait. Although both PCA and FA have been used this way, there has been no comparison of their performance in the literature. We compared the ability of these two procedures to extract unobserved underlying genetic components from complex simulated data on nuclear families. We first simulated seven underlying genetic and environmentally determined traits. Then we derived two sets of 50 complex (observed) traits using algebraic combinations of the underlying components. We next performed PCA and FA on the complex traits. We assessed two aspects of the performance of the methods: (1) ability to detect the underlying genetic components; (2) whether the methods worked better when applied to raw traits or to residuals (after regressing out significant environmental covariates). Our results indicate that both the methods behave similarly in most cases, although FA generally produced factors that had stronger correlations with the underlying traits. We also found that using residuals in PCA or FA analyses greatly increased the probability that the PCs or factors detected common genetic components instead of common environmental factors, except if there was statistical interaction between genetic and environmental factors.
SUBMITTER: Wang X
PROVIDER: S-EPMC3042259 | biostudies-literature | 2009 May
REPOSITORIES: biostudies-literature
ACCESS DATA