A general framework for studying genetic effects and gene-environment interactions with missing data.
Ontology highlight
ABSTRACT: Missing data arise in genetic association studies when genotypes are unknown or when haplotypes are of direct interest. We provide a general likelihood-based framework for making inference on genetic effects and gene-environment interactions with such missing data. We allow genetic and environmental variables to be correlated while leaving the distribution of environmental variables completely unspecified. We consider 3 major study designs-cross-sectional, case-control, and cohort designs-and construct appropriate likelihood functions for all common phenotypes (e.g. case-control status, quantitative traits, and potentially censored ages at onset of disease). The likelihood functions involve both finite- and infinite-dimensional parameters. The maximum likelihood estimators are shown to be consistent, asymptotically normal, and asymptotically efficient. Expectation-Maximization (EM) algorithms are developed to implement the corresponding inference procedures. Extensive simulation studies demonstrate that the proposed inferential and numerical methods perform well in practical settings. Illustration with a genome-wide association study of lung cancer is provided.
SUBMITTER: Hu YJ
PROVIDER: S-EPMC3294269 | biostudies-literature | 2010 Oct
REPOSITORIES: biostudies-literature
ACCESS DATA