Unknown

Dataset Information

0

Genetically-guided algorithm development and sample size optimization for age-related macular degeneration cases and controls in electronic health records from the VA Million Veteran Program.


ABSTRACT: Electronic health records (EHRs) linked to extensive biorepositories and supplemented with lifestyle, behavioral, and environmental exposure data, have enormous potential to contribute to genomic discovery, a necessary step in the pathway towards translational or precision medicine. A major bottleneck in incorporating EHRs into genomic studies is the extraction of research-grade variables for analysis, particularly when gold-standard measurements are not available or accessible. Here we develop algorithms for age-related macular degeneration (AMD), a common cause of blindness among the elderly, and controls free of AMD. These computable phenotypes were developed using billing codes (ICD-9-CM and ICD-10-CM) and Current Procedural Terminology (CPT) codes and evaluated in two study sites of the Veterans Affairs Million Veteran Program: Louis Stokes Cleveland VA Medical Center and the Providence VA Medical Center. After establishing a high overall positive and negative predictive values (93% and 95%, respectively) through manual chart review, the candidate algorithm was deployed in the full VA MVP dataset of >500,000 participants. The algorithm was then optimized in a data cube using a variety of approaches including adjusting inclusion age thresholds by examining previously-reported genetic associations for CFH (rs10801555, a proxy for rs1061170) and ARMS2 (rs10490924). The algorithm with the smallest p-values for the known genetic associations was selected for downstream and on-going AMD genomic discovery efforts. This two-phase approach to developing research-grade case/control variables for AMD genomic studies capitalizes on established genetic associations resulting in high precision and optimized sample sizes, an approach that can be applied to other large-scale biobanks linked to EHRs for precision medicine research.

SUBMITTER: Halladay CW 

PROVIDER: S-EPMC6568141 | biostudies-literature | 2019

REPOSITORIES: biostudies-literature

altmetric image

Publications

Genetically-guided algorithm development and sample size optimization for age-related macular degeneration cases and controls in electronic health records from the VA Million Veteran Program.

Halladay Christopher W CW   Hadi Tamer T   Anger Matthew D MD   Greenberg Paul B PB   Sullivan Jack M JM   Konicki P Eric PE   Peachey Neal S NS   Igo Robert P RP   Iyengar Sudha K SK   Wu Wen-Chih WC   Crawford Dana C DC  

AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science 20190506


Electronic health records (EHRs) linked to extensive biorepositories and supplemented with lifestyle, behavioral, and environmental exposure data, have enormous potential to contribute to genomic discovery, a necessary step in the pathway towards translational or precision medicine. A major bottleneck in incorporating EHRs into genomic studies is the extraction of research-grade variables for analysis, particularly when gold-standard measurements are not available or accessible. Here we develop  ...[more]

Similar Datasets

| S-EPMC8636485 | biostudies-literature
| S-EPMC6710266 | biostudies-literature
2004-09-02 | GSE1719 | GEO
| phs001672 | dbGaP
| S-EPMC6069794 | biostudies-literature
| S-EPMC6547827 | biostudies-literature
| S-EPMC4530462 | biostudies-literature
| S-EPMC7020617 | biostudies-literature