Development and Validation of a Large Synthetic Cohort for the Study of Cardiovascular Health Across the Life Span.
Ontology highlight
ABSTRACT: We developed and validated a synthetic cohort approach to examine numbers of cardiovascular risk factors (CRFs) and adverse clinical events, including incident cardiovascular disease and all-cause mortality, across the life span from ages 20 years to 90 years. The current analysis included 40,875 participants from 7 large, population-based longitudinal epidemiologic studies (1948-2016). On the basis of a joint multilevel imputation model, we multiply imputed each participant's life-span numbers of CRFs and events using available records. To validate the imputed values, we partially removed the observed data and then compared the imputed and observed values. The complete life-span synthetic data set reflected the original observed data trends well. In our validation sample, the distributions of imputed CRFs and events were close to the observed distributions but with less variability. Bland-Altman plots indicated that there was a slightly negative trend in general, and the agreement bias was relatively small for the continuous CRFs. The hypothetical linear regression model suggested that the relationships between the CRFs and events were preserved in the imputed data set. This approach generated valid estimates of CRFs and events across the life span for African-American and White participants. The synthetic cohort may be sufficiently accurate to be useful in assessing the origins and timing of accumulating cardiovascular risk that can inform efforts to avoid cardiovascular disease development.
SUBMITTER: Ning H
PROVIDER: S-EPMC8633456 | biostudies-literature |
REPOSITORIES: biostudies-literature
ACCESS DATA