Unknown

Dataset Information

0

Mixture models for undiagnosed prevalent disease and interval-censored incident disease: applications to a cohort assembled from electronic health records.


ABSTRACT: For cost-effectiveness and efficiency, many large-scale general-purpose cohort studies are being assembled within large health-care providers who use electronic health records. Two key features of such data are that incident disease is interval-censored between irregular visits and there can be pre-existing (prevalent) disease. Because prevalent disease is not always immediately diagnosed, some disease diagnosed at later visits are actually undiagnosed prevalent disease. We consider prevalent disease as a point mass at time zero for clinical applications where there is no interest in time of prevalent disease onset. We demonstrate that the naive Kaplan-Meier cumulative risk estimator underestimates risks at early time points and overestimates later risks. We propose a general family of mixture models for undiagnosed prevalent disease and interval-censored incident disease that we call prevalence-incidence models. Parameters for parametric prevalence-incidence models, such as the logistic regression and Weibull survival (logistic-Weibull) model, are estimated by direct likelihood maximization or by EM algorithm. Non-parametric methods are proposed to calculate cumulative risks for cases without covariates. We compare naive Kaplan-Meier, logistic-Weibull, and non-parametric estimates of cumulative risk in the cervical cancer screening program at Kaiser Permanente Northern California. Kaplan-Meier provided poor estimates while the logistic-Weibull model was a close fit to the non-parametric. Our findings support our use of logistic-Weibull models to develop the risk estimates that underlie current US risk-based cervical cancer screening guidelines. Published 2017. This article has been contributed to by US Government employees and their work is in the public domain in the USA.

SUBMITTER: Cheung LC 

PROVIDER: S-EPMC5583012 | biostudies-literature | 2017 Sep

REPOSITORIES: biostudies-literature

altmetric image

Publications

Mixture models for undiagnosed prevalent disease and interval-censored incident disease: applications to a cohort assembled from electronic health records.

Cheung Li C LC   Pan Qing Q   Hyun Noorie N   Schiffman Mark M   Fetterman Barbara B   Castle Philip E PE   Lorey Thomas T   Katki Hormuzd A HA  

Statistics in medicine 20170628 22


For cost-effectiveness and efficiency, many large-scale general-purpose cohort studies are being assembled within large health-care providers who use electronic health records. Two key features of such data are that incident disease is interval-censored between irregular visits and there can be pre-existing (prevalent) disease. Because prevalent disease is not always immediately diagnosed, some disease diagnosed at later visits are actually undiagnosed prevalent disease. We consider prevalent di  ...[more]

Similar Datasets

| S-EPMC6586434 | biostudies-other
| S-EPMC5057324 | biostudies-literature
| S-EPMC5978779 | biostudies-literature
| S-EPMC7770078 | biostudies-literature
| PRJNA158491 | ENA
| S-EPMC10653491 | biostudies-literature
| S-EPMC4300240 | biostudies-literature
| S-EPMC7297278 | biostudies-literature
| S-EPMC3558546 | biostudies-literature
| S-EPMC9713425 | biostudies-literature