Project description:Epidemiologic studies and disease prevention trials often seek to relate an exposure variable to a failure time that suffers from interval-censoring. When the failure rate is low and the time intervals are wide, a large cohort is often required so as to yield reliable precision on the exposure-failure-time relationship. However, large cohort studies with simple random sampling could be prohibitive for investigators with a limited budget, especially when the exposure variables are expensive to obtain. Alternative cost-effective sampling designs and inference procedures are therefore desirable. We propose an outcome-dependent sampling (ODS) design with interval-censored failure time data, where we enrich the observed sample by selectively including certain more informative failure subjects. We develop a novel sieve semiparametric maximum empirical likelihood approach for fitting the proportional hazards model to data from the proposed interval-censoring ODS design. This approach employs the empirical likelihood and sieve methods to deal with the infinite-dimensional nuisance parameters, which greatly reduces the dimensionality of the estimation problem and eases the computation difficulty. The consistency and asymptotic normality of the resulting regression parameter estimator are established. The results from our extensive simulation study show that the proposed design and method works well for practical situations and is more efficient than the alternative designs and competing approaches. An example from the Atherosclerosis Risk in Communities (ARIC) study is provided for illustration.
Project description:The case-cohort design has been widely used as a means of cost reduction in assembling or measuring expensive covariates in large cohort studies. The existing literature on the case-cohort design is mainly focused on right-censored data. In practice, however, the failure time is often subject to interval-censoring; it is known only to fall within some random time interval. In this paper, we consider the case-cohort study design for interval-censored failure time and develop a sieve semiparametric likelihood approach for analyzing data from this design under the proportional hazards model. We construct the likelihood function using inverse probability weighting and build the sieves with Bernstein polynomials. The consistency and asymptotic normality of the resulting regression parameter estimator are established and a weighted bootstrap procedure is considered for variance estimation. Simulations show that the proposed method works well for practical situations, and an application to real data is provided.
Project description:Assessing causal treatment effect on a time-to-event outcome is of key interest in many scientific investigations. Instrumental variable (IV) is a useful tool to mitigate the impact of endogenous treatment selection to attain unbiased estimation of causal treatment effect. Existing development of IV methodology, however, has not attended to outcomes subject to interval censoring, which are ubiquitously present in studies with intermittent follow-up but are challenging to handle in terms of both theory and computation. In this work, we fill in this important gap by studying a general class of causal semiparametric transformation models with interval-censored data. We propose a nonparametric maximum likelihood estimator of the complier causal treatment effect. Moreover, we design a reliable and computationally stable expectation-maximization (EM) algorithm, which has a tractable objective function in the maximization step via the use of Poisson latent variables. The asymptotic properties of the proposed estimators, including the consistency, asymptotic normality, and semiparametric efficiency, are established with empirical process techniques. We conduct extensive simulation studies and an application to a colorectal cancer screening data set, showing satisfactory finite-sample performance of the proposed method as well as its prominent advantages over naive methods.
Project description:In standard survival analysis, it is generally assumed that every individual will experience someday the event of interest. However, this is not always the case, as some individuals may not be susceptible to this event. Also, in medical studies, it is frequent that patients come to scheduled interviews and that the time to the event is only known to occur between two visits. That is, the data are interval-censored with a cure fraction. Variable selection in such a setting is of outstanding interest. Covariates impacting the survival are not necessarily the same as those impacting the probability to experience the event. The objective of this paper is to develop a parametric but flexible statistical model to analyze data that are interval-censored and include a fraction of cured individuals when the number of potential covariates may be large. We use the parametric mixture cure model with an accelerated failure time regression model for the survival, along with the extended generalized gamma for the error term. To overcome the issue of non-stable and non-continuous variable selection procedures, we extend the adaptive LASSO to our model. By means of simulation studies, we show good performance of our method and discuss the behavior of estimates with varying cure and censoring proportion. Lastly, our proposed method is illustrated with a real dataset studying the time until conversion to mild cognitive impairment, a possible precursor of Alzheimer's disease.
Project description:Instrumental variable (IV) analysis has been widely used in economics, epidemiology, and other fields to estimate the causal effects of covariates on outcomes, in the presence of unobserved confounders and/or measurement errors in covariates. However, IV methods for time-to-event outcome with censored data remain underdeveloped. This paper proposes a Bayesian approach for IV analysis with censored time-to-event outcome by using a two-stage linear model. A Markov chain Monte Carlo sampling method is developed for parameter estimation for both normal and non-normal linear models with elliptically contoured error distributions. The performance of our method is examined by simulation studies. Our method largely reduces bias and greatly improves coverage probability of the estimated causal effect, compared with the method that ignores the unobserved confounders and measurement errors. We illustrate our method on the Women's Health Initiative Observational Study and the Atherosclerosis Risk in Communities Study.
Project description:ObjectiveNovice drivers who delay in driving licensure may miss safety benefits of Graduate Driver Licensing (GDL) programs, potentially putting themselves at higher crash-risk. Time to licensure relates their access to independent transportation to potential future economic- and educational-related opportunities. The objective of this study was to explore time to licensure associations with teens' race/ethnicity and GDL restrictions.MethodsSecondary analysis using all seven annual assessments of the NEXT Generation Health Study, a nationally representative longitudinal study starting with 10th grade (N = 2785; 2009-2010 school year). Data were collected in U.S. public/private schools, colleges, workplaces, and other settings. The outcome variable was interval-censored time to licensure (event = obtained driving licensure). Independent variables included race/ethnicity and state-specific GDL restrictions. Covariates included family affluence, parent education, nativity, sex, and urbanicity. Proportional hazards (PH) models were conducted for interval-censored survival analysis based on stepwise backward elimination for fitting multivariate models with consideration of complex survey features. In the PH models, a hazard ratio (HR) estimates a greater (>1) or lesser (<1) likelihood of licensure at all timepoints.ResultsMedian time to licensure after reaching legal driving age for Latinos, African Americans, and Non-Latino Whites was 3.47, 2.90, and 0.41 years, respectively. Multivariate PH models showed that Latinos were 46% less likely (HR = 0.54, 95%CI: 0.35-0.72) and African Americans were 56% less likely (HR = 0.44, 95%CI: 0.32-0.56) to have obtained licensure at any time compared to Non-Latino Whites. Only learner minimum age GDL restriction was associated with time to licensure. Living in a state with a required learner driving minimum age of ≥16 years (HR = 0.57, 95%CI: 0.16-0.98) also corresponded with 43% lower likelihood of licensure at legal eligibility compared to living in other states with a required learner driving minimum age of <16 years.ConclusionLatinos and African American teens obtained their license approximately three years after eligibility on average, and much later than Non-Latino Whites. Time to licensure likelihood was associated with race/ethnicity and required minimum age of learner permit, indicating important implications for teens of different racial/ethnic groups in relation to licensure, access to independent transportation, and exposure to GDL programs.
Project description:We propose interval censored recursive forests (ICRF), an iterative tree ensemble method for interval censored survival data. This nonparametric regression estimator addresses the splitting bias problem of existing tree-based methods and iteratively updates survival estimates in a self-consistent manner. Consistent splitting rules are developed for interval censored data, convergence is monitored using out-of-bag samples, and kernel-smoothing is applied. The ICRF is uniformly consistent and displays high prediction accuracy in both simulations and applications to avalanche and national mortality data. An R package icrf is available on CRAN and Supplementary Materials for this article are available online.
Project description:Interval-censored competing risks data arise when each study subject may experience an event or failure from one of several causes and the failure time is not observed directly but rather is known to lie in an interval between two examinations. We formulate the effects of possibly time-varying (external) covariates on the cumulative incidence or sub-distribution function of competing risks (i.e., the marginal probability of failure from a specific cause) through a broad class of semiparametric regression models that captures both proportional and non-proportional hazards structures for the sub-distribution. We allow each subject to have an arbitrary number of examinations and accommodate missing information on the cause of failure. We consider nonparametric maximum likelihood estimation and devise a fast and stable EM-type algorithm for its computation. We then establish the consistency, asymptotic normality, and semiparametric efficiency of the resulting estimators for the regression parameters by appealing to modern empirical process theory. In addition, we show through extensive simulation studies that the proposed methods perform well in realistic situations. Finally, we provide an application to a study on HIV-1 infection with different viral subtypes.
Project description:In prevalent cohort design, subjects who have experienced an initial event but not the failure event are preferentially enrolled and the observed failure times are often length-biased. Moreover, the prospective follow-up may not be continuously monitored and failure times are subject to interval censoring. We study the nonparametric maximum likelihood estimation for the proportional hazards model with length-biased interval-censored data. Direct maximization of likelihood function is intractable, thus we develop a computationally simple and stable expectation-maximization algorithm through introducing two layers of data augmentation. We establish the strong consistency, asymptotic normality and efficiency of the proposed estimator and provide an inferential procedure through profile likelihood. We assess the performance of the proposed methods through extensive simulations and apply the proposed methods to the Massachusetts Health Care Panel Study.
Project description:Failure time data subject to various types of censoring commonly arise in epidemiological and biomedical studies. Motivated by an AIDS clinical trial, we consider regression analysis of failure time data that include exact and left-, interval-, and/or right-censored observations, which are often referred to as partly interval-censored failure time data. We study the effects of potentially time-dependent covariates on partly interval-censored failure time via a class of semiparametric transformation models that includes the widely used proportional hazards model and the proportional odds model as special cases. We propose an EM algorithm for the nonparametric maximum likelihood estimation and show that it unifies some existing approaches developed for traditional right-censored data or purely interval-censored data. In particular, the proposed method reduces to the partial likelihood approach in the case of right-censored data under the proportional hazards model. We establish that the resulting estimator is consistent and asymptotically normal. In addition, we investigate the proposed method via simulation studies and apply it to the motivating AIDS clinical trial.