Ontology highlight
ABSTRACT:
SUBMITTER: Goldberg Y
PROVIDER: S-EPMC3385950 | biostudies-literature | 2012 Feb
REPOSITORIES: biostudies-literature
We develop methodology for a multistage-decision problem with flexible number of stages in which the rewards are survival times that are subject to censoring. We present a novel Q-learning algorithm that is adjusted for censored data and allows a flexible number of stages. We provide finite sample bounds on the generalization error of the policy learned by the algorithm, and show that when the optimal Q-function belongs to the approximation space, the expected survival time for policies obtained ...[more]