Ontology highlight
ABSTRACT:
SUBMITTER: Wu H
PROVIDER: S-EPMC7419136 | biostudies-literature | 2018
REPOSITORIES: biostudies-literature
Proceedings of machine learning research 20180101
Off-policy learning, the task of evaluating and improving policies using historic data collected from a logging policy, is important because on-policy evaluation is usually expensive and has adverse impacts. One of the major challenge of off-policy learning is to derive counterfactual estimators that also has low variance and thus low generalization error. In this work, inspired by learning bounds for importance sampling problems, we present a new counterfactual learning principle for off-policy ...[more]