Ontology highlight
ABSTRACT:
SUBMITTER: Zehfroosh A
PROVIDER: S-EPMC8982074 | biostudies-literature | 2022
REPOSITORIES: biostudies-literature
Zehfroosh Ashkan A Tanner Herbert G HG
Frontiers in robotics and AI 20220309
This paper offers a new hybrid probably approximately correct (PAC) reinforcement learning (RL) algorithm for Markov decision processes (MDPs) that intelligently maintains favorable features of both model-based and model-free methodologies. The designed algorithm, referred to as the Dyna-Delayed Q-learning (DDQ) algorithm, combines model-free Delayed Q-learning and model-based R-max algorithms while outperforming both in most cases. The paper includes a PAC analysis of the DDQ algorithm and a de ...[more]