Oracular Partially Observable Markov Decision Processes: A Very Special Case
Abstract
We introduce the Oracular Partially Observable Markov Decision Process (OPOMDP), a type of POMDP in which the world produces no observations; instead there is an ?racle,?available in any state, that tells the agent its exact state for a fixed cost. The oracle may be a human or a highly accurate sensor. At each timestep the agent must choose whether to take a domain-level action or consult the oracle. This formulation comprises a factorization between information-gathering actions and domain-level actions, allowing us to characterize the value of information and to examine the problem of planning under uncertainty from a novel perspective. We propose an algorithm to capitalize on this factorization and the special structure of the OPOMDP, and we test the algorithm? performance on a new sample domain. On this new domain, we are able to solve a problem with hundreds of thousands of action-states and vastly outperform a previous state-of-the-art approximate technique.
BibTeX
@conference{Armstrong-Crews-2007-9699,author = {Nicholas Armstrong-Crews and Manuela Veloso},
title = {Oracular Partially Observable Markov Decision Processes: A Very Special Case},
booktitle = {Proceedings of (ICRA) International Conference on Robotics and Automation},
year = {2007},
month = {April},
pages = {2477 - 2482},
}