Mo’States Mo’Problems: Emergency Stop Mechanisms from Observation
Conference Paper, Proceedings of (NeurIPS) Neural Information Processing Systems, pp. 15156 - 15166, December, 2019
Abstract
In many environments, only a relatively small subset of the complete state space is necessary in order to accomplish a given task. We develop a simple technique using emergency stops (e-stops) to exploit this phenomenon. Using e-stops significantly improves sample complexity by reducing the amount of required exploration, while retaining a performance bound that efficiently trades off the rate of convergence with a small asymptotic sub-optimality gap. We analyze the regret behavior of e-stops and present empirical results in discrete and continuous settings demonstrating that our reset mechanism can provide order-of-magnitude speedups on top of existing reinforcement learning methods.
BibTeX
@conference{Ainsworth-2019-122659,author = {Samuel Ainsworth and Matt Barnes and Siddhartha Srinivasa},
title = {Mo'States Mo'Problems: Emergency Stop Mechanisms from Observation},
booktitle = {Proceedings of (NeurIPS) Neural Information Processing Systems},
year = {2019},
month = {December},
pages = {15156 - 15166},
}
Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.