Adaptive Variance for Changing Sparse-Reward Environments
Conference Paper, Proceedings of (ICRA) International Conference on Robotics and Automation, pp. 3210 - 3216, May, 2019
Abstract
Robots that are trained to perform a task in a fixed environment often fail when facing unexpected changes to the environment due to a lack of exploration. We propose a principled way to adapt the policy for better exploration in changing sparse-reward environments. Unlike previous works which explicitly model environmental changes, we analyze the relationship between the value function and the optimal exploration for a Gaussian-parameterized policy and show that our theory leads to an effective strategy for adjusting the variance of the policy, enabling fast adapt to changes in a variety of sparse-reward environments.
BibTeX
@conference{Lin-2019-113049,author = {Xingyu Lin and Pengsheng Guo and Carlos Florensa and David Held},
title = {Adaptive Variance for Changing Sparse-Reward Environments},
booktitle = {Proceedings of (ICRA) International Conference on Robotics and Automation},
year = {2019},
month = {May},
pages = {3210 - 3216},
}
Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.