Generalizing Regrasping with Supervised Policy Learning
Abstract
We present a method for learning a general regrasping behavior by using supervised policy learning. First, we use reinforcement learning to learn linear regrasping policies, with a small number of parameters, for single objects. Next, a general high-dimensional regrasping policy is learned in a supervised manner by using the outputs of the individual policies. In our experiments with multiple objects, we show that learning low-dimensional policies makes the reinforcement learning feasible with a small amount of data. Our experiments indicate that the general high-dimensional policy learned using our method is able to outperform the respective linear policies on each of the single objects that they were trained on. Moreover, the general policy is able to generalize to a novel object that was not present during training.
BibTeX
@conference{Chebotar-2016-112197,author = {Yevgen Chebotar and Karol Hausman and Oliver Kroemer and Gaurav S. Sukhatme and Stefan Schaal},
title = {Generalizing Regrasping with Supervised Policy Learning},
booktitle = {Proceedings of International Symposium on Experimental Robotics (ISER '16)},
year = {2016},
month = {October},
pages = {622 - 632},
}