Carnegie Mellon University
11:00 am to 12:00 pm
NSH 4305
Abstract:
Learning for control is capable of acquiring controllers in novel task scenarios, paving the path to autonomous robots. However, typical learning approaches can be prohibitively expensive in terms of robot experiments, and policies learned in simulation do not transfer directly due to modelling inaccuracies. This encourages learning information from simulation that has a higher chance of transferring to hardware. In this thesis, we explore methods that learn from simulation to improve learning performance on actual robots.
One way to improve sample-efficiency is through parametric expert-designed controllers. In this context, Bayesian optimization has emerged as a promising approach for automatically learning controller parameters. However, when performing Bayesian optimization on hardware for high-dimensional policies, sample-efficiency can still be an issue. We develop an approach that utilizes simulation to map the original parameter space into a domain-informed space. During Bayesian optimization, similarity between controllers is now calculated in this transformed space, thus informing distances on hardware with behavior in simulation. Our hardware experiments on the ATRIAS robot show that these features capture important aspects of walking and accelerate learning on hardware.
An alternative to directly optimizing policies on hardware is to learn robust policies in simulation that can be implemented on hardware. We study the effect of different policy structures on the robustness of very high-dimensional neural network policies. Our experiments on the ATRIAS robot show that neural network policies with an expert-designed structure have a higher rate of transfer between simulation and hardware than unstructured policies.
Thesis Committee:
Christopher G. Atkeson, Chair
Hartmut Geyer
Oliver Kroemer
Stefan Schaal, University of Southern California