Provably Efficient Imitation Learning from Observation Alone

Wen Sun, Anirudh Vemula, Byron Boots, and J. Andrew Bagnell

Conference Paper, Proceedings of (ICML) International Conference on Machine Learning, pp. 6036 - 6045, June, 2019

View Publication

Abstract

We study Imitation Learning (IL) from Observations alone (ILFO) in large-scale MDPs. While most IL algorithms rely on an expert to directly provide actions to the learner, in this setting the expert only supplies sequences of observations. We design a new model-free algorithm for ILFO, Forward Adversarial Imitation Learning (FAIL), which learns a sequence of time-dependent policies by minimizing an Integral Probability Metric between the observation distributions of the expert policy and the learner. FAIL is the first provably efficient algorithm in ILFO setting, which learns a near-optimal policy with a number of samples that is polynomial in all relevant parameters but independent of the number of unique observations. The resulting theory extends the domain of provably sample efficient learning algorithms beyond existing results, which typically only consider tabular reinforcement learning settings or settings that require access to a near-optimal reset distribution. We also investigate the extension of FAIL in a model-based setting. Finally we demonstrate the efficacy of FAIL on multiple OpenAI Gym control tasks.

BibTeX

@conference{Sun-2019-118906,
author = {Wen Sun and Anirudh Vemula and Byron Boots and J. Andrew Bagnell},
title = {Provably Efficient Imitation Learning from Observation Alone},
booktitle = {Proceedings of (ICML) International Conference on Machine Learning},
year = {2019},
month = {June},
pages = {6036 - 6045},
}

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.