Directed-Info GAIL: Learning Hierarchical Policies from Unsegmented Demonstrations using Directed Information

Mohit Sharma, Arjun Sharma, Nicholas Rhinehart, and Kris M. Kitani

Conference Paper, Proceedings of (ICLR) International Conference on Learning Representations, May, 2019

View Publication

Abstract

The use of imitation learning to learn a single policy for a complex task that has multiple modes or hierarchical structure can be challenging. In fact, previous work has shown that when the modes are known, learning separate policies for each mode or sub-task can greatly improve the performance of imitation learning. In this work, we discover the interaction between sub-tasks from their resulting state-action trajectory sequences using a directed graphical model. We propose a new algorithm based on the generative adversarial imitation learning framework which automatically learns sub-task policies from unsegmented demonstrations. Our approach maximizes the directed information flow in the graphical model between sub-task latent variables and their generated trajectories. We also show how our approach connects with the existing Options framework, which is commonly used to learn hierarchical policies.

BibTeX

@conference{Sharma-2019-117961,
author = {Mohit Sharma, Arjun Sharma, Nicholas Rhinehart, Kris M. Kitani},
title = {Directed-Info GAIL: Learning Hierarchical Policies from Unsegmented Demonstrations using Directed Information},
booktitle = {Proceedings of (ICLR) International Conference on Learning Representations},
year = {2019},
month = {May},
}

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.