Carnegie Mellon University
2:00 pm to 3:00 pm
NSH 4513
Most deep reinforcement and imitation learning methods are data-driven and do not utilize the underlying structure of the problem. While these methods have achieved great success on many challenging tasks, several key problems such as generalization, data efficiency and compositionality remain open. Utilizing problem structure in the form of architecture design, priors, domain knowledge etc. may be a viable strategy to solve some of these problems. In this thesis, we present two approaches towards integrating problem structure with deep reinforcement and imitation learning methods.
In the first part of the thesis, we consider reinforcement learning problems where parameters of the model vary with its phase while the agent attempts to learn through its interactions with the environment. We propose phase-parameterized policies and value function approximators which explicitly enforce a phase structure to the policy or value space to better model such environments. Our approach leads to better performance and sample complexity on trajectory optimization and locomotion problems.
In the second part, we present a framework that incorporates structure in imitation learning by modelling the imitation of complex tasks or activities as a composition of easier sub-tasks. We propose a new algorithm, called Directed-Information GAIL, which leverages the idea of directed or causal information to segment demonstrations of complex tasks into simpler sub-tasks and learn sub-task policies that can then be composed together to perform complicated activities. We experiment with both discrete and continuous state-action environments and show that our proposed approach is able to find meaningful sub-tasks from unsegmented trajectories.
Committee:
Prof. Kris M. Kitani
Prof. David Held
Nick Rhinehart