Carnegie Mellon University
1:00 pm to 2:00 pm
GHC 4405
Abstract
This thesis focuses on moment and kernel-based methods for applications in Robotics and Natural Language Processing. Kernel and moment-based learning leverage information about correlated data that allow the design of compact representations and efficient learning algorithms.
We explore kernel algorithms for planning by leveraging inherently continuous properties of reproducing kernel Hilbert spaces. We introduce a kernel based robot motion planner based on gradient optimization, in a space of smooth trajectories— a reproducing kernel Hilbert space. We further study a kernel-based approaches in the context of prediction, for learning a generative model, and in the context of planning for learning to interact with a controlled process.
We further explore two variants of moment-based learning: spectral techniques and anchor-based methods.
Spectral learning describes a more expressive model, which implicitly uses hidden state variables. We use it as a means to obtain a more expressive predictive model that we can use to learn to control an interactive agent, in the context of reinforcement learning.
We propose a combination of predictive representations with deep reinforcement learning to produce a recurrent network that is able to learn continuous policies under partial observability. We introduce an efficient end-to-end learning algorithm that is able to maximize cumulative reward while minimizing prediction error. We apply this approach to several continuous observation and action environments.
Anchor learning, on the other hand, provides an explicit form of representing state variables, by relating states to unambiguous observations. We rely on anchor-based techniques to provide a form of explicitly recovering the model parameters, in particular when states have a discrete representation such as in many Natural Language tasks. This family of methods provides an easier form of integrating supervised information during the learning process. We apply anchor-based algorithms on word labelling tasks in Natural Language Processing, namely semi-supervised part-of-speech tagging where annotations are learned from a large amount of raw text and a small amount of annotated corpora.
Thesis Committee Members:
Geoffrey J. Gordon, Chair
Siddhartha S. Srinivasa, University of Washington
Matthew T. Mason
André F. T. Martins, Unbabel/IT, Instituto Superior Técnico
João P. Costeira, Instituto Superior Técnico
Shay B. Cohen, University of Edinburgh