Loading Events

PhD Thesis Proposal

October

26
Mon
Zita Marinho Carnegie Mellon University
Monday, October 26
10:00 am to 12:00 am
Moment-based Algorithms for Structured Prediction

Event Location: GHC 4405

Abstract: Latent variable models constitute a compact representation for complex, high dimensional data. This is useful in many applications in Robotics and Natural Language Processing (NLP). 

Latent models are however very hard to learn, in part because of the non-convexity of the likelihood function. The main difficult is associated with estimating the latent (unobserved) states, which need to be estimated indirectly by looking at correlations among observations. Maximum likelihood learning is statistically consistent but leads to intractable optimization problems. Local methods such as the Expectation Maximization (EM) algorithm present a tractable solution, but provide only local convergence guarantees, being sensitive to initialization.

We focus on an alternative school of thought, the so called Method of Moments (MoM), whose goal is to find model parameters that are in agreement with certain statistical moments of data. These methods yield (asymptotically) statistically optimal solutions, that can be computed efficiently. 

Some of these methods rely in their core on a spectral decomposition of observed statistics. These are known as spectral methods. Other approaches involve finding observations that unambiguously identify hidden states. These are known as anchor learning methods.

Moment-based algorithms are faster to compute and usually require more data to build good empirical estimates, compared with likelihood methods. In many applications, unsupervised data is inexpensive to compute, which makes moment- based algorithms a preferable choice under unsupervised settings, or even when labeled data is scarce.

 In this work we propose different forms of learning hidden state models using moment matching techniques, with large quantities of unsupervised data and little supervised data. We intend to design algorithms that are capable of handling large amounts of data in weakly supervised settings, that are flexible, and that allow the inclusion of model constraints. 

In this thesis, we propose to study moment-based learning for structured prediction tasks. We exploit the two variants of moment-based learning: spectral techniques and anchor-based methods. The former describes a more expressive model, which implicitly uses the hidden state variables. We propose to use these spectral techniques to learn controllable Predictive State Representations (PSRs) and apply them to two different domains: robotic manipulation tasks and a transition-based parser. Anchor-based methods provide a form of explicitly recovering the model parameters, which may then be used to integrate labeled data during the learning process. We propose to apply these anchor-based methods to two sequence labeling tasks: (1) semi-supervised part-of-speech tagging, where we learn from a small, annotated Twitter corpus, and (2) building weakly supervised named entity recognition system for different languages. 

Committee:Siddhartha Srinivasa, Co-chair

Geoffrey Gordon, Co-chair

Matt Mason

André Martins, Priberam Labs

Shay Cohen, University of Edinburgh

João Paulo Costeira, Instituto Superior Técnico