Semi-Supervised Learning of Sequence Models with Method of Moments
Conference Paper, Proceedings of Empirical Methods for Natural Language Processing Conference (EMNLP '16), November, 2016
Abstract
We propose a fast and scalable method for semi-supervised learning of sequence models, based on anchor words and moment matching. Our method can handle hidden Markov models with feature-based log-linear emissions. Unlike other semi-supervised methods, no decoding passes are necessary on the unlabeled data and no graph needs to be constructed—only one pass is necessary to collect moment statistics. The model parameters are estimated by solving a small quadratic program for each feature. Experiments on part-of-speech (POS) tagging for Twitter and for a low-resource language (Malagasy) show that our method can learn from very few annotated sentences.
BibTeX
@conference{Marinho-2016-5622,author = {Zita Alexandra Magalhaes Marinho and Andre F. T. Martins and Shay B. Cohen and Noah A. Smit},
title = {Semi-Supervised Learning of Sequence Models with Method of Moments},
booktitle = {Proceedings of Empirical Methods for Natural Language Processing Conference (EMNLP '16)},
year = {2016},
month = {November},
}
Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.