Approximate Kalman Filters for Embedding Author-Word Co-occurrence Data over Time
Abstract
We address the problem of embedding enti ties into Euclidean space over time based on co-occurrence data. We extend the CODE model of Globerson et al. (2004) to a dynamic setting. This leads to a non-standard factored state space model with real-valued hidden parent nodes and discrete observation nodes. We investigate the use of variational approximations applied to the observation model that allow us to formulate the entire dynamic model as a Kalman Flter. Applying this model to temporal co-occurrence data yields posterior distributions of entity coordinates in Euclidean space that are updated over time. Initial results on per-year co-occurrences of authors and words in the NIPS corpus and on synthetic data, including videos of dynamic embeddings, seem to indicate that the model results in embeddings of co-occurrence data that are meaningful both temporally and contextually.
BibTeX
@workshop{Sarkar-2006-17010,author = {Purnamrita Sarkar and Sajid Siddiqi and Geoffrey Gordon},
title = {Approximate Kalman Filters for Embedding Author-Word Co-occurrence Data over Time},
booktitle = {Proceedings of ICML '06 Workshop on Statistical Network Analysis},
year = {2006},
month = {June},
}