2:00 pm to 12:00 am
Event Location: Newell Simon Hall 1305
Abstract: In numerous real world applications, from sensor networks to patient monitoring to intelligent buildings, probabilistic inference is necessary to make conclusions about the system in question in the face of uncertainty. The key problem in all those settings is to compute the probability distribution over some random variables of interest (the query) given the known values of other random variables (the evidence). Probabilistic graphical models (PGMs) have become the approach of choice for representing and reasoning with probability distributions. This thesis proposes algorithms for learning probabilistic graphical models and approximate inference in PGMs that aim to improve the quality of answering the queries by exploiting the information about the query variables and the evidence assignment more fully than the existing approaches.
The contributions of this thesis fall into three categories. First, we propose a polynomial time algorithm for learning the structure of graphical models that guarantees both the approximation quality of the resulting model and the fact that the resulting model admits efficient exact inference. Ours is the first efficient algorithm to provide this type of guarantees. A key theoretical insight of our approach is a tractable upper bound on the mutual information of arbitrarily large sets of random variables that yields exponential speedups over the exact computation.
Second, we propose several algorithms that exploit the information available at test time about the query variables and the evidence assignment to focus the existing approximate inference approaches on the query, and to only reason about nuisance variables to the extent necessary for computing the conditional distribution over the query. In particular, we propose an efficient algorithm for simplifying the original model while preserving the conditional distribution over the query to the extent possible. Based on the same theoretical analysis, we also propose a modification to the family of residual loopy belief propagation algorithms that allows one to focus the inference on the query and away from the nuisance variables in a principled way.
Finally, we observe that the standard approach of learning a single densely connected probabilistic graphical model and running approximate inference at test time results in compounding approximation errors both during learning and during inference. Instead, we propose a framework for learning the structure and parameters of query-specific tractable models at test time and running exact inference. The advantage of our approach is that the approximations are only needed during learning. Although in general the query variables may have complex dependencies that no tractable model can capture well, given the particular evidence assignment many general dependencies may not manifest themselves, and a well-approximating tractable model may exist. Therefore, by learning a separate tractable query-specific model for every test point, our approach has potential to take advantage of the existing algorithms for exact inference, and at the same time to avoid the representational constraints of committing to the same fixed tractable model for all evidence assignments.
Committee:Carlos Guestrin, Chair
Drew Bagnell
Eric Xing
Pedro Domingos, University of Washington