Computing Conditional Probabilities in Large Domains by Maximizing Renyi’s Quadratic Entropy
Abstract
In this dissertation we discuss methods for efficiently approximating conditional probabilities in large domains by maximizing the entropy of the distribution given a set of constraints. The constraints are constructed from conditional probabilities, typically of low-order, that can be accurately computed from the training data. By appropriately choosing the constraints, maximum entropy methods can balance the tradeoffs in errors due to bias and variance. Standard maximum entropy techniques are too computationally inefficient for use in large domains in which the set of variables that are being conditioned upon varies. Instead of using the standard measure of entropy first proposed by Shannon, we use a measure that lies within the family of R'enyi's entropies. If we allow our probability estimates to occasionally lie outside the range from 0 to 1, we can efficiently maximize R'enyi's quadratic entropy relative to the constraints using a set of linear equations. We develop two algorithms, the inverse probability method and recurrent linear network, for maximizing R'enyi's quadratic entropy without bounds. The algorithms produce identical results. However, depending on the type of problem, one method may be more computationally efficient than the other. We also propose an extension to the algorithms for partially enforcing the constraints based on our confidence in them. Our algorithms are tested on several applications including: collaborative filtering, image retrieval and language modeling.
BibTeX
@phdthesis{Zitnick-2003-8647,author = {Charles Zitnick},
title = {Computing Conditional Probabilities in Large Domains by Maximizing Renyi's Quadratic Entropy},
year = {2003},
month = {May},
school = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-03-20},
keywords = {maximum entropy, collaborative filtering, image retrieval, language modeling, conditional probabilities},
}