Model Inference and Pattern Discovery by Minimal Representation Method - Robotics Institute Carnegie Mellon University

Model Inference and Pattern Discovery by Minimal Representation Method

Jakub Segen and Arthur C. Sanderson
Tech. Report, CMU-RI-TR-81-02, Robotics Institute, Carnegie Mellon University, July, 1981

Abstract

Inference of statistical models and discovery of patterns in random data sets are problems common to many fields of investigation. In particular, in the observation and control of processes where the physical mechanisms are too complex or not well understood to provide a model structure a priori, the choice of model structure and model size becomes a key element in the analysis. This paper describes an unsupervised technique for the ranking and model structures and choice of model size based on the expression [-log likelihood + model size (in bits) 1. This criterion is shown to be equivalent to seeking a parsimonious representation for data, and its derivation is motivated through a Bayesian argument. Limiting properties of the criterion and applications to number of clusters, dimension of a linear predictor, degree of polynomial approximation, or order of a Markov chain are discussed.

BibTeX

@techreport{Segen-1981-15104,
author = {Jakub Segen and Arthur C. Sanderson},
title = {Model Inference and Pattern Discovery by Minimal Representation Method},
year = {1981},
month = {July},
institute = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-81-02},
}