Finding natural clusters having minimum description length - Robotics Institute Carnegie Mellon University

Finding natural clusters having minimum description length

R. S. Wallace and Takeo Kanade
Conference Paper, Proceedings of 10th International Conference on Pattern Recognition (ICPR '90), Vol. 1, pp. 438 - 442, June, 1990

Abstract

A two-step procedure that finds natural clusters in geometric point data is described. The first step computes a hierarchical cluster tree minimizing an entropy objective function. The second step recursively explores the tree for a level clustering having minimum description length. Together, these two steps find natural clusters without requiring a user to specify threshold parameters or so-called magic numbers. In particular, the method automatically determines the number of clusters in the input data. The first step exploits a new hierarchical clustering procedure called numerical iterative hierarchical clustering (NIHC). The output of NIHC is a cluster tree. The second step in the procedure searches the tree for a minimum-description-length (MDL) level clustering. The MDL formulation, equivalent to maximizing the posterior probability, is suited to the clustering problem because it defines a natural prior distribution.

BibTeX

@conference{Wallace-1990-13121,
author = {R. S. Wallace and Takeo Kanade},
title = {Finding natural clusters having minimum description length},
booktitle = {Proceedings of 10th International Conference on Pattern Recognition (ICPR '90)},
year = {1990},
month = {June},
volume = {1},
pages = {438 - 442},
}