Finding natural clusters having minimum description length
Abstract
A two-step procedure that finds natural clusters in geometric point data is described. The first step computes a hierarchical cluster tree minimizing an entropy objective function. The second step recursively explores the tree for a level clustering having minimum description length. Together, these two steps find natural clusters without requiring a user to specify threshold parameters or so-called magic numbers. In particular, the method automatically determines the number of clusters in the input data. The first step exploits a new hierarchical clustering procedure called numerical iterative hierarchical clustering (NIHC). The output of NIHC is a cluster tree. The second step in the procedure searches the tree for a minimum-description-length (MDL) level clustering. The MDL formulation, equivalent to maximizing the posterior probability, is suited to the clustering problem because it defines a natural prior distribution.
BibTeX
@conference{Wallace-1990-13121,author = {R. S. Wallace and Takeo Kanade},
title = {Finding natural clusters having minimum description length},
booktitle = {Proceedings of 10th International Conference on Pattern Recognition (ICPR '90)},
year = {1990},
month = {June},
volume = {1},
pages = {438 - 442},
}