Activate Activate Activate
contact  
Hello. Sign in to personalize your visit. New user? Register now.  

In
By author

Monthly
288 pp. per issue, 6 x 9,
illustrated
Founded: 1989
ISSN 0899-7667
E-ISSN 1530-888X
2008 ISI Impact Factor: 2.378

Neural Computation

December 2004, Vol. 16, No. 12, Pages 2483-2506
Posted Online March 13, 2006.
(doi:10.1162/0899766042321751)
© 2004 Massachusetts Institute of Technology
How Many Clusters? An Information-Theoretic Perspective

Susanne Still

Department of Physics and Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, U.S.A.

William Bialek

Department of Physics and Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, U.S.A.

PDF (576.196 KB) PDF Plus (602.206 KB)

Clustering provides a common means of identifying structure in complex data, and there is renewed interest in clustering as a tool for the analysis of large data sets in many fields. A natural question is how many clusters are appropriate for the description of a given system. Traditional approaches to this problem are based on either a framework in which clusters of a particular shape are assumed as a model of the system or on a two-step procedure in which a clustering criterion determines the optimal assignments for a given number of clusters and a separate criterion measures the goodness of the classification to determine the number of clusters. In a statistical mechanics approach, clustering can be seen as a trade-off between energy- and entropy-like terms, with lower temperature driving the proliferation of clusters to provide a more detailed description of the data. For finite data sets, we expect that there is a limit to the meaningful structure that can be resolved and therefore a minimum temperature beyond which we will capture sampling noise. This suggests that correcting the clustering criterion for the bias that arises due to sampling errors will allow us to find a clustering solution at a temperature that is optimal in the sense that we capture maximal meaningful structure—without having to define an external criterion for the goodness or stability of the clustering. We show that in a general information-theoretic framework, the finite size of a data set determines an optimal temperature, and we introduce a method for finding the maximal number of clusters that can be resolved from the data in the hard clustering limit.

Cited by

Xu-Lei Yang, Qing Song, Yi-Lei Wu, Ai-Ze Cao. (2009) A novel pruning approach for robust data clustering. Neural Computing and Applications 18:7, 759-768
Online publication date: 1-Nov-2009.
CrossRef
Mina Zarei, Keivan Aghababaei Samani, Gholam Reza Omidi. (2009) Complex eigenvectors of network matrices give better insight into the community structure. Journal of Statistical Mechanics: Theory and Experiment 2009:10, P10018
Online publication date: 1-Nov-2009.
CrossRef
S. Still. (2009) Information-theoretic approach to interactive learning. EPL (Europhysics Letters) 85:2, 28005
Online publication date: 1-Feb-2009.
CrossRef
Li-Fei CHEN. (2008) A Hierarchical Method for Determining the Number of Clusters. Journal of Software 19:1, 62-72
Online publication date: 30-Jul-2008.
CrossRef
Thomas A. Runkler. (2008) Wasp swarm optimization of the c-means clustering model. International Journal of Intelligent Systems 23:3, 269-285
Online publication date: 1-Apr-2008.
CrossRef
Ulrike Luxburg. (2007) A tutorial on spectral clustering. Statistics and Computing 17:4, 395-416
Online publication date: 10-Oct-2007.
CrossRef
Qing Song. (2005) A Robust Information Clustering Algorithm. Neural Computation 17:12, 2672-2698
Online publication date: 1-Dec-2005.
Abstract | PDF (306 KB) | PDF Plus (509 KB) 

Technology Partner - Atypon Systems, Inc.
  CrossRef member COUNTER member