MIT CogNet, The Brain Sciences ConnectionFrom the MIT Press, Link to Online Catalog
SPARC Communities
Subscriber : Stanford University Libraries » LOG IN

space

Powered By Google 
Advanced Search

 

Model Selection in Clustering by Uniform Convergence Bounds

 Joachim M. Buhmann and Marcus Held
  
 

Abstract:
Unsupervised learning algorithms are designed to extract structure from data samples. Reliable and robust inference requires a guarantee that extracted structures are typical for the data source, i.e., similar structures have to be infered from a second sample set of the same data source. The overfitting phenomenon in maximum entropy based annealing algorithms is exemplarily studied for a class of histogram clustering models. Bernstein's inequality for large deviations is used to determine the maximal achievable approximation quality parameterized by a minimal temperature. Monte Carlo simulations support the proposed model selection criterion by finite temperature annealing.

 
 


© 2010 The MIT Press
MIT Logo