| |
Abstract:
Cluster analysis is a fundamental principle in exploratory
data analysis, providing the user with a description of the group
structure of given data. A key problem in this context is the
interpretation and visualization of clustering solutions in
high--dimensional or abstract data spaces. In particular, fuzzy or
probabilistic descriptions of the group structure, essential to
capture inter--cluster relations, are hardly assessable by simple
inspection of the probabilistic assignment variables. We present a
novel approach for the visualization of probabilistic group
structure based on a statistical model of the object assignments
which have been observed or estimated by a probabilistic clustering
procedure. The objects or data points are embedded in a low
dimensional Euclidean space by approximating the observed data
statistics with a Gaussian mixture model. The algorithm provides a
new approach to the visualization of the inherent structure for a
broad variety of data types, e.g. histogram data, proximity data
and co--occurrence data. To demonstrate the power of the approach,
histograms of textured images are visualized as a large--scale data
mining application.
|