| |
Abstract:
Active data clustering is a novel technique for clustering of
proximity data which utilizes principles from sequential experiment
design in order to interleave data generation and data analysis.
The proposed active data sampling strategy is based on the expected
value of information, a concept rooted in statistical decision
theory. This is considered as an important step towards the
analysis of large-scale data sets, because it offers a way to
overcome the inherent data sparseness of proximity data. We present
applications to unsupervised texture segmentation in computer
vision and information retrieval in document databases.
|