| |
Abstract:
In many discrimination problems a large amount of data is
available but only a few of them are labeled. This provides a
strong motivation to improve or develop methods for
semi-supervised learning. In this paper, boosting is generalized
to this task within the optimization framework of MarginBoost. We
extend the margin definition to unlabeled data and develop the
gradient descent algorithm that corresponds to the resulting
margin cost function. This meta-learning scheme can be applied to
any base classifier able to benefit from unlabeled data. We
propose here to apply it to mixture models trained with an
Expectation-Maximization algorithm. Promising results are
presented on benchmarks with different rates of labeled data.
|