| |
Abstract:
We describe a new incremental algorithm for training linear
threshold functions. Our algorithm can be viewed as an
approximation to the algorithm that repeatedly chooses the
hyperplane that classifies previously seen examples correctly with
the maximum margin. It is known that such a maximum-margin
hypothesis can be computed by minimizing the length of the weight
vector subject to a number of linear constraints. Our algorithm
works by maintaining a relatively simple relaxation of these
constraints that can be efficiently updated. We prove a mistake
bound for our algorithm that is the same as that proved for the
perceptron algorithm. Our analysis implies that the more
computationally intensive maximum-margin algorithm also satisfies
this mistake bound; this is the first worst-case performance
guarantee for this algorithm. We demonstrate that our algorithm can
also be used with kernel functions to efficiently classify in a
very high dimensional space. We describe some experiments using our
new algorithm as a batch algorithm to recognize handwritten digits.
Its computational complexity and simplicity is similar to that of
perceptron algorithm, but its generalization is much better.
Compared to Vapnik's maximum margin classifier, the use of our
algorithm in batch mode is much simpler to implement, and much more
efficient in computation time, but its generalization is not as
good.
|