| |
Abstract:
Training a Support Vector Machine (SVM) requires the solution
of a very large quadratic programming (QP) problem. This paper
proposes a new algorithm for training SVMs: Sequential Minimal
Optimization, or SMO. SMO breaks the large QP problem into a series
of smallest possible QP problems which are analytically solvable.
SMO thus avoids numerical QP optimization in the inner loop. Using
code which exploits sparseness of input data speeds up SMO even
further. Because large matrix computation is avoided, SMO scales
somewhere between linear and quadratic in the training set size for
various test problems, while a standard projected conjugate
gradient (PCG) chunking algorithm scales somewhere between linear
and cubic in the training set size. SMO's computation time is
dominated by SVM evaluation, hence SMO is fastest for linear SVMs
and sparse data sets. For the MNIST database, SMO is as fast as PCG
chunking; while for the UCI Adult database and linear SVMs, SMO can
be more than 1000 times faster than the PCG chunking
algorithm.
|