| |
Abstract:
We propose the following general method for scaling learning
algorithms to arbitrarily large data sets. Consider the model
$M_{\vecn}$ learned by the algorithm using $n_i$ examples in step
i
($\vecn = (n_1, \ldots, n_m)$), and the model $M_\infty$ that
would be learned using infinite examples. Upper-bound the loss
$L(M_{\vecn}, M_\infty) between them as a function of $\vecn$,
and then minimize the algorithm's time complexity $f(\vecn)$
subject to the constraint that $L(M_\infty, M_{\vecn})$ be at
most $\epsilon$ with probability at most $\delta$. We apply this
method to the EM algorithm for mixtures of Gaussians. Preliminary
experiments on a series of large data sets provide evidence of
the potential of this approach.
|