| |
Abstract:
Gaussian mixtures (or so-called radial basis function
networks) for density estimation provide a natural counterpart to
sigmoidal neural networks for function fitting and approximation.
In both cases, it is possible to give simple expressions for the
iterative improvement of performance as components of the network
are introduced one at a time. In particular, for mixture density
estimation we show that a k-component mixture estimated by maximum
likelihood (or by an iterative likelihood improvement that we
introduce) achieves log-likelihood within order
1/k
of the log-likelihood achievable by any convex combination.
Consequences for approximation and estimation using
Kullback-Leibler risk are also given. A Minimum Description Length
principle selects the optimal number of components $k$ that
minimizes the risk bound.
|