| |
Abstract:
Singularities are ubiquitous in the parameter space of
hierarchical models such as multilayer perceptrons. At
singularities, the Fisher information matrix degenerates, and the
Cramér-Rao paradigm does no more hold, implying that the
classical model selection theory such as AIC and MDL cannot be
applied. It is important to study the relation between the
generalization error and the training error at singularities. The
present paper demonstrates a method of analyzing these errors
both for the maximum likelihood estimator and the Bayesian
predictive distribution in terms of Gaussian random fields, by
using simple models.
|