| |
Abstract:
The curse of dimensionality is severe when modeling
high-dimensional discrete data: the number of possible combinations
of the variables explodes exponentially. In this paper we propose a
new architecture for modeling high-dimensional data that requires
resources (parameters and computations) that grow only as the
square of the number of variables, using a multi-layer neural
network to represent the joint distribution of the variables as the
product of conditional distributions. The neural network can be
interpreted as a graphical model without hidden random variables,
but in which the conditional distributions are tied through the
hidden units. The connectivity of the neural network can be pruned
by using dependency tests between the variables. Experiments on
modeling the distribution of several discrete data sets show
statistically significant improvements over other methods such as
naive Bayes and comparable Bayesian networks, and show that
significant improvements can be obtained by pruning the
network.
|