|
Abstract:
Among the different technics used for automated natural
language processing, neural networks have been used to varying
degrees of success. Different researchers have used topologies
ranging from fully connected to very finely designed and
controlled (see, for example, Jain, 1991). The question of how to
determine what topology is optimal for a given task is still
open.
In this poster I will report results of empirical experiments
using genetic algorithms as a means of discovering optimal
network topology for a natural language task. The language
consists of over 500 sentences divided into three levels of
complexity. At the highest level of complexity are sentences such
as <mary saw that the duck swam in the river>. A sentence
is presented to the network one word at a time by activating the
equivalent of its lexical value at the input layer. The network
incrementally generates a description of the input and
demonstrates its understanding by correctly identifying: (1) each
word in the sentence, e.g. <swam> is a past tense movement
verb; (2) each phrase in the sentence, e.g. <the duck> is a
noun phrase; and (3) the relationships among phrases, e.g.
<mary> is the agent of <saw>. The network is trained
with 20% of the corpus, and performance is measured by testing
the network on the complete language.
The key factor in measuring performance is generalization. A
network with too many connections will manage to memorize the
training data, decreasing performance when generalizing to novel
data (Caudill, 1990). My experiments confirm that a fully
connected network is not optimal for the task described above. In
order to increase generalization performance, some researchers
have opted to prune weights off the network once it has been
trained. Most pruning algorithms commonly being used delete one
weight on each iteration and continue to iterate until some
performance threshold is achieved. These methods assume that
greedy deletion of weights is optimal, which is not true for all
cases.
Unlike greedy pruning algorithms, the genetic algorithm
randomly alters the number of connections and penalizes networks
based on how many connections they use. This penalty is modified
from one experiment to another in order to determine how the
number of weights affects the ability of the network to
generalize. It then combines the topologies of the better
networks in an attempt to produce better ones. These experiments
will produce a better measure of the importance of this factor in
generalization performance.
Caudill, M. (1990). "Using Neural Nets: Diagnostic Expert Nets,
Pruning Trained Networks."
AI Expert,
September 1990, p. 45
Jain, A. (1991). "Parsing Complex Sentences with Structured
Connectionist Networks."
Neural Computation,
volume 3, pp.110-120
|