| |
Abstract:
In this study, the neural basis of language learning is
considered. It is proposed that the requisite knowledge for
language comprehension is characterized by a manifold and vector
field defined on the space of possible word sequences. Converging
upon an estimate of these topological structures is the
computational goal of language learning. Furthermore, this learning
may be implemented by the predictive coding strategy of the
recurrent circuitry in the neocortex. To explore this hypothesis
further, a simulation of a biologically plausible neural system was
performed. A continuous-time recurrent neural network was trained
on a 10 million word corpus of English texts. Analysis revealed
that the network's state space is topologically organized on the
basis of meaning. Sentences defined as semantically similar are
clustered together in compact neighborhoods. It was shown that the
state space of a recurrent neural network, trained to predict word
sequences, becomes organized on the basis of semantic similarity.
Sentences and texts that are semantically similar are clustered and
can be discriminated by a simple linear function. In conclusion, a
dynamical systems approach to sentence comprehension is proposed
whereby the semantic interpretation of a sentence is identified
with an attractor in a metric state space, and learning is a
process of converging to an estimate of the geometry of this
space.
|