| |
Abstract:
This paper presents a novel approach to the unsupervised
learning of syntactic analyses of natural language text. Most
previous work has focused on maximizing likelihood according to
generative PCFG models. In contrast, we employ a
simpler
probabilistic model over trees based directly on constituent
identity and linear context, and use an EM-like iterative
procedure to induce structure. This method produces much higher
quality analyses, giving the best published results on the ATIS
dataset.
|