Quarterly (March, June, September, December)
160 pp. per issue
6 3/4 x 10
2014 Impact factor:

Computational Linguistics

Paola Merlo, Editor
June 2005, Vol. 31, No. 2, Pages 173-185
(doi: 10.1162/0891201054223986)
© 2005 Association for Computational Linguistics
A General Technique to Train Language Models on Language Models
Article PDF (185.95 KB)

We show that under certain conditions, a language model can be trained on the basis of a second language model. The main instance of the technique trains a finite automaton on the basis of a probabilistic context-free grammar, such that the Kullback-Leibler distance between grammar and trained automaton is provably minimal. This is a substantial generalization of an existing algorithm to train an n-gram model on the basis of a probabilistic context-free grammar.