Paola Merlo, Editor
March 2013, Vol. 39, No. 1, Pages 87-119
(doi: 10.1162/COLI_a_00136)
Data-Driven Parsing using Probabilistic Linear Context-Free Rewriting Systems
This paper presents the first efficient implementation of a weighted deductive CYK parser for Probabilistic Linear Context-Free Rewriting Systems (PLCFRSs). LCFRS, an extension of CFG, can describe discontinuities in a straightforward way and is therefore a natural candidate to be used for data-driven parsing. To speed up parsing, we use different context-summary estimates of parse items, some of them allowing for A* parsing. We evaluate our parser with grammars extracted from the German NeGra treebank. Our experiments show that data-driven LCFRS parsing is feasible and yields output of competitive quality.