CSLI Publications logo
new books
catalog
series
contact us
for authors
order
search
LFG Proceedings
CSLI Publications
Facebook

Corpus-based learning of OT constraint rankings for large-scale LFG grammars

Martin Forst, Jonas Kuhn and Christian Rohrer

Abstract

We discuss a two-stage disambiguation technique for linguistically precise broad-coverage grammars: the pre-filter of the first stage is triggered by linguistic configurations (optimality marks) specified by the grammar writer; the second stage is a log-linear probability model trained on corpus data. This set-up is used in the Parallel Grammar (ParGram) project, developing Lexical Functional Grammars for various languages. The present paper is the first study exploring how the pre-filter can be empirically tuned by learning a relative ranking of the optimality marks from corpus data, identifying problematic marks and relaxing the filter in various ways.

pubs @ csli.stanford.edu 
CSLI Publications
Stanford University
Cordura Hall
210 Panama Street
Stanford, CA 94305-4101
(650) 723-1839