The Pargram Project: Workshop and Demo
Abstract
Proceedings of LFG02; CSLI Publications On-line
The Parallel Grammar Project (ParGram) is a long-standing consortium of researchers developing LFG grammars for written input in a number of languages. The grammars are written ''in parallel'' (hence the name of the project), based on shared linguistic assumptions about the nature of the grammars that are produced. The project has several goals: on the theoretical side, to test the universality of LFG theory and to examine and rectify any limitations in coverage of the theory, and on the practical side to produce resources for applications. In developing the grammars, we rely on the XLE platform, a grammar development platform incorporating high-performance algorithms for parsing, generating, and debugging LFG grammars. Word-level analysis is performed through finite-state morphological analyzers, which function as a separate module of the grammar.
The Pargram project began in 1994 as a collaboration between NLTT/Palo Alto Research Center, the University of Stuttgart, and MLTT/Xerox Research Center Europe. Originally, grammars for three languages were developed: an English grammar at PARC, a German grammar at the University of Stuttgart. and a French grammar at XRCE. The original partners contributed to the development of the XLE platform for large grammars and applications, to solidifying the underlying grammatical assumptions and conventions used in writing the grammars, and to the integration of morphological analyzers. After the move of the French grammar to PARC in 2000, several additional partners were added to the project; currently, the project encompasses grammars for six languages. The English and French grammars are being developed at PARC, and the German grammar is being developed at the University of Stuttgart. Additionally, a Norwegian grammar is under development at the University of Bergen, a Japanese grammar at the Corporate Research Center, Fuji Xerox, Japan, and a Hindi/Urdu grammar at UMIST.
Grammars developed by the PARGRAM project have been incorporated into a number of other research projects. Among them are the PARTRANS project, which uses the grammars in translation; the COMET project, which explores statistical disambiguation, using the Wall Street Journal corpus and the English grammar; and the TIGER project, which uses the grammar in semi-automatic creation of a treebank of German newspaper text.
Recommended ParGram Specific References:
- Butt, Miriam, Helge Dyvik, Tracy Holloway King, Hiroshi Masuichi, and Christian Rohrer. 2002. The Parallel Grammar Project. Proceedings of COLING2002 Workshop on Grammar Engineering and Evaluation. (ps, pdf)
- Butt, Miriam, Tracy Holloway King, Maria-Eugenia Nino, and Frederique Segond. 1999. A Grammar Writer's Cookbook. Stanford: CSLI Publications.
Selected References:
- Brants, Sabine, Stefanie Dipper, Silvia Hansen, Wolfgang Lezius and George Smith. 2002. The TIGER Treebank. In Proceedings of the Workshop on Treebanks and Linguistic Theories, Sozopol.
- Butt, Miriam and Tracy Holloway King. 2001. Non-Nominative Subjects in Urdu: A Computational Analysis. In Proceedings of the Proceedings of the International Symposium on Non-nominative Subjects, ILCAA, Tokyo.
- Butt, Miriam, Stefanie Dipper, Anette Frank, and Tracy Holloway King. 1999. Writing Large-scale Parallel Grammars for English, French, and German. In Miriam Butt and Tracy Holloway King, editors, Proceedings of the LFG99 Conference. CSLI
- Butt, Miriam, Christian Fortmann and Christian Rohrer. 1996. Syntactic Analyses for Parallel Grammars: Auxiliaries and Genitive NPs, Coling 96, Copenhagen.
- Butt, Miriam, Maria-Eugenia Nino, and Frederique Segond. 1996. Multilingual Processing of Auxiliaries within LFG. In Proceedings of KONVENS 1996, Bielefeld. Mouton de Gruyter. 111-122.
- Dipper, Stefanie. 2000. Grammar-based Corpus Annotation in Anne Abeillé; Thorsten Brants and Hans Uszkoreit, editors, Proceedings of the Second Workshop on Linguistically Interpreted Corpora (LINC) pp. 56-64 Luxembourg.
- Frank, Anette, Tracy Holloway King, Jonas Kuhn, and John Maxwell. 2001. Optimality Theory style constraint ranking in large-scale LFG grammars. In Peter Sells, editor, Formal and Empirical Issues in Optimality Theory. Stanford: CSLI Publications.
- Kaplan, Ronald M. and Miriam Butt. 2002. The Morphology-Syntax Interface in LFG. In Miriam Butt and Tracy Holloway King, editors, Proceedings of the LFG02 Conference. CSLI
- Kaplan, Ronald M. and John T. Maxwell. 1996. LFG grammar writer's workbench. ftp://ftp.parc.xerox.com/pub/lfg/lfgmanual.ps
- King, Tracy Holloway, Stefanie Dipper, Anette Frank, Jonas Kuhn, and John Maxwell. 2000. Ambiguity Management in Grammar Writing. In E. Hinrichs, D. Meurers, and S. Wintner, editors, Proceedings of the Workshop on Linguistic Theory and Grammar Implementation, ESSLLI-2000, Birmingham, UK, pp. 5-19.
- Kuhn, Jonas. 1998. Towards data-intensive testing of a broad-coverage LFG grammar. In Proceedings of KONVENS 98, Bonn. Peter Lang. 43-56.
- Kuhn, Jonas. 2000. Processing Optimality-theoretic Syntax by Interleaved Chart Parsing and Generation. In Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics (ACL-2000) pp. 360-367 Hong Kong.
- Zinsmeister, Heike, Jonas Kuhn and Stefanie Dipper. 2002. Utilizing LFG Parses for Treebank Annotation. In Miriam Butt and Tracy Holloway King, editors, Proceedings of the LFG02 Conference. CSLI (pdf, ps)