Transition-Based Techniques for Non-Projective Dependency Parsing

Main Article Content

Marco Kuhlmann
Joakim Nivre

Abstract

We present an empirical evaluation of three methods for the treatment of non-projective structures in transition-based dependency parsing: pseudo-projective parsing, non-adjacent arc transitions, and online reordering. We compare both the theoretical coverage and the empirical performance of these methods using data from Czech, English and German. The results show that although online reordering is the only method with complete theoretical coverage, all three techniques exhibit high precision but somewhat lower recall on non-projective dependencies and can all improve overall parsing accuracy provided that non-projective dependencies are frequent enough. We also find that the use of non-adjacent arc transitions may lead to a drop in accuracy on projective dependencies in the presence of long-distance non-projective dependencies, an effect that is not found for the two other techniques.

Article Details

Section
Articles

References

Attardi, Giuseppe. 2006. Experiments with a multilanguage non-projective dependency parser. In Proceedings of the Tenth Conference on Computational Natural Language Learning (CoNLL), pages 166–170.

Brants, Sabine, Stefanie Dipper, Silvia Hansen,Wolfgang Lezius, and George Smith. 2002. TIGER treebank. In Proceeedings of the First Workshop on Treebanks and Linguistic Theories, pages 24–42.

Briscoe, Edward and John Carroll. 1993. Generalised probabilistic LR parsing of natural language (corpora) with unification-based grammars. Computational Linguistics 19:25–59.

Buchholz, Sabine and Erwin Marsi. 2006. CoNLL-X shared task on multilingual dependency parsing. In Proceedings of the Tenth Conference on Computational Natural Language Learning (CoNLL), pages 149–164.

Chang, Chih-Chung and Chih-Jen Lin. 2001. LIBSVM: A Library for Support Vector Machines. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.

Eryigit, Gülsen, Joakim Nivre, and Kemal Oflazer. 2008. Dependency parsing of Turkish. Computational Linguistics 34.

Foth, Kilian, Michael Daum, and Wolfgang Menzel. 2004. A broad-coverage parser for German based on defeasible constraints. In Proceedings of KONVENS 2004 , pages 45–52.

Hajic, Jan, Eva Hajicová, Petr Pajas, Jarmila Panevová, and Petr Sgall. 2001. Prague Dependency Treebank 1.0. Linguistic Data Consortium, 2001T10.

Hajic, Jan, Jarmila Panevová, Eva Hajicová, Petr Sgall, Petr Pajas, Jan Štepánek, Jirí Havelka, and Marie Mikulová. 2006. Prague Dependency Treebank 2.0. Linguistic Data Consortium, 2006T01.

Hajic, Jan, Massimiliano Ciaramita, Richard Johansson, Daisuke Kawahara, Maria Antònia Martí, Lluís Màrquez, Adam Meyers, Joakim Nivre, Sebastian Padó, Jan Štepánek, Pavel Stranák, Mihai Surdeanu, Nianwen Xue, and Yi Zhang. 2009. The CoNLL-2009 shared task: Syntactic and semantic dependencies in multiple languages. In Proceedings of the Thirteenth Conference on Computational Natural Language Learning (CoNLL 2009): Shared Task, pages 1–18.

Kudo, Taku and Yuji Matsumoto. 2002. Japanese dependency analysis using cascaded chunking. In Proceedings of the 6th Conference on Computational Language Learning (CoNLL), pages 63–69.

Marcus, Mitchell P., Beatrice Santorini, and Mary Ann Marcinkiewicz. 1993. Building a large annotated corpus of english: The Penn Treebank. Computational Linguistics 19(2):313–330.

Martins, Andre, Noah Smith, and Eric Xing. 2009. Concise integer linear programming formulations for dependency parsing. In Proceedings of the 47th Annual Meeting of the ACL and the Fourth International Joint Conference on Natural Language Processing of the AFNLP, pages 342–350.

McDonald, Ryan and Fernando Pereira. 2006. Online learning of approximate dependency parsing algorithms. In Proceedings of the Eleventh Conference of the European Chapter of the Association for Computational Linguistics (EACL), pages 81–88. Trento, Italy.

McDonald, Ryan, Fernando Pereira, Kiril Ribarov, and Jan Hajic. 2005. Non-projective dependency parsing using spanning tree algorithms. In Proceedings of the Human Language Technology Conference (HLT) and the Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 523–530. Vancouver, Canada.

Nakagawa, Tetsuji. 2007. Multilingual dependency parsing using global features. In Proceedings of the CoNLL Shared Task Session of EMNLP-CoNLL 2007, pages 952–956.

Nivre, Joakim. 2008. Algorithms for deterministic incremental dependency parsing. Computational Linguistics 34(4):513–553.

Nivre, Joakim. 2009. Non-projective dependency parsing in expected linear time. In Proceedings of the 47th Annual Meeting of the ACL and the Fourth International Joint Conference on Natural Language Processing of the AFNLP, pages 351–359.

Nivre, Joakim, Johan Hall, Sandra Kübler, Ryan McDonald, Jens Nilsson, Sebastian Riedel, and Deniz Yuret. 2007. The CoNLL 2007 shared task on dependency parsing. In Proceedings of the CoNLL Shared Task Session of EMNLP-CoNLL 2007 , pages 915–932.

Nivre, Joakim, Johan Hall, and Jens Nilsson. 2004. Memory-based dependency parsing. In Proceedings of the Eighth Conference on Computational Natural Language Learning, pages 49–56.

Nivre, Joakim, Johan Hall, Jens Nilsson, Gülsen Eryigit, and Svetoslav Marinov. 2006. Labeled pseudo-projective dependency parsing with support vector machines. In Proceedings of the Tenth Conference on Computational Natural Language Learning (CoNLL), pages 221–225.

Nivre, Joakim, Marco Kuhlmann, and Johan Hall. 2009. An improved oracle for dependency parsing with online reordering. In Proceedings of the Eleventh International Conference on Parsing Technologies, pages 73–76. Paris, France.

Nivre, Joakim and Jens Nilsson. 2005. Pseudo-projective dependency parsing. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL), pages 99–106. Ann Arbor, USA.

Ratnaparkhi, Adwait. 1997. A linear observed time statistical parser based on maximum entropy models. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1–10.

Sagae, Kenji and Jun’ichi Tsujii. 2008. Shift-reduce dependency DAG parsing. In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics (ACL), pages 753–760.

Schneider, Gerold. 2008. Hybrid Long-Distance Functional Dependency Parsing. Ph.D. thesis, Universität Zürich, Zürich, Switzerland.

Titov, Ivan and James Henderson. 2007. A latent variable model for generative dependency parsing. In Proceedings of the 10th International Conference on Parsing Technologies (IWPT), pages 144–155.

Yamada, Hiroyasu and Yuji Matsumoto. 2003. Statistical dependency analysis with support vector machines. In Proceedings of the Eighth International Workshop on Parsing Technologies (IWPT), pages 195–206.

Zhang, Yue and Stephen Clark. 2008. A tale of two parsers: Investigating and combining graph-based and transition-based dependency parsing. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 562–571.