Skip to main content

Dependency Parsing Task

Task description

The task will be organized into two subtasks:

  1. Dependency Parsing for Information Extraction (DPIE): the main task focusing on standard dependency parsing of Italian texts with evaluation tracks aimed at testing the performance of parsing systems as well as their suitability to Information Extraction tasks;
  2. Cross-language Dependency Parsing (CLAP): a pilot task focusing on cross-lingual transfer parsing. Stemming from the experiments described in McDonald et al. (2013), in this subtask the participants are asked to use their parsers trained on the “Italian Stanford Dependency Treebank” (universal version) on test sets of other (not necessarily typologically related) languages.

Source data

The data sets for the two subtasks will be based on a new resource for Italian derived from CoNLL-compliant releases of two dependency treebanks developed for the Italian language, i.e. the Turin University Treebank (TUT) and the ISST-TANL Treebank.
The newly developed “Italian Stanford Dependency Treebank (ISDT)” is much bigger than the resources used in previous EVALITA campaigns (around two times larger than the previously exploited resources) and is standard-compliant at the level of both representation format (CoNLL) and adopted annotation scheme (Stanford Dependency Scheme).
For Italian, the data sets for both subtasks will be extracted from an updated version, in line with the most recent developments of the Stanford Dependency scheme, of the ISDT resource available at the following address: http://medialab.di.unipi.it/wiki/ISDT. For the other languages, the data sets will be extracted from Version 2.0 of “The Universal Dependency Treebank Project” made available by Google at the following address: https://code.google.com/p/uni-dep-tb/.

References

  • C. Bosco, V. Lombardo, L. Lesmo, and D. Vassallo. 2000. Building a treebank for italian: a data-driven annotation schema. In Proceedings of LREC’00, Athens, Greece.
  • C. Bosco, S. Montemagni, A. Mazzei, V. Lombardo, F. Dell’Orletta, and A. Lenci. 2009. Evalita’09 parsing task: comparing dependency parsers and treebanks. In Proceedings of Evalita’09, Reggio Emilia, Italy.
  • C. Bosco, S. Montemagni, A. Mazzei, V. Lombardo, F. Dell'Orletta, A. Lenci, L. Lesmo, G. Attardi, M. Simi, A. Lavelli, J. Hall, J. Nilsson and J. Nivre. Comparing the influence of different treebank annotations on dependency parsing performance. Proc. of LREC 2010, Malta, 2010.
  • C. Bosco, S. Montemagni, M. Simi, Harmonization and Merging of two Italian Dependency Treebanks. Workshop on Merging of Language Resources, in Porceedings of LREC 2012, Instanbul, May 2012, pp. 23-30. LREC 2012
  • Ryan McDonald, Joakim Nivre, Yoav Goldberg, Yvonne Quirmbach-Brundage, Dipanjan Das, Kuzman Ganchev, Keith Hall, Slav Petrov, Hao Zhang, Oscar Tackstrom, Claudia Bedini, Nuria Bertomeu Castello, Jungmee Lee. 2013. Universal dependency annotation for multilingual parsing. In Proceedings of ACL.
  • Montemagni, F. Barsotti, M. Battista, N. Calzolari, O. Corazzari, A. Lenci, A. Zampolli, F. Fanciulli, M. Massetani, R. Raffaelli, R. Basili, M. T. Pazienza, D. Saracino, F. Zanzotto, N. Mana, F. Pianesi, and R. Delmonte. 2003. Building the Italian Syntactic-Semantic Treebank. In A. Abeill´e, editor, Building and Using syntactically annotated corpora. Kluwer.

Organizers

  • Cristina Bosco (University of Torino)
  • Felice Dell'Orletta (Istituto di Linguistica Computazionale "Antonio Zampolli" - CNR, Pisa)
  • Simonetta Montemagni (Istituto di Linguistica Computazionale "Antonio Zampolli" - CNR, Pisa)
  • Manuela Sanguinetti (University of Torino)
  • Maria Simi (University of Pisa)

Collaborators:

  • Roberta Montefusco (University of Pisa)

Contact: evalita_dpie[at]ilc.cnr.it


Note that these pages will be updated continually, so please make sure to check it out from time to time.
Go to the form to register your declaration of interest for one or more task.