ISDT (Italian Stanford Dependency Treebank) is a resource annotated according to the Stanford dependencies scheme, obtained through a semi-automatic conversion process starting from MIDT. MIDT in turn was obtained merging two existing Italian treebanks: TUT and ISST-TANL.

The Stanford annotation scheme was adapted to the specificity of the Italian language. We refer to [4] for a dscussion.

Contents

ISDT Specifications

ISDT Resources

The corpus composition is the same as for MIDT, for a total of approximately 200,500 tokens.

ISDT Downloads

ISDT version 1.0

ISDT version 2.0 is released as part of the Evalita 2014 (Evaluation of NLP and Speech Tools for Italian).

References

  1. M.C. de Marneffe and C. Manning. 2008. The stanford typed dependencies representation. In Coling 2008: Proceedings of the workshop on Cross-Framework and Cross-Domain Parser Evaluation, pages 1–8, Stroudsburg, PA, USA. Association for Computational Linguistics.
  2. Marie-Catherine de Marneffe and Christopher D. Manning, Stanford typed dependencies manual, September 2008, Revised for the Stanford Parser v. 3.2 in June 2013.
  3. Daniel Cer, Marie-Catherine de Marneffe, Daniel Jurafsky, and Christopher D. Manning. 2010. Parsing to Stanford Dependencies: Trade-offs between speed and accuracy. In 7th International Conference on Language Resources and Evaluation (LREC 2010).
  4. Cristina Bosco, Simonetta Montemagni, Maria Simi, Converting Italian Treebanks: Towards an Italian Stanford Dependency Treebank, in the 7th Linguistic Annotation Workshop & Interoperability with Discourse, ACL workshop, Sofia, August 2013 pdf.
  5. de Marneffe, M., Connor M., Silveira N., Bowman S. R., Dozat T., Manning C. D.: More constructions, more genres: Extending Stanford Dependencies. Proc. of the Second International Conference on Dependency Linguistics (DepLing 2013), Prague, August 27–30, 2013, Charles University in Prague, Matfyzpress, Prague, pp. 187–196.
Powered by MediaWiki