Lexical Morphological Annotations

From Medialab

The Tanl POS Tagset has been based on the ISST morpho-sintactic tagging, itself derived from the ILC/PAROLE tagset.

The Tanl Tagset is conformant to current international standards being compliant to the EAGLES recommendations.

The tagset consists of 28 categories, grouped in 14 base morpho-sintactic categories (Abbreviation, Adjective, Adverb, Article, Conjunction, Determiner, Interjection, Noun, Numeral, Preposition, Pronoun, Punctuation, Verb and Residual).

Abbreviation

Adjective

Adverb

The class of Adverb (tagged as B) is split in a distinct subcategory for negation adverbs tagged as BN.

Article

Conjunction

The category is split into:

  1. CC: coordinative conjunctions, e.g. i libri e i quaderni, vengo ma non rimango;
  2. CS: subordinative conjunctions, e.g quando ho finito vengo, mentre scrivevo ho finito l’inchiostro.

Determiner

Interjection

Noun

Numeral

Prepositions

The category is split into:

  1. E: simple prepositions
  2. EA: articulated prepositions

Pronoun

The category is split into:

  1. PD: demonstrative pronoun
  2. PI: indefinite pronoun
  3. PP: possessive pronoun
  4. PE: personal pronoun
  5. PR: relative pronoun
  6. PQ: interrogative pronoun
  7. PC: clitic pronoun

The category of clitic pronouns is useful since they have a peculiar distribution pattern different from all other classes of pronouns.

Predeterminer

The category of Predeterminer (T) is used in contexts like "tutti gli studenti" where tutti is tagged as T rather than DI, whose combination with an article would result as anomalous.

Punctuation

  1. FS: sentence delimiter: [ . ! ? ]
  2. FC: clause delimiter: [ ; : ]
  3. FF: comma: [ , ]
  4. FB: balanced punctuation, i.e.: [ ( ) “ ” ‘ ’ - - ].

Residual

This category is used for cases that cannot be dealt with the other categories.

Verb

Verbs are split into the following categories:

  1. VA: auxiliary verbs (essere, avere or venire in passive constructs);
  2. VM: modal verbs (volere, potere, dovere, solere);
  3. V: main verbs.