German POS and Lemma

(partly in German)

Model: STTS 2.0 (Westphal et. al.)

The guidelines can be found here: Westpfahl_Schmidt_Jonietz_Borlinghaus_STTS_2_0_2017.pdf

Decisions POS tag

Here you find some data specific decisions and some cases that are specified in STTTS 2.0 and highlighted here:

  • Following Rehbein 2013, we add the tag EMO for emticons and emoji to the STTS 2.0 tagset
  • F16 as NE
  • one word greetings and terms for saying goodbye as hi, hallo, tschüss are interjections (NGIRR)
  • speaker-codes, anonymised streetnames, etc. are proper nouns (NE)
  • names that were anonymised by the speaker, e.g., "Frau XX" or "XY Straße" receive the tag XY (non-word)
  • if it is not possible to decide on a POS tag, e.g., due to unfinished utterances, the event stays empty
  • conventionalised abbreviations (e.g., "d.h.") receive the POS tag ADV (guidelines p.13)
  • "also" receives the tag SEDM or ADV depending on the context:
    • "also"/SEDM in the pre-prefield, e.g., "also/SEDM ich heiße..."
    • "also"/ADV: adverbial connector, e.g. "also/ADV ging ich die Straße entlang", connector signaling a specification (without verb), e.g., "...eine Familie, also/ADV Frau, Mann, Kind" or a correction, e.g., "derweil ist dann ein Auto gekommen äh entgegen also entlanggekommen"
  • "wie" in "wie folgt" as KOKOM (see guidelines p.44 for other uses)
  • "als"
    • "als"/KOUS if it introduces a subordinate clause
    • "als"/KOKOM in prototypical cases such as "ich bin größer als du", here also in "ich möchte als Zeuge aussagen"
  • if "natürlich" can be replaced by "selbstverständlich" it receives the tag ADV
  • interrogative adverbs "wo, wie, worüber, warum" can be used as interrogatives or can serve as relative pronouns. In both cases, they get the POStag PWAV (STTS, S.26). Examples:
    • "auf dem Mittelstreifen, wo/PWAV der Unfall passiert ist"
    • "ich weiß nicht, wo/PWAV du bist"
    • "wo/PWAV bist du"
  • "was, welche" can appear
    • as interrogative pronouns, also in embedded contexts
      • substitutively: "Ich weiß nicht, was/PWS du gemacht hast"
      • attributively: "Welche/PWAT Farbe hat der Hut?";
    • as interrogative pronouns with a relative use after verbs of dicendi/sentiendi nature
      • "Er erzählt, was er gesehen hat"
    • as relative pronoun (PRELS) if the antecedent is mentioned previously
      • "das Kind, welches/PRELS sich auf der anderen Seite befand"

Weitere Beispiele

token POS tag
/aufgrund /von /ADV /APPR
/aufgrund (des Unfalls) /APPR
/bis /später /APPR /ADJD
/gegenüber /von /ADV /APPR
/gegenüber /dem /Auto /APPR /ART /NN
/nichts /weiter /PIS /PTKMWL
/weder /noch /KON /KON
zwar ADV
... $.

Decisions lemma:

  • lemma represents the shortest converging form
  • nominalisations stay (Verletzte, Folgendes, Fahrer, etc.). The lemma represents the shortest converging form, so that POS and lemma match (e.g., norm: "das Spielen", pos_lang: NN, lemma: Spielen)
  • speaker codes stay as they are
  • the lemma of merged forms of articles and prepositions is the preposition: norm:"aufm", lemma:"auf"; norm:"mitm", lemma:"mit"; norm:"zum", lemma:"zu"
  • dates are represented by @card@
  • cardinal numbers stay on lemma as they are on norm layer, e.g., "zwei", "16"
  • reflexive pronouns on lemma are their corresponding personal pronouns (e.g., sich zu er|sie|es)
  • ordinal numbers stay as they are on norm layer
  • different forms of one lexeme, because related to gender and case marking, are reduced to the shortest converging form (see table below); EXCEPTION: NN denotating persons stay in the same gender form as on norm layer, e.g., "Augenzeugin" and "Augenzeuge"
  • "der", "die", "das" are always reduced to "d", no matter if it used as article, relative pronoun or demonstrative pronoun
  • forms in plural get the singular form on lemma (e.g., norm: Einkäufe, lemma: Einkauf)
different forms lemma
all, alle, alles, aller all
andere, anderer, anderes ander
eine, einer, ein ein
der, die, das d
diese, dieser, dieses (atrribuierende Demonstrativpronomen) diese
dieser, dies, dieses (substituierendes Demostrativpronomen) dies
Folgendes, Folgende, Folgender Folgende
jener, jenes, jene jene
mein, meiner, meine, meins mein
weit, weiter, weitere, weiterer, weiteres weit
welche, welcher, welches welch
vordere, vorderer, vorderes (ADJA) vordere
zweit, zweite, zweiter, zweites zweit