German POS and Lemma
(partly in German)
Model: STTS 2.0 (Westphal et. al.)
The guidelines can be found here: Westpfahl_Schmidt_Jonietz_Borlinghaus_STTS_2_0_2017.pdf
Decisions POS tag
Here you find some data specific decisions and some cases that are specified in STTTS 2.0 and highlighted here:
- Following Rehbein 2013, we add the tag EMO for emticons and emoji to the STTS 2.0 tagset
- F16 as NE
- one word greetings and terms for saying goodbye as hi, hallo, tschüss are interjections (NGIRR)
- speaker-codes, anonymised streetnames, etc. are proper nouns (NE)
- names that were anonymised by the speaker, e.g., "Frau XX" or "XY Straße" receive the tag XY (non-word)
- if it is not possible to decide on a POS tag, e.g., due to unfinished utterances, the event stays empty
- conventionalised abbreviations (e.g., "d.h.") receive the POS tag ADV (guidelines p.13)
- "also" receives the tag SEDM or ADV depending on the context:
- "also"/SEDM in the pre-prefield, e.g., "also/SEDM ich heiße..."
- "also"/ADV: adverbial connector, e.g. "also/ADV ging ich die Straße entlang", connector signaling a specification (without verb), e.g., "...eine Familie, also/ADV Frau, Mann, Kind" or a correction, e.g., "derweil ist dann ein Auto gekommen äh entgegen also entlanggekommen"
- "wie" in "wie folgt" as KOKOM (see guidelines p.44 for other uses)
- "als"
- "als"/KOUS if it introduces a subordinate clause
- "als"/KOKOM in prototypical cases such as "ich bin größer als du", here also in "ich möchte als Zeuge aussagen"
- if "natürlich" can be replaced by "selbstverständlich" it receives the tag ADV
- interrogative adverbs "wo, wie, worüber, warum" can be used as interrogatives or can serve as relative pronouns. In both cases, they get the POStag PWAV (STTS, S.26). Examples:
- "auf dem Mittelstreifen, wo/PWAV der Unfall passiert ist"
- "ich weiß nicht, wo/PWAV du bist"
- "wo/PWAV bist du"
- "was, welche" can appear
- as interrogative pronouns, also in embedded contexts
- substitutively: "Ich weiß nicht, was/PWS du gemacht hast"
- attributively: "Welche/PWAT Farbe hat der Hut?";
- as interrogative pronouns with a relative use after verbs of dicendi/sentiendi nature
- "Er erzählt, was er gesehen hat"
- as relative pronoun (PRELS) if the antecedent is mentioned previously
- "das Kind, welches/PRELS sich auf der anderen Seite befand"
- as interrogative pronouns, also in embedded contexts
Weitere Beispiele
token | POS tag |
---|---|
/aufgrund /von | /ADV /APPR |
/aufgrund (des Unfalls) | /APPR |
/bis /später | /APPR /ADJD |
/gegenüber /von | /ADV /APPR |
/gegenüber /dem /Auto | /APPR /ART /NN |
/nichts /weiter | /PIS /PTKMWL |
/weder /noch | /KON /KON |
zwar | ADV |
... | $. |
Decisions lemma:
- lemma represents the shortest converging form
- nominalisations stay (Verletzte, Folgendes, Fahrer, etc.). The lemma represents the shortest converging form, so that POS and lemma match (e.g., norm: "das Spielen", pos_lang: NN, lemma: Spielen)
- speaker codes stay as they are
- the lemma of merged forms of articles and prepositions is the preposition: norm:"aufm", lemma:"auf"; norm:"mitm", lemma:"mit"; norm:"zum", lemma:"zu"
- dates are represented by @card@
- cardinal numbers stay on lemma as they are on norm layer, e.g., "zwei", "16"
- reflexive pronouns on lemma are their corresponding personal pronouns (e.g., sich zu er|sie|es)
- ordinal numbers stay as they are on norm layer
- different forms of one lexeme, because related to gender and case marking, are reduced to the shortest converging form (see table below); EXCEPTION: NN denotating persons stay in the same gender form as on norm layer, e.g., "Augenzeugin" and "Augenzeuge"
- "der", "die", "das" are always reduced to "d", no matter if it used as article, relative pronoun or demonstrative pronoun
- forms in plural get the singular form on lemma (e.g., norm: Einkäufe, lemma: Einkauf)
different forms | lemma |
---|---|
all, alle, alles, aller | all |
andere, anderer, anderes | ander |
eine, einer, ein | ein |
der, die, das | d |
diese, dieser, dieses (atrribuierende Demonstrativpronomen) | diese |
dieser, dies, dieses (substituierendes Demostrativpronomen) | dies |
Folgendes, Folgende, Folgender | Folgende |
jener, jenes, jene | jene |
mein, meiner, meine, meins | mein |
weit, weiter, weitere, weiterer, weiteres | weit |
welche, welcher, welches | welch |
vordere, vorderer, vorderes (ADJA) | vordere |
zweit, zweite, zweiter, zweites | zweit |