Annotation Step 3: POS and Lemma

If you ever face the following error message when opening an EXMARaLDA file:

Tier ... is not stratified. Please choose a method for stratifying the tier:

Choose Stratify by deletion.

inflected form → prototype = a single inflected form within paradigm or uninflected form = lemma

part of speech is the baseline for many further annotations
we need correct annotations
we will measure agreement
highly connected to lemmatization, thus underlies the same restrictions and parameters of variation

Always trust the guidelines more than your “grammatical intuition”, but in cases of doubt consult both.

language specific: you might have to find new ways / rules for undescribed phenomena (Please document!)
UD: strictly stick to UD guidelines for your language and please do not decide by what seems more logical to you

English	German	Greek	Russian	Turkish
British National Corpus Part of Speech Tagset	STTS 2.0	Universal POS tags	MyStem Morphology	MULTILIT

Universal Dependencies POS-tags, lemmas, and features

English	German	Greek	Russian	Turkish
correct BNC-POS, lemma, features (?)	correct STTS-POS, lemma, features (?)	correct UD-POS, UD features, lemma	correct MyStem-POS, lemma, features, and UD-POS	correct MULTILIT-POS, lemma, features (?), and UD-POS (?)

English	German	Greek	Russian	Turkish
derivable	derivable	needs manual correction	needs manual correction (?)	derivable (?)

lemma "F16" is tagged as a proper noun with the respective tag from the specific tagset you are using (e.g. "PROPN" from Universal Postags)
all kinds of greetings should be treated as interjections and tagged with the respective tag from the specific tagset you are using (e.g. "INTJ" from Universal Postags), unless there is a specific tag for greetings in the language specific tagset you are using
regarding the lemmatization of informal greetings: you can just copy the word form from the norm/dipl-layer

RUEG Corpus Documentation