Transcription Decisions Turkish
Basics
Format
- create a TextGrid on Praat
- import a TextGrid to EXMARaLDA
Tiers
- speaker tier (e.g TUmo01MT; type: transcription)
- optional tier for segmentation in Intonation Phrases (IP)
- Normalization in EXMARaLDA
Segmentation
- According to Communication Units (CU) Communication_unit__P4_10.12.2018.pdf
- No punctuation
Anonymisation
- Replace name of participant with the respective code (e.g TUmo01MT)
- If whole names or surnames of friends are mentioned, replace with the participant code + _P (e.g. TUmo02FT_P)
- Places that could lead to the identification of a participant (e.g. Atatürk okulunda = Axxx{schoolname} okulunda, Kızılay caddesi = Kxxx{streetname} caddesi)
- if a phone number is mentioned, please anonymize it as {phonenumber}
Transcription
'Unwanted' material (if applicable)
- If this is not possible mark those passages as:
<Q> communication with elicitor </Q>
Merged forms
- Merged forms are transcribed as they are articulated, but with an equal sign linking the merged elements
- Examples from TUmo10MT_isT: n=apıyorsun (= ne yapıyorsun), TUmo11MT_isT: n=aber (= ne haber)
Tag Questions
- tag questions (de mi) do not constitute a separate CU
Reduced syllables
- reduced syllables are transcribed as articulated
- Examples: bi tane (= bir tane), gidiyo (= gidiyorsun) yakıyosun (= yakıyorsun), içbiri (= hiçbiri)
- Use / to mark unfinished words, e.g. “Çarb/ çarptı derken oldu bitti“
Accents and dialects
- pronounced sounds are transcribed as articulated (e.g gardaşım (= kardeşim), but sounds which are not typical for Turkish are not represented.
Pauses
- 0.2 - 1 sec: (-)
- 1-3 secs: (--)
- More than 3 secs: (5.5) to be measured
- Wordinternal pauses are marked as followed: top(-)la - no space between the parts.
Long vocals & consonants
- vocals pronounced longer than normal (under 2sec) are marked with : (e.g. canı:m)
- vocals that are pronounced extremely long (2sec and more) are marked with :: (e.g canı::m)
- also possible for consonants (e.g. tamam:)
- doubling of vocal syllables with % (e.g. ba%ay)
Non-verbal material
-
non-verbal events such as a participant laughing or coughing are noted in square brackets on speaker tier, e.g. [laughing], [whispering, [clears throat], [sighs], [sniffs], [snapsfingers]
-
if participants speak and laugh at the same time, it is noted as: [[laughing]speech]
Uninterpretable material
- uninterpretable material is to be marked as (UNK) on Speaker-tier
- longer than 2secs: (UNK, 2.1)
- assumed content in brackets, each token separated: (assumed) (content)
Hesitation markers / Interjections / Reception markers
- e (short "e") ee (long "ee") ı (short "ı") ııı (long "ııı")
- thinking: "hmm, eem, ımm"
- agreement: "hıhı"
- negation: "ı ıh"
- dissapointment: "tüh"
Foreign language material
- original spelling will be kept.
Proper/Brand names
- Keep conventionalized spelling (e.g. Renault = renault)
Numerals
- Numbers are spelled (e.g 155 = yüz elli beş)
Table of symbols
Symbols | Meaning |
---|---|
<Q> araştırmacıyla iletişim </Q> | instances of questions concerning the procedure and/or verbal interventions of elicitators |
(-) | 0.5 - 1sec |
(--) | pauses 1-3secs |
(3.2) | pauses longer than 3secs |
(UNK) | uninterpretable material |
(UNK, 2.2) | uninterpretable material longer than 2secs |
(assumption) | assumed material |
[gülüşmeler/fısıldaşmalar] | non-verbal material |
[[gülüşme]konuşma] | non-verbal & verbal event |
: | unusually long vocal or consonant (under 2secs) |
:: | unusually long vocal or consonant (longer than 2secs) |
= | merged forms |
/ | interruption of a word |
% | doubled syllables |
{...} | specification of an anonymised place |