Transcription Decisions Turkish

Basics

Format

  • create a TextGrid on Praat
  • import a TextGrid to EXMARaLDA

Tiers

  • speaker tier (e.g TUmo01MT; type: transcription)
  • optional tier for segmentation in Intonation Phrases (IP)
  • Normalization in EXMARaLDA

Segmentation

Anonymisation

  • Replace name of participant with the respective code (e.g TUmo01MT)
  • If whole names or surnames of friends are mentioned, replace with the participant code + _P (e.g. TUmo02FT_P)
  • Places that could lead to the identification of a participant (e.g. Atatürk okulunda = Axxx{schoolname} okulunda, Kızılay caddesi = Kxxx{streetname} caddesi)
  • if a phone number is mentioned, please anonymize it as {phonenumber}

Transcription

'Unwanted' material (if applicable)

  • If this is not possible mark those passages as: <Q> communication with elicitor </Q>

Merged forms

  • Merged forms are transcribed as they are articulated, but with an equal sign linking the merged elements
  • Examples from TUmo10MT_isT: n=apıyorsun (= ne yapıyorsun), TUmo11MT_isT: n=aber (= ne haber)

Tag Questions

  • tag questions (de mi) do not constitute a separate CU

Reduced syllables

  • reduced syllables are transcribed as articulated
  • Examples: bi tane (= bir tane), gidiyo (= gidiyorsun) yakıyosun (= yakıyorsun), içbiri (= hiçbiri)
  • Use / to mark unfinished words, e.g. “Çarb/ çarptı derken oldu bitti“

Accents and dialects

  • pronounced sounds are transcribed as articulated (e.g gardaşım (= kardeşim), but sounds which are not typical for Turkish are not represented.

Pauses

  • 0.2 - 1 sec: (-)
  • 1-3 secs: (--)
  • More than 3 secs: (5.5) to be measured
  • Wordinternal pauses are marked as followed: top(-)la - no space between the parts.

Long vocals & consonants

  • vocals pronounced longer than normal (under 2sec) are marked with : (e.g. canı:m)
  • vocals that are pronounced extremely long (2sec and more) are marked with :: (e.g canı::m)
  • also possible for consonants (e.g. tamam:)
  • doubling of vocal syllables with % (e.g. ba%ay)

Non-verbal material

  • non-verbal events such as a participant laughing or coughing are noted in square brackets on speaker tier, e.g. [laughing], [whispering, [clears throat], [sighs], [sniffs], [snapsfingers]

  • if participants speak and laugh at the same time, it is noted as: [[laughing]speech]

Uninterpretable material

  • uninterpretable material is to be marked as (UNK) on Speaker-tier
  • longer than 2secs: (UNK, 2.1)
  • assumed content in brackets, each token separated: (assumed) (content)

Hesitation markers / Interjections / Reception markers

  • e (short "e") ee (long "ee") ı (short "ı") ııı (long "ııı")
  • thinking: "hmm, eem, ımm"
  • agreement: "hıhı"
  • negation: "ı ıh"
  • dissapointment: "tüh"

Foreign language material

  • original spelling will be kept.

Proper/Brand names

  • Keep conventionalized spelling (e.g. Renault = renault)

Numerals

  • Numbers are spelled (e.g 155 = yüz elli beş)

Table of symbols

Symbols Meaning
<Q> araştırmacıyla iletişim </Q> instances of questions concerning the procedure and/or verbal interventions of elicitators
(-) 0.5 - 1sec
(--) pauses 1-3secs
(3.2) pauses longer than 3secs
(UNK) uninterpretable material
(UNK, 2.2) uninterpretable material longer than 2secs
(assumption) assumed material
[gülüşmeler/fısıldaşmalar] non-verbal material
[[gülüşme]konuşma] non-verbal & verbal event
: unusually long vocal or consonant (under 2secs)
:: unusually long vocal or consonant (longer than 2secs)
= merged forms
/ interruption of a word
% doubled syllables
{...} specification of an anonymised place