Illustration of DIMA in comparison to German intonation annotation according to GToBI, ToGI, and KIM as supplementary material to the ICPhS-2019 paper.

Figure 1: Annotation example comparing DIMA (tiers 3 – phrase, 4 – tone, and 5 – prominence) with GToBI (tier 6), ToGI (tier 7), and KIM (tier 8); the phrase is „Without hesitation, Petra takes them (the curtains) down and carries them to the washing machine“.

Comparison of phrase boundaries

DIMA has a separate phrase tier compared to all other systems that comprise the boundary information with tonal information. The annotation example in Figure 1 illustrates an utterance that according to the DIMA annotation in tier 3, perceptually comprises two phrases with a strong boundary annotated as ‚%‘ at the phrase-layer. The first phrase contains two phrases separated by a weak boundary annotated as ‚-‚. All systems have final boundary tones, most of them have initial boundary tones (e.g. initial boundary tones are optional in GToBI and hence no initial boundary tones are shown in the example). The DIMA phrase tier marks phrase boundaries which are annotated independent of tone. The boundary is marked on the basis of different perceptual boundary cues.
What is not shown here is that DIMA, KIM and IPrA mark register changes.

Comparison of prominences

Some systems mark prominence on a separate tier including DIMA. However, prominence and tone are mixed in the phonological models, whereas prominence marking in DIMA is independent of tone. The perceptual prominence on tier 5 shows three ’strong prominences‘ (‚2‘) in the first phrase and two prominences in the second phrase, one ‚weak prominence‘ (‚1′) and one ’strong prominence‘ (‚2‘).

The fact that prominence is annotated independent of tone in DIMA is an advantage for the annotation process because the annotator does not have to decide on prominence and accent type at once. It is a matter of prominence perception around a strong level, either a weak prominence, a strong prominence or an extra-strong prominence (in case of emphasis, but not necessarily focus).

Comparison of tones

With the exception of IViE and IPrA which have both a phonetic and a phonological tier, all systems have one tonal tier. The assumption on the structure of pitch accent types differ between the systems though. GToBI has a preference for right-headed pitch accents, ToGI has a preference for left-headed pitch accents, and pitch events in KIM are based on local minima and maxima and their timing relation. DIMA breaks up complex pitch accent types into accentual and non-accentual tones, which are shown on the tonal layer (tier 4). In the example, the first phrase starts with a H tone at the boundary as it is perceptually high. The strong prominence at the phrase-initial syllable is tonally marked as a high accentual tone (H*). Since the pitch is perceptually slightly falling the weak boundary is tonally annotated as downstepped high (!H). The second phrase starts at a default low pitch level annotated as ‚L‘ at the same label as the previous tone label for the end of the previous phrase as there is only one boundary label. The second phrase contains two normal prominences which are perceptually high and thus each labelled with an accentual high tone (H*). Pitch drops down after the first accentual high tone which is labelled with a non-accentual low tone (L). Pitch stays high after the second accentual high tone, and thus the boundary receives a H tone label. The next phrase starts again at a default low pitch level, annotated as L at the same boundary label. The first prominence is perceived as low and caused by an F0 event, hence the annotation of low accentual tone (L*). After the accentual low tone, pitch is slightly rising ending in a non-accentual high tone (H). The final prominence is perceived as downstepped high and is therefore annotated as an accentual downstepped high tone (!H*). The pitch drops down to low before the boundary which is labelled with a non-accentual low tone (L). The boundary, finally, is low (L)