/
Intonation: buildings and bricks Intonation: buildings and bricks

Intonation: buildings and bricks - PowerPoint Presentation

alida-meadow
alida-meadow . @alida-meadow
Follow
343 views
Uploaded On 2019-12-10

Intonation: buildings and bricks - PPT Presentation

Intonation buildings and bricks Francesco Cangemi Universität Zürich amp Universität zu Köln fcangemiunikoelnde I Theory 0930 1045 A Architectures 1 P rosody and intonation ID: 769918

speakers contrast narrow primitives contrast speakers primitives narrow broad intonation prosodic phonetic function detail individual speaker analysis architectures behaviour

Share:

Link:

Embed:

Download Presentation from below link

Download Presentation The PPT/PDF document "Intonation: buildings and bricks" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Intonation:buildings and bricks Francesco Cangemi Universität Zürich & Universität zu Köln fcangemi@uni-koeln.de

I. Theory 09:30 – 10:45 A. Architectures 1. P rosody and intonation 2. Abstractionist assumptions 3. Exemplarist challenges 4. Prosodic detail 11:00 – 12:30 B. Primitives 1. Partial Topics in Italian 2. Intonational meaning 3. Challenging primitives i . Contour warping ii. Individual behaviour II. Practice 14:00 – 15:00 C. Praat scripting 1. Basics 15:15 – 16:30 2. Plotting i . Synthesis (one example) ii. Analysis (many items)

I. Theory 09:30 – 10:45 A. Architectures 1. P rosody and intonation 2. Abstractionist assumptions 3. Exemplarist challenges 4. Prosodic detail 11:00 – 12:30 B. Primitives 1. Partial Topics in Italian 2. Intonational meaning 3. Challenging primitives i . Contour warping ii. Individual behaviour II. Practice 14:00 – 15:00 C. Praat scripting 1. Basics 15:15 – 16:30 2. Plotting i . Synthesis (one example) ii. Analysis (many items)

Long Short A B Prosody and arbitrariness 4 Both highly universal and language-specific Physio- andpsychologicallymotivated AcquisitionLearningPathology Phylogenies Ontogenesis First Last

5 hier soir, avant de s'endormir, François fumait une dernière cigarette, en relisant le cours d'allemand qu'il avait préparé pour ses élèves de terminale. Puis, il écrasa sa gauloise dans un cendrier, et éteignit la lumière. Un moment plus tard, une odeur de brûlé le réveilla. La pièce était envahie de fumée, et François s'aperçut avec effroi que les rideaux de la fenêtre avaient pris feu.  [Louis et alii, 2001] First to appear Last to disappear PROSODYThe workmen from Boston were leavingF M Learning Pathology

Prosody and function 6 Message encoding and decoding Disambiguation of (surface) syntactic structuresPROSODY (I’ll move on) (Saturday) (I’ll move) (on Saturday)

7 PROSODY Lexical access [Christophe et alii 2004] Le livre racontait l’histoire…[d’un grand chat grincheux] [qui avait mordu un facteur] CHAGRIN[d’un grand chat drogué] [qui dormait tout le temps] *CHAD Message encoding and decoding Disambiguation of(surface) syntactic structures

8 PROSODY Information structure Management of interaction e.g. backchannels [ Benus et alii 2007]E.g., it.: Michele viene con me Micheal comes with me Modality Contrastivity Givenness Lexical access Message encoding and decoding Disambiguation of (surface) syntactic structures

I. Theory 09:30 – 10:45 A. Architectures 1. P rosody and intonation 2. Abstractionist assumptions 3. Exemplarist challenges 4. Prosodic detail 11:00 – 12:30 B. Primitives 1. Partial Topics in Italian 2. Intonational meaning 3. Challenging primitives i. Contour warping ii. Individual behaviour II. Practice 14:00 – 15:00 C. Praat scripting 1. Basics 15:15 – 16:30 2. Plotting i . Synthesis (one example) ii. Analysis (many items)

Architectures 10 SUBSTANCE FUNCTION FORM Syntax Lexicon Information structureInteractionf0amplitude duration voice quality

Architectures 11 SUBSTANCE FUNCTION FORM /kæt/

Architectures (intonation) 12 Intonation refers to the use of suprasegmental phonetic features to convey post-lexical or sentence-level pragmatic meanings in a linguistically structured way.[Ladd , 1996] FORM FUNCTION SUBSTANCE

Architectures (prosody) ( Prosody is a) branch of linguistics devoted to the factual description ( phonetic aspects) and the formal analysis (phonological aspects) of the systematic elements in the phonic expression which are not coextensive to phonemes , such as accents, tones, intonation and quantity . whose actual manifestations in speech production are associated with variations in the physical parameters of f0, duration and intensity , which represent prosody’s objective parameters . These variations are perceived by the listener as changes in pitch, length and loudness , which are prosody’s subjective parameters . The prosodic elements play at (lexical prosody ) or above ( post-lexical prosody ) the word level a bundle of grammatical , para-grammatical and extra-grammatical functions , related to ‘ what is said ’, ‘how it is said ’ and to speaker identity . These functions prove crucial in signalling the structure of utterances and of discourse , and in guiding their semantic and pragmatic interpretation . [Di Cristo 2004] 13 FORM FUNCTION SUBSTANCE

I. Theory 09:30 – 10:45 A. Architectures 1. P rosody and intonation 2. Abstractionist assumptions 3. Exemplarist challenges 4. Prosodic detail 11:00 – 12:30 B. Primitives 1. Partial Topics in Italian 2. Intonational meaning 3. Challenging primitives i. Contour warping ii. Individual behaviour II. Practice 14:00 – 15:00 C. Praat scripting 1. Basics 15:15 – 16:30 2. Plotting i . Synthesis (one example) ii. Analysis (many items)

An alternative architecture …these models introduce a phonological level of description that is intermediate between (abstract) function and (concrete) phonetic form …it is our experience that one always get better results if one can do without such an intermediate level, i.e., if one can establish a direct link between (syntactic/semantic) function and phonetic form “the unfortunate notion of pitch accent” [ Batliner and Möbius 2005] 15

An alternative architecture substance function FORM exemplar auditory properties f0,F1,F2,F3,dur category labels word,sex,speaker [Johnson 1997] [K. Schweitzer 2012] 16

Categorization Monothetic approach (classical view) Singly necessary and jointly sufficient features bug /bʌg/ bun /bʌn/ 17

Prototypic ( probabilistic view )Categories center around members sharing many features Polythetic ( Familienähnlichkeit ) elements of a class share more or less features 18

Episodic ( exemplar view )Online categories, comparing probe to stored exemplars A B C 19

I. Theory 09:30 – 10:45 A. Architectures 1. P rosody and intonation 2. Abstractionist assumptions 3. Exemplarist challenges 4. Prosodic detail 11:00 – 12:30 B. Primitives 1. Partial Topics in Italian 2. Intonational meaning 3. Challenging primitives i. Contour warping ii. Individual behaviour II. Practice 14:00 – 15:00 C. Praat scripting 1. Basics 15:15 – 16:30 2. Plotting i . Synthesis (one example) ii. Analysis (many items)

Phonetic detail Exemplar - based models:Traces for words are storedwithout reducing phonetic informationto an abstract phonological representation Phonetic detail:systematically produced and perceivedphonetic information which is not included in abstract phonological representations 21

True prefix : discolour/ dɪs kʌlə/Pseudo-prefix :discover / dɪs kʌvə/Spectrotemporal patterns [R. Smith et al. 2012]Production Intelligibility in noise[Baker et al. 2007] Perception Frequency [Bybee 2001] mammary [ˈmæməɹi] artillery [ɑːˈtɪləɹi] memory [ˈmɛm əɹi] every [ˈɛv.ɹi] 22

Exemplar-based models : Traces for words are storedwithout reducing phonetic informationto an abstract phonological representation + Functional approaches to prosody: No intermediate phonological level, but only a direct link between function and phonetics=Exemplar-based prosody:Words are stored along with their f0 contours 23Exemplar prosody set ofcategory labels set ofauditoryproperties exemplar [K. Schweitzer 2012]

Words are stored along with theirf0 contourMemoryrequirements Feature analysis A: I hear you’ll soon be a doctor in chemistry . B: a. Me?! b. A doctor in chemistry?! H L H H L H Adapted from [ Ladd 1996] Restrictive view of phonetic detail : systematically produced and perceived phonetic information which is not YET included in abstract phonological representations 24

Abstract forms ( e.g . pitch accents) in AM theory are phonetically very underspecifiedPROSODIC CUES f0 duration amplitudespectralfeatureseventsinterpolations 25 Prosodic detail If systematically produced and perceived phonetic information is found t o cue functional contrasts , p honological representations m ight be enriched either in inventory or grammar

I. Theory 09:30 – 10:45 A. Architectures 1. P rosody and intonation 2. Abstractionist assumptions 3. Exemplarist challenges 4. Prosodic detail 11:00 – 12:30 B. Primitives 1. Partial Topics in Italian 2. Intonational meaning 3. Challenging primitives i. Contour warping ii. Individual behaviour II. Practice 14:00 – 15:00 C. Praat scripting 1. Basics 15:15 – 16:30 2. Plotting i . Synthesis (one example) ii. Analysis (many items)

PROSODIC CUES f0 duration amplitude spectral features events interpolations 27 Partial Topi c constructions seem to have distinctive rise shape Interpolation rather than events ?!

28 Narrowing down the Discourse Topic Non exhaustive answer [ Büring 1997] Partial Topics in Neapolitan Italian A: How do your friends like their coffee? B: Milena drinks it black (as for the others, I wouldn’t know) milena lo vuole amaro milena it -OBJ want-3SG unsweetened

Alignment Scaling Curve index: [Dombrowski & Niebuhr 2005; Cangemi 2009] 29 Reading task with contextualizing paragraph3 sentences2 contexts7 speakers5 repetitions210 itemsTwo-sample two-tailed T-testsBoth with prosodic break after NL is acceleration peak (d”)

C scaling Curve index L H start of… end of… Scaling Alignment p<0.001 Events Interpolations 30

Other contrasts… la mamma vuole vedere la bina the mother want-3SG see -INF the bina Petrone and D’Imperio (2008): PRENUCLEAR instead of nuclear FALLS instead of rises MODALITY instead of S-Topic (narrow focus question/statement) 31

Other languages… Dombrowski and Niebuhr (2005): Utterance FINAL instead of initial CONVERSATIONAL functions Appointment scheduling task Turn-yielding (left) “Ostermontag”vs turn-holding (right)“Anfang Dezember” 32

I. Theory 09:30 – 10:45 A. Architectures 1. P rosody and intonation 2. Abstractionist assumptions 3. Exemplarist challenges 4. Prosodic detail 11:00 – 12:30 B. Primitives 1. Partial Topics in Italian 2. Intonational meaning 3. Challenging primitives i. Contour warping ii. Individual behaviour II. Practice 14:00 – 15:00 C. Praat scripting 1. Basics 15:15 – 16:30 2. Plotting i . Synthesis (one example) ii. Analysis (many items)

If systematically produced and perceived phonetic information is foundt o cue functional contrasts, p honological representations m ight be enriched either in inventory or grammar Enriching forms … Petrone and D’Imperio (2008) Modality contrast via a novel tonal contrast for a new prosodic domain : Accentual phrase Dombrowski and Niebuhr (2005) Turn - taking contrast via a split of early valleys according to concavity / convexity Rise shapes 34

…but what about function ? In the segmental domain , linguistic categories are expected to relate both to differences in sounds and articulations and to differences in semantic interpretation . For example , we say that [p] is different from [b] because they are pronounced differently , and because [ pit ] means something different than [bit] does . [Pierrehumbert and Hirschberg 1990] This functional point of view has given way to more formal criteria such as economy of description [ Batliner and Möbius 2005] 35 SUBSTANCE FUNCTION FORM /kæt/

Bricks and buildings 36 Constitutional difference between the primitive-architecture relationship for segmental and suprasegmental phonology! SUBSTANCE FUNCTIONFORM /bʌn/ /bʌg/ labial closure no rounding nasality … – animate + edible + organic SUBSTANCE FUNCTION ...labial clos. no rounding velar closure + animate + edible + organic

Bricks and buildings 37 Constitutional difference between the primitive-architecture relationship for segmental and suprasegmental phonology! SUBSTANCE FUNCTIONFORM /bʌn/ /bʌg/ labial closure no rounding nasality … – animate + edible+ organic SUBSTANCE FUNCTION ...labial clos. no rounding velar closure + animate + edible + organic NO DIRECT LINKS !

38 Then why suggest a direct link between substance and function for prosody?Batliner, Möbius SUBSTANCE FUNCTION FORM valley early concave – turn yielding …SUBSTANCE FUNCTION valley early convex + turn yielding … Even intonational phonologists working with gestalt- like wholes ! Dombrowski, Niebuhr

Intonational meaning Segments and tones : epilinguistically salient lexical contrastsIntonation: theory-dependent post-lexical contrasts 39 Even worse, theories of post-lexical contrasts often implicitly useunanalyzed introspection orone of many intonation theoriesto frame and validate their proposals [Jackendoff 1972] SEMIOTIC PRIMITIVES: There is no consensus on an « atomic level » for intonation…not even as controversial as the phoneme! Extra tones in more complex grammar vs Shapes as part of holistic forms, Sentence Partial Topic vsQuestion Narrow Focus !

I. Theory 09:30 – 10:45 A. Architectures 1. P rosody and intonation 2. Abstractionist assumptions 3. Exemplarist challenges 4. Prosodic detail 11:00 – 12:30 B. Primitives 1. Partial Topics in Italian 2. Intonational meaning 3. Challenging primitives i. Contour warping ii. Individual behaviour II. Practice 14:00 – 15:00 C. Praat scripting 1. Basics 15:15 – 16:30 2. Plotting i . Synthesis (one example) ii. Analysis (many items)

D’Imperio Niebuhr Gili FivelaLou Boves Michele GubianChallenging intonational primitives 41 Some slow steps in rethinking the (unfortunate) notion of pitch accent, addressing two issues: Individual variabilityin production (soon perception) Time-evolving and multiparametric phonetic representations

I. Theory 09:30 – 10:45 A. Architectures 1. P rosody and intonation 2. Abstractionist assumptions 3. Exemplarist challenges 4. Prosodic detail 11:00 – 12:30 B. Primitives 1. Partial Topics in Italian 2. Intonational meaning 3. Challenging primitives i. Contour warping ii. Individual behaviour II. Practice 14:00 – 15:00 C. Praat scripting 1. Basics 15:15 – 16:30 2. Plotting i . Synthesis (one example) ii. Analysis (many items)

Manipulating f0 contours So, interpolations between targets might be relevantStill, when resynthesizing stimuli, we often make linear stylizations (Praat & Psola)How can we retain(or even better, explore)phonetic information reduction in resynthesizing stimuli? Gubian, Cangemi, Boves (2010), Automatic and Data Driven Pitch Contour Manipulationwith Functional Data Analysis , Speech Prosody, Chicago 43 time F 0

F unctional D ata Analysisx 44 A data driven approach

milena lo vuole amaro milena it -OBJ want-3SG unsweetened Statement Question 45 Test dataset 2 male speakers, r ead speech 3 sentences, 5 repetitions 2 modalities 57 utterances Data driven resynthesis procedure Provide only category labels Test on small dataset Timing of stressed syllables Equal syllabic template

Data Preparation Sampled f0 curves have to be turned into functions A basis of functions (B-splines) expresses each original curveDecide how much detail to retain (smoothing) 46

Data Preparation (2) Landmark registration Align points in time that are deemed as having the same meaning across the dataset 47

age 25 65 salary x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x PC1 PC2 48

49 Analysis ( functiona l PCA)

+ 1.65 x - 0.46 x mean(t) PC1(t) PC2(t) 50 Resynthesis ( procedure )

51 Resynthesis ( output )

52 Discussion Explore both dense and sparse areas: prototypical or ambiguous stimuli Sensitive to long range relationships(non-local cue multiplicity)Can be expanded with other parameters (intensity, durations)Data driven, but still using some knowledge:Landmark registration – stressed syllables ! What happens with other patterns?And how to interpret the PC coefficients? Excellent results by manipulating rich phonetic specificationsAlong with minimalistic phonological knowledge… Back, done 1≠0 !!!

I. Theory 09:30 – 10:45 A. Architectures 1. P rosody and intonation 2. Abstractionist assumptions 3. Exemplarist challenges 4. Prosodic detail 11:00 – 12:30 B. Primitives 1. Partial Topics in Italian 2. Intonational meaning 3. Challenging primitives i. Contour warping ii. Individual behaviour II. Practice 14:00 – 15:00 C. Praat scripting 1. Basics 15:15 – 16:30 2. Plotting i . Synthesis (one example) ii. Analysis (many items)

Individual variability Recent growing interest in speaker-specific and listener-specific behaviours(using different cues for the same contrast) Exemplar-based architectureNo ad hoc solution needed:listener simply store even more exemplars(but: production-perception loop) Abstractionist architecture Phonological forms (e.g. pitch accents)must be flexible enough to mirrorpossibly diverging phonetic specifications Niebuhr, D’Imperio, Gili Fivela, Cangemi (2011), Are there “Shapers” and “Aligners”? Individual differences in signalling pitch accent category, ICPhS, Hong Kong

55 Individual speakers employ different production strategies towards a similar acoustic/perceptual goal. The relative strength of perceptually trading acoustic cues [Jonhnson 1997] can differ from speaker to speaker, at least for segmental variability But what about prosodic variability? What about tonal alignment? Strict segmental anchoring [Ladd et al. 1999] tonal targets are timed to occur at a fixed distance from segmental boundaries (irrespective of rate changes and/or syllabic structure differences)for a specific pitch accent type. Does this hold for all speakers? Do some speakers use other cues? Can interpolations play a role?Is there a trade-off relationship?

Standard German 56 Assertiveness vs Surprise H+L* vs H*early peak vs medial peak[Kohler 1991, Grice and Baumann 2002] 10 A-B dialogues, 35 speakers, nuclear accented CV: syllable Peak alignment (to vowel onset: A ); Shape index (duration rise/fall: B/C) t F0 C V: H+L * H * A B C C B A

Trade-off between peak alignment (x)and shape index (y), highly significant negative correlation (r= -0.72; p<0.001) 57N= ALIGN. SHAPE 525 μ σ μ σ H+L* -29 55 1.31 0.33 H* 68 78 0.83 0.25

Neapolitan Italian 58 nuclear, narrow focus Declarative vs Interrogative L+H* vs L*+H [D’Imperio 2002, Petrone 2008] Peak alignment (y-axis) by Speaker (x-axis) for L+H* (grey) and H* (white)

Purely perceptual mechanisms?Unsystematizable idiolectal variation?Sociolects? Sound change? Perceptual testing is needed… and underwayResults are consistent with the literature (alignment differences across pitch accent type)when data are pooled across speakers When broken down by speakers, data rather reveal a continuum……with quite different behaviours at the two ends ! 59 Back, done

Thank you!

Intonation:buildings and bricks Francesco Cangemi Universität Zürich & Universität zu Kölnfcangemi@uni-koeln.de

Listener-specific perception ofspeaker-specific productions? francesco cangemifcangemi@uni-koeln.de

The problem Speaker -specific production meaning use of different cues for same contrastclustering of speaker-typesMean differences across conditions: assertiveness vs surprise(labelled in ToBI as H+L*vs H*, in KIM as early vs medial peak) peak alignment (gray) shape index (black)« aligner » speakers (top) « shaper » speakers (bottom) Niebuhr et alii, 2011

The problem …but also listener -specific perception and even when using the same cues, e.g. switch points in categorization curves and crucially, interaction between the twoSome listeners perceive differently than others the productions of some particular speakers Moreover, listeners’ performances might be related to their own behaviour as speakers

The problem Speakers S1 S2 S3 … L1 L2 L3 … Listeners and as speakers: S3! Speaker Listener

A test-case four focus types in German 4 target words, 7 repetitions, 560 itemsMücke and Grice, forthcoming

A test-case Production : 5 speakers reading with four contextualizing paragraphsPhonetics: acoustic (f0 dur) and articulatory (lipap)Phonology: GToBI labellingPerception: 20 listeners identification through context matching do individual listeners react differently to the different speakers?

Reshaping the questions Production : do speakers use different cues? do they use different sets of cuesdo they use (sets of) cues in a more or less robust way?are there more-and-less robust speakers?Perception:are there more-and-less reliable listeners?is it the case for some particular cues?or rather for less robust productions?Interaction : are some particular speakers more-and-less clear for some particular listeners?

Some results Production - acoustics Speaker AH DM HN MB WP Peak alignment broad * narrow * contrast broad * narrow/contrast broad * narrow/contrast broad * narrow * contrast Peak height broad * narrow * contrast broad * narrow/contrast broad * narrow/contrast broad * narrow * contrast Duration of target word broad/narrow * contrast broad/narrow * contrast broad * narrow * contrast broad * contrast broad * narrow * contrast Number of prenclear accents broad/narrow * contrast broad * narrow/contrast broad * narrow/contrast broad/narrow * contrast Duration of first word broad * contrast broad * narrow/contrast broad * narrow/contrast Expectations towards perception? AH > HN > DM > WP > MB ?

Some results Production - articulation Mücke and Grice, forthcoming

Some results Perception across listeners, BB>...>KS2 ~30% across speakers, AH>DM>HN>WP>MB ~ 10%

Some results Interactions Similarly significant results with mixed models, using intercepts and slopes for speaker-listener dyads ( thanks Roger Mundry and SSSPP summerschool !) Logit linear model predicting correct answers (contrastive), Speaker and Listener as fixed factors Significantly different from model including interactions Pr(Chi) = 0.002 **

New questions Method : Listeners also performing reading task? (do they use strategies which are similar to those of the speakers they rated more consistently?) Speakers also performing the identification task (do they identify more reliably the productions of speakers which use strategies similar to their own?) and if yes: should they also rate themselves?!

New questions Assumptions : Can we really postulate that some speakers or listeners are more performative than others? “universal donor and universal recipient” Does this conflict with viewing similarity of production as an advantage for perception? Empedocles’ “like is known by like” Implications: From speaker/listener specific behaviour through speker /listener group behaviour to sound change? with listener-speaker mismatch? (Ohala, 1981)Back