224s.09.lec5

224s.09.lec5 - CS 224S / LINGUIST 281 Speech Recognition,...

Info iconThis preview shows pages 1–12. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: CS 224S / LINGUIST 281 Speech Recognition, Synthesis, and Dialogue Dan Jurafsky Lecture 5: Prosodic Processing for TTS IP Notice: many of the slides in the first half come from two lectures of Jennifer Venditti on intonation (thanks!); lots of other info in these slides comes from Alan Blacks and Richard Sproats lecture notes Outline I. Linguistic Background 1. What is Prosody? 2. Thinking about F0 3. Intonational Prominence: Pitch Accents 4. Intonational Boundaries/Phrasing 5. Intonational Tunes II. Producing Intonation in TTS 1. Predicting Accents 2. Predicting Boundaries 3. Predicting Duration 4. Generating F0 Advanced: The TOBI Prosodic Transcription Theory Part I: Linguistic Background I.1 Defining Intonation Ladd (1996) Intonational phonology The use of suprasegmental phonetic features Suprasegmental = above and beyond the segment/phone F0 Intensity (energy) Duration to convey sentence-level pragmatic meanings I.e. meanings that apply to phrases or utterances as a whole, not lexical stress, not lexical tone. Three aspects of prosody Prominence : some syllables/words are more prominent than others Structure/boundaries : sentences have prosodic structure Some words group naturally together Others have a noticeable break or disjuncture between them Tune : the intonational melody of an utterance. From Ladd (1996) Prosodic Prominence: Pitch Accents A: What types of foods are a good source of vitamins? B1: Legumes are a good source of VITAMINS. B2: LEGUMES are a good source of vitamins. Prominent syllables are: Louder Longer Have higher F0 and/or sharper changes in F0 (higher F0 velocity) Slide from Jennifer Venditti Prosodic Boundaries I met Mary and Elenas mother at the mall yesterday. I met Mary and Elenas mother at the mall yesterday . French [ bread and cheese ] [ French bread ] and [ cheese ] Slide from Jennifer Venditti Prosodic Tunes Legumes are a good source of vitamins. Are legumes a good source of vitamins? Slide from Jennifer Venditti Part I.2 Part I.2 Thinking about F0 Graphic representation of F0 legumes are a good source of VITAMINS 50 100 150 200 250 300 350 400 time F0 (in Hertz) Slide from Jennifer Venditti The ripples legumes are a good source of VITAMINS [ t ] [ s ] [ s ] 50 100 150 200 250 300 350 400 F0 is not defined for consonants without vocal fold vibration....
View Full Document

This note was uploaded on 04/21/2011 for the course CS 224 taught by Professor De during the Spring '11 term at Kentucky.

Page1 / 81

224s.09.lec5 - CS 224S / LINGUIST 281 Speech Recognition,...

This preview shows document pages 1 - 12. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online