Lecture 3_winter_2012_6tp

Lecture 3_winter_2012_6tp - Topics to be Covered Digital...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
1 1 Digital Speech Processing— Lecture 3 Acoustic Theory of Speech Production 2 Topics to be Covered • Sound production mechanisms of the human vocal tract • Sounds of language => phonemes • Conversion of text to sounds via letter-to-sound rules and dictionary lookup • Location/properties of sounds in the acoustic waveform • Location/properties of sounds in spectrograms • Articulatory properties of speech sounds—place and manner of articulation 3 Topics to be Covered • sounds of speech – acoustic phonetics – place and manner of articulation • sound propagation in the human vocal tract • transmission line analogies • time-varying linear system approaches • source models 4 Basic Speech Processes • idea Æ sentences Æ words Æ sounds Æ waveform Æ waveform Æ sounds Æ words Æ sentences Æ idea Idea : it’s getting late, I should go to lunch, I should call Al and see if he wants to join me for lunch today Words : Hi Al, did you eat yet? Sounds : /h/ /a y /-/ae/ /l/-/d/ /ih/ /d/-/y/ /u/-/iy/ /t/-/y/ / ε / /t/ Coarticulated Sounds : /h- a y -l/-/d-ih-j-uh/-/iy-t-j- ε -t/ (hial-dija- eajet) • remarkably, humans can decode these sounds and determine the meaning that was intended—at least at the idea/concept level (perhaps not completely at the word or sound level); often machines can also do the same task – speech coding: waveform Æ (model) Æ waveform – speech synthesis: words Æ waveform – speech recognition: waveform Æ words/sentences – speech understanding: waveform Æ idea Basics speech is composed of a sequence of sounds sounds (and transitions between them) serve as a symbolic representation of information to be shared between humans (or humans and machines) • arrangement of sounds is governed by rules of language (constraints on sound sequences, word sequences, etc)--/spl/ exists, /sbk/ doesn’t exist linguistics is the study of the rules of language phonetics is the study of the sounds of speech can exploit knowledge about the structure of sounds and language—and how it is encoded in the signal—to do speech analysis, speech coding, speech synthesis, speech recognition, speaker recognition, etc. 6 Human Vocal Apparatus Mid-sagittal plane X-ray of human vocal apparatus vocal tract —dotted lines in figure; begins at the glottis (the vocal cords) and ends at the lips • consists of the pharynx (the connection from the esophagus to the mouth) and the mouth itself (the oral cavity) • average male vocal tract length is 17.5 cm • cross sectional area, determined by positions of the tongue, lips, jaw and velum, varies from zero (complete closure) to 20 sq cm nasal tract —begins at the velum and ends at the nostrils velum —a trapdoor-like mechanism at the back of the mouth cavity; lowers to couple the nasal tract to the vocal tract to produce the nasal sounds like /m/ (mom), /n/ (night), /ng/ (sing) Vocal Tract MRI Sequences Mid-sagittal plane X-ray of human vocal apparatus
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
2 7 MRI of Speech (Prof. Shri Narayanan, USC) 8 Real Time MRI – Shri Narayanan, USC 9
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 12/29/2011 for the course ECE 259 taught by Professor Rabiner,l during the Fall '08 term at UCSB.

Page1 / 15

Lecture 3_winter_2012_6tp - Topics to be Covered Digital...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online