Lecture 3_fall_2010

Lecture 3_fall_2010 - Digital Digital Speech Processing—...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Digital Digital Speech Processing— Processing— Lecture 3 Acoustic Theory of Speech Speech Production 1 Topics to be Covered Topics to be Covered • Sound production mechanisms of the human production mechanisms of the human vocal tract • Sounds of language => phonemes • Conversion of text to sounds via letter-to-sound rules and dictionary lookup • Location of sounds in the acoustic waveform sou spect • Location of sounds in spectrograms • Articulatory properties of speech sounds—place and manner of articulation 2 Topics to be Covered Topics to be Covered • sounds of speech of speech – acoustic phonetics – place and manner of articulation and manner of articulation • sound propagation in the human vocal tract • transmission line analogies • time-varying linear system approaches models • source models 3 Basic Speech Processes Basic • idea sentences words sounds waveform waveform sounds words sentences idea – Idea: it’s getting late, I should go to lunch, I should call Al and see if he wants to join me for lunch today – Words: Hi Al, did you eat yet? – Sounds: /h/ /ay/-/ae/ /l/-/d/ /ih/ /d/-/y/ /u/-/iy/ /t/-/y/ /ε/ /t/ /h/ /a /l/ /ih/ /d/ /u/ /t/ /t/ – Coarticulated Sounds: /h- ay-l/-/d-ih-j-uh/-/iy-t-j-ε-t/ (hial-dijaeajet) • remarkably, humans can decode these sounds and humans can decode these sounds and determine the meaning that was intended—at least at the idea/concept level (perhaps not completely at the word or sound level); often machines can also do the same task – – – – speech coding: waveform (model) waveform speech synthesis: words waveform speech recognition: waveform words/sentences speech understanding: waveform idea 4 Basics Basics • speech is composed of a sequence of sounds • sounds (and transitions between them) serve as a symbolic representation of information to be shared between humans (or humans and machines) • arrangement of sounds is governed by rules of language (constraints on sound sequences, word sequences, etc)--/spl/ exists, /sbk/ doesn’t exist • linguistics is the study of the rules of language • phonetics is the study of the sounds of speech th th can exploit knowledge about the structure of sounds and language—and how it exploit knowledge the structure of sounds and language how it is encoded in the signal—to do speech analysis, speech coding, speech synthesis, speech recognition, speaker recognition, etc. 5 Human Vocal Apparatus Human Vocal Apparatus • vocal tract —dotted lines in figure; begins at the tract lines in figure; begins at the glottis (the vocal cords) and ends at the lips • consists of the pharynx (the connection from the esophagus to the mouth) and the mouth itself (the oral cavity) itself (the oral cavity) • average male vocal tract length is 17.5 cm • cross sectional area, determined by positions of the tongue, lips, jaw and velum, varies from zero (complete closure) to 20 sq cm • nasal tract —begins at the velum and ends at the nostrils Mid-sagittal plane X-ray of human vocal apparatus • velum —a trapdoor-like mechanism at the back of the mouth cavity; lowers to couple the nasal tract to the vocal tract to produce the nasal sounds like /m/ (mom), /n/ (night), /ng/ (sing) Vocal Tract MRI Sequences Mid-sagittal plane X-ray of human vocal apparatus 6 MRI MRI of Speech (Prof. Shri Narayanan, USC) USC) 7 Real Real Time MRI – Shri Narayanan, USC 8 Schematic View of Vocal Tract Schematic View of Vocal Tract Speech Production Mechanism: • air enters the lungs via normal breathing enters the lungs via normal breathing and no speech is produced (generally) on in-take • as air is expelled from the lungs, via the air is expelled from the lungs via the trachea or windpipe, the tensed vocal cords within the larynx are caused to vibrate (Bernoulli oscillation) by the air flow Acoustic Tube Models Demo • air is chopped up into quasi-periodic pulses which are modulated in frequency (spectrally shaped) in passing through the th th pharynx (the throat cavity), the mouth cavity, and possibly the nasal cavity; the positions of the various articulators (jaw, (j tongue, velum, lips, mouth) determine the sound that is produced 9 Tube Tube Models 10 Tube Tube Models 11 Vocal Vocal Cords The vocal cords (folds) form a relaxation oscillator. Air pressure builds up and blows them apart. Air flows through the orifice and pressure drops allowing the vocal cords to close Then the cycle is vocal cords to close. Then the cycle is repeated. 12 Vocal Cord Views and Operation Vocal Cord Views and Operation Bernoulli Oscillation Tensed Vocal Cords – Ready to Vibrate Vib Lax Vocal Cords – Open for Breathing 13 Glottal Flow Glottal Flow Glottal volume velocity and resulting sound pressure at the mouth for the first 30 msec of a voiced sound • 15 msec buildup to periodicity => pitch detection issues at beginning and end of voicing; also voiced-unvoiced uncertainty for 15 msec 14 voiced- Artificial Larynx Artificial Larynx Artificial Larynx Demo Larynx Demo 15 Schematic Schematic Production Mechanism • lungs and associated muscles act as the source of air for exciting the vocal mechanism • muscle force pushes air out of the lungs (like (like a piston pushing air up within a cylinder) through bronchi and trachea • if vocal cords are tensed, air flow causes them them to vibrate, producing voiced or quasiquasiperiodic speech sounds (musical notes) Schematic representation of physiological mechanisms of speech production • if vocal cords are relaxed, air flow continues through vocal tract until it hits a ti th it hit constriction in the tract, causing it to become turbulent, thereby producing unvoiced sounds (like /s/, /sh/), or it hits a point of total closure in the vocal tract, building up pressure until the closure is opened and the pressure is suddenly and abruptly released causing brief transient abruptly released, causing a brief transient sound, sound, like at the beginning of /p/, /t/, or /k/ 16 Abstractions Abstractions of Physical Model excitation voiced unvoiced mixed Time-Varying Filter speech 17 The Speech Signal The Speech Signal • speech is a sequence of ever changing sounds • sound properties are highly dependent on context context (i.e., the sounds which occur before and the sounds which occur before and after the current sound) • the state of the vocal cords, the positions, shapes and sizes of the various articulators and sizes of the various articulators—all change change slowly over time, thereby producing the desired speech sounds => need to determine the physical properties of speech by observing and measuring the speech waveform (as well as signals derived from the speech waveform—e.g., the signal spectrum) 18 Speech Waveforms and Spectra Speech Waveforms and Spectra • 100 msec/line; 0.5 sec for utterance • S-silence-background-no speech silence-background• U-unvoiced, no vocal cord vibration (aspiration, unvoiced vi (asp sounds) • V-voiced-quasi-periodic speech voiced-quasi• speech is a slowly time varying speech is slowly time varying signal signal over 5-100 msec intervals 5• over longer intervals (100 msec-5 msecsec) the speech characteristics sec), the speech characteristics change change as rapidly as 10-20 10times/second => no well => no well-defined or exact regions or exact regions where where individuals sounds begin 19 and end 100 msec Speech Sounds Speech Sounds • “Should we chase” we chase – – – – – – /sh/ sound /ould/ sounds /we/ sounds /ch/ sound /a/ sound /s/ sound • hard to distinguish weak sounds from silence • hard to segment with high precision => don’t do it when it can be avoided COOL EDIT demo—’should’, ‘every’ 20 Estimate Estimate of Pitch Period - I IY TH IY V Z UW HH 21 Estimate Estimate of Pitch Period - II R AA B R EH F Z N 22 SourceSource-System Model of Speech Speech Production 23 Making Making Speech “Visible” in 1947 24 Spectrogram Properties Spectrogram Properties Speech Spectrogram —sound intensity versus time and frequency • wideband spectrogram -spectral analysis on 15 msec sections of waveform using broad (125 Hz) bandwidth sections of waveform using a broad (125 Hz) bandwidth analysis filter, with new analyzes every 1 msec – spectral intensity resolves individual periods of the speech and shows vertical striations during voiced regions st du • narrowband spectrogram -spectral analysis on 50 msec sections of waveform using a narrow (40 Hz) bandwidth analysis filter with new analyzes every bandwidth analysis filter, with new analyzes every 1 msec – narrowband spectrogram resolves individual pitch harmonics and shows horizontal striations during voiced regions and shows horizontal striations during voiced regions 25 Wideband Wideband and Narrowband Spectrograms Spectrograms 26 Sound Spectrogram Sound Spectrogram Wav Surfer: www.wavsurfer.com wavsurfer demo—’s5’, ‘s5_synthetic’ VoiceBox: www.ee.ic.ac.uk/hp/staff/dmb/voic ebox/voicebox.htm voicebox demo—’s5’, ‘s5_synthetic’ COLEA UI: COLEA demo—’should’, ‘every’ www.utdallas.edu/~loizou/speech/ colea.htm HMM Toolkit: www.ai.mit.edu/~murphyk/Softwa re/HMM/hmm.html#hmm 27 Speech Speech Sentence Waveform 28 Speech Speech Wideband Spectrogram 29 Acoustic Acoustic Theory of Speech Production 30 Sound Sound Source for Voiced Sounds 31 Sound Sound Source for Unvoiced Sounds 32 Parametrization Parametrization of Spectra • human vocal tract is essentially a tube of varying cross sectional area, or can be approximated as a concatentation of tubes of varying cross sectional areas cross sectional areas • acoustic theory shows that the transfer function of energy from the theory shows that the transfer function of energy from the excitation source to the output can be described in terms of the natural frequencies or resonances of the tube resonances known as formants or formant frequencies for speech and they represent the frequencies that pass the most acoustic energy th th th th from the source to the output typically there are 3 significant formants below about 3500 Hz formants are a highly efficient, compact representation of speech are highly efficient compact representation of speech • • • 33 Spectrogram and Formants Spectrogram and Formants Key Issue: Issue reliability reliability in estimating formants formants from spectral data 34 Waveform and Spectrogram Waveform and Spectrogram 35 Acoustic Theory Summary Acoustic Theory Summary • basic speech processes — from ideas to speech (production) from speech to ideas speech (production), from speech to ideas (perception) vocal production mechanisms • basic vocal production mechanisms — vocal tract, nasal tract, velum • source of sound flow at the glottis; output of sound flow at the lips and nose • speech waveforms and properties — voiced, unvoiced silence pitch unvoiced, silence, pitch • speech spectrograms and properties — wideband spectrograms, narrowband spectrograms, narrowband spectrograms, formants 36 English Speech Sounds English Speech Sounds ARPABET representation • 48 sounds • 18 vowels/diphthongs • 4 vowel-like consonants vowel consonants • 21 standard consonants • 4 syllabic sounds • 1 glottal stop 37 Phonemes— Phonemes—Link Between Orthography and Speech Orthography • Larry sequence of sounds /l/ /ae/ /r/ /iy/ (/L/ /AE/ /R/ /IY/) Speech Waveform sequence of sounds • based on acoustic properties (temporal) of phonemes Spectrogram sequence of sounds of sounds • based on acoustic properties (spectral) of phonemes The bottom line is that we use a phonetic code as an intermediate phonetic representation of language, from either orthography or from waveforms or spectrograms; now we have to learn how to or spectrograms no ha to learn ho to recognize sounds within speech utterances 38 Phonetic Transcriptions Phonetic Transcriptions • based on ideal (dictionary-based) pronunciations of all words in sentence – ‘My name is Larry’-/M/ /AY/-/N/ /EY/ /M/-/IH/ /Z/-/L/ /AE/ /R/ /IY/ – ‘How old are you’-/H/ /AW/-/OW/ /L/ /D/-/AA/ /R/-/Y/ /UW/ – ‘Speech processing is fun’-/S/ /P/ /IY/ /CH/-/P/ /R/ /AH/ /S/ /EH/ /S/ /IH/ /NG/ /S/ /EH/ /S/ /IH/ /NG/-/IH/ /Z/-/F/ /AH/ /N/ /Z/ /AH/ /N/ • word ambiguity abounds – ‘lives’-/L/ /IH/ /V/ /Z/ (he lives here) versus /L/ /AY/ /V/ /Z/ (a cat has nine lives) (a cat has nine lives) – ‘record’-/R/ /EH/ /K/ /ER/ /D/ (he holds the world record) versus /R/ /IY/ /K/ /AW/ /D/ (please record my favorite show tonight) show tonight) 39 She She had your dark suit in… SH IY AE HH D AA AXR Y R D UW S K T IH N 40 “Wideband” “Wideband” Spectrogram SH IY HH AE D AXR D AA R Y K S UW IH N T 41 Reduced Set of English Sounds Reduced Set of English Sounds • 39 sounds – 11 vowels (front, mid, back) classification based on tongue hump position – 4 diphthongs (vowel-like combinations) diphthongs (vowel combinations) – 4 semi-vowels (liquids and glides) – 3 nasal consonants – 6 voiced and unvoiced stop consonants voiced and unvoiced stop consonants – 8 voiced and unvoiced fricative consonants – 2 affricate consonants – 1 whispered sound • look at each class of sounds to characterize their acoustic and spectral properties their acoustic and spectral properties 42 Phoneme Classification Chart Phoneme Classification Chart Vocal Cords Vib Vibrating Noise-Like Excitation EY 43 Vowels Vowels • longest duration sounds – least context sensitive • can be held indefinitely in singing and other musical works (opera) • carry very little linguistic information (some languages don’t display vowels in text-Hebrew, Arabic) Text 1: all vowels deleted Th_y n_t_d s_gn_f_c_nt _mpr_v_m_nts _n th_ c_mp_ny’s th _m_g_, s_p_rv_s__n _nd m_n_g_m_nt. Text 2: all consonants deleted A__i_u_e_ _o_a__ _a_ __a_e_ e__e__ia___ __e _a_e, _i__ __e __i_e_ o_ o__u_a_io_a_ e___o_ee_ __i_____ i _e__ea_i__. 44 Vowels Vowels and Consonants Text 1: all vowels deleted Th_y n_t_d s_gn_f_c_nt _mpr_v_m_nts _n th_ c_mp_ny’s _m_g_, s_p_rv_s__n _nd m_n_g_m_nt. m (They noted significant improvements in the company’s image, supervision and management.) Text 2: all consonants deleted A__i_u_e_ _o_a__ _a_ __a_e_ e__e__ia___ __e _a_e, _i__ __e __i_e_ o_ o__u_a_io_a_ e___o_ee_ __i_____ _e__ea_i__. e (Attitudes toward pay stayed essentially the same, with the toward pay stayed essentially the same, with the scores of occupational employees slightly decreasing) 45 More Textual Examples More Textual Examples Text (all vowels deleted): (all vowels deleted): _n th_ n_xt f_w d_c_d_s, _dv_nc_s _n c_mm_n_c_t_ _ns w_ll r_d_c_lly ch_ng_ th_ w_y w_ ch th l_v_ _nd w_rk. Text (all consonants deleted): _ _e _o_ _e_ _ o_ _oi_ _ _o _o_ _ _i_ _ _ _a_ _e _ _o_ _o_ _u_i_ _ … 46 More Textual Examples More Textual Examples Text (all vowels deleted): _n th_ n_xt f_w d_c_d_s, _dv_nc_s _n c_mm_n_c_t_ _ns w_ll r_d_c_lly ch_ng_ th_ w_y w_ l_v_ _nd w_rk. (In the next few decades, advances in communications will radically change the way we live and work.) Text (all consonants deleted): _ _e _o_ _e_ _ o_ _oi_ _ _o _o_ _ _i_ _ _ _a_ _e _ _o_ _o_ _u_i_ _ … (The concept of going to work will change from commuting…) 47 Vowels Vowels • • • • produced using fixed vocal tract shape sustained sounds vocal cords are vibrating => voiced sounds cross-sectional area of vocal tract determines vowel resonance frequencies and vowel sound quality lit • tongue position (height, forward/back position) most important in determining vowel sound most important in determining vowel sound • usually relatively long in duration (can be held during singing) and are spectrally well formed 48 Vowel Production Vowel Production 49 Vowel Articulatory Shapes Vowel Articulatory Shapes • tongue hump position (front, mid, back) • tongue hump height (high, mid, low) • /IY/, /IH/, /AE/, /EH/ => front => high resonances /IH/ /AE/ /EH/ => front => high resonances • /AA/, /AH/, /AO/ => mid => energy balance 50 • /UH/, /UW/, /OW/ => back => low frequency resonances Vowel Vowel Waveforms & Spectrograms Synthetic versions of the versions of the 10 vowels 51 Vowel Formants Vowel Formants Clear pattern of variability of vowel pronunciation among men, women and children hild Strong overlap for different vowel sounds by different talkers => no unique identification of vowel strictly from resonances => need context to define vowel sound 52 The Vowel Triangle The Vowel Triangle Centroids of common vowels form clear triangular pattern in F1-F2 space iy-ih-eh-ae-uh 53 Canonic Canonic Vowel Spectra IY IY AA AA UW UW 100 Hz Fundamental 10 Hz 33 Hz 100 Hz 54 Canonic Canonic Vowel Spectra IY IY AA AA UW UW 100 Hz Fundamental 300 Hz Hz 300 Hz Fundamental 55 Eliminating Eliminating Vowels and Consonants 56 Diphthongs Diphthongs • Gliding speech sound that starts at or near the articulatory position for one vowel and moves to or toward the position for another vowel – – – – – /AY/ in buy /AW/ in down /EY/ in bait /OY/ in boy in /OW/ in boat (usually classified as vowel, not diphthong) – /Y/ in you (usually classified as glide) 57 Distinctive Features Distinctive Features Classify non-vowel/non-diphthong sounds in terms of distinctive features features – place of articulation • • • • • • • Bilabial (lips)—p,b,m,w Labiodental (between lips and front of teeth)-f,v Dental (teeth)-th,dh Alveolar (front of palate)-t,d,s,z,n,l Palatal (middle of palate)-sh,zh,r Velar (at velum)-k,g,ng Pharyngeal (at end of pharynx)-h – manner of articulation • • • • • • • Glide—smooth motion-w,l,r,y Nasal—lowered velum-m,n,ng Stop—constricted vocal tract-p,t,k,b,d,g Fricative—turbulent source-f,th,s,sh,v,dh,z,zh,h Voicing—voiced source-b,d,g,v,dh,z,zh,m,n,ng,w,l,r Mixed source—both voicing and unvoiced-j,ch Whispered--h 58 Places of Articulation Places of Articulation 59 Semivowels (Liquids and Glides) Semivowels (Liquids and Glides) • vowel-like in nature (called semivowels for in nature (called semivowels for this reason) • voiced sounds (w-l-r-y) sounds (w • acoustic characteristics of these sounds are strongly influenced by context—unlike most vowel sounds which are much less influenced by context Manner: glides Place: bilabial (w), alveolar (l), palatal (r) uh-{w,l,r,y}-a 60 Nasal Consonants Nasal Consonants • The nasal consonants consist of /M/, /N/, and /NG/ – – – – nasals produced using glottal excitation => voiced sounds vocal tract totally constricted at some point along the tract velum lowered so sound is radiated at nostrils constricted oral cavity serves as a resonant cavity that traps acoustic energy at certain natural frequencies (anti-resonances or zeros of transmission) – /M/ is produced with a constriction at the lips => low frequency zero – /N/ is produced with a constriction just behind the teeth => higher frequency zero frequency zero – /NG/ is produced with a constriction just forward of the velum => even higher frequency zero Manner: nasal uh-{m,n,ng}-a Place: bilabial (m), alveolar (n), velar (ng) 61 Nasal Production Nasal Production 62 Nasal Nasal Sounds Hole in spectrum UH M AA UH N AA 63 Nasal Spectrograms Nasal Spectrograms 64 Unvoiced Unvoiced Fricatives • Consonant sounds /F/, /TH/, /S/, /SH/ – produced by exciting vocal tract by steady air flow by exciting vocal tract by steady air flow which becomes turbulent in region of a constriction in the vocal tract • • • • /F/ constriction near the lips /TH/ constriction near the teeth /S/ constriction near the middle of the vocal tract /SH/ constriction near the back of the vocal tract – noise source at constriction => vocal tract is separated into two cavities – sound radiated from lips – front cavity – back cavity traps energy and produces antiresonances (zeros of transmission) Manner: fricative uh-{f,th,s,sh}-a Place: labiodental (f), dental (th), alveolar (s), palatal (sh) 65 Unvoiced Fricative Production Unvoiced Fricative Production 66 Unvoiced Unvoiced Fricatives UH F AA UH S AA UH SH AA 67 Unvoiced Fricative Spectrograms Unvoiced Fricative Spectrograms 68 Voiced Fricatives Voiced Fricatives • Sounds /V/,/DH/, /Z/, /ZH/ – place of constriction same as for unvoiced counterparts – two sources of excitation; vocal cords vibrating producing semi-periodic puffs of air to excite the tract; the resulting air flow becomes turbulent at the constriction giving a noise noise-like component in addition to the component in addition to the voiced-like component Manner: fricative uh-{v,dh,z,zh}-a Place: labiodental (v), dental (dh), alveolar (z), palatal (zh) 69 Voiced Voiced Fricatives UH V AA UH ZH AA 70 Voiced Voiced and Unvoiced Stop Consonants • sounds-/B/, /D/, /G/ (voiced stop consonants) and /P/, /T/ /K/ /K/ (unvoiced stop consonants) – voiced stops are transient sounds produced by building up pressure behind a total constriction in the oral tract and then suddenly releasing the pressure resulting in pop suddenly releasing the pressure, resulting in a pop-like sound sound • /B/ constriction at lips • /D/ constriction at back of teeth • /G/ constriction at velum Manner: stop uh-{b,d,g}-a Place: bilabial (b,p), alveolar (d,t), velar (g, k) – no sound is radiated from the lips during constriction => t h li sometimes sound is radiated from the throat during constriction (leakage through tract walls) allowing vocal cords to vibrate in spite of total constriction – stop sounds strongly influenced by surrounding sounds – unvoiced stops have no vocal cord vibration during period of closure => brief period of frication (due to sudden turbulence of escaping air) and aspiration (steady air flow from the glottis) escaping air) and aspiration (steady air flow from the glottis) before voiced excitation begins 71 Stop Consonant Production Stop Consonant Production 72 Voiced Voiced Stop Consonant UH B AA 73 Unvoiced Unvoiced Stop Consonants uh-{p,t,k}-a Stop Gap uh-{j,ch,h}-a UH P AA UH T AA 74 Stop Stop Consonant Waveforms and Spectrograms uh-{p,t,k}-a uh-{j,ch,h}-a 75 Distinctive Phoneme Features Distinctive Phoneme Features • the brain recognizes sounds by doing a distinctive feature analysis from the information going to the brain • the distinctive features are somewhat insensitive to noise, background, reverberation => they are robust and reliable 76 Distinctive Features Distinctive Features • place and manner of articulation completely define the consonant sounds, making speech perception robust to a range of external factors 77 IY-beat IHbit EHEH bet AE-bat AA-bob ER-bird AH-but AO AO-bought UW-boot UH-book OW OW-boat AW-down AY-buy OY-boy EY-bait Review Exercises Review Exercises Write the transcription of the Write the transcription of the sentence “Good friends are hard to find” G-UH-D F-R-EH-N-D-Z AA-R HHAA-R-D T-UH (UW) F-AY-N-D 78 Review Review Exercises file: enjoy 10k, sampling rate: 10000, starting sample: 1 number of samples 8079 0 EH N 2000 s amples offset N JH Enjoy: OY 4000 OY EH-N-JH-OY 6000 OY 8000 0 200 400 600 800 1000 1200 sample number number 1400 1600 1800 2000 79 Review Exercises Review Exercises file: simple10k, sampling rate: 10000, starting sample: 1 number of samples 7152 0 S Simple: s amples offset 2000 S IH M 4000 P S-IH-M-P(AX-L | EL) AXAX-L | EL 6000 0 200 400 600 800 1000 1200 sample number 1400 1600 1800 2000 80 TH-IH S IH Z UH T EH S T 81 This is a test (16 kHz sampling rate) Ultimate Exercise— Ultimate Exercise—Identify Words From Spectrogram Word Choices: that, and, was, by, people little simple people, little, simple, between, very, enjoy, only, other, company, those /was/ -- this word can be identified by the voiced voiced initial portion with very low first and second low first and second formants (sounds like UW or W), followed by the AA sound and ending with the Z (S) sound. 82 Ultimate Exercise— Ultimate Exercise—Identify Words From Spectrogram Word Choices: Word Choices: that, and, was, by, people, little, simple, between very enjoy between, very, enjoy, only, other, company, those /enjoy/ – this word can be identified by the two-syllable twonature, with the nasal sound N at the end of the first syllable and the fricative JH syllable, and the fricative JH at the beginning of the second syllable, with the characteristic OY diphthong at the end of th the word 83 Ultimate Exercise— Ultimate Exercise—Identify Words From Spectrogram Word Choices: that, and, was, by, people, little, simple, between, very, enjoy, only, other, company, those those /company/ /company/ – this word can be word can be identified by the three syllable three nature, with the initial stop consonant K, the first syllable ending in the nasal ending in the nasal M, followed by followed by the stop P, and with the second syllable ending with the nasal N followed by an IY vowel-like sound sound 84 Ultimate Exercise— Ultimate Exercise—Identify Words From Spectrogram Word Choices: that, and, was, by, people, little, simple, people, little, simple, between, very, enjoy, only, other, company, those /simple/ – this word can be identified by the two-syllable twonature, with a strong initial fricative S beginning the first syllable and the nasal M ll th ending the first syllable, and with the stop consonant P beginning the second syllable 85 Summary Summary • sounds of the English language—phonemes, the English language syllables, words • phonetic transcriptions of words and sentences — coarticulation across word boundaries • vowels and consonents — their roles, articulatory shapes, waveforms, spectrograms, formants • distinctive feature representations of speech 86 ...
View Full Document

This note was uploaded on 12/29/2011 for the course ECE 259 taught by Professor Rabiner,l during the Fall '08 term at UCSB.

Ask a homework question - tutors are online