This preview shows pages 1–3. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: 11: Speech Analysis - 1 11: S PEECH A NALYSIS Introduction In the previous two labs we found that we could analyze a complex sound with particular frequency using the power spectrum. We gave very precise ways to define how to measure sounds made up of a harmonic series using the power spectrum: Pitch: the frequency of the fundamental of the harmonic series (f 1 ) . Timbre: the relative heights of the peaks in the power spectrum (compared to the fundamental) Loudness: the overall heights of the peaks in the power spectrum In this lab we will try to better understand the acoustical features of speech. Speech sounds, called “phonemes” are classified either as vowels or consonants . Frequency analysis of a vowel sound always reveals a clear harmonic spectrum, with a pitch (fundamental frequency) varying from individual to individual, with the pitch of female speakers being on the average twice as high as that of male speakers. The pitch is determined by the vibration of vocal cords. The features that distinguish specific vowels are called formants . Formants are vocal tract resonances, the frequency of which depends on such effects as the tongue height or tongue advancement. The first two formants are particularly important in speech recognition. The frequency of the first formant increases as we open our mouth wider and lower the tongue. The frequency of the second formant increases as we advance our tongue (see positions of formants for selected vowels in the attached figures). Frequencies of formants change only within 15% between female and male speakers. Most consonants do not have harmonic frequency spectra. The features that distinguish consonants are periods of silence, voice bars, noise, and the consonant’s effects on the frequency spectra of adjacent vowels. Consonants are classified by: (i) manner of articulation, (ii) place of articulation, and (iii) as voiced or unvoiced. The consonant types, classified by manner of articulation, include: 1. plosive or stop (p, b, t, d, k, g) – produced by blocking the flow of air somewhere in the vocal tract 2. fricative (f, s, sh, h, v, th, z) – produced by constricting the air flow to produce a turbulence 3. nasal (m, n, ng) – produced by lowering the soft palate 4. liquid (l, r) – generated by raising the tip of the tongue 5. semi-vowel (w, y), always followed by a vowel. 11: Speech Analysis - 2 A. Looking at a power spectrum of a spoken sound 1) The power spectrum is quite complicated; nevertheless, we can spot some of the same features as those found in the spectrum of a simpler sound (such as a triangle wave): a) Identify on the graph the position of the peak corresponding to the fundamental (hint: it is the biggest peak on the graph); label it as "1." Note that the horizontal position of the peak shows the frequency corresponding to the peak. The fundamental peak is about halfway between 0 and 440 Hz; therefore, a good estimate for the frequency of the fundamental would be 220 Hz....
View Full Document
This note was uploaded on 01/20/2012 for the course P 109 taught by Professor Staff during the Fall '08 term at Indiana.
- Fall '08