1
1
Digital Speech Processing—
Lecture 9
Short-Time Fourier
Analysis Methods-
Introduction
2
General Discrete-Time Model of
Speech Production
Voiced Speech:
•
A
V
P(z)G(z)V(z)R(z)
Unvoiced Speech:
•
A
N
N(z)V(z)R(z)
3
Short-Time Fourier Analysis
•
represent signal by
sum of sinusoids
or
complex exponentials as it leads to convenient
solutions to problems (formant estimation, pitch
period estimation, analysis-by-synthesis
methods), and insight into the signal itself
• such
Fourier representations
provide
– convenient means to determine response to a sum of
sinusoids for linear systems
– clear evidence of signal properties that are obscured
in the original signal
4
Why STFT for Speech Signals
•
steady state sounds, like vowels, are produced
by
periodic excitation of a linear system
=>
speech spectrum is the product of the excitation
spectrum and the vocal tract frequency response
•
speech is a
time-varying signal
=> need more
sophisticated analysis to reflect time varying
properties
– changes occur at syllabic rates (~10 times/sec)
– over fixed time intervals of 10-30 msec, properties of
most speech signals are relatively constant (when is
this not the case)
5
Overview of Lecture
• define
time-varying Fourier transform
(
STFT
)
analysis method
• define
synthesis method
from time-varying FT
(filter-bank summation, overlap addition)
•
show how time-varying FT can be viewed in
terms of a
bank of filters model
•
computation methods
based on using FFT
•
application
to vocoders, spectrum displays,
format estimation, pitch period estimation
6
Frequency Domain Processing
•
Coding
:
– transform, subband, homomorphic, channel vocoders
•
Restoration/Enhancement/Modification
:
– noise and reverberation removal, helium restoration,
time-scale modifications (speed-up and slow-down of
speech)

This ** preview** has intentionally

**sections.**

*blurred***to view the full version.**

*Sign up*