This preview shows pages 1–3. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: Purdue University: ECE438 - Digital Signal Processing with Applications 1 ECE438 - Laboratory 9: Speech Processing (Week 2) October 6, 2010 1 Introduction This is the second part of a two week experiment. During the first week we discussed basic properties of speech signals, and performed some simple analyses in the time and frequency domain. This week, we will introduce a system model for speech production. We will cover some background on linear predictive coding , and the final exercise will bring all the prior material together in a speech coding exercise. 1.1 A Speech Model DT Impulse Train White Noise Voiced Sounds Unvoiced Sounds Vocal Tract LTI, all-pole filter V(z) x(n) s(n) speech signal T p G Figure 1: Discrete-Time Speech Production Model From a signal processing standpoint, it is very useful to think of speech production in terms of a model, as in Figure 1. The model shown is the simplest of its kind, but it includes all the principal components. The excitations for voiced and unvoiced speech are represented Questions or comments concerning this laboratory should be directed to Prof. Charles A. Bouman, School of Electrical and Computer Engineering, Purdue University, West Lafayette IN 47907; (765) 494- 0340; firstname.lastname@example.org Purdue University: ECE438 - Digital Signal Processing with Applications 2 by an impulse train and white noise generator, respectively. The pitch of voiced speech is controlled by the spacing between impulses, T p , and the amplitude (volume) of the excitation is controlled by the gain factor G . As the acoustical excitation travels from its source (vocal cords, or a constriction), the shape of the vocal tract alters the spectral content of the signal. The most prominent effect is the formation of resonances, which intensifies the signal energy at certain frequencies (called formants ). As we learned in the Digital Filter Design lab, the amplification of certain frequencies may be achieved with a linear filter by an appropriate placement of poles in the transfer function. This is why the filter in our speech model utilizes an all-pole LTI filter. A more accurate model might include a few zeros in the transfer function, but if the order of the filter is chosen appropriately, the all-pole model is sufficient. The primary reason for using the all-pole model is the distinct computational advantage in calculating the filter coefficients, as will be discussed shortly. Recall that the transfer function of an all-pole filter has the form V ( z ) = 1 1 P k =1 a k z- k (1) where P is the order of the filter. This is an IIR filter that may be implemented with a recursive difference equation. With the input G x ( n ), the speech signal s ( n ) may be written as s ( n ) = P summationdisplay k =1 a k s ( n k ) + G x ( n ) (2) Keep in mind that the filter coefficients will change continuously as the shape of the vocal tract changes, but speech segments of an appropriately small length may be approximated by a time-invariant model.by a time-invariant model....
View Full Document