Ch2-Speech_Coding-old

February 11 2012 veton kpuska 85 use of vq in speech

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Voicing decision Gain Compared to telephone quality signal: Resulting in 1300 parameters/s. 4000 Hz bandwidth 8000 samples/s (8 bit per sample). 1300 parameters/s < 8000 samples/s February 11, 2012 Veton Kpuska 88 Basic Linear Prediction Coder (LPC) Instead of prediction coefficients ai use: Behavior of prediction coefficients is difficult to characterize: Corresponding poles bi Partial Correlation Coefficients ki (PARCOR) Reflection Coefficients ri, or Other equivalent representation. Alternative equivalent representations: Large dynamic range ( large variance) Quantization errors can lead to unstable system function at synthesis (poles may move outside the unit circle). Have a limited dynamic range Can be easily enforced to give stability because |bi|<1 and |ki|<1. February 11, 2012 Veton Kpuska 89 Basic Linear Prediction Coder (LPC) Many ways to code linear prediction parameters: Example of 7200 bps coding: 1. 2. 3. 4. Ideally optimal quantization uses the Max quantizer based on known or estimated pdf's of each parameter. Voice/Unvoiced Decision: 1 bit (on or off) Pitch (if voiced): 6 bits (uniform) Gain: 5 bits (nonuniform) Each Pole bi: 10 bits (nonuniform) 5 bits for bandwidth 5 bits for center frequency Total of 6 poles Quality limited by simple impulse/noise excitation model. 100 frames/s 1+6+5+6x10=72 bits February 11, 2012 Veton Kpuska 90 Basic Linear Prediction Coder (LPC) Improvements possible based on replacement of poles with PARCOR. Higher order PARCOR have pdf's closer to Gaussian centered around zero nonuniform quantization. Companding is effective with PARCOR: Transformed pdf's close to uniform. Original PARCOR coefficients do not have a good spectral sensitivity (change in spectrum with a change in spectral parameters that is desired to minimize). Empirical finding that a more desirable transformation in this sense is to use logarithm of the vocal tract area function ratio: Ai+1 1-ki = Ai 1+ki 1-ki Ai+1 g i =T [ ki ] = log 1+k = log A i i February 11, 2012 Veton Kpuska 91 Basic Linear Prediction Coder (LPC) Parameters gi: Have a pdf close to uniform Smaller spectral sensitivity than PARCOR: The all pole spectrum changes less with a change in g i than with a change in ki Note that spectrum changes less with the change in k i than with the change in pole positions. Typically these parameters can be coded at 56 bits each (significant improvement over 10 bits): 100 frames/s Order 6 of the predictor (6 poles) (1+6+5+6x6)x100 bps = 4800 bps Same quality as 7200 bps by coding pole positions for telephone bandwidth speech. February 11, 2012 Veton Kpuska 92 Basic Linear Prediction Coder (LPC) Government standard for secure communications using 2.4 kbps for about a decade used this basic LPC scheme at 50 frames per second. Demand for higher quality standards opened up research on two primary problems with speech codes base on all pole linear prediction analysis: 1. 2. Inadequacy of the basic source/filter speech production model Restrictions of onedimen...
View Full Document

This note was uploaded on 02/10/2012 for the course ECE 3552 taught by Professor Staff during the Fall '10 term at FIT.

Ask a homework question - tutors are online