Ch2-Speech_Coding-old

# February 11 2012 veton kpuska 97 mixed excitation lpc

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: llows: Two polynomials of order p+1 are created from the pth order inverse filter A(z) according to: P ( z ) = A( z )+ z - ( p+1) A( z -1 ) LSFs can be coded efficiently and stability of the resulting syntheses filter can be guaranteed when they are quantized. Better quantization and interpolation properties than the corresponding PARCOR coefficients. Disadvantage is the fact that solving for the roots of P(z) and Q(z) can be more computationally intensive than the PARCOR coefficients. Polynomial A(z) is easily recovered from the LSFs (Exercise 12.18). Q( z ) = A( z )- z -( p+1) A( z -1 ) February 11, 2012 Veton Kpuska 100 CodeExcited Linear Prediction (CELP) Concept: Core ideas of CELP: Utilization of longterm as well as shortterm linear prediction models for speech synthesis Avoiding the strict voiced/unvoiced classification of LPC coder. Incorporation of an excitation codebook which is searched during encoding to locate the best excitation sequence. "Code Excited" LP comes from the excitation codebook that contains the "code" to "excite" the synthesis filters. On each frame a codeword is chosen from a codebook of residuals such as to minimize the meansquared error between the synthesized and original speech waveform. The length of a codeword sequence is determined by the analysis frame length. For a 10 ms frame interval split into 2 inner frames of 5 ms each a codeword sequence is 40 samples in duration for an 8000 Hz sampling rate. The residual and longterm predictor is estimated with twice the time resolution (a 5 ms frame) of the shortterm predictor (10 ms frame); Excitation is more nonstationary than the vocal tract. February 11, 2012 Veton Kpuska 101 CodeExcited Linear Prediction (CELP) Two approach to formation of the codebook: Deterministic codebook It is formed by applying the kmeans clustering algorithm to a large set of residual training vectors. Deterministic Stochastic Stochastic codebook Channel mismatch Histogram of the residual from the longterm predictor follows roughly a Gaussian probability pdf. A valid assumption with exception of plosives and voiced/unvoiced transitions. Cumulative distributions are nearly identical to those for white Gaussian random variables Alternative codebook is constructed of white Gaussian random variables with unit variance. February 11, 2012 Veton Kpuska 102 CELP Coders Variety of government and International standard coders: 1990's Government standard for secure communications at 4.8 kbps at 4000 Hz bandwidth (FedStd1016) uses CELP coder: Three bit rates: Current international standards use CELP based coding. Shorttime predictor: 30 ms frame interval coded with 34 bits per frame. 10th order vocal tract spectrum from prediction coefficients transformed to LSFs coded nonuniform quantization. Shortterm and longterm predictors are estimated in openloop Residual codewords are determined in closedloop form. 9.6 kbps (multipulse) 4.8 kbps (CELP) 2.4 kbps (LPC) G.729 G.723.1 February 11, 2012 Veton Kpuska 103...
View Full Document

## This note was uploaded on 02/10/2012 for the course ECE 3552 taught by Professor Staff during the Fall '10 term at FIT.

Ask a homework question - tutors are online