This preview shows page 1. Sign up to view the full content.
Unformatted text preview: sional scalar quantization techniques to account for possible parameter correlation. February 11, 2012 Veton Kpuska 93 A VQ LPC Coder VQ based LPC PARCOR coder. Kmeans algorithm February 11, 2012 Veton Kpuska 94 A VQ LPC Coder
1. Use VQ LPC Coder to achieve same quality of speech with lower bitrate: 10--bit code book (1024 codewords) 800 bps 2400 bps of scalar quantization 44.4 frames/s 440 bits to code PARCOR coefficients per second. 8 bits per frame for: Pitch Gain Voicing 1 bit for frame synchronization per second. February 11, 2012 Veton Kpuska 95 A VQ LPC Coder Maintain 2400 bps bit rate with a higher quality of speech coding (early 1980): 22bit codebook 222 = 4200000 codewords. Problems: 1. 1. VQ based spectrum characterized by a "wobble" due to LPC based spectrum being quantized: Intractable solution due to computational requirements (large VQ search) Memory (large Codebook size) Spectral representation near cell boundary "wobble" to and from neighboring cells insufficient number of codebooks. Emphasis changed from improved VQ of the spectrum and better excitation models ultimately to a return to VQ on the excitation. February 11, 2012 Veton Kpuska 96 Mixed Excitation LPC (MELP) Multiband voicing decision (introduced as a concept in Section 12.5.2 not covered in slides) Addresses shortcomings of conventional linear prediction analysis/synthesis: Realistic excitation signal Time varying vocal tract formant bandwidths Production principles of the "anomalous" voice. February 11, 2012 Veton Kpuska 97 Mixed Excitation LPC (MELP) Model: MELP unique components:
1. 2. 3. Different mixtures of impulses and noise are generated in different frequency bands (410 bands) The impulse train and noise in the MELP model are each passed through timevarying spectral shaping filters and are added together to form a fullband signal. An auditorybased approach to multiband voicing estimation for the mixed impulse/noise excitation. Aperiodic impulses due to pitch jitter, the creaky voice, and the diplophonic voice. Timevarying resonance bandwidth within a pitch period accounting for nonlinear source/system interaction and introducing the truncation effects. More accurate shape of the glottal flow velocity source. 4. February 11, 2012 Veton Kpuska 98 Mixed Excitation LPC (MELP) 2.4 kbps coder has been implemented based on the MELP model and has been selected as government standard for secure telephone communications. Original version of MELP uses: 34 bits for scalar quantization of the LPC coefficients (Specifically the line spectral frequencies LSFs). 8 bits for gain 7 bits for pitch and overall voicing In actual 2.4 kbs standard greater efficiency is achieved with vector quantization of LSF coefficients. Veton Kpuska 99 5bits to multiband voicing. 1bit for the jittery state (aperiodic) flag. 54 bits per 22.5 ms frame 2.4 bps. Uses autocorrelation technique on the lowpass filtered LPC residual. February 11, 2012 Mixed Excitation LPC (MELP) Line Spectral Frequencies (LSFs) More efficient parameter set for coding the allpole model of linear prediction. The LSFs for a pth order allpole model are defined as fo...
View Full Document
This note was uploaded on 02/10/2012 for the course ECE 3552 taught by Professor Staff during the Fall '10 term at FIT.
- Fall '10