Code - CODE.PPT(4/15/2002) 6.1 CODE.PPT(4/15/2002) 6.2...

Info iconThis preview shows pages 1–4. Sign up to view the full content.

View Full Document Right Arrow Icon
Page 6. Speech Coding E.4.14 – Speech Processing CODE.PPT(4/15/2002) 6.1 Lecture 8 Speech Coding – Objectives of Speech Coding – Quality versus bit rate – Uniform Quantization • Quantisation Noise – Non-uniform Quantisation – Adaptive Quantization Encode s(n) Decode ) ( ˆ n s CODE.PPT(4/15/2002) 6.2 Speech Coding Speech Coding Objectives : – High perceived quality – High measured intelligibility – Low bit rate (bits per second of speech) – Low computational requirement (MIPS) – Robustness to successive encode/decode cycles – Robustness to transmission errors Objectives for real-time only: – Low coding/decoding delay (ms) – Work with non-speech signals (e.g. touch tone)
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Page 6. Speech Coding E.4.14 – Speech Processing CODE.PPT(4/15/2002) 6.3 Subjective Assessment – MOS (Mean Opinion Score): a panel of test listeners rates the quality 1 to 5. 1=bad, 2=poor, 3=fair, 4=good, 5=excellent. – DRT (Diagnostic Rhyme Test): listeners choose between pairs of rhyming words (e.g. veal/feel) – DAM (Diagnostic Acceptability Measure): Trained listeners judge various factors e.g. muffledness, buzziness, intelligibility Quality versus data rate (8kHz sample rate) 2 (poor) 3 (fair) 4 (good) 5 (excellent) Mean Opinion Score 2k 4k 8k 16k 32k 64k 128k Bits per second PCM16 A/μ law 1972 ADPCM 1984 LD-CELP 1994 CELP 1996 GSM 1987 GSM2 1994 CELP 1991 MELP 1996 LPC10 1984 music: CODE.PPT(4/15/2002) 6.4 Objective Assessment Not terribly closely related to subjective quality. Segmental SNR : average value of 10log 10 ( E speech / E error ) evaluated in 20 ms frames. No good for coding schemes that can introduce delays as a one-sample time shift causes a huge decrease in SNR. Spectral distances Based on the power spectrum of the original and coded- decoded speech signals, P ( ω ) and Q ( ω ). <…> denotes the average over the frequency range 0 … 2 π . Itakura Distance Itakura-Saito Distance Both distances are evaluated over 20 ms frames and averaged. If you do LPC analysis on the original and coded speech, both these distances may also be calculated directly from the LPC coefficients. Neither is a true distance metric since they are asymmetrical: ) ( ) ( log ) ( ) ( log ω Q P Q P 1 ) ( ) ( log ) ( ) ( Q P Q P ( ) ( ) P Q d Q P d , ,
Background image of page 2
Page 6. Speech Coding E.4.14 – Speech Processing CODE.PPT(4/15/2002) 6.5 Quantization Errors x dx w x w w w w w w w 2 32 31 2 0289 + + = =⇒ = ½ ½ ½ ½ . rms error Input & Output Signals Quantisation Error Coding/Decoding introduces a quantisation error of ±½ w If input values are uniformly distributed within the bin, the mean square quantisation error is: If the quantisation levels are uniformly spaced, w is the separation between adjacent output values and is called a Least Significant Bit (LSB) CODE.PPT(4/15/2002) 6.6 Linear PCM – Pulse Code Modulation SNR rms of maximal sine wave rms of quantisationnoise dB = =+ 20 20 0354 2 20 122 20 2 176 602 10 10 10 10 log log .
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 4
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 01/22/2011 for the course MICD 410 taught by Professor Romelsudah during the Spring '10 term at Oxford University.

Page1 / 15

Code - CODE.PPT(4/15/2002) 6.1 CODE.PPT(4/15/2002) 6.2...

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online