1
1
Digital Speech Processing—
Lecture 16
Speech Coding Methods
Based on Speech Waveform
Representations and
Speech Models—Adaptive
and Differential Coding
Speech Waveform CodingSummary of
Part 1
12
3
2
3
0
8
/
()

σ
πσ
−
⎡⎤
=
=∞
⎢⎥
⎣⎦
x
x
x
px
e
p
x
2
11
0
22
−
==
x
x
xx
e
p
1. Probability density function for speech samples
Gamma
Laplacian
2. Coding paradigms
•
uniform
 divide interval from
+X
max
to
–X
max
into
2
B
intervals of length
∆
=(2X
max
/2
B
)
for a
B
bit quantizer
∆
∆∆
∆
∆
∆
∆
X
max
=4
x
+X
Speech Waveform CodingSummary of
Part 1
∆
∆
∆
∆
∆
6
4 77
20
10
20
10
26
0
2
4
6
max
max
max
max
ˆ
[]
[] []
.l
o
g
sensitivity to
/
(
varies a lot!!!)
not great use of bits for actual speech densities!
log
(uniform) (B=8)
..
x
xn
xn en
X
SNR
B
X
XX
SNR
σσ
=+
=+ −
±
±
75
4
1204
4073
8
1806
3471
16
24 08
28 69
32
30 10
22 67
64
36 12
16 65
max
max
(or equivalently
) varies a lot across sounds, speakers, environments
need to adapt coder ( [ ]) to time varying
or
key
x
x
X
nX
Δ
±
±
±
question is how to adapt

∆
/2
∆
/2
1/
∆
p(e)
30 dB loss as
X
max
/
σ
x
varies
over a 32:1 range
4
Speech Waveform CodingSummary of Part 1
[
]
1
1
max
max
[ ]
[ ]
[]

log
[ ]
log(
)
yn F xn
X
X
sign x n
μ
=
+
=⋅
+
•
pseudologarithmic
(constant percentage error)
 compress
x
[
n
] by pseudologarithmic compander
 quantize the companded
x
[
n
] uniformly
 expand the quantized signal
max
max
large  [ ] 

 [ ] 
log
log
X
yn
X
•
⎡
⎤
≈⋅
⎢
⎥
⎣
⎦
5
Speech Waveform CodingSummary of
Part 1
2
10
10
6
4 77
20
1
10
1
2
max
max
max
(
)
.
log
ln(
)
log
insensitive to
/
over a wide range for large
μσ
σμ
⎛⎞
+−
+
+
⎜⎟
⎝⎠
±
x
SNR dB
B
X
•
maximum
SNR
coding
— match signal quantization intervals to
model probability distribution (Gamma, Laplacian)
• interesting—at least theoretically
6
Adaptive Quantization
• linear quantization =>
SNR
depends on
σ
x
being
constant (this is clearly not the case)
• instantaneous companding =>
SNR
only weakly
dependent on
X
max
/
σ
x
for large
μ
law compression (100
500)
• optimum
SNR
=> minimize
σ
e
2
when
σ
x
2
is known, non
uniform distribution of quantization levels
Quantization dilemma
: want to choose quantization step
size large enough to accomodate maximum peakto
peak range of
x
[
n
]; at the same time need to make the
quantization step size small so as to minimize the
quantization error
– the nonstationary nature of speech (variability across sounds,
speakers, backgrounds) compounds this problem greatly
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document2
7
Solutions to Quantization Dilemna
•
Solution 1
le
t
∆
vary to
match the variance of the
input signal =>
∆
[
n
]
•
Solution 2
use
±a
±
variable gain,
G
[
n
]
,
followed by a fixed
quantizer step size,
∆
=>
keep signal variance of
y
[
n
]
=G
[
n
]
x
[
n
] constant
Case 1
:
∆
[
n
] proportional to
σ
x
=> quantization levels and ranges would
be linearly scaled to match
σ
x
2
=> need to reliably estimate
σ
x
2
Case 2
: G[
n
] proportional to
1/
σ
x
to give
σ
y
2
≈
constant
•
need reliable estimate of
σ
x
2
for both types of adaptive quantization
Adaptive Quantization
:
8
Types of Adaptive Quantization
• instantaneousamplitude changes reflect sample
This is the end of the preview.
Sign up
to
access the rest of the document.
 Fall '08
 Rabiner,L
 Digital Signal Processing, Signal Processing, quantization, Adaptive Quantization

Click to edit the document details