1
1
Digital Speech Processing—
Lecture 16
Speech Coding Methods
Based on Speech Waveform
Representations and
Speech Models—Adaptive
and Differential Coding
Speech Waveform CodingSummary of
Part 1
12
3
2
3
0
8
/
()

σ
πσ
−
⎡⎤
=
=∞
⎢⎥
⎣⎦
x
x
x
px
e
p
x
2
11
0
22
−
==
x
x
xx
e
p
1. Probability density function for speech samples
Gamma
Laplacian
2. Coding paradigms
•
uniform
 divide interval from
+X
max
to
–X
max
into
2
B
intervals of length
∆
=(2X
max
/2
B
)
for a
B
bit quantizer
∆
∆∆
∆
∆
∆
∆
X
max
=4
x
+X
Speech Waveform CodingSummary of
Part 1
∆
∆
∆
∆
∆
6
4 77
20
10
20
10
26
0
2
4
6
max
max
max
max
ˆ
[]
[] []
.l
o
g
sensitivity to
/
(
varies a lot!!!)
not great use of bits for actual speech densities!
log
(uniform) (B=8)
..
x
xn
xn en
X
SNR
B
X
XX
SNR
σσ
=+
=+ −
±
±
75
4
1204
4073
8
1806
3471
16
24 08
28 69
32
30 10
22 67
64
36 12
16 65
max
max
(or equivalently
) varies a lot across sounds, speakers, environments
need to adapt coder ( [ ]) to time varying
or
key
x
x
X
nX
Δ
±
±
±
question is how to adapt

∆
/2
∆
/2
1/
∆
p(e)
30 dB loss as
X
max
/
σ
x
varies
over a 32:1 range
4
Speech Waveform CodingSummary of Part 1
[
]
1
1
max
max
[ ]
[ ]
[]

log
[ ]
log(
)
yn F xn
X
X
sign x n
μ
=
+
=⋅
+
•
pseudologarithmic
(constant percentage error)
 compress
x
[
n
] by pseudologarithmic compander
 quantize the companded
x
[
n
] uniformly
 expand the quantized signal
max
max
large  [ ] 

 [ ] 
log
log
X
yn
X
•
⎡
⎤
≈⋅
⎢
⎥
⎣
⎦
5
Speech Waveform CodingSummary of
Part 1
2
10
10
6
4 77
20
1
10
1
2
max
max
max
(
)
.
log
ln(
)
log
insensitive to
/
over a wide range for large
μσ
σμ
⎛⎞
+−
+
+
⎜⎟
⎝⎠
±
x
SNR dB
B
X
•
maximum
SNR
coding
— match signal quantization intervals to
model probability distribution (Gamma, Laplacian)
• interesting—at least theoretically
6
Adaptive Quantization
• linear quantization =>
SNR
depends on
σ
x
being
constant (this is clearly not the case)
• instantaneous companding =>
SNR
only weakly
dependent on
X
max
/
σ
x
for large
μ
law compression (100
500)
• optimum
SNR
=> minimize
σ
e
2
when
σ
x
2
is known, non
uniform distribution of quantization levels
Quantization dilemma
: want to choose quantization step
size large enough to accomodate maximum peakto
peak range of
x
[
n
]; at the same time need to make the
quantization step size small so as to minimize the
quantization error
– the nonstationary nature of speech (variability across sounds,
speakers, backgrounds) compounds this problem greatly
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document2
7
Solutions to Quantization Dilemna
•
Solution 1
le
t
∆
vary to
match the variance of the
input signal =>
∆
[
n
]
•
Solution 2
use
±a
±
variable gain,
G
[
n
]
,
followed by a fixed
quantizer step size,
∆
=>
keep signal variance of
y
[
n
]
=G
[
n
]
x
[
n
] constant
Case 1
:
∆
[
n
] proportional to
σ
x
=> quantization levels and ranges would
be linearly scaled to match
σ
x
2
=> need to reliably estimate
σ
x
2
Case 2
: G[
n
] proportional to
1/
σ
x
to give
σ
y
2
≈
constant
•
need reliable estimate of
σ
x
2
for both types of adaptive quantization
Adaptive Quantization
:
8
Types of Adaptive Quantization
• instantaneousamplitude changes reflect sample
This is the end of the preview. Sign up
to
access the rest of the document.
 Fall '08
 Rabiner,L

Click to edit the document details