CS 493: Algorithms for Massive Data Sets
Huffman and Arithmetic Coding
DATE : Thursday, 2/7/2002
Scribe: Chi Zhang
1
Review of last class
For message set
S
,
s
∈
S
has probability of
p
(
s
). Entropy of
S
is given as
H
(
S
) =
X
s
∈
S
p
(
s
) log
2
1
p
(
s
)
And log
2
1
p
(
s
)
is the self information of s.
For any uniquely decodable code
C
for
S
,
l
a
(
C
)
≥
H
(
S
).
Theorem 1.1
Given
S
,
∃
code
C
for
S
, such that
l
a
(
C
)
≤
H
(
S
) + 1
.
Proof.
Define
l
(
s
) =
d
log
2
1
p
(
s
)
e
.
X
s
∈
S
2

l
(
s
)
=
X
s
∈
S
2
d
log
2
1
p
(
s
)
e
≤
X
s
∈
S
p
(
s
) = 1
From KraftMcMillan inequality,
∃
prefix code
C
0
,
l
a
(
C
0
) =
X
s
∈
S
p
(
s
)
l
(
s
) =
X
s
∈
S
p
(
s
)
&
log
2
1
p
(
s
)
’
≤
X
s
∈
S
p
(
s
)
log
2
1
p
(
s
)
+ 1
!
=
H
(
S
) + 1
2
Huffman Code
Given a set of messages with probabilities
p
1
≤
p
2
≤
...
≤
p
n
, the Huffman code tree is
constructed by recursively combining subtrees:
1. Begin with
n
trees, each consists of a single node corresponding to one message word,
with the weight of
p
i
2. Repeat until there is only one tree
•
pick two subtrees with smallest weights
•
combine them by adding a new node as root, and make the two trees its children.
The weight of the new tree is the sum of the weight of two subtrees
With a heap, each step of combining tree takes
O
(log
n
) time, and the total time is
O
(
n
log
n
).
1
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Lemma 2.1
Suppose
C
is the optimal code for
S
,
p
1
, p
2
and
l
1
, l
2
are the probabilities and
code lengths of messages
s
1
and
s
2
, respectively. Then
p
1
> p
2
⇒
l
1
≤
l
2
.
Proof.
Suppose
p
1
> p
2
and
l
1
> l
2
, we swap the code words for
s
1
and
s
2
, and get a new code
C
0
. The length of
C
0
is
l
a
(
C
0
) =
l
a
(
C
)+
p
1
(
l
2

l
1
)+
p
2
(
l
1

l
2
) =
l
a
(
C
)+(
p
1

p
2
)(
l
2

l
1
)
< l
a
(
C
).
This contradicts the optimality of code
C
.
Lemma 2.2
Without loss of generality, the two messages of smallest probability occur as
siblings in the code tree for an optimal code.
Proof.
Given the code tree for an optimal code, we will show that it can always be modified
without increasing the average code length so that the two smallest probability nodes are
siblings. From Lemma 2.1, the smallest probability node must occur at the largest depth
in the code tree.
Note that the sibling of this node is also at the same depth.
Now the
sibling can be swapped with the second smallest probability node to obtain a code tree of
the desired structure. This transformation does not increase the average code length.
This is the end of the preview.
Sign up
to
access the rest of the document.
 Spring '09
 B.K.Dey
 Cryptography, Binary numeral system, Code word

Click to edit the document details