Introduction to Information Theory (67548)
December 30, 2008
Assignment 2  Solution
Lecturer: Prof. Michael Werman
Due:
Note: Unless speciﬁed otherwise, all entropies and logarithms should be taken with base
2
.
Problem 1 AEP and Source Coding
1. The number of sequences with three of fewer x’s is
±
100
0
²
+
..
+
±
100
3
²
= 166751
Using binary codewords of length
n
provides as with 2
n
possible codewords. The smallest
n
such
that 2
n
≥
166751 is 18.
2. This is the probability that out of 100 letters, 4 or more are
x
0
s
. This is simply one minus the
probability that there will be at most 3 letters. Recalling the formula for a binomial distribution,
we have that this is equal to
1

±
100
0
²
p
0
(1

p
)
1
00

±
100
1
²
p
1
(1

p
)
9
9

±
100
2
²
p
2
(1

p
)
9
8

±
100
3
²
p
3
(1

p
)
9
7
,
where
p
= 0
.
005. This is equal to 0
.
0017.
3. Let
X
1
,...,X
100
denote random variables, such that
X
i
= 1 if letter
i
in the sequence is ’x’, and
0 otherwise. Note that
E
[
X
i
] = 0
.
005, and Var(
X
i
) =
E
[
X
2
i
]

(
E
[
X
i
])
2
=
E
[
X
i
]

(
E
[
X
i
])
2
=
0
.
004975. Therefore,
E
[
X
1
+
...,X
100
] = 100
E
[
X
1
] = 0
.
5, and Var(
X
1
+
...
+
X
100
) = 100Var(
X
1
) =
100(
E
[
X
2
1
]

E
2
[
X
1
]) = 0
.
4975. Applying Chebyshev’s inequality, we have that
Pr(
X
1
+
...
+
X
100
≥
4)
≤
Pr(

(
X
1
+
...
+
X
100
)

0
.
5

>
3
.
5)
<
0
.
4975
3
.
5
2
≈
0
.
041
.
We see that this bound is much looser than the exact probability we have computed in the previous
question.
4. (a) The most eﬃcient code is simply to code, say, ’x’ with the bit 0, and
y
with the bit 1. The
expected code length per letter is of course 1.
(b) For the letter ’y’, its respective codeword should be of length
d
log
2
(0
.
995)
e
= 1, and for the
letter ’x’ the length should be
d
log
2
(0
.
005)
e
= 8. The expected code length is 0
.
005
*
8 +
0
.
995
*
1 = 1
.
035, which is worse than the simple code of the previous question, due to the
needless use of a long codeword for the letter ’x’.
(c) We have seen in class that when we use such a code by aggregating
n
letters together, the re
sulting code has expected codeword length per letter of between
H
(
X
) and
H
(
X
)+1
/n
, where
X
is the entropy of the source. In our case, the entropy of the source is
H
((0
.
995
,
0
.
005)) =
0
.
045 bits, and