reason for his is that the value of
that we have chosen is too large. By choosing a smaller
and a larger
n
in
the definitions of
A
(
n
)
, we can get the probability of error to be as small as we want as long as the rate of the
code is less than
I
(
X
;
Y
)
−
3 .
Also note that the decoding procedure described in the problem is not optimal. The optimal decoding procedure
is maximum likelihood, i.e., to choose the codeword that is closest to the received sequence. It is possible to
calculate the average probability of error for a random code for which the decoding is based on an approximation
to maximum likelihood decoding, where we decode a received sequence to the unique codeword that differs
from the received sequence in
≤
4 bits, and declare an error otherwise. The only difference with the jointly
typical decoding described above is that in the case when the codeword is equal to the received sequence! The

——————————————————————————————————————————————————
Problem 7.13, Cover and Thomas
Channel Capacity.
Calculate the channel capacity of the following channels with probability transition matrices:
a.
X
=
Y
=
{
0
,
1
,
2
}
.
p
(
y/x
)
=
⎡
⎢
⎢
⎣
1
/
3
1
/
3
1
/
3
1
/
3
1
/
3
1
/
3
1
/
3
1
/
3
1
/
3
⎤
⎥
⎥
⎦
p
(
x, y
)
=
p
(
y/x
)
·
p
(
x
) =
⎡
⎢
⎢
⎣
1
/
9
1
/
9
1
/
9
1
/
9
1
/
9
1
/
9
1
/
9
1
/
9
1
/
9
⎤
⎥
⎥
⎦
I
(
X
;
Y
)
=
x
∈
X
y
∈
Y
p
(
x, y
)
·
log
p
(
x, y
)
p
(
x
)
p
(
y
)
= 0
Hence
,
C
=
max
p
(
x
)
I
(
X
;
Y
) = 0
b.
X
=
Y
=
{
0
,
1
,
2
}
.
p
(
y/x
)
=
⎡
⎢
⎢
⎣
1
/
2
1
/
2
0
0
1
/
2
1
/
2
1
/
2
0
1
/
2
⎤
⎥
⎥
⎦
p
(
x, y
)
=
p
(
y/x
)
·
p
(
x
) =
⎡
⎢
⎢
⎣
1
/
6
1
/
6
0
0
1
/
6
1
/
6
1
/
6
0
1
/
6
⎤
⎥
⎥
⎦
I
(
X
;
Y
)
=
x
∈
X
y
∈
Y
p
(
x, y
)
·
log
p
(
x, y
)
p
(
x
)
p
(
y
)
=
.
5850
Hence
,
C
=
max
p
(
x
)
I
(
X
;
Y
) =
.
5850
c.
X
=
Y
=
{
0
,
1
,
2
,
3
}
.
p
(
y/x
)
=
⎡
⎢
⎢
⎢
⎢
⎢
⎣
p
1
−
p
0
0
1
−
p
p
0
0
0
0
q
1
−
q
0
0
1
−
q
q
⎤
⎥
⎥
⎥
⎥
⎥
⎦
C
=
max
p
(
x
)
I
(
X
;
Y
)
I
(
X
;
Y
)
=
H
(
Y
)
−
H
(
Y/X
) =
H
(
X
)
−
H
(
X/Y
)
H
(
Y/X
)
=
4
i
=1
p
(
x
i
)
H
(
Y/x
i
)
=
α
2
H
(
p
) +
α
2
H
(
p
) +
1
−
α
2
H
(
q
) +
1
−
α
2
H
(
q
)
=
αH
(
p
) + (1
−
α
)
H
(
q
)
H
(
Y
)
=
−
2
·
α
2
p
+
α
2
(1
−
p
) log
α
2
p
+
α
2
(1
−
p
)
−
2
·
1
−
α
2
q
+
1
−
α
2
(1
−
q
)
log
1
−
α
2
q
+
1
−
α
2
(1
−
q
)

H
(
Y
)
=
1 +
H
(
α
)
Need to maximize I(X; Y)
=
1 +
H
(
α
)
−
[
αH
(
p
) + (1
−
α
)
H
(
q
)]
∂I
(
X
;
Y
)
∂α
=
∂H
(
α
)
∂α
−
H
(
p
) +
H
(
q
)
0
=
log
1
−
α
α
−
H
(
p
) +
H
(
q
)
α
=
2
H
(
q
)
2
H
(
p
)
+ 2
H
(
q
)
Let D
=
2
H
(
p
)
+ 2
H
(
q
)
C
=
1 +
H
(
2
H
(
q
)
D
)
−
2
H
(
q
)
D
H
(
p
)
−
2
H
(
p
)
D
H
(
q
)
=
1
−
2
H
(
q
)
D
log
2
H
(
q
)
D
−
2
H
(
p
)
D
log
2
H
(
p
)
D
−
2
H
(
q
)
D
H
(
p
)
−
2
H
(
p
)
D
H
(
q
)
=
1
−
H
(
q
)
−
H
(
p
) + log
D
=
1
−
H
(
q
)
−
H
(
p
) + log(2
H
(
p
)
+ 2
H
(
q
)
)
C
=
log 2
(1
−
H
(
p
))
+ 2
(1
−
H
(
p
))
Or 2
C
=
2
(1
−
H
(
q
))
+ 2
(1
−
H
(
p
))

#### You've reached the end of your free preview.

Want to read all 13 pages?

- Spring '10
- sd
- Information Theory, Probability theory