Discrete Mathematics and Probability Theory
Spring 2010
Alistair Sinclair
Note 8
Error Correcting Codes
Erasure Errors
We will consider two situations in which we wish to transmit information on an unreliable channel. The
ﬁrst is exempliﬁed by the internet, where the information (say a ﬁle) is broken up into packets, and the
unreliability is manifest in the fact that some of the packets are lost during transmission, as shown below:
Suppose that the message consists of
n
packets and suppose that at most
k
packets are lost during transmis
sion. We will show how to encode the initial message consisting of
n
packets into a redundant encoding
consisting of
n
+
k
packets such that the recipient can reconstruct the message from
any n
received packets.
Note that in this setting the packets are labeled and thus the recipient knows exactly which packets were
dropped during transmission.
We can assume without loss of generality that the contents of each packet is a number modulo
q
, where
q
is
a prime. For example, the contents of the packet might be a 32bit string and can therefore be regarded as a
number between 0 and 2
32

1; then we could choose
q
to be any prime larger than 2
32
. The properties of
polynomials over
GF
(
q
)
(i.e., with coefﬁcients and values reduced modulo
q
) are perfectly suited to solve
this problem and are the backbone of this errorcorrecting scheme. To see this, let us denote the message to
be sent by
m
1
, . . . ,
m
n
and make the following crucial observations:
1)
There is a unique polynomial
P
(
x
)
of degree
n

1 such that
P
(
i
) =
m
i
for 1
≤
i
≤
n
(i.e.,
P
(
x
)
contains
all of the information about the message, and evaluating
P
(
i
)
gives the contents of the
i

th
packet).
2)
The message to be sent is now
m
1
=
P
(
1
)
, . . . ,
m
n
=
P
(
n
)
. We can generate additional packets by eval
uating
P
(
x
)
at points
n
+
j
(remember, our transmitted message must be redundant, i.e., it must contain
more packets than the original message to account for the lost packets). Thus the transmitted message is
c
1
=
P
(
1
)
,
c
2
=
P
(
2
)
, . . . ,
c
n
+
k
=
P
(
n
+
k
)
. Since we are working modulo
q
, we must make sure that
n
+
k
≤
q
,
but this condition does not impose a serious constraint since
q
is very large.
3)
We can uniquely reconstruct
P
(
x
)
from its values at any
n
distinct points, since it has degree
n

1. This
means that
P
(
x
)
can be reconstructed from any
n
of the transmitted packets. Evaluating this reconstructed
polynomial
P
(
x
)
at
x
=
1
, . . . ,
n
yields the original message
m
1
, . . . ,
m
n
.
Example
Suppose Alice wants to send Bob a message of
n
=
4 packets and she wants to guard against
k
=
2 lost
packets. Then, assuming the packets can be coded up as integers between 0 and 6, Alice can work over
GF
(
7
)
(since 7
≥
n
+
k
=
6). Suppose the message that Alice wants to send to Bob is
m
1
=
3,
m
2
=
1,
m
3
=
5,
and
m
4
=
0. The unique polynomial of degree
n

1
=
3 described by these 4 points is