Discrete Mathematics for CS
Spring 2008
David Wagner
Note 11
Error Correcting Codes
Erasure Errors
We will consider two situations in which we wish to transmit information on an unreliable channel. The
first is exemplified by the internet, where the information (say a file) is broken up into fixed-length packets,
and the unreliability is manifest in the fact that some of the packets are lost during transmission, as shown
below:
Suppose that, in the absence of packet loss, it would take
n
packets to send the entire message—but in
practice up to
k
packets may be lost during transmission. We will show how to encode the initial message
consisting of
n
packets into a redundant encoding consisting of
n
+
k
packets such that the recipient can
reconstruct the message from any
n
received packets. We will assume that the packets are labelled and thus
the recipient knows exactly which packets were dropped during transmission.
In our scheme, the contents of each packet is a number modulo
q
, where
q
is a prime. The properties of
polynomials over
GF
(
q
)
(i.e., with coefficients and values reduced modulo
q
) are perfectly suited to solve
this problem and are the backbone to this error-correcting scheme. To see this, let us denote the message to
be sent by
m
1
, . . . ,
m
n
and make the following crucial observations:
1)
There is a unique polynomial
P
(
x
)
of degree
n
−
1 such that
P
(
i
) =
m
i
for
i
=
1
, . . . ,
n
(i.e.,
P
(
x
)
contains
all of the information about the message, and evaluating
P
(
i
)
gives the contents of the
i
-th packet). Therefore
we can consider the message to be given by the polynomial
P
(
x
)
.
2)
The message to be sent is now
m
1
=
P
(
1
)
, . . . ,
m
n
=
P
(
n
)
. We can generate additional packets by eval-
uating
P
(
x
)
at points
n
+
j
(remember, our transmitted message must be redundant, i.e., it must contain
more packets than the original message to account for the lost packets). Thus the transmitted message is
c
1
=
P
(
1
)
,
c
2
=
P
(
2
)
, . . . ,
c
n
+
k
=
P
(
n
+
k
)
. Since we are working modulo
q
, we must make sure that
n
+
k
≤
q
,
but this condition does not impose a serious constraint since
q
is typically very large.
3)
We can uniquely reconstruct
P
(
x
)
from its values at any
n
distinct points, since it has degree
n
−
1. This
means that
P
(
x
)
can be reconstructed from any
n
of the transmitted packets. Evaluating this reconstructed
polynomial
P
(
x
)
at
x
=
1
, . . . ,
n
yields the original message
m
1
, . . . ,
m
n
.
Example
Suppose Alice wants to send Bob a message of
n
=
4 packets and she wants to guard against
k
=
2 lost
packets. Then assuming the packets can be coded up as integers between 0 and 6, Alice can work over
GF
(
7
)
(since 7
≥
n
+
k
=
6). Suppose the message that Alice wants to send to Bob is
m
1
=
3,
m
2
=
1,
m
3
=
5,
and
m
4
=
0. The unique degree
n
−
1
=
3 polynomial described by these 4 points is
P
(
x
) =
x
3
+
4
x
2
+
5.
(You may want to verify that