Algorithms
NonLecture E: Tail Inequalities
If you hold a cat by the tail
you learn things you cannot learn any other way.
— Mark Twain
E
Tail Inequalities
?
The simple recursive structure of skip lists made it relatively easy to derive an upper bound on the
expected
worstcase
search time, by way of a stronger highprobability upper bound on the worstcase
search time. We can prove similar results for treaps, but because of the more complex recursive structure,
we need slightly more sophisticated probabilistic tools. These tools are usually called
tail inequalities
;
intuitively, they bound the probability that a random variable with a bellshaped distribution takes a
value in the
tails
of the distribution, far away from the mean.
E.1
Markov’s Inequality
Perhaps the simplest tail inequality was named after the Russian mathematician Andrey Markov; however,
in strict accordance with Stigler’s Law of Eponymy, it first appeared in the works of Markov’s probability
teacher, Pafnuty Chebyshev.
1
Markov’s Inequality.
Let
X
be a nonnegative integer random variable.
For any
t
>
0
, we have
Pr
[
X
≥
t
]
≤
E
[
X
]
/
t
.
Proof:
The inequality follows from the definition of expectation by simple algebraic manipulation.
E
[
X
] =
∞
X
k
=
0
k
·
Pr
[
X
=
k
]
[
definition of E
[
X
]
]
=
∞
X
k
=
0
Pr
[
X
≥
k
]
[
algebra
]
≥
t

1
X
k
=
0
Pr
[
X
≥
k
]
[
since
t
<
∞
]
≥
t

1
X
k
=
0
Pr
[
X
≥
t
]
[
since
k
<
t
]
=
t
·
Pr
[
X
≥
t
]
[
algebra
]
Unfortunately, the bounds that Markov’s inequality implies (at least directly) are often very weak,
even useless. (For example, Markov’s inequality implies that with high probability, every node in an
n
node treap has depth
O
(
n
2
log
n
)
. Well,
duh!
) To get stronger bounds, we need to exploit some
additional structure in our random variables.
1
The closely related tail bound traditionally called Chebyshev’s inequality was actually discovered by the French statistician
IrénéeJules Bienaymé, a friend and colleague of Chebyshev’s.
1
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Algorithms
NonLecture E: Tail Inequalities
E.2
Sums of Indicator Variables
A set of random variables
X
1
,
X
2
,...,
X
n
are said to be
mutually independent
if and only if
Pr
n
^
i
=
1
(
X
i
=
x
i
)
=
n
Y
i
=
1
Pr
[
X
i
=
x
i
]
for all possible values
x
1
,
x
2
,...,
x
n
. For examples, different flips of the same fair coin are mutually
independent, but the number of heads and the number of tails in a sequence of
n
coin flips are not
independent (since they must add to
n
). Mutual independence of the
X
i
’s implies that the expectation of
the product of the
X
i
’s is equal to the product of the expectations:
E
n
Y
i
=
1
X
i
=
n
Y
i
=
1
E
[
X
i
]
.
This is the end of the preview.
Sign up
to
access the rest of the document.
 Spring '09
 A
 Probability theory, Chernoff

Click to edit the document details