This preview shows pages 1–2. Sign up to view the full content.
04/29/09
23:28:06
1
39
CS61B: Lecture 39
Wednesday, April 29, 2009
RANDOMIZED ANALYSIS
===================
Randomized analysis, like amortized analysis, is a mathematically rigorous way
of saying, "The worstcase running time of this operation is slow, but nobody
cares, because on average it runs fast."
Unlike amortized analysis, the
"average" is taken over an infinite number of runs of the program.
Randomized algorithms make decisions based on rolls of the dice.
The random
numbers actually help to keep the running time low.
A randomized algorithm can
occasionally run more slowly than expected, but the probability that it will
run _asymptotically_ slower is extremely low.
The randomized algorithms we’ve studied are quicksort and quickselect.
Hash
tables can also be modeled probabilistically, though.
Expectation

Suppose a method x() flips a coin.
If the coin comes up heads, x() takes one
second to execute.
If it comes up tails, x() takes three seconds.
Let X be the exact running time of one call to x().
With probability 0.5,
X is 1, and with probability 0.5, X is 3.
For obvious reasons, X is called a
_random_variable_.
The _expected_ value of X is the average value X assumes in an infinite
sequence of coin flips,
E[X] = 0.5 * 1 + 0.5 * 3 = 2 seconds expected time.
Suppose we run the code sequence
x();
// takes time X
x();
// takes time Y
and let Y be the running time of the _second_ call.
The total running time is
T = X + Y.
(Y and T are also random variables.)
What is the expected total
running time E[T]?
The main idea from probability we need is called _linearity_of_expectation_,
which says that expected running times sum linearly.
E[X + Y] = E[X] + E[Y]
= 2 + 2
= 4 seconds expected time.
The interesting thing is that linearity of expectation holds true whether or
not X and Y are _independent_.
Independence means that the first coin flip has
no effect on the outcome of the second.
If X and Y are independent, the code
will take four seconds on average.
But what if they’re not?
Suppose the
second coin flip always matches the firstwe always get two heads, or two
tails.
Then the code still takes four seconds on average.
If the second coin
flip is always the opposite of the firstwe always get one head and one tail
the code still takes four seconds on average.
So if we determine the expected running time of each individual operation, we
can determine the expected running time of a whole program by adding up the
expected costs of all the operations.
Hash Tables

The implementations of hash tables we have studied don’t use random numbers,
but we can model the effects of collisions on running time by pretending we
have a random hash code.
A _random_hash_code_ maps each possible key to a number that’s chosen randomly.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
This is the end of the preview. Sign up
to
access the rest of the document.
This note was uploaded on 02/21/2010 for the course CS 61B taught by Professor Canny during the Spring '01 term at University of California, Berkeley.
 Spring '01
 Canny
 Data Structures

Click to edit the document details