Massachusetts Institute of Technology
Handout 8
6.854J/18.415J: Advanced Algorithms
Wednesday, October 7, 2009
David Karger
Problem Set 3 Solutions
Wednesday, October 7, 2009
Problem 1.
(a)
Consider the (
k
+ 1)
st
item inserted. Since only
k
buckets (at
worst) are occupied, the probability that
both
candidate locations are occupied is
only (
k/n
1
/
5
)
2
. Thus, the expected number of times an item is actually inserted
into an alreadyoccupied bucket is at most
n
−
1
s
k
=0
(
k/n
1
.
5
)
2
=
(
n
−
1)(
n
)(2
n
−
1)
6
n
3
≤
1
/
3
Now let’s consider pairwise collisions. Item
k
collides with item
j < k
only
if (i) one of the candidate locations of item
k
is the location of item
j
(this
has probability at most 2
/n
1
.
5
) and (ii) the other candidate location for item
k
contains at least one element (probability
k/n
1
.
5
). Thus, the probability
k
collides with
j
is at most
k/n
3
. Summing over the
k
possible values of
j < k
,
we Fnd the expected number of collisions for item
k
is at most
k
2
/n
3
. Summing
over all
k
, we get the same result as above:
O
(1) expected collisions.
(b)
Start with a 2universal family of hash functions mapping
n
items to 2
n
1
.
5
lo
cations. Consider any particular set of
n
items. Consider choosing a random
function from the hash family. The probability that item
k
collides with item
j
is 1
/
2
n
1
.
5
by pairwise independence, implying by the union bound that the
probability
k
collides with
any
item is at most 1
/
2
√
n
.
Now suppose that we allocate
two
arrays of size 2
n
1
.
5
and choose a random 2
universal hash function from the family independently for each array. If an item
has no collision in
either
array, then it will be placed in an empty bucket by
the bash function. We need merely analyze the probability that this happens for
every item (this would make the bash function perfect).
The probability that item
k
has a collision in
both
arrays is at most (1
/
2
√
n
)
2
=
1
/
4
n
. It follows that the expected number of items colliding with some other
item is at most 1
/
4. This implies in turn that with probability 3/4, every item is
placed in an empty bucket by the (perfect) bash function. This in turn implies
that
some
pair of 2universal hash functions deFnes a perfect bash for our set of
n
items.
Since every set of items gets a perfect bash from this scheme, it follows that the
family of pairs of 2universal functions above is a perfect bash family. Since the
2universal family has size polynomial in the universe, so does the family of pairs
of 2universal functions.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document2
Handout 8: Problem Set 3 Solutions
(c)
When we sample a hash function from the above 2universal family, we get a 3
/
4
probability of having no collisions. It follows that if we make 2 or more attempts,
we can expect to Fnd a collisionfree hash function.
This is the end of the preview.
Sign up
to
access the rest of the document.
 Fall '09
 DavidKarger
 Algorithms, Graph Theory, hash function, edges, Cryptographic hash function

Click to edit the document details