Assignment 2 : Spanning Tree Algorithms
Note : This assignment is to be done individually. Group working is not allowed. Use the FibHeap Library built in first assignment to perform the work. Aim : To verify the known bounds of various spanning tree algor
(3)
j =0
nj2)
j =0
nj ) + 2E (
j =0
2
)
m1
nj
= n + 2E ( # collisions)
n 1
= n+2
2m
n + (n 1) 2n
= E(
m1
9020
This is a rough argument. Making the odds higher and counting more precisely it is
convenient and works with 6n.
E(
n1
What is the space consu
Perfect Hashing
(2)
A table of size n2 makes it easy to find a perfect hash function. Theorem 1. If we store n keys in a hash table of size m = n 2 using a hash function h randomly chosen from a universal class of hash functions, then the probability of t
Perfect Hashing
The ultimate combination of the the ideas presented above leads to perfect hashing.
In (static) perfect hashing we can achieve a worst case search time of O(1) while using only O(n) space. This is achieved by a clever two step hashing sche
(2)
1m1
dx
mn x
m
1
ln
=
mn
1
1
ln
=
1
m
m1
=
n i =mn+1 i
1 n1 m
=
n i =0 m i
9017
Hence, if the table is half full, the expected number of probes in a successful search
1
1
is 0.5 ln 0.5 = 1.387.
1 n1
1
n i =0 1 i /m
Proof. A successful search has the
9016
Theorem. Given an open address hash table with load factor = n/m < 1, the
1
1
expected number of probes in a successful search is at most ln 1 , assuming
uniform hashing and assuming that each key in the table is equally likely to be
searched for.
An
Analysis of open addressing
Hence, if the table is half full, at most 2 probes will be required on average, but if it is 80% full, then on average up to 5 probes are needed.
Corollary. Inserting an item into an open-address hash table with load factor 1 r
Analysis of open addressing
(2)
Now define qi = Pr ( at least i probes access occupied slots), then
Proof. Define pi = Pr ( exactly i probes access occupied slots ) for i = 0, 1, 2, . (Note that for i > n, pi = 0). The expected number of probes is then 1
9013
Theorem.
Given an open address hash table with load factor = n/m < 1, the expected num1
ber of probes in an unsuccessful search is at most 1 , assuming simple uniform
hashing.
Analysis of open addressing
(7)
9012
Double hashing is an improvement over linear and quadratic probing in that (m 2)
sequences are used rather than (m) since every (h1(k ), h2(k ) pair yields a distinct probe sequence, and the initial probe position, h1(k ), and offset h2(k ) vary
Perfect Hashing
(4)
The hash function used in perfect hashing is of the form hk (x) = (kx mod p) mod s, where p is a prime. It was introduced and analyzed in the paper of Fredman, Komlos, and Szemeredi in 1984. A proof that it is universal is similar to t
mod m
for i = 0, 1, . , m 1 ,
(5)
9010
Quadratic probing is better than linear probing, because it spreads subsequent
probes out from the initial probe position. However, when two keys have the same
initial probe position, their probe sequences are the sa
(4)
9009
Thus, runs of occupied slots tend to get longer, and linear probing is not a very good
approximation to uniform hashing.
Clusters are likely to arise, since if an empty slot is preceded by i full slots, then the
probability that the empty slot is
Probe sequences
(3)
For example, if we have n = m/2 keys in the table, where every even-indexed slot is occupied and every odd-indexed slot is free, then the average search time takes 1.5 probes.
If the first n = m/2 locations are the ones occupied, howev
(2)
mod m
for i = 0, 1, . , m 1 .
9007
This methods is easy to implement but suffers from primary clustering , that is, two
hash keys that hash to different locations compete with each other for successive
rehashes. Hence, long runs of occupied slots buil
9006
These techniques guarantee that h(k , 0), h(k , 1), . , h(k , m 1) is a permutation
of 0, 1, . , m 1 for each k , but none fulllls the assumption of uniform hashing,
since none can generate more than m 2 sequences.
3. double hashing
2. quadratic prob
9006
These techniques guarantee that h(k , 0), h(k , 1), . , h(k , m 1) is a permutation
of 0, 1, . , m 1 for each k , but none fulllls the assumption of uniform hashing,
since none can generate more than m 2 sequences.
3. double hashing
2. quadratic prob
Open addressing
(2)
The main problem with open addressing is the deletion of elements. We cannot simply set an element to NIL, since this could break a probe sequence for other elements in the table.
It is possible to use a special purpose marker instead
9004
is considered. If no free position is found in the sequence the hash table overows.
h(k , 0), h(k , 1), . , h(k , m 1)
For every key k the probe sequence
h : U cfw_0, 1, . , m 1 cfw_0, 1, . , m 1
The hash function is redened as
To perform an insertio
Each pair of keys x and y collides for exactly m r values of a, once for each possible
value of a1, a2, . , ar . Hence, out of mr +1 combinations of a0, a1, a2, . , ar , there
are exactly mr collisions of x and y , and hence the probability that x and y c
(4)
i =1
ai y i
mod m .
ai x i
mod m .
mod m = 0
i =0
r
ai (xi yi )
mod m =
a0(x0 y0)
i =0
r
ai x i
r
a0
i =1
(ai xi ) (x0 y0)1
r
mod m .
9003
Note that m is prime and (x0 y0) is non-zero, hence it has a (unique) multiplicative
inverse modulo m. Multipl
(6)
9023
Mehlhorn et al showed that you can also use a simple doubling technique in conjunction with static perfect hashing, such that you can construct a dynamic hash
table that support insertion, deletion and lookup time in expected, amortized time
O (1
Probe sequences
(6)
Double hashing is one of the best open addressing methods, because the permutations produced have many characteristics of randomly chosen permutations. It uses a hash function of the form mod m for i = 0, 1, . , m - 1 ,
h(k , i) = (h1(