This preview shows pages 1–4. Sign up to view the full content.
1
Lecture 24:
Advanced Hashing
PIC 10B
Todd Wittman
Note:
We'll end 10 minutes
early today for evaluations.
Hash Table Choices
s
Hash tables are a very powerful data structure, but unlike the other
containers we've studied they have parameters / choices of methods
that need to be determined.
s
How many buckets M should we use?
b
More buckets = More memory = Faster runtime
s
What hash function should we use?
b
A hash function has 2 basic steps.
b
First we need to map the key to a
(large)
integer x.
s
e.g. we could add up the ASCII values of the chars in a string
b
Second we need to convert the integer x to an index in the range
[0,M1]
s
Division Method
:
x%M
s
Multiplication Method
:
(int) M * ( A*x  (int) (A*x) )
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document 2
Choosing the Number of Buckets
s
To choose the number of
buckets, you have to ask:
b
How fast do searches need to
be?
b
How much memory do I have
available?
s
In our example for 14
students at Jedi Academy, it
takes a large number of
buckets to get down to zero
collisions.
s
Can you modify our program so that it finds the
smallest number of buckets needed for 0 collisions?
Choosing the Hash Function
s
First we need to decide how to convert the key to an integer x.
s
Preferably a large integer bigger than the # buckets M, so we don't
map to just one part of our table.
s
For strings, we added up the ASCII values of each character.
for (int i=0; i < s.length(); i++)
x += (int) s[i];
s
But with this method, the order of the characters does not matter.
So permutations map to the same integer.
H("Yoda") = H("odaY")
s
A common solution is to weight the chars.
For example, we could fix
a constant 1<B<2 and compute
for (int i=0; i<s.length(); i++)
x += (int) pow(B,i) * s[i];
3
Choosing the Hash Function
s
After we have an integer x, we need to restrict it to the range [0,M1].
s
There are 2 popular methods for doing this.
s
Division Method
:
Take x mod the # of buckets
index = x % M
s
Multiplication Method
:
Fix a constant 0<A<1.
Then take the
fractional part of A*x and multiply that by M.
Then round the result
down to the nearest integer.
index = (int) M * ( A*x  (int) (A*x) )
s
The value of A needs to be set experimentally.
Knuth recommends:
s
Can you modify our program so that it uses the multiplication
method?
(Note you are asked to do this for HW9.)
2
1
5

=
A
Are Hash Tables Used in Real
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
This is the end of the preview. Sign up
to
access the rest of the document.
This note was uploaded on 04/27/2010 for the course PIC 15705120 taught by Professor Wittman during the Winter '10 term at UCLA.
 Winter '10
 Wittman

Click to edit the document details