Lec23 - Advanced Hashing

Lec23 - Advanced Hashing - Lecture 23 Advanced Hashing PIC

1 Lecture 23: Advanced Hashing PIC 10B Todd Wittman Note: We'll end 10 minutes early today for evaluations. Hash Table Choices square6 Hash tables are a very powerful data structure, but unlike the other containers we've studied they have parameters / choices of methods that need to be determined. square6 How many buckets M should we use? box2 More buckets = More memory = Faster run-time square6 What hash function should we use? box2 A hash function has 2 basic steps. box2 First we need to map the key to a (large) integer x. square6 e.g. we could add up the ASCII values of the chars in a string box2 Second we need to convert the integer x to an index in the range [0,M-1] square6 Division Method : x%M square6 Multiplication Method : (int) M * ( A*x - (int) (A*x) )

2 Choosing the Number of Buckets square6 To choose the number of buckets, you have to ask: box2 How fast do searches need to be? box2 How much memory do I have available? square6 In our example for 14 students at Jedi Academy, it takes a large number of buckets to get down to zero collisions. square6 Can you modify our program so that it finds the smallest number of buckets needed for 0 collisions? Choosing the Hash Function square6 First we need to decide how to convert the key to an integer x. square6 Preferably a large integer bigger than the # buckets M, so we don't map to just one part of our table. square6 For strings, we added up the ASCII values of each character. for (int i=0; i < s.length(); i++) x += (int) s[i]; square6 But with this method, the order of the characters does not matter. So permutations map to the same integer. H("Yoda") = H("odaY") square6 A common solution is to weight the chars. For example, we could fix a constant 1<B<2 and compute for (int i=0; i<s.length(); i++) x += (int) pow(B,i) * s[i];
3 Choosing the Hash Function square6 After we have an integer x, we need to restrict it to the range [0,M-1]. square6 There are 2 popular methods for doing this. square6 Division Method : Take x mod the # of buckets index = x % M square6 Multiplication Method : Fix a constant 0<A<1. Then take the fractional part of A*x and multiply that by M. Then round the result down to the nearest integer.

Lec23 - Advanced Hashing

