This preview shows pages 1–4. Sign up to view the full content.

1 Lecture 24: Advanced Hashing PIC 10B Todd Wittman Note: We'll end 10 minutes early today for evaluations. Hash Table Choices s Hash tables are a very powerful data structure, but unlike the other containers we've studied they have parameters / choices of methods that need to be determined. s How many buckets M should we use? b More buckets = More memory = Faster run-time s What hash function should we use? b A hash function has 2 basic steps. b First we need to map the key to a (large) integer x. s e.g. we could add up the ASCII values of the chars in a string b Second we need to convert the integer x to an index in the range [0,M-1] s Division Method : x%M s Multiplication Method : (int) M * ( A*x - (int) (A*x) )

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
2 Choosing the Number of Buckets s To choose the number of buckets, you have to ask: b How fast do searches need to be? b How much memory do I have available? s In our example for 14 students at Jedi Academy, it takes a large number of buckets to get down to zero collisions. s Can you modify our program so that it finds the smallest number of buckets needed for 0 collisions? Choosing the Hash Function s First we need to decide how to convert the key to an integer x. s Preferably a large integer bigger than the # buckets M, so we don't map to just one part of our table. s For strings, we added up the ASCII values of each character. for (int i=0; i < s.length(); i++) x += (int) s[i]; s But with this method, the order of the characters does not matter. So permutations map to the same integer. H("Yoda") = H("odaY") s A common solution is to weight the chars. For example, we could fix a constant 1<B<2 and compute for (int i=0; i<s.length(); i++) x += (int) pow(B,i) * s[i];
3 Choosing the Hash Function s After we have an integer x, we need to restrict it to the range [0,M-1]. s There are 2 popular methods for doing this. s Division Method : Take x mod the # of buckets index = x % M s Multiplication Method : Fix a constant 0<A<1. Then take the fractional part of A*x and multiply that by M. Then round the result down to the nearest integer. index = (int) M * ( A*x - (int) (A*x) ) s The value of A needs to be set experimentally. Knuth recommends: s Can you modify our program so that it uses the multiplication method? (Note you are asked to do this for HW9.) 2 1 5 - = A Are Hash Tables Used in Real

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}