11/17/10 20:57:20 1 22 CS 61B: Lecture 22 Monday, October 18, 2010 Today’s reading: DICTIONARIES ============ Suppose you have a set of two-letter words and their definitions. You want to be able to look up the definition of any word, very quickly. The two-letter word is the _key_ that addresses the definition. Since there are 26 English letters, there are 26 * 26 = 676 possible two-letter words. To implement a dictionary, we declare an array of 676 references, all initially set to null. To insert a Definition into the dictionary, we define a function hashCode() that maps each two-letter word (key) to a unique integer between 0 and 675. We use this integer as an index into the array, and make the corresponding bucket (array position) point to the Definition object. public class Word { public static final int LETTERS = 26, WORDS = LETTERS * LETTERS; public String word; public int hashCode() { return LETTERS * (word.charAt(0) - ’a’) + (word.charAt(1) - ’a’); } } public class WordDictionary { private Definition[] defTable = new Definition[Word.WORDS]; public void insert(Word w, Definition d) { defTable[w.hashCode()] = d; } Definition find(Word w) { return defTable[w.hashCode()]; } } What if we want to store every English word, not just the two-letter words? The table "defTable" must be long enough to accommodate pneumonoultramicroscopicsilicovolcanoconiosis, 45 letters long. Unfortunately, declaring an array of length 26^45 is out of the question. English has fewer than one million words, so we should be able to do better. Hash Tables (the most common implementation of dictionaries) ----------- Suppose n is the number of keys (words) whose definitions we want to store, and suppose we use a table of N buckets, where N is perhaps a bit larger than n, but much smaller than the number of _possible_ keys. A hash table maps a huge set of possible keys into N buckets by applying a _compression_function_ to

