hashing - Hashing and Hash Tables Hash Tables Many...

Info iconThis preview shows pages 1–4. Sign up to view the full content.

View Full Document Right Arrow Icon
Hashing and Hash Tables
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Hash Tables Many applications require a dynamic set that supports only the dictionary. Operations , INSERT, SEARCH and DELETE. Example: a symbol table. A hash table is effective for implementing a dictionary. The expected time to search for an element in a hash table is O (1), under some reasonable assumptions. Worst-case search time is Θ (n), however. A hash table is a generalization of an ordinary array. With an ordinary array, we store the element whose key is k in position k of the array. Given a key k, we find the element whose key is k by just looking in the kth position of the array -- Direct addressing . Direct addressing is applicable when we can afford to allocate an array with one position for every possible key. We use a hash table when we do not want to (or cannot) allocate an array with one position per possible key. Use a hash table when the number of keys actually stored is small relative to the number of possible keys. A hash table is an array, but it typically uses a size proportional to the number of keys to be stored (rather than the number of possible keys). Given a key k, don’t just use k as the index into the array. Instead, compute a function of k, and use that value to index into the array -- Hash function. Direct-Address Tables Scenario: Maintain a dynamic set. Each element has a key drawn from a universe U = {0, 1, . ..,m-1} where m isn’t too large. No two elements have the same key. Represent by a direct-address table , or array, T [0. ..m-1]: Each slot , or position, corresponds to a key in U. If there’s an element x with key k, then T [k] contains a pointer to x. Otherwise, T [k] is empty, represented by NIL.
Background image of page 2
Dictionary operations are trivial and take O(1) time each: DIRECT-ADDRESS-SEARCH (T, k) Return T [k] DIRECT-ADDRESS-INSERT (T, x) T [key[x]] x DIRECT-ADDRESS-DELETE (T, x) T [key[x]] NIL The problem with direct addressing: if the universe U is large, storing a table of size |U| may be impractical or impossible. Often, the set K of keys actually stored is small, compared to U, so that most of the space allocated for T is wasted. When K << U, the space of a hash table << the space of a direct-address table. Can reduce storage requirements to (|K|). Can still get O(1) search time, but in the average case, not the worst case. Idea: Instead of storing an element with key k in slot k, use a function h and store the element in slot h(k). We call h a hash function . h : U {0, 1, . . . ,m-1}, so that h(k) is a legal slot number in T. We say that k hashes to slot h (k). Collisions: when two or more keys hash to the same slot. Can happen when there are more possible keys than slots (|U| > m).
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 4
This is the end of the preview. Sign up to access the rest of the document.

This document was uploaded on 06/12/2011.

Page1 / 11

hashing - Hashing and Hash Tables Hash Tables Many...

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online