Lecture%2015%20%20-%20%20Introduction%20to%20Hashing

Lecture%2015%20%20-%20%20Introduction%20to%20Hashing -...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
Copyright @ 2009 Ananda Gunawardena Lecture 15 Introduction to Hashing Why Hashing? Internet has grown to millions of users generating terabytes of content every day. According to internet data tracking services, the amount of content on the internet doubles every six months. With this kind of growth, it is impossible to find anything in the internet, unless we develop new data structures and algorithms for storing and accessing data. So what is wrong with traditional data structures like Arrays and Linked Lists? Suppose we have a very large data set stored in an array. The amount of time required to look up an element in the array is either O(log n) or O( n) based on whether the array is sorted or not. If the array is sorted then a technique such as binary search can be used to search the array. Otherwise, the array must be searched linearly . Either case may not be desirable if we need to process a very large data set. Therefore we discuss a new technique called hashing that allows us to update and retrieve any entry in constant time O(1). The constant time or O(1) performance means, the amount of time to perform the operation does not depend on data size n. The Map Data Structure In a mathematical sense, a map is a relation between two sets. We can define Map M as a set of pairs, where each pair is of the form (key, value), where for given a key, we can find a value using some kind of a “function” that maps keys to values. That is, the key for a given object can be calculated using a function called a hash function . In its simplest form, we can think of an array as a Map where key is the index and value is the value at that index. For example, given an array A, if i is the key, then we can find the value by simply looking up A[i]. The idea of a hash table can be described as follows. The concept of a hash table is a generalized idea of an array where key does not have to be an integer. We can have a name as a key, or for that matter any object as the key. The trick is to find a hash function to compute an index so that the object can be stored at a specific location in a table such that it can easily be found. Example: Suppose we have a set of strings {“abc”, “def”, “ghi”} that we’d like to store in a table. Our objective here is to find or update them quickly from a table, actually in O(1). We are not concerned about ordering them or maintaining any order at all. Let us think of a simple schema to do this. Suppose we assign “a” = 1, “b”=2, … etc to all alphabetical characters. We can then simply compute a number for each of the strings by using the sum of the characters as follows. “abc” = 1 + 2 + 3=6, “def” = 4 + 5 + 6=15 , “ghi” = 7 + 8 + 9=24 Now if we assume that we have a table of size 5 to store these strings, we can compute the location of the string by taking the sum mod 5 . So we will store “abc” in 6 mod 5 = 1, “def” in 15 mod 5 = 0, and “ghi” in 24 mod 5 = 4
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Copyright @ 2009 Ananda Gunawardena In locations 1, 0 and 4 as follows.
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 11/27/2009 for the course CS 123 taught by Professor Bajkzek during the Fall '08 term at Carnegie Mellon.

Page1 / 6

Lecture%2015%20%20-%20%20Introduction%20to%20Hashing -...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online