HASHING and
COLLISION
1

Hashing
Hash tables are a common approach to the
storing/searching techniques and also Used in
encryption methods.
implemented as an array of objects, where the
search keys correspond to the array indexes
Example: Empdata[1000], index = employee ID
number
◦
search for employee with emp. number = 500
◦
return: Empdata[500]
2

Hashing
In the Example shown, it was relatively easy since
employee number is an integer
Problem 1
:
possible integer key values might be
too large; creating an appropriate array might be
impractical
◦
Need to map large integer values to smaller array indexes
Problem 2
:
what if the key is a word in the English
Alphabet (e.g. last names)?
◦
Need to map names to integers (indexes)
3

Hash Functions and
Hash Tables
Hashing has 2 major components
◦
Hash function
h
◦
Hash Table Data Structure
of size
N
A
hash function
h
maps keys (a identifying element of record
set) to hash value or hash key which refers to specific location in
Hash table
Example:
h
(
x
)
x
mod
N
is a hash function for integer keys
The integer
h
(
x
)
is called the
hash value
of key
x
4

Hash Table Size
◦
Should be appropriate for the hash function used
◦
Too big will waste memory; too small will increase
collisions and may eventually force
rehashing
(copying
into a larger table)
◦
Rule of thumb:
the table size should be about twice the
size of the data set (2s)
◦
for 50,000 words, use table of 100,000 elements
5

Example
We design a hash table for a
dictionary storing items (SSN,
Name), where SSN (social security
number) is a nine-digit positive
integer
The actual data is not stored in
hash table
Pin points the location of actual
data or set of data
Our hash table uses an array of
size
N
10,000
and the hash
function
h
(
x
)
last four digits of
x
6
0
1
2
3
4
9997
9998
9999
…
451-229-0004
981-101-0002
200-751-9998
025-612-0001

Hash Function
The mapping of keys into the table is called
Hash
Function
A hash function,
◦
Ideally, it should distribute keys and entries evenly
throughout the table
◦
It should be easy and quick to compute.
◦
It should minimize
collisions
, where the position given by
the hash function is already occupied
◦
It should be applicable to all objects
7

Hash Table
The simplest kind of hash table is an array of records.
This example has 701 records.
[ 0 ]
[ 1 ]
[ 2 ]
[ 3 ]
[ 4 ]
[ 5 ]
An array of records
. . .
[ 700]

What is a Hash Table ?
Each record has a special field,
called its
key
.
In this example, the key is a long
integer field called
Number
.
[ 0 ]
[ 1 ]
[ 2 ]
[ 3 ]
[ 4 ]
[ 5 ]
. . .
[ 700]
[ 4 ]
Number
506643548

What is a Hash Table ?
The number might be a person's
identification number, and the rest
of the record has information about
the person.
[ 0 ]
[ 1 ]
[ 2 ]
[ 3 ]
[ 4 ]
[ 5 ]
. . .
[ 700]
[ 4 ]
Number
506643548

What is a Hash Table ?
When a hash table is in use, some spots contain
valid records, and other spots are "empty".

#### You've reached the end of your free preview.

Want to read all 62 pages?

- Fall '19