This preview shows pages 1–3. Sign up to view the full content.
COP 3503 – Computer Science II
– Spring 2000  CLASS NOTES

DAY #24
Hashed Tables
•
Hash tables (files) rely on hashing to perform insertion, deletion, and retrieval
in constant time.
•
Hashing functions are a mapping between a key value and a location in the
table.
The location is typically produced as an offset from the first position in
the table.
•
The key value is the data field on which retrieval, insertion, and deletion will be
based.
To be retrieved from, inserted to, or deleted from the file, an item’s key
value must be specified.
•
If the hashing function is a 1:1 function from key value to location then any
(retrieval) can be done in constant time.
If the hashing function is not 1:1, but
M:1, then it is possible for more than key value to map to the same location in
the hash table.
This is called a collision.
•
Typically, a restriction on the possible key values that may be used will be
expected.
Without any restrictions the size of the set of possible key values is
infinite.
[Domain is infinite and the range will normally be finite].
Hashing Methods
Method #1
Convert the key value (a string, let’s say) into an integer by adding the product of
the character (its ASCII or Unicode value) and some number (say 128) raised to
the position of the character.
Note: 128 is used since the typical character set
requires 7 bits in which to encode the character set (ASCII, not extended ASCII or
Unicode).
Since 2
7
= 128 this means that we can encode 128 different characters
using the integer numbers 0 through 127.
Example:
CSII = (C * 128
3
) + (S*128
2
) + (I*128
1
) + (I+128
0
)
This method has a serious potential for overflow!
For example using ASCII code
the value for the work “junk” is 224,229,227!
A long string will generate a huge
number.
Also note that this technique is not a 1:1 mapping so collisions will be
possible.
Collision resolution will be discussed later.
To prevent calculating a number such as 128
i
directly for some applies to general
polynomials can be used.
A general polynomial:
A
3
X
3
+ A
2
X
2
+ A
1
X
1
+ A
0
X
0
Day 24 
1
This preview has intentionally blurred sections. Sign up to view the full version.
View Full Documentcan be evaluated as:
(((A
3
) X + A
2
) X + A
1
)X + A
0
This has three distinct advantages over the earlier method.
(1) A large
intermediate result that will overflow is deferred until the end of the calculation,
(2) only three multiplications and three additions are required to evaluate the
polynomial, and (3) the entire calculation proceeds from left to right
(exponentiation is from right to left).
A better solution is
:
public int hash1 (string s, int tablesize)
{
int HashVal = 0;
for (int i = 0; i < s.length; i++)
HashVal = (HashVal * 128 + s.charAT(i)) % tablesize;
return HashVal;
}// end hash1
Note that the only improvement in this solution compared to the first is that
modulo arithmetic has been applied.
However, modulo operations are very
expensive.
An even better solution:
This is the end of the preview. Sign up
to
access the rest of the document.
 Spring '08
 Staff
 Computer Science

Click to edit the document details