Retrieval Time of
Richard P. Brent,
IBM Thomas J. Watson Research Center
A new method for entering and retrieving
information in a hash table is described. The method is
intended to be efficient if most entries are looked up several
times. The expected number of probes to look up an entry,
predicted theoretically and verified by Monte Carlo
experiments, is considerably less than for other comparable
methods if the table is nearly full. An example of a possible
Fortran implementation is given.
Key Words and Phrases: address calculation, content
addressing, file searching, hash addressing, hash code,
linear probing, linear quotient method, scatter storage,
searching, symbol table
CR Categories: 3.7, 3.73, 3.74, 4.1, 4.9
Scatter storage (hash coding) techniques are used to
minimize the time required to enter and retrieve infor-
mation in tables. 'Rather similar techniques can be used
for internal tables, such as the symbol tables of com-
pilers and assemblers, and large files which are stored
on random-access devices such as disks or drums.
Some of these techniques are described in an excellent
survey paper  and more recently in [1, 2, and 6].
Our aim is to describe a method for entering infor-
Copyright © 1973, Association for Computing Machinery, Inc.
General permission to republish, but not for profit, all or part
of this material is granted, provided that reference is made to this
publication, to its date of issue, and to the fact that reprinting
privileges were granted by permission of the Association for Com-
Author's present address: Computer Centre. Australian Na-
tional University, P.O. Box 4, Canberra, ACT 2600, Australia.
mation so that subsequent retrievals are very efficient.
Suppose that each item consists of an identifying name
which may be regarded as an integer, and an
If m keys kx, -.- ,km are stored at
in a table T of length
n _> m (i.e.
T(a(ki)) = kl
for i =
.. , m) and a
key k is given, the problem is to determine efficiently
whether k is in T, and if so, to find
In order to
compare the efficiency of different algorithms, we count
the number of fetches of elements of
T, i.e. probes,
In practical applications it usually happens that most
entries in the table are looked up several times. Bell and
Kaman  found that their hashing routine was en-
tered 10,988 times, but with only 735 different keys,
when a typical COBOL program was compiled. As a
more extreme example, a table of opcode mnemonics
or reserved words may be built up once and thereafter
used purely for retrieval . Thus it is very important to
minimize the number of probes required to look up keys
which are already in the table. The number of probes
required to look up (and perhaps insert) keys which are
not already there is not so important.
The idea of our method, which is described in de-
tail in Section 2, is to take more care than usual when
keys are inserted, in an attempt to reduce the number of
probes required for subsequent lookups. Although we