This preview shows pages 1–3. Sign up to view the full content.
COP 3503 – Computer Science II
–
CLASS NOTES

DAY #6
Additional Data Structures
Balancing Trees
As search trees get large, it becomes important to ensure that the tree is
balanced, otherwise the time required by the various tree operations (searching
primarily) will increase to a worst case of O(N).
Later in the term, we will examine several different variants of trees and see
how they are balanced.
Some trees require that balance be maintained by all
operations on the tree while other trees allow balancing to occur only after the
tree has become unbalanced to the point of requiring too much time for
individual operations on the tree.
Recall that a binary tree is
heightbalanced
or simply
balanced
if the difference in
height of both subtrees of any node is either zero or one.
A
perfectly balanced
tree
is one in which all leaf nodes are found on one or two levels.
For example, a perfectly balanced binary tree consisting of 10,000 nodes, the
height of this tree will be
log(10,001)
=
13.289
= 14.
In practical terms, this
means that if 10,000 elements are stored in a perfectly balanced tree, then at most
14 nodes will need to be checked to locate a specific element.
This is a substantial
difference when compared to the worst case of 10,000 elements in a list!
Therefore, in trees which are to be used primarily for searching, it is worth the
effort to either build the tree so that it is balanced or modify the existing tree so
that it is balanced.
Day 6 
1
A binary tree is
heightbalanced
(or simply
balanced
) if the difference in
height of both subtrees of any node in the tree is either zero or one.
A
tree is said to be
perfectly balanced
if it is balanced and all of the leaves
are found on one or two levels.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentHash Tables
Hash functions are a specific case of a more general technique known as
keyto
address transformations
(KTA transformations).
There are many different KTA
transformation techniques possible.
Figure 1, illustrates the hierarchy of KTA
transformations.
Figure 1 – Keytoaddress transformation hierarchy.
Distribution dependent transformations depend on at least approximate knowledge
of the key values that will be expected.
The benefits that can be gained by
distribution dependent techniques depend on openaddressing, bucket size. file
density, and the appropriateness of the transformation itself.
For small bucket size
and a good distribution algorithm, the improvement over randomizing
transformations can be significant.
On the other hand, the liabilities of distribution
dependent transformations are major, since a change in the key distribution can
cause these methods to generate many more collisions than a randomization would
generate for the same data.
A benefit of some distribution dependent KTA
transforms is that they can allow for maintaining sequentiality.
Such sequence
maintaining transforms allow the addresses produced to increase with increasing
value of the key.
Serial access is made possible in this case.
Otherwise, a direct
This is the end of the preview. Sign up
to
access the rest of the document.