6.897: Advanced Data Structures
Spring 2010
Lecture 2 — February 4, 2010
Prof. Erik Demaine
Scribe: Hui Tang, Prasant Gopal
1
Overview
In the last lecture we discussed Binary Search Trees(BST) and introduced them as a model of
computation. A quick recap: A search is conducted with a pointer starting at the root, which is
free to move about the tree and perform rotations; however, the pointer must at some point in
the operation visit the item being searched. The cost of the search is simply the total number of
distinct nodes in the trees that have been visited by the pointer during the operation. We measure
the total cost of executing a sequence of searches
S
=
a
s
1
,s
2
,s
3
...
A
, where each search
s
i
is chosen
from among the Fxed set of
n
keys in the BST.
We have witnessed that there are access sequences which require
o
(log(
n
)) time per operation. There
are also some deterministic sequences on
n
queries (for example, the bit reversal permutation) which
require a total running time of Ω(
n
log(
n
)) for any BST algorithm. This disparity however does
not rule out the possibility of having an instance optimal BST. By this we mean: Let
OPT
(
S
)
denote the minimal cost for executing the access sequence
S
in the BST model, or the cost of the
best BST algorithm which has access to the sequence apriori. It is believed that splay trees are
the “best BST”. However, they are not known to have
o
(log(
n
)) competitive ratio. Also, notice
that we are only concerned with the cost of the speciFed operations on the BST and we are not
accounting for the work done outside the model, say, the computation done for rotations etc.
This motivates us to search for a BST which is optimal (or close to optimal) on
any
sequence of
search. Given splay trees satisfy a number of properties like static optimatlity, working set bound,
dynamic-Fnger bound and linear traversal; they are a natural candidate for the dynamic optimality.
They are notoriously hard to analyse and understand and sometimes appear magical.
So, this led researchers to look for alternative approaches to build a dynamically optimal BST. The
best guarantee so far is the
O
(log log(
n
)) competitive ratio achieved by the Tango Trees - we shall
see them in the later part of the lecture.
Another perspective, is the recently proposed geometric view of the BST [DHIKP09]. In this
approach, an correspondence between the BST model of computation and points in
R
2
is given.
Informally, call a set
P
of points arborally satisFed if, for any two points
a,b
∈
P
not on a common
horizontal or vertical line, there is al teast one point
P
\{
a,b
}
in the axis parallel rectangle deFned
by
a
and
b
. Each search is mapped into the
R
2
in the following way:
P
=
{
(
s
1
,
1)
,
(
s
2
,
2)
...
(
s
n
,n
)
}
.