Spring 2009
Instructor: Martin Ester
TA: Bahareh Bina
Sample Exam
Problem 1
Given two relations R(a,b,c) and S(a,b,d) with the following statistics:
T(R) = 1000,
V(R,a) = 1000, V(R,b) = 50, V(R,c) = 20
T(S) = 2000, V(S,a) = 50, V(S,b) = 100, V(S,d) = 200
a) What is the estimated number of tuples of the natural join of R and S?
The estimated number of tuples is equal to
20
100
1000
2000
1000
))
,
(
),
,
(
max(
))
,
(
),
,
(
max(
)
(
)
(
=
⋅
⋅
=
⋅
⋅
b
S
V
b
R
V
a
S
V
a
R
V
S
T
R
T
.
b) What is the estimated number of distinct values of attribute c in the join result?
According to the assumption of preservation of value sets, this number is equal to V(R,c), i.e.
20. This is indeed our estimate, since the join is expected to have that many tuples.
c) What join algorithm requires the smallest number of buffer blocks to implement the given
natural join?
How many buffer blocks does it require?
The block-based nested-loop join with chunk size equal to one block, and it requires only two
buffer blocks.
d)
Assuming that an unlimited number of buffer blocks is available, what join algorithm requires
the smallest number of IO to implement the given natural join? How many IO
does it require?
The block-based nested-loop join with chunk size equal to the number of blocks needed to
hold the smaller of the input relations. The number of IO required is equal to the sum of the
number of blocks to hold the smaller input relation and the number of blocks to hold the
larger input relation.
Note that the answers to c) and d) are independent from the characteristics of the given input