The idea keeps getting reinvented about 3
times a year somewhere in the world.
3 So let's see it once and for all, and then if
you ever need it you can just implement
it rather than reinventing it.
Example of Problem that the UR Solves
Sometimes, the size of a join result can be
exponential in the size of the input relations, even if
the join is acyclic.
Consider A1 A2 . A2 A3 . . An,1An .
Let each Ai have domain f1; 2; 3; 4g.
Let each relation consist of the eigh
Hypergraph = nodes plus hyperedges that are
sets of any number of nodes.
Applications include optimizing queries
that are joins and representing universal
relations" a useful data-modeling concept.
Typically, nodes represent attributes and
In relation R, we say MVD X ! Y holds if
whenever there are tuples s and t in R such that
X s = X t, then there is a tuple r in r such
1. XY r = XY s.
2. R,Y X r = R,Y X t.
I.e., r agrees with s on the attributes
Background: Functional Dependencies
We are always talking about a relation R,
with a xed schema set of attributes and a
varying instance set of tuples.
Conventions: A; B; : : : are attributes; : : :; Y; Z
are sets of attributes. Concatenation means
CS345 Midterm Examination
Wednesday, May 14, 2003, 9:30 11:30AM
The exam is open book; any written materials may be used.
Answer all 9 questions on the exam paper itself.
The total number of points is 120 i.e., 1 point per minute.
Do not forget
The CURE Algorithm
Problem with BFR/k -means:
Assumes clusters are normally distributed
in each dimension.
And axes are fixed - ellipses at an angle
are not OK.
Assumes a Euclidean distance.
= safe, Datalog rules:
H :- G1 & & Gn
Most common form of query; equivalent to
Useful for optimization of active elements,
e.g., checking distributed constraints,
maintaining materialized views.
CQ's With Negation
General form of conjunctive query with negation
H :- G1 & . & Gn &
NOT F1 & . & NOT Fm
G's are positive subgoals; F 's are negative
Apply CQN Q to DB D by considering all
possible substitutions of constants for the
Five Groups of Rules for Magic Construction
Let r be a typical rule
H :- G1 & G2 & & G
Supplementary magic for next subgoal. If G
has IDB predicate p:
m pbound args of G :- sup ,1variables
Magic 0th supplementary. If head has predicate
Rule Goal Trees
Nodes correspond to rules and to subgoals of rules.
Rule node: children = subgoals of the rule.
Goal node: children = rules whose heads unify
with the goal. Unifying substitution must be
made in the rule.
3 Be careful that local variables
Optimization technique for recursive Datalog.
Also a win on some nonrecursive SQL
Mumick, Finkelstein, Pirahesh, and
Ramakrishnan, 1990 SIGMOD, pp. 247 258.
Combines bene ts of both top-down
backward chaining, recursive tree search
Getting All You Can Out of Views
The situation is that we are given a collection
of views and a query possibly recursive.
3 We want to nd all the answers to the
query that we can using the views.
This technology, due to Oliver Duschka, comes
The Bucket Algorithm
We can answer queries using views" by trying
all CQ's with no more view subgoals than the
query has subgoals.
However, a more organized exploration of the
possibilities, called the bucket algorithm looks
at how views can cover" each
Using CQ Theory in Information Integration
Yes; this stu really does get used in systems. We
shall talk about three somewhat di erent systems
that use the theory in various ways:
1. Information Manifold, developed by Alon
Levy at ATT Research Labs Levy is
Relationships Among Semantics
If a program + EDB has a strati ed or perfect
locally strati ed model, then that is the unique
A program + EDB can have a unique stable
model even if there is no perfect model.
p :- NOT q
q :- NOT p
Local Strati cation
Instantiate rules; i.e., substitute all
possible constants for variables, but reject
instantiations that cause some EDB subgoal
to be false.
3 Ground atom = atom with no variables.
Build dependency graph at the level of ground
Review of Logic as a Query Language
Datalog programs are collections of rules, which are
Horn clauses or if-then expressions.
The following rules express what is needed to
make" a le. It assumes these relations or EDB
extensional database predicat
Using Views to Implement
Idea: invert the view definitions to give
the global predicates definitions in terms
of views and function symbols.
Plug the globals definitions into the body