This preview shows page 1. Sign up to view the full content.
Unformatted text preview: Hypergraphs Hypergraph = nodes plus hyperedges that are sets of any number of nodes.
Applications include optimizing queries
that are joins and representing universal
relations" a useful datamodeling concept.
Typically, nodes represent attributes and
hyperedges are sets of attributes. Example Suppose we have relations with schemas ABC ,
ACD, and BE . This database schema could be
represented by the hypergraph A
D B E C Acyclic Hypergraphs These have some useful properties that make
query optimization easier than the general case.
Most natural" queries correspond to acyclic
hypergraphs.
De nition depends on GYO reduction; GYO =
GrahamYuOzsoyoglu.
An ear is a hyperedge H such that we can
divide its nodes into two groups: those that
appear in H and no other hyperedge and
those that are contained in another hyperedge
G.
3 Note that an isolated edge is an ear; no G
is needed.
GYO reduction of a hypergraph is the process
of repeatedly nding ears and removing them.
That is, we remove those nodes that are in the
ear and no other hyperedge; then we remove
the hyperedge itself, leaving the other nodes.
3 We say that ear H is consumed by G, if
all the nodes that are not unique to H
are in G.
3 If a hypergraph is reduced to nothing
1 3 by GYO reduction, then it is said to be
acyclic.
Aside: acyclic" makes sense: if the
hypergraph is an ordinary graph, it is
acyclic i it is a tree. Example Here is an acyclic hypergraph A
E D
B F C The central hyperedge DEF can consume
each of the other three hyperedges.
At that time, the remaining hyperedge is
trivially an ear, since all of its nodes are
unique to it. Formal GYO Reduction The original de nition of GYO reduction consisted
of the following two steps:
1. Eliminate a node that is in only one
hyperedge.
2. Delete a hyperedge that is contained in
another.
The goal is to reduce a hypergraph to a single,
empty hyperedge.
You need to look at GYO reduction this way
to show that there is a unique GYO reduction
of any hypergraph, acyclic or not.
3 Key idea of proof: candidates for step 1
remain candidates, no matter what other
steps are taken. Dangling Tuple Elimination
Useful as a rst step in optimizing large joins.
2 A collection of relations R1 ; R2; : : :; Rn is
locally join consistent if for each i and j there
are no tuples that dangle between Ri and Rj .
Formally: R Ri . Rj = Ri, and similarly
when i and j are reversed.
These relations are globally join consistent if
there are no dangling tuples when considered
as a group. Formally, for all i:
R R1 . R2 . . Rn = Ri
i i Easy to check global consistency implies local
consistently.
3 What about the opposite? Theorem If the relation schemas R1; R2; : : :; Rn form an
acyclic hypergraph, then whenever relations for
these schemas are locally consistent, they are
globally consistent. Proof Induction on n, the number of hyperedges
relations in the join.
Basis : For n = 1 there is nothing to check.
Induction : Assume for n , 1 hyperedges, and
prove for n.
Let E be the rst ear in a GYO reduction,
and let G be the remaining hypergraph.
Since G has local consistency and n , 1
hyperedges, by the inductive hypothesis, G
is globally consistent.
3 That is, every tuple of every relation of G
appears in the result of the join.
E was consumed by some hyperedge H , and E
is locally consistent with H . Therefore, each
tuple t of E joins with some tuple s of H .
s appears as part of some tuple r in the join
of the relations in G. Since attributes of E are
either unique to it, or in H , t joins with r.
3 Thus, t particpates in the join of all n
relations. However, if the hypergraph is not acyclic,
we can always nd relations that are locally
consistent but not globally consistent.
3 Example Consider AB = f00; 11g, BC = f00; 11g, and
AC = f01; 10g.
Any two relations are joinconsistent. E.g.,
AB . AC = f001; 110g, which projected onto
AB is f00; 11g.
But AB . BC . AC = ;, so the relations are
not globally consistent. Reduction by Semijoins If we are to take the join of several relations, it is
often e cient to rst remove the dangling tuples.
It guarantees that whatever order we join in,
the result never shrinks. Thus, the total work
is proportional to the output, and we can't do
more than a constant factor better than that.
To reduce relations to globally consistent
subsets, we can use the semijoin operation:
,
R := R S = R R . S = R . R S
Sometimes, semijoins don't help eliminating
dangling tuples.
3 For example, AB , BC , and AC above are
not changed by any semijoin.
However, if the hypergraph is acyclic, the
following algorithm produces a full reducer
for a set of relations.
3 That is, the result is a set of globally joinconsistent relations.
1. Pick an ear E that can be consumed
by hyperedge H . Execute the semijoin
H := H E .
2. Recursively generate a full reducer for the
hypergraph with E removed.
3. Append the semijoin E := E H . Example Consider the relation schemas ABC , ACD, and
DE .
ACD is an ear that is consumed by ABC .
In the resulting hypergraph, ABC can be
consumed by BE .
The full reducer:
4 ABC := ABC ACD
BE := BE ABC
ABC := ABC BE
ACD := ACD ABC Proof It Works
After step 1, it is impossible for the join of
the remaining hyperedges to have a tuple that
doesn't join with any tuple of E .
Inductively, step 2 leaves the relations other
than E in a globally joinconsistent state.
Then, step 3 eliminates from E any tuples
that do not join with the other relations. 5 ...
View
Full
Document
This document was uploaded on 01/06/2012.
 Spring '09

Click to edit the document details