slides01-3 - Hypergraphs Hypergraph = nodes plus...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Hypergraphs Hypergraph = nodes plus hyperedges that are sets of any number of nodes. Applications include optimizing queries that are joins and representing universal relations" a useful data-modeling concept. Typically, nodes represent attributes and hyperedges are sets of attributes. Example Suppose we have relations with schemas ABC , ACD, and BE . This database schema could be represented by the hypergraph A D B E C Acyclic Hypergraphs These have some useful properties that make query optimization easier than the general case. Most natural" queries correspond to acyclic hypergraphs. De nition depends on GYO reduction; GYO = Graham-Yu-Ozsoyoglu. An ear is a hyperedge H such that we can divide its nodes into two groups: those that appear in H and no other hyperedge and those that are contained in another hyperedge G. 3 Note that an isolated edge is an ear; no G is needed. GYO reduction of a hypergraph is the process of repeatedly nding ears and removing them. That is, we remove those nodes that are in the ear and no other hyperedge; then we remove the hyperedge itself, leaving the other nodes. 3 We say that ear H is consumed by G, if all the nodes that are not unique to H are in G. 3 If a hypergraph is reduced to nothing 1 3 by GYO reduction, then it is said to be acyclic. Aside: acyclic" makes sense: if the hypergraph is an ordinary graph, it is acyclic i it is a tree. Example Here is an acyclic hypergraph A E D B F C The central hyperedge DEF can consume each of the other three hyperedges. At that time, the remaining hyperedge is trivially an ear, since all of its nodes are unique to it. Formal GYO Reduction The original de nition of GYO reduction consisted of the following two steps: 1. Eliminate a node that is in only one hyperedge. 2. Delete a hyperedge that is contained in another. The goal is to reduce a hypergraph to a single, empty hyperedge. You need to look at GYO reduction this way to show that there is a unique GYO reduction of any hypergraph, acyclic or not. 3 Key idea of proof: candidates for step 1 remain candidates, no matter what other steps are taken. Dangling Tuple Elimination Useful as a rst step in optimizing large joins. 2 A collection of relations R1 ; R2; : : :; Rn is locally join consistent if for each i and j there are no tuples that dangle between Ri and Rj . Formally: R Ri . Rj  = Ri, and similarly when i and j are reversed. These relations are globally join consistent if there are no dangling tuples when considered as a group. Formally, for all i: R R1 . R2 .    . Rn = Ri i i Easy to check global consistency implies local consistently. 3 What about the opposite? Theorem If the relation schemas R1; R2; : : :; Rn form an acyclic hypergraph, then whenever relations for these schemas are locally consistent, they are globally consistent. Proof Induction on n, the number of hyperedges relations in the join. Basis : For n = 1 there is nothing to check. Induction : Assume for n , 1 hyperedges, and prove for n. Let E be the rst ear in a GYO reduction, and let G be the remaining hypergraph. Since G has local consistency and n , 1 hyperedges, by the inductive hypothesis, G is globally consistent. 3 That is, every tuple of every relation of G appears in the result of the join. E was consumed by some hyperedge H , and E is locally consistent with H . Therefore, each tuple t of E joins with some tuple s of H . s appears as part of some tuple r in the join of the relations in G. Since attributes of E are either unique to it, or in H , t joins with r. 3 Thus, t particpates in the join of all n relations. However, if the hypergraph is not acyclic, we can always nd relations that are locally consistent but not globally consistent. 3 Example Consider AB = f00; 11g, BC = f00; 11g, and AC = f01; 10g. Any two relations are join-consistent. E.g., AB . AC = f001; 110g, which projected onto AB is f00; 11g. But AB . BC . AC = ;, so the relations are not globally consistent. Reduction by Semijoins If we are to take the join of several relations, it is often e cient to rst remove the dangling tuples. It guarantees that whatever order we join in, the result never shrinks. Thus, the total work is proportional to the output, and we can't do more than a constant factor better than that. To reduce relations to globally consistent subsets, we can use the semijoin operation: ,  R := R S = R R . S  = R . R S  Sometimes, semijoins don't help eliminating dangling tuples. 3 For example, AB , BC , and AC above are not changed by any semijoin. However, if the hypergraph is acyclic, the following algorithm produces a full reducer for a set of relations. 3 That is, the result is a set of globally joinconsistent relations. 1. Pick an ear E that can be consumed by hyperedge H . Execute the semijoin H := H E . 2. Recursively generate a full reducer for the hypergraph with E removed. 3. Append the semijoin E := E H . Example Consider the relation schemas ABC , ACD, and DE . ACD is an ear that is consumed by ABC . In the resulting hypergraph, ABC can be consumed by BE . The full reducer: 4 ABC := ABC ACD BE := BE ABC ABC := ABC BE ACD := ACD ABC Proof It Works After step 1, it is impossible for the join of the remaining hyperedges to have a tuple that doesn't join with any tuple of E . Inductively, step 2 leaves the relations other than E in a globally join-consistent state. Then, step 3 eliminates from E any tuples that do not join with the other relations. 5 ...
View Full Document

This document was uploaded on 01/06/2012.

Ask a homework question - tutors are online