slides01-7

slides01-7 - CQ's With Negation General form of conjunctive...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: CQ's With Negation General form of conjunctive query with negation CQN: H :- G1 & ... & Gn & NOT F1 & ... & NOT Fm G's are positive subgoals; F 's are negative subgoals. Apply CQN Q to DB D by considering all possible substitutions of constants for the variables of Q. If for some substitution: 1. All the positive subgoals become facts in D and 2. None of the negative subgoals do, then infer the substituted head. Set of inferred facts is QD. Containment of CQ's doesn't change: Q1 Q2 i for every database D, Q1 D Q2D. Example C1: pX,Z :- aX,Y & aY,Z & NOT aX,Z C2: pA,C :- aA,B & aB,C & NOT aA,D Intuitively, C1 looks for paths of length 2 that are not short-circuited" by a single arc from beginning to end. C2 looks for paths of length 2 that start from a node A that is not a universal source"; i.e., there is at least one node D not reachable from A by an arc. We thus expect C1 C2, but not vice-versa. Levy-Sagiv Test There is a straightforward, time-consuming test for Q1 Q2: Create a large-but- nite family of canonical DB's that consist of all DB's using only the constants 1; 2; : : :; n, where n is the number of variables in Q1. 1 Test each canonical DB. If Q1D is not contained in Q2 D for even one canonical DB D, then containment of CQ's surely doesn't hold. Otherwise, we claim that Q1 Q2 . Proof of L S Test Suppose Q1D Q2 D for each canonical DB D, but there is some other DB E , for which containment doesn't hold. That is, Q1E  contains a tuple t that Q2E  does not contain. Consider the at most n symbols that variables of Q1 map to when showing that Q1 E  contains t. We may rename these symbols 1; 2; : : :; n; the counterexample still holds. Let D be the canonical DB consisting of E restricted to the tuples having only the symbols 1; 2; : : :; n. Since the L S test passed, we know that Q2D contains t. Since the assignment of Q2's variables that shows t is in Q2 D maps variables only to 1; 2; : : :; n remember all CQ's are assumed safe, the same assignment maps the positive subgoals of Q2 to tuples of E and negative subgoals of Q2 to tuples not in E . 3 In proof: note that D and E , after renaming of symbols, agree on all tuples that involve only 1; 2; : : :; n. That is, D and E look the same" whenever we assign variables to only 1; 2; : : :; n. CQ's With Arithmetic Suppose we allow subgoals with , 6=, and other comparison operators. We must assume database constants can be compared. Technique is a generalization of the L S algorithm, but it is due to Tony Klug. We shall work the case where is a total order; other assumptions lead to other algorithms, and we shall later give an allpurpose technique using a di erent approach. 2 Example Consider the rules: C1: pX,Z :- aX,Y & aY,Z & X C2: pA,C :- aA,B & aB,C & A Y C Both ask for paths of length 2. But Q1 requires that the rst node be numerically less than the second, while Q2 requires that the rst node be numerically less than the third. Klug Levy Sagiv Test Construct a family of canonical databases by considering all partitions of the variables of Q1 assuming we are testing Q1 Q2 , and ordering the partitions. To represent canonical DB's assign the rst partition the value 0, the second the value 1, and so on. Example To test C1 C2: C1: pX,Z :- aX,Y & aY,Z & X Y C2: pA,C :- aA,B & aB,C & A C we need to consider the partitions of fX; Y; Z g and order them. The number of ordered partitions is 13. 3 For partition fX gfY gfZ g we have 3! = 6 possible orders of the blocks. 3 For the three partitions that group two variables and leave the other separate we have 2 di erent orders. 3 For the partition that groups all three, there is one order. In this example, the containment test fails. We have only to nd one of the 13 cases to show failure. For instance, consider fX; Z gfY g. The canonical database D for this case is fa0; 1; a1; 0g, and since X Y , the body of C1 is true. Thus, C1D includes p0; 0, the frozen head of C1 . 3 However, no assignment of values to A, B , and C makes all three subgoals of C2 true, when D is the database. Thus, p0; 0 is not in C2D, and D is a counterexample to C1 C2. Key Theorems No Longer Hold When Some Predicates are Interpreted e.g., Arithmetic Comparisons Union of CQ's theorem is false. Example Consider something we've seen before: Q1: pX :- aX & 10X & X20 R1: pX :- aX & 5X & X15 R2: pX :- aX & 15X & X25 Q1 R1 R2, but neither Q1 R1 nor Q1 R2 is true. Containment mapping theorem is false. Example Q1: panic :- rU,V & rV,U Q2: panic :- rU,V & UV Note, panic" is a 0-ary predicate, i.e., a propositional variable. 3 0-ary predicates in the head present no problems for CQ's but don't make anything easier either. Informally: Q1 = cycle of length 2"; Q2 = nondecreasing arc." Thus, Q1 Q2 . 3 That is, whenever there is a pair of arcs U ! V and V ! U , surely one is nondecreasing. However, if  is a containment mapping from Q2 to Q1, there is no subgoal that U  V  can be. Hence, no containment mapping from Q2 to Q1. Generalizing the Containment-Mapping Theorem 4 The Klug Levy Sagiv approach uses canonical databases to handle arithmetic. Another approach, due to Ashish Gupta and Zhang Ozsoyoglu, uses containment mappings. 3 It has the advantage of working for any kind of interpreted  built-in" predicate, although we shall use arithmetic comparisons in our examples. The G Z O Test To test whether Q1 Q2, where Q1, Q2 are CQ's with interpreted predicates: 1. Recti cation : replace variables and constants by new variables so that no variable appears twice among the relational subgoals and the head. Also, no constant may appear there at all. 2. Add equality comparisons so the new variables are equated to the variable or constant they replace. Examples a Q1 above: panic :- rU,V & rV,U becomes panic :- rU,V & rX,Y & U=Y & V=X b pX :- qX,Y,X & rY,a would become: pZ :- qX,Y,W & rV,U & X=W & X=Z & Y=V & U=a G Z O Test Continued 3. Having modi ed the CQ's, let M be the set of all containment mappings from the relational subgoals of Q2 to the relational subgoals of Q1. 3 Note that with all variables appearing only once, every mapping from subgoals to subgoals that matches predicates gives us a containment mapping. 5 Then Q1 Q2 i the interpreted subgoals of Q1 logically imply the OR, over all  in M , of  applied to the interpreted subgoals of Q2. Example Let Q1: panic :- rU,V & rX,Y & U=Y & V=X Q2: panic :- rU,V & UV Two containment mappings: 1. 1 U  = U ; 1 V  = V . Here, the rU; V  subgoal of Q2 maps to the rst subgoal of Q1. 2. 2 U  = X ; 2V  = Y . Here, rU; V  of Q2 maps to the second subgoal of Q1. We must check: U = Y ^ V = X  1U  V  _ 2 U  V  That is: U =Y ^ V =X  U V _ XY Use equalities U = Y and V = X in the hypothesis. Su cient to show: U V _ V U Obviously true. Test For Logical Expressions Involving Inequalities For arbitrary interpreted predicates, we can only make the necessary test by using whatever algorithm is appropriate for those predicates. For interpreted predicates that are arithmetic inequalities, we can use the same test that was hidden inside the K L S test: 3 Consider all total orders of variables, including those with equalities. If implication holds for each order, then expression is true, else false. Example For the implication above: U =Y ^ V =X  U V 6 _ XY two possible orders are: UVXY X U =V Y For this implication, the only orders that make the hypothesis U = Y ^ V = X  true are: U =V =X=Y U =Y V =X V =X U =Y Conclusion U  V _ X  Y holds for each of the three orders. Test is exponential but works. Extensions Extends to test for a CQ contained in a union of CQ's. The logical implication includes the OR over all containment mappings from any of the CQ's in the union. Extends to containment of unions of CQ's: handle each CQ in the contained unions separately. 7 ...
View Full Document

Ask a homework question - tutors are online