slides01-6

slides01-6 - Conjunctive Queries = safe, Datalog rules: H...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Conjunctive Queries = safe, Datalog rules: H :- G1 &    & Gn Most common form of query; equivalent to select-project-join queries. Useful for optimization of active elements, e.g., checking distributed constraints, maintaining materialized views. Useful for information integration. Applying a CQ to a Database If Q is a CQ, and D is a database of EDB facts, then QD is the set of heads of Q that we get when we: Substitute constants for variables in the body of Q in all possible ways. Require all subgoals to become true. Example pX; Y  : , qX; Z  & qZ; Y  EDB = fq1; 2; q2; 3; q3; 4g. Only substitutions that make subgoals both true: 1. X ! 1; Y ! 3; Z ! 2. 2. X ! 2; Y ! 4; Z ! 3. Yield heads p1; 3 and p2; 4. Containment Q1 Q2 i for every database D, Q1D Q2D. Containment problem is NP-complete, but not a hard" problem in practical situations short queries, few pairs of subgoals with same predicate. Function symbols do not make problems more di cult. Adding negated subgoals and or arithmetic subgoals, e.g., X Y , makes things more complex, but important special cases. Example 1 A: pX,Y :- rX,W & bW,Z & rZ,Y B : pX,Y :- rX,W & bW,W & rW,Y Claim: B A. In proof, suppose px; y is in B D. Then there is some w such that rx; w, bw; w, and rw; y are in D. In A, make the substitution X ! x, Y ! y, W ! w, Z ! w. Thus, the head of A becomes px; y, and all subgoals of A are in D. Thus, px; y is also in AD, proving B A. Testing Containment of CQ's 1. Containment mappings. 2. Canonical databases. Similar for basic CQ case, but 2 is useful for more general cases like negated subgoals. Containment Mappings Mapping from variables of CQ Q2 to variables of CQ Q1 such that 1. Head of Q2 becomes head of Q1 . 2. Each subgoal of Q2 becomes some subgoal of Q1. 3 It is not necessary that every subgoal of Q1 is the target of some subgoal of Q2. Example A, B as above: A: pX,Y :- rX,W & bW,Z & rZ,Y B : pX,Y :- rX,W & bW,W & rW,Y Containment mapping from A to B : X ! X , Y ! Y , W ! W, Z ! W. No containment mapping from B to A. Subgoal bW; W  in B can only go to bW; Z  in A. That would require both W ! W and W ! Z. Example C1: pX :- aX,Y & aY,Z & aZ,W C2: pX :- aX,Y & aY,X 2 Containment mapping from C1 to C2. X ! X, Y ! Y , Z ! X, W ! Y . No containment mapping from C2 to C1. Proof: a X ! X required for head. b Thus, rst subgoal of C2 must map to rst subgoal of C1; Y must map to Y . c Similarly, 2nd subgoal of C2 must map to 2nd subgoal of C1, so X must map to Z . d But we already found X maps to X . Containment Mapping Theorem Q1 Q2 i there exists a containment mapping from Q2 to Q1 . Proof If Let : Q2 ! Q1 be a containment mapping. Let D be any DB. Every tuple t in Q1D is produced by some substitution on the variables of Q1 that makes Q1's subgoals all become facts in D. Claim:  is a substitution for variables of Q2 that produces t. 1. Fi = some Gj . Therefore, it is in D. 2. H2  = H1  = t. Thus, every t in Q1D is also in Q2 D; i.e., Q1 Q2. Proof Only If Key idea: frozen CQ. 1. Create a unique constant for each variable of the CQ Q. 2. Frozen Q is a database consisting of all the subgoals of Q, with the chosen constants substituted for variables. Example pX :- aX,Y & aY,Z & aZ,W Let x be the constant for X , etc. The relation for predicate a consists of the three tuples x; y, y; z , and z; w. 3 Proof Only If Continued Let Q1 Q2 . Let database D be the frozen Q1 . Q1D contains t, the frozen" head of Q1 3 Sounds gruesome, but the reason is that we can use the substitution in which each variable of Q1 is replaced by its corresponding constant. Since Q1 Q2 , Q2D must also contain t. Let be the substitution of constants from D for the variables of Q2 that makes each subgoal of Q2 a tuple of D and yields t as the head. Let be the substitution that maps constants of D to their unique, corresponding variable of Q1. Q2: E :- F1 &    Fm X; Y  t Q1: ab H :- G1 &    & GiA; B  &    is a containment mapping from Q2 to Q1 because: a The head of Q2 is mapped by to t, and t is the frozen head of Q1, so maps the head of Q2 to the unfrozen" t, that is, the head of Q1 . b Each subgoal Fi of Q2 is mapped by to some tuple of D, which is a frozen version of some subgoal Gj of Q1. Then maps Fi to the unfrozen tuple, that is, to Gj itself. Dual View of Containment Mappings A containment mapping, de ned as a mapping on variables, induces a mapping on subgoals. Therefore, we can alternatively de ne a containment mapping as a function on subgoals, thus inducing a mapping on variables. The containment mapping condition becomes: the subgoal mapping does not cause a variable to be mapped to two di erent variables or 4 constants, nor cause a constant to be mapped to a variable or a constant other than itself. Example Again consider A: pX,Y :- rX,W & bW,Z & rZ,Y B : pX,Y :- rX,W & bW,W & rW,Y Previously, we found the containment mapping X ! X , Y ! Y , W ! W , Z ! W from A to B . We could as well describe this mapping as rX; W  ! rX; W , bW; Z  ! bW; W , and rZ; Y  ! rW; Y . Method of Canonical Databases Instead of looking for a containment mapping from Q2 to Q1 in order to test Q1 Q2, we can apply the following test: 1. Create a canonical database D that is the frozen body of Q1. 2. Compute Q2D. 3. If Q2D contains the frozen head of Q1, then Q1 Q2; else not. The proof that this method works is essentially the same as the argument for containment mappings: 3 The only way the frozen head of Q1 can be in Q2 D is for there to be a containment mapping Q2 ! Q1 . Example C1: pX :- aX,Y & aY,Z & aZ,W C2: pX :- aX,Y & aY,X Here is the test for C2 C1 : Choose constants X ! 0, Y ! 1. Canonical DB from C1 is D = fa0; 1; a1; 0g C1D = fp0; p1g. 5 Since the frozen head of C2 is p0, which is in C1D, we conclude C2 C1. 3 Note that the instantiation of C1 that shows p0 is in C1D is X ! 0, Y ! 1, Z ! 0, and W ! 1. 3 If we replace 0 and 1 by the variables X and Y they stand for, we have the containment mapping from C1 to C2 . Saraiya's Containment Test Containment of CQ's is NP-complete in general. Sariaya's algorithm is a polynomial-time test of Q1 Q2 for the common case that no predicate appears more than twice among the subgoals of Q1. 3 They can appear any number of times in Q2. The algorithm is a reduction to 2SAT and yields a linear-time algorithm. Our algorithm is more direct, but quadratic. The Algorithm Pick a subgoal of Q2, and consider the consequences of mapping it to the two possible subgoals of Q1. Follow all consequences of this choice: subgoals that must map to subgoals, and variables that must map to variables. 3 If we know pX1 ; : : :; Xn must map to pY1 ; : : :; Yn, then infer that each Xi must map to Yi . 3 If pX1 ; : : :; Xn  is a subgoal of Q2, and we know Xi maps to some variable Z , and exactly one of the p-subgoals of Q1 has Z in the ith component, then conclude pX1 ; : : :; Xn maps to this subgoal. One of two things must happen: 1. We derive a contradiction: a subgoal or variable that must map to two di erent things. 3 If so, try the other choice if there is one; fail if there is no other choice. 6 2. We close the set of inferences we must make. 3 Then we can forever forget about the question of how to map the determined subgoals and variables. 3 We have found one mapping that works and that can't interfere with the mapping of any other subgoals or variables, so we make another arbitrary choice if there are any unmapped subgoals. Example Let us test C1 C2 , where: C1: pB :- aA,B & aB,A & bA,C & bC,B C2: pX :- aX,Y & bY,Z & bZ,W & aW,X Note this simple example omits some options: C1 could have a predicate appearing only once in the body, and C2 could have 3 or more occurrences of some predicates. Here is a description of inferences that might be made: 1 Suppose aX; Y  ! aA; B  2 Then X ! A, Y ! B 3 Now, bY; Z  ! bB; ? 4 Since there is no bB; ?, fail 5 Thus, we must map aX; Y  ! aB; A 6 Then X ! B and Y ! A, 7 bY; Z  ! bA; C , Z ! C , 8 bZ; W  ! bC; B , W ! B 9 Now, aW; X  must map to aB; B  10 Since aB; B  does not exist, fail Note, however, that if the last subgoal of C1 were bC; A, we would have W ! A at line 8 and aW; X  ! aA; B  at line 9. 3 That completes the containment mapping successfully, with X ! B , Y ! A, Z ! C , and W ! A. Generalization to Unions of CQ's P1 P2    Pk Q1 Q2    Qn i for all Pi there exists some Qj such that Pi Qj . Proof If Obvious. 7 Proof Only If Assume the containment holds. Let D be the canonical frozen database from CQ Pi . Since the containment holds, and Pi D surely includes the frozen head of Pi, there must be some Qj such that Qj D includes the frozen head of Pi . Thus, Pi Qj . Union Theorem Just Misses Being False Consider generalized CQ's allowing arithmeticcomparison subgoals. P1: pX :- eX & 10 = X & X = 20 Q1: pX :- eX & 10 = X & X = 15 Q2: pX :- eX & 15 = X & X = 20 P1 Q1 Q2, but P1 Q1 and P1 Q2 are both false. CQ Contained in Recursive Datalog Test relies on method of canonical DB's; containment mapping approach doesn't work it's meaningless. Make DB D from frozen body of CQ. Apply program to D. If frozen head of CQ appears in result, then yes contained, else no. Example CQ Q1 is: Q1: pathX,Y :- arcX,Z & arcZ,W & arcW,Y Q2 is the value of path in the following recursive Datalog program: r1: pathX,Y :- arcX,Y r2: pathX,Y :- pathX,Z & pathZ,Y Intuitively, Q1 = paths of length 3; Q2 = paths of length 1 or more. Freeze Q1, say with 0, 1, 2, 3 as constants for X , Z , W , Y , respectively. D = farc0; 1; arc1; 2; arc2; 3g 8 Frozen head is path0; 3. Easy to infer that path0; 3 is in Q2D | use r1 three times to infer path0; 1, path1; 2, path2; 3, then use r2 to infer path0; 2, path0; 3. Harder Cases Datalog program CQ: doubly exponential complexity. Reference: Chaudhuri, S. and M. Y. Vardi 1992 . On the equivalence of datalog programs," Proc. Eleventh ACM Symposium on Principles of Database Systems, pp. 55 66. Datalog program Datalog program: undecidable. 9 ...
View Full Document

This document was uploaded on 01/06/2012.

Ask a homework question - tutors are online