midterm03

midterm03 - C8345 Midterm Examination Wednesday7 May 147...

Info iconThis preview shows pages 1–8. Sign up to view the full content.

View Full Document Right Arrow Icon
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 2
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 4
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 6
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Background image of page 8
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: C8345 Midterm Examination Wednesday7 May 147 20037 9:30 — 11:30AM Directions 0 The exam is open book; any written materials may be used. 0 Answer all 9 questions on the exam paper itself. 0 The total number of points is 120 (i.e.7 1 point per minute). 0 Do not forget to Sign the pledge below. I acknowledge and accept the honor code. Print your name here: Problem 1: (12 points) Let C1 and C2 denote two columns (items) of a matrix that represents market—basket data. Let C1 V C2 denote a column that is the row—wise logical OR of the two columns; i.e., C1 V C2 has a 1 when either C1 or C2 or both has a 1, and has 0 otherwise. Similarly, C1 /\ C2 denotes the row—wise logical AND of the two columns; i.e., C1 /\ C2 has a 1 if and only if both columns have 1. Let h(C) denote the minhash value for column C. That is, h(C) is the smallest i such that the ith row in the chosen (permuted) order of rows has a 1 in column C. For each of the following statements, indicate whether it is always true or sometimes false, by circling T or F, respectively. EL) V C2) 2 T F ) b /\ C2) 2 maX(h(C1), T F C) Z h(C2), then V C2) 2 T F Problem 2: (12 points) Let In(:1:) denote the set of pages with link to page 91:, and let Out(:1:) denote the set of pages to which page :1: links. Let h(:1:), a(:1:), and p(:1:) denote the “hubbiness,” authority, and PageRank of page 91:, respectively. Indicate whether each of the following statements is always true (T) or sometimes false a) If Out(i) Q Out(j), then g T F b) If Out(i) Q Out(j), then g T F c) If Q In(j), then g T F d) If Q In(i), then a(i) 3 (1(3). T F Problem 3: (15 points) What are all the stable models for the following propositional— logic program? p1 :- NOT q1 q1 :- NOT p1 p2 :- p1 p2 :- NOT q2 q2 :- NOT p2 Problem 4: (15 points) A collection of market—basket data has 100,000 frequent items, and 1,000,000 infrequent items. Each pair of frequent items appears 100 times; each pair consisting of one frequent and one infrequent item appears 10 times, and each pair of infrequent items appears once. Answer each of the following questions. Your answers only have to be correct to within 1%, and for convenience, you may optionally use scientific notation, e.g., 3.14 X 108 instead of 314,000,000. a) What is the total number of pair occurrences? That is, what is the sum of the counts of all pairs? b) We did not state the support threshold, but the given information lets us put bounds on the support threshold 8. What are the tightest upper and lower bounds on 8? c) Suppose we apply the PCY algorithm to this data. If the actual support threshold 8 is 10,000,000 (i.e., 107), and pairs in each of the three categories distribute as evenly as possible, what is the smallest number of buckets we can use so that most of the buckets are not frequent? Problem 5: (16 points) Consider the following rules: p(X) :- int(X) 86 X Z 2 86 NOT C(X) C(X) :- int(X) 86 p(Y) 86 divides(X,Y) 86 X 7E Y Think of p(X) as meaning “X is a prime77 and C(X) as “X is composite.77 The EDB predicate int(X) says that X is a positive integer, and in practice it will hold a finite set of integers. The EDB predicate divides(X, Y) means that Y evenly divides X. Suppose that int : {1,2,3,4}, and divides is the expected relation on these four integers; that is, divides : {(1,1), (2,1), (3,1), (4,1), (2,2), (4,2), (3,3), (4,4)}. If we instantiate these rules in all possible ways, eliminate rules with a known false subgoal and then eliminate known true subgoals from the remaining rules, we are left with the following: p(2) :- NOT C(2) C(2) :- p(1) p(3) :- NOT C(3) C(3) :- p(1) p(4) :- NOT C(4) C(4) :- p(1) C(4) :- p(2) a) Use the alternating—fixedpoint method to compute the well—founded model for this program plus EDB, by filling in the following table and then indicating the truth value (T, F, UNK) of each of the eight ground atoms. The table may have extra space for rounds that need not be computed; you may fill in the table only until you are 3 sure you have reached convergence. Round 0 1 2 3 4 Truth Value b) In the space belovv7 draw the dependency graph for the instantiated atoms and for 1 g i g 4. c) Are the rules with the given EDB locally stratified? — If so7 tell what the strata are; if not7 describe an infinite negative path. d) Suppose int contains the integers from 1 to n, and divides contains all those pairs such that divides i and i andj are integers between 1 and n. For what values of n will the rules and EDB be locally stratified? Explain briefiy. Problem 6: (16 points) A vievv—centric information system has a single view: V(X,Y,Z) :- e(X,Y) 86 e(Y,Z) 86 e(X,Z) We wish to answer the following query: q(A,B,C,D) :— e(A,B) & e(B,C) & e(C,D) & e(A,C) & e(B,D) & e(A,D) Notice that in this unusual case, neither the view definition nor the query have any variables that do not appear in the head. That fact may simplify reasoning about the problem. Also observe that the view describes a triangle in a graph, but the edges are directed, and go in the direction from one argument of the head (representing a node) to another that appears to the right, among the arguments of the head. Likewise, the query asks for a complete graph of 4 nodes, again with direction determined by “to the right, among the arguments of the head.77 A conjunctive query Q, all of whose subgoals have predicate v, is a solution if, after expansion, it is contained in the query. For Q to be a minimal solution, any conjunctive query P formed by deleting one or more subgoals from the body of Q must not be a solution; i.e., the expansion of P is not contained in the query. For each of the proposed solutions belovv, tell whether it is: not a solution, a solution but not minimal, or a minimal solution. In each case, explain your reasoning briefly. Suggestions: describe the expansions of the proposed solutions and indicate containment mappings when needed. a) q(A,B,C,D) :- V(A,B,C) 86 V(B,C,D) b) q(A,B,C,D) :- V(A,B,C) & V(B,C,D) & V(A,C,D) C) q(A,B,C,D) :- V(A,B,C) & V(A,E,D) & V(B,F,D) & V(G,C,D) d) q(A,B,C,D) :- V(A,B,C) & V(B,C,D) & V(B,A,D) Problem 7: (8 points) A market—basket data set contains 10 items. For a particular sample of the data7 the set of all maximal frequent itemsets is precisely the set of all pairs ofnfinm. HOW'nmnyiUHnafisthataresubfimsofthaw H)fienm Wfllbezxpaflzofthe negative border (as used in Toivonen’s Algorithm)? — Explain your answer briefly. Problem 8: (16 points) Consider the following conjunctive queries With arithmetic: Q2: panic :- a(X,Y) 86 a(Y,X) 86 X<Y Q1: panic :- a(A,B) 86 a(B,A) 86 A7EB We Wish to check Whether or not Q1 Q Q2. a) Rewrite Q1 and Q2 as rectified rules. b) What are all the containment mappings from the uninterpreted subgoals of Q2 to those of Q1? c) Write the statement about arithmetic that must be checked to verify that Q1 Q Q2. d) Is the condition of (c) true? — Explain briefly. Problem 9: (10 points) Suppose a Web graph is undirected, i.e. page i points to page j if and only page points to page i. Are the following statements true or false? Justify your answers briefly. a) The hubbiness and authority vectors are identical, i.e for each page, its hubbiness is equal to its authority. b) The matrix M that we use to compute PageRank is symmetric; i.e. = for all i and ...
View Full Document

This document was uploaded on 01/25/2012.

Page1 / 8

midterm03 - C8345 Midterm Examination Wednesday7 May 147...

This preview shows document pages 1 - 8. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online