General formula T RXY 1 Y 2 1 SY 1 Y 2 Z TR TS max VRY 1VSY 1 max VRY 2 VSY 2

# General formula t rxy 1 y 2 1 sy 1 y 2 z tr ts max

• Notes
• 106

This preview shows page 88 - 95 out of 106 pages.

General formula: T ( R(X,Y 1 ,Y 2 ) 1 S(Y 1 ,Y 2 ,Z) ) = T(R) × T(S) max ( V(R,Y 1 ),V(S,Y 1 ) ) × max ( V(R,Y 2 ),V(S,Y 2 ) ) 88 / 106
Result size estimation: R 1 S , joining on 2 attributes Example R(a,b) S(b,c) U(c,d) T(R) =1000 T(S) =2000 T(U) =5000 V(R,b) =20 V(S,b) =50 V(S,c) =100 V(U,c) =500 Find T(R 1 S 1 U) ? Computed using this ordering: R(a,b) 1 S(b,c) 1 U (c,d) = ( R(a,b) 1 U(c,d) ) 1 S(b,c) A join operation with no common attributes will degenerates into a Cartesian product Example: R(a,b) 1 U(c,d) R(a,b) × U(c,d) 89 / 106
Result size estimation: R 1 S , joining on 2 attributes Method 3: (ordering 3) R(a,b) 1 S(b,c) 1 U(c,d) = ( R(a,b) 1 U(c,d) ) 1 S(b,c) T ( R(a,b) 1 U(c,d) ) =T ( R(a,b) × U(c,d) ) =1000 × 5000=5,000,000 T ( R(a,b) 1 U(c,d) ) 1 S(b,c) = T ( R(a,b) 1 U(c,d) ) × T(S) max V ( R(a,b) 1 U(c,d),b ) , V(S,b) × max V ( R(a,b) 1 U(c,d),c ) , V(S,c) From the preservation of value sets assumption, we have: V ( R(a,b) 1 U(c,d),b ) = V(R,b) , V(R,b) =20 according to data V ( R(a,b) 1 U(c,d),c ) = V(U,c) , V(U,c) =500 according to data T(R 1 S 1 U) = 5 , 000 , 000 × 2 , 000 max(20 , 50) × max(500 , 100) =400,000 The 2 assumptions ( containment and preservation of value sets ) allows us to re-order the join-order without affecting the size of the result set estimation 90 / 106
Estimating Join Sizes: General Case of Natural Join The general case of a natural join, R 1 1 R 2 1 R 3 1 ... 1 R n : an attribute A appear in k of the n relations the probability for that all these k relations agreeing in attribute A is then 1 v 2 * v 3 *...*v k v 1 = min(V(R 1 ,A),V(R 2 ,A),V(R 3 ,A),...,V(R k ,A)) general formula for finding size of any join: find the maximum number of tuples using the product of the number of tuples in all relations T(R 1 ) * T(R 2 ) * T(R 3 ) *...* T(R n ) then, for each attribute A appearing in more than one relation, divide the above result by all, but the least V(R,A) 91 / 106
Estimating Join Sizes: General Case of Natural Join Example R(a,b,c) S(b,c,d) U(b,d,e) T(R) =10,000 T(S) =2,000 T(U) =5,000 V(R,a) =5,000 V(R,b) =1,000 V(S,b) =50 V(U,b) =100 V(R,c) =50 V(S,c) =1,000 V(S,d) =200 V(U,d) =100 V(U,e) =100 Find T(R 1 S 1 U) ? maximum number of tuples: T(R)*T(S)*T(U)=10000*2000*5000=100,000,000,000 for each attribute X appearing in more than one relation b appear in all relations, V(R,b)=1000 , V(S,b)=50 , V(U,b)=100 divide by 1000 * 100 c appear in all R and S , V(R,c)=50 , V(S,c)=1000 divide by 1000 d appear in all S and U , V(S,d)=200 , V(U,d)=100 divide by 200 T(R 1 S 1 U) = 100 , 000 , 000 , 000 (1000 * 100) * (1000) * (200) = 5 92 / 106
Estimating Join Sizes So far, we have only calculated the number of tuples, but the size of a join is given by sizeof(R 1 S) = T(R 1 S) * S(R 1 S) However, the size of the tuples from a join is dependent on which kind of join we perform, e.g., in a natural join, the join attributes only appear once in a theta-join, all attributes from all relations appear thus, before calculating the total size in number of bytes, we must find the correct size of each tuple 93 / 106
Result size estimation: Size of a Union: R S The number of tuples of a union ( ) is dependent of whether it is a set - or bag -version: Bag-based: the result is exactly the sum of the tuples of all the arguments: T(R bag S) = T(R)+T(S) Set-based: Range of the result set of R set S : max ( T(R),T(S) ) T(R set S) T(R)+T(S) max ( T(R),T(S) ) : R S or S R as bag-version if disjoint relations: T(R)+T(S) : R S= Recommended estimate for R set S usually somewhere between sum of both and the number of the larger relation may for example use: T(R set S) = max ( T(R),T(S) ) + 1 2 × min ( T(R),T(S) ) i.e.: maximum + 1 2 × (smaller size) 94 / 106

#### You've reached the end of your free preview.

Want to read all 106 pages?

• Fall '19
• Joseph Rosen
• Relational model, StarsIn, R 1c S

### What students are saying

• As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

Kiran Temple University Fox School of Business ‘17, Course Hero Intern

• I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

Dana University of Pennsylvania ‘17, Course Hero Intern

• The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

Jill Tulane University ‘16, Course Hero Intern