General formula T RXY 1 Y 2 1 SY 1 Y 2 Z TR TS max VRY 1VSY 1 max VRY 2 VSY 2

General formula t rxy 1 y 2 1 sy 1 y 2 z tr ts max

This preview shows page 88 - 95 out of 106 pages.

General formula: T ( R(X,Y 1 ,Y 2 ) 1 S(Y 1 ,Y 2 ,Z) ) = T(R) × T(S) max ( V(R,Y 1 ),V(S,Y 1 ) ) × max ( V(R,Y 2 ),V(S,Y 2 ) ) 88 / 106
Image of page 88
Result size estimation: R 1 S , joining on 2 attributes Example R(a,b) S(b,c) U(c,d) T(R) =1000 T(S) =2000 T(U) =5000 V(R,b) =20 V(S,b) =50 V(S,c) =100 V(U,c) =500 Find T(R 1 S 1 U) ? Computed using this ordering: R(a,b) 1 S(b,c) 1 U (c,d) = ( R(a,b) 1 U(c,d) ) 1 S(b,c) A join operation with no common attributes will degenerates into a Cartesian product Example: R(a,b) 1 U(c,d) R(a,b) × U(c,d) 89 / 106
Image of page 89
Result size estimation: R 1 S , joining on 2 attributes Method 3: (ordering 3) R(a,b) 1 S(b,c) 1 U(c,d) = ( R(a,b) 1 U(c,d) ) 1 S(b,c) T ( R(a,b) 1 U(c,d) ) =T ( R(a,b) × U(c,d) ) =1000 × 5000=5,000,000 T ( R(a,b) 1 U(c,d) ) 1 S(b,c) = T ( R(a,b) 1 U(c,d) ) × T(S) max V ( R(a,b) 1 U(c,d),b ) , V(S,b) × max V ( R(a,b) 1 U(c,d),c ) , V(S,c) From the preservation of value sets assumption, we have: V ( R(a,b) 1 U(c,d),b ) = V(R,b) , V(R,b) =20 according to data V ( R(a,b) 1 U(c,d),c ) = V(U,c) , V(U,c) =500 according to data T(R 1 S 1 U) = 5 , 000 , 000 × 2 , 000 max(20 , 50) × max(500 , 100) =400,000 The 2 assumptions ( containment and preservation of value sets ) allows us to re-order the join-order without affecting the size of the result set estimation 90 / 106
Image of page 90
Estimating Join Sizes: General Case of Natural Join The general case of a natural join, R 1 1 R 2 1 R 3 1 ... 1 R n : an attribute A appear in k of the n relations the probability for that all these k relations agreeing in attribute A is then 1 v 2 * v 3 *...*v k v 1 = min(V(R 1 ,A),V(R 2 ,A),V(R 3 ,A),...,V(R k ,A)) general formula for finding size of any join: find the maximum number of tuples using the product of the number of tuples in all relations T(R 1 ) * T(R 2 ) * T(R 3 ) *...* T(R n ) then, for each attribute A appearing in more than one relation, divide the above result by all, but the least V(R,A) 91 / 106
Image of page 91
Estimating Join Sizes: General Case of Natural Join Example R(a,b,c) S(b,c,d) U(b,d,e) T(R) =10,000 T(S) =2,000 T(U) =5,000 V(R,a) =5,000 V(R,b) =1,000 V(S,b) =50 V(U,b) =100 V(R,c) =50 V(S,c) =1,000 V(S,d) =200 V(U,d) =100 V(U,e) =100 Find T(R 1 S 1 U) ? maximum number of tuples: T(R)*T(S)*T(U)=10000*2000*5000=100,000,000,000 for each attribute X appearing in more than one relation b appear in all relations, V(R,b)=1000 , V(S,b)=50 , V(U,b)=100 divide by 1000 * 100 c appear in all R and S , V(R,c)=50 , V(S,c)=1000 divide by 1000 d appear in all S and U , V(S,d)=200 , V(U,d)=100 divide by 200 T(R 1 S 1 U) = 100 , 000 , 000 , 000 (1000 * 100) * (1000) * (200) = 5 92 / 106
Image of page 92
Estimating Join Sizes So far, we have only calculated the number of tuples, but the size of a join is given by sizeof(R 1 S) = T(R 1 S) * S(R 1 S) However, the size of the tuples from a join is dependent on which kind of join we perform, e.g., in a natural join, the join attributes only appear once in a theta-join, all attributes from all relations appear thus, before calculating the total size in number of bytes, we must find the correct size of each tuple 93 / 106
Image of page 93
Result size estimation: Size of a Union: R S The number of tuples of a union ( ) is dependent of whether it is a set - or bag -version: Bag-based: the result is exactly the sum of the tuples of all the arguments: T(R bag S) = T(R)+T(S) Set-based: Range of the result set of R set S : max ( T(R),T(S) ) T(R set S) T(R)+T(S) max ( T(R),T(S) ) : R S or S R as bag-version if disjoint relations: T(R)+T(S) : R S= Recommended estimate for R set S usually somewhere between sum of both and the number of the larger relation may for example use: T(R set S) = max ( T(R),T(S) ) + 1 2 × min ( T(R),T(S) ) i.e.: maximum + 1 2 × (smaller size) 94 / 106
Image of page 94
Image of page 95

You've reached the end of your free preview.

Want to read all 106 pages?

  • Fall '19
  • Joseph Rosen
  • Relational model, StarsIn, R 1c S

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern

Stuck? We have tutors online 24/7 who can help you get unstuck.
A+ icon
Ask Expert Tutors You can ask You can ask ( soon) You can ask (will expire )
Answers in as fast as 15 minutes
A+ icon
Ask Expert Tutors