Result size estimation Size of an Intersection R S Range of the result set ofR

Result size estimation size of an intersection r s

This preview shows page 95 - 100 out of 106 pages.

Result size estimation: Size of an Intersection: R S Range of the result set of R S 0 T(R S) min ( T(R),T(S) ) 0 : R S = - if disjoint relations min ( T(R),T(S) ) : R S or S R - if one relation contains only a subset of the other Recommended estimate for R S usually somewhere in-between may for example use average: T(R S) = 1 2 × min ( T(R),T(S) ) 95 / 106
Image of page 95
Result size estimation: Size of a Difference R - S Range of the result set of R - S : max { 0,T(R) - T(S) } T(R - S) T(R) T(R) - T(S) : all tuples in S also is in R T(R) : R S = - disjoint relations Recommended estimate for R - S usually somewhere in-between may for example use: T(R - S) = T(R) - 1 2 × T(S) Note: if T(R) - 1 2 × T(S) 0 , then T(R - S) = 0 (estimate) 96 / 106
Image of page 96
Result size estimation: δ (R,A) The number of tuples of a duplicate elimination ( δ ) is the same as the number of distinct tuples Range of the result set of δ (R,A) 1 T ( δ (R,A) ) T(R) 1 : all tuples have same attribute value T(R) : all tuples have different attribute values Recommended estimate for δ (R,a 1 ,a 2 ,...,a n ) : given V(R,a i ) for all n attributes, the maximum number of different tuples are T ( δ (R,a 1 ,a 2 ,...,a n ) ) = V(R,a 1 ) * V(R,a 2 ) *...* V(R,a n ) let estimated number of tuples be the smaller of this number and the number of tuples in the relation Otherwise, if no statistics available, then we use this estimate: T ( δ (R,a 1 ,...,a n ) ) = min ( ( 1 2 ) n × T(R) , V(R,a 1 )*...*V(R,a n ) ) Reason - V(R,a 1 )*...* V(R,a n ) : The upper bound limit on the number of distinct tuples that could exist - ( 1 2 ) n * T ( R ): The size can be as small as 1 or as big as T(R) 97 / 106
Image of page 97
Result size estimation: γ L (R) The number of tuples of a grouping ( γ ) is the same as the number of groups Range of the result set of γ L (R) 1 T ( γ L (R) ) ) T(R) 1 : all tuples have same attribute value T(R) : all tuples have different attribute values for attribute L Recommended estimate for T ( γ L (R) ) ) If the database maintains statistics on the attribute values: T ( γ L (R) ) = V(R,L) If no statistics available, then we use this estimate: T ( γ L (R) ) = ( 1 2 ) n × T(R) , where n = number of attributes in the attribute list L 98 / 106
Image of page 98
Histogram Modern DBMS often maintain a number of statistical information to help the Query Optimizer decide on the optimal query plan Better selectivity estimates are possible if we have more detailed statistics A DBMS typically collects histograms that detail the distribution of values. A histogram can be of two types: equi-width histogram : All buckets contain the same number of values Easy, but inaccurate The range of values is divided into equal-sized subranges. equi-depth histograms : (used by most DBMS) All buckets contain the same number of tuples Better accuracy, need to sort data to compute The sub ranges are chosen in such a way that the number of tuples within each sub range is equal. Such histograms are only available for base relations, however, not for sub-results. If a histogram is available for the attribute A , the number of tuples can be estimated with more accuracy.
Image of page 99
Image of page 100

You've reached the end of your free preview.

Want to read all 106 pages?

  • Fall '19
  • Joseph Rosen
  • Relational model, StarsIn, R 1c S

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern

Ask Expert Tutors You can ask You can ask ( soon) You can ask (will expire )
Answers in as fast as 15 minutes