Result size estimation Size of an Intersection R S Range of the result set ofR

# Result size estimation size of an intersection r s

• Notes
• 106

This preview shows page 95 - 100 out of 106 pages.

Result size estimation: Size of an Intersection: R S Range of the result set of R S 0 T(R S) min ( T(R),T(S) ) 0 : R S = - if disjoint relations min ( T(R),T(S) ) : R S or S R - if one relation contains only a subset of the other Recommended estimate for R S usually somewhere in-between may for example use average: T(R S) = 1 2 × min ( T(R),T(S) ) 95 / 106
Result size estimation: Size of a Difference R - S Range of the result set of R - S : max { 0,T(R) - T(S) } T(R - S) T(R) T(R) - T(S) : all tuples in S also is in R T(R) : R S = - disjoint relations Recommended estimate for R - S usually somewhere in-between may for example use: T(R - S) = T(R) - 1 2 × T(S) Note: if T(R) - 1 2 × T(S) 0 , then T(R - S) = 0 (estimate) 96 / 106
Result size estimation: δ (R,A) The number of tuples of a duplicate elimination ( δ ) is the same as the number of distinct tuples Range of the result set of δ (R,A) 1 T ( δ (R,A) ) T(R) 1 : all tuples have same attribute value T(R) : all tuples have different attribute values Recommended estimate for δ (R,a 1 ,a 2 ,...,a n ) : given V(R,a i ) for all n attributes, the maximum number of different tuples are T ( δ (R,a 1 ,a 2 ,...,a n ) ) = V(R,a 1 ) * V(R,a 2 ) *...* V(R,a n ) let estimated number of tuples be the smaller of this number and the number of tuples in the relation Otherwise, if no statistics available, then we use this estimate: T ( δ (R,a 1 ,...,a n ) ) = min ( ( 1 2 ) n × T(R) , V(R,a 1 )*...*V(R,a n ) ) Reason - V(R,a 1 )*...* V(R,a n ) : The upper bound limit on the number of distinct tuples that could exist - ( 1 2 ) n * T ( R ): The size can be as small as 1 or as big as T(R) 97 / 106
Result size estimation: γ L (R) The number of tuples of a grouping ( γ ) is the same as the number of groups Range of the result set of γ L (R) 1 T ( γ L (R) ) ) T(R) 1 : all tuples have same attribute value T(R) : all tuples have different attribute values for attribute L Recommended estimate for T ( γ L (R) ) ) If the database maintains statistics on the attribute values: T ( γ L (R) ) = V(R,L) If no statistics available, then we use this estimate: T ( γ L (R) ) = ( 1 2 ) n × T(R) , where n = number of attributes in the attribute list L 98 / 106
Histogram Modern DBMS often maintain a number of statistical information to help the Query Optimizer decide on the optimal query plan Better selectivity estimates are possible if we have more detailed statistics A DBMS typically collects histograms that detail the distribution of values. A histogram can be of two types: equi-width histogram : All buckets contain the same number of values Easy, but inaccurate The range of values is divided into equal-sized subranges. equi-depth histograms : (used by most DBMS) All buckets contain the same number of tuples Better accuracy, need to sort data to compute The sub ranges are chosen in such a way that the number of tuples within each sub range is equal. Such histograms are only available for base relations, however, not for sub-results. If a histogram is available for the attribute A , the number of tuples can be estimated with more accuracy.

#### You've reached the end of your free preview.

Want to read all 106 pages?

• Fall '19
• Joseph Rosen
• Relational model, StarsIn, R 1c S

### What students are saying

• As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

Kiran Temple University Fox School of Business ‘17, Course Hero Intern

• I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

Dana University of Pennsylvania ‘17, Course Hero Intern

• The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

Jill Tulane University ‘16, Course Hero Intern