ch9-2 - Chapter 9 Optimization of Distributed Queries 9.3...

Info iconThis preview shows pages 1–6. Sign up to view the full content.

View Full Document Right Arrow Icon
Chapter 9 Optimization of Distributed Queries 9.3 JOIN ORDER IN FRAGMENT QUERY Join ordering is important in centralized DB and more important in distributed DB. 9.3.1 Joing Ordering Understand the difficulty of join ordering without using semijoin. Assumptions necessary to state the main issues A Fragments and relations are indistinguishable i Local processing cost is omitted L Relations are transferred in one-set-at-a- time mode; and t Cost to transfer data to produce the final result at the result site is omitted Single join S R Transfer the smaller size If size(R)<size(S) R S If size(S)<size(R) 16
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Chapter 9 Optimization of Distributed Queries Join of multirelations The size of intermediate relations must be considered. The number of possible ways to process a query grows rapidly with the number of relations involved. Example Site 2 G ENO JNO E J Site 1 Site 3 Strategies: Choose two relations to join and send the result to the 3 rd site – 6 ways. Choose two relations and send them to the 3 rd site to join: 3 ways. …… Shortcoming of the method: transfer the entire relation which may contain some useless tuples. 9.3.2 Semijoin Based Algorithms 17
Background image of page 2
Chapter 9 Optimization of Distributed Queries Semijoin reduces the size of operand relation to be transferred. ) ( ) ( ) ( ) ( R S S R R S R S S R S R A A A A A A A A Semijoin is beneficial if the cost to produce and send it to the other site is less than sending the whole relation. Four steps in computing R S ) ( S A π R S S R A The semijoin method is beneficial if Size( ) ( S A )+size( S R A ) < size(R) The join method is better if size( S R A ) is close to size(R). In general, semijoin can be used to reduce the size of operand relations in multiple join 18
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Chapter 9 Optimization of Distributed Queries queries. Example: the join of E, G, and J G ENO JNO E J Semijoin can be used more than once to reduce operand relations E (G J) -- to reduce E J (G E) -- to reduce J The sequence of semijoin is called semijoin program. For a given relation, the number of semijoin programs is exponential in the number of relations. But only one is optimal, called full reducer . It’s difficult to find the full reducer. Most systems just use single semijoin to reduce the size of operand relations. 9.4 DISTRIBUTED QUERY OPTIMIZATION ALGORITHMS The Algorithms vary from system to system. 19
Background image of page 4
Chapter 9 Optimization of Distributed Queries Four distributed query optimization algorithms: distributed INGRES, System R*, SDD-1, and AHY. Features
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 6
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 12/23/2009 for the course DBST 663 taught by Professor Tba during the Spring '09 term at MD University College.

Page1 / 17

ch9-2 - Chapter 9 Optimization of Distributed Queries 9.3...

This preview shows document pages 1 - 6. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online