This preview shows page 1. Sign up to view the full content.
Unformatted text preview: • 2 partitioned hash joins
• Cost 3B(R) + 3B(S) + 4k + 3B(U) = 75000 + 4k Completing Physical Query Plan (8 of 13) Query Optimization (49 of 64) 3B(R) because we read once, write once and read it back to memory for joining, same as 3B(S) and 3B(U). 4K: read and write out all the K we just joined, then we read it once and twice for writing out the hash bucket, then bring back to memory. 3B(R) + 3B(S) cannot be reduced, we will try to reduce 4K and 3B(U) with new method depending on the value of K. Example
• Smarter:
• Step 1: hash R on x into 100 buckets, each of 50 blocks; to disk
• Step 2: hash S on x into 100 buckets; to disk
• Step 3: read each Ri in memory (50 buffer) join with Si (1 buffer); hash result on y into 50 buckets (50 buffers)   here we pipeline
• Cost so far: 3B(R) + 3B(S)
k blocks U(y,z)
10,000 blocks R(w,x)
S(x,y)
5,000 blocks 10,000 blocks
Completing Physical Query Plan (9 of 13) Query Optimization (50 of 64) If K is really small, we can neglect its contribution. Like 50 blocks can be fitted into memory right away by hashing the entire k values. Read U only once per time because K is small. Besides k never needs to be kicked out of the memory. Example
• Continuing:
• How large are the 50 buckets on y ? Answer: k/50.
• If k <= 50 then keep all 50 buckets in Step 3 in memory, then:
• Step 4: read U from disk, hash on y and join with memory
• Total cost: 3B(R) + 3B(S) + B(U) = 55,000
k blocks U(y,z)
10,000 blocks R(w,x)
S(x,y)
5,000 blocks 10,000 blocks Completing Physical Query Plan (10 of 13) Query Optimization (51 of 64) Example
• Continuing:
• If 50 < k <= 5000 then sen...
View
Full
Document
 Fall '08
 Staff

Click to edit the document details