This preview shows pages 1–7. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full DocumentThis preview has intentionally blurred sections. Sign up to view the full version.
View Full Document
Unformatted text preview: Relational Query Optimization Overview of Query Optimization • Plan : Tree of R.A. ops, with choice of alg for each op. ▫ Each operator typically implemented using a `pull’ interface: when an operator is `pulled’ for the next output tuples, it `pulls’ on its inputs and computes them. • Two main issues: ▫ For a given query, what plans are considered?  Algorithm to search plan space for cheapest (estimated) plan. ▫ How is the cost of a plan estimated? • Ideally: Want to find best plan. Practically: Avoid worst plans! • We will study the System R approach. Highlights of System R Optimizer • Impact: ▫ Most widely used currently; works well for < 10 joins. • Cost estimation: Approximate art at best. ▫ Statistics, maintained in system catalogs, used to estimate cost of operations and result sizes. ▫ Considers combination of CPU and I/O costs. • Plan Space: Too large, must be pruned. ▫ Only the space of leftdeep plans is considered.  Leftdeep plans allow output of each operator to be pipelined into the next operator without storing it in a temporary relation. ▫ Cartesian products avoided. Schema for Examples • Similar to old schema; rname added for variations. • Reserves: ▫ Each tuple is 40 bytes long, 100 tuples per page, 1000 pages. • Sailors: ▫ Each tuple is 50 bytes long, 80 tuples per page, 500 pages. Sailors ( sid : integer , sname : string, rating : integer, age : real) Reserves ( sid : integer, bid : integer, day : dates , rname : string) Motivating Example • Cost: 1000+1000*500 I/Os • By no means the worst plan! • Misses several opportunities: selections could have been `pushed’ earlier, no use is made of any available indexes, etc. • Goal of optimization: To find more efficient plans that compute the same answer. SELECT S.sname FROM Reserves R, Sailors S WHERE R.sid=S.sid AND R.bid=100 AND S.rating>5 Reserves Sailors sid=sid bid=100 rating > 5 sname Reserves Sailors sid=sid bid=100 rating > 5 sname (Simple Nested Loops) (Onthefly) (Onthefly) RA Tree: Plan: Alternative Plans 1 (No Indexes) • Main difference: push selects. • With 5 buffers, cost of plan: ▫ Scan Reserves (1000) + write temp T1 (10 pages, if we have 100 boats, uniform distribution). ▫ Scan Sailors (500) + write temp T2 (250 pages, if we have 10 ratings)....
View
Full
Document
This note was uploaded on 02/24/2011 for the course CS 442 taught by Professor Mlittman during the Fall '08 term at Rutgers.
 Fall '08
 mlittman

Click to edit the document details