442_QueryOpt

# 442_QueryOpt - Relational Query Optimization Overview of...

This preview shows pages 1–7. Sign up to view the full content.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Relational Query Optimization Overview of Query Optimization • Plan : Tree of R.A. ops, with choice of alg for each op. ▫ Each operator typically implemented using a `pull’ interface: when an operator is `pulled’ for the next output tuples, it `pulls’ on its inputs and computes them. • Two main issues: ▫ For a given query, what plans are considered? &#2; Algorithm to search plan space for cheapest (estimated) plan. ▫ How is the cost of a plan estimated? • Ideally: Want to find best plan. Practically: Avoid worst plans! • We will study the System R approach. Highlights of System R Optimizer • Impact: ▫ Most widely used currently; works well for < 10 joins. • Cost estimation: Approximate art at best. ▫ Statistics, maintained in system catalogs, used to estimate cost of operations and result sizes. ▫ Considers combination of CPU and I/O costs. • Plan Space: Too large, must be pruned. ▫ Only the space of left-deep plans is considered. &#2; Left-deep plans allow output of each operator to be pipelined into the next operator without storing it in a temporary relation. ▫ Cartesian products avoided. Schema for Examples • Similar to old schema; rname added for variations. • Reserves: ▫ Each tuple is 40 bytes long, 100 tuples per page, 1000 pages. • Sailors: ▫ Each tuple is 50 bytes long, 80 tuples per page, 500 pages. Sailors ( sid : integer , sname : string, rating : integer, age : real) Reserves ( sid : integer, bid : integer, day : dates , rname : string) Motivating Example • Cost: 1000+1000*500 I/Os • By no means the worst plan! • Misses several opportunities: selections could have been `pushed’ earlier, no use is made of any available indexes, etc. • Goal of optimization: To find more efficient plans that compute the same answer. SELECT S.sname FROM Reserves R, Sailors S WHERE R.sid=S.sid AND R.bid=100 AND S.rating>5 Reserves Sailors sid=sid bid=100 rating > 5 sname Reserves Sailors sid=sid bid=100 rating > 5 sname (Simple Nested Loops) (On-the-fly) (On-the-fly) RA Tree: Plan: Alternative Plans 1 (No Indexes) • Main difference: push selects. • With 5 buffers, cost of plan: ▫ Scan Reserves (1000) + write temp T1 (10 pages, if we have 100 boats, uniform distribution). ▫ Scan Sailors (500) + write temp T2 (250 pages, if we have 10 ratings)....
View Full Document

## This note was uploaded on 02/24/2011 for the course CS 442 taught by Professor Mlittman during the Fall '08 term at Rutgers.

### Page1 / 26

442_QueryOpt - Relational Query Optimization Overview of...

This preview shows document pages 1 - 7. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online