cs411-08-queryprocessing-2

cs411-08-queryprocessing-2 - 1 CS411 Database Systems 8:...

Info iconThis preview shows pages 1–11. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: 1 CS411 Database Systems 8: Query Processing 2 2 Why Do We Learn This? How to implement (efficiently) SQL queries in a database system? As a DBA, how to tune you database system in order to have better performance? Must-know knowledge if you want to get a job in Oracle, Microsoft SQL-Server branch, IBM DB2 branch, Google, 2 3 Outline Logical/physical operators Cost parameters and sorting One-pass algorithms Nested-loop joins Two-pass algorithms 4 Query Execution Query compiler Execution engine Index/record mgr. Buffer manager Storage manager storage User/ Application Query or update Query execution plan Record, index requests Page commands Read/write pages 5 Logical v.s. Physical Operators Logical operators what they do e.g., union, selection, project, join , grouping Physical operators how they do it e.g., nested loop join, sort-merge join, hash join, index join 6 Query Execution Plans Purchase Person Buyer=name City=urbana phone>5430000 buyer (Simple Nested Loops) SELECT Q.name FROM Purchase P, Person Q WHERE P.buyer=Q.name AND P.city=urbana AND Q.phone > 5430000 Query Plan: logical tree implementation choice at every node scheduling of operations. (Table scan) (Index scan) Some operators are from relational algebra, and others (e.g., scan, group) are not. 7 How do We Combine Operations? The iterator model. Each operation is implemented by 3 functions: Open: sets up the data structures and performs initializations GetNext: returns the the next tuple of the result. Close: ends the operations. Cleans up the data structures. Enables pipelining! 8 Cost Parameters Cost parameters M = number of blocks that fit in main memory B(R) = number of blocks holding R T(R) = number of tuples in R V(R,a) = number of distinct values of the attribute a Estimating the cost: Important in optimization (cost-based optimization: next lecture) Compute I/O cost only ( memory disk ) access to disk drives is much slower than access to RAM! We compute the cost to read the tables We dont compute the cost to write the result (because pipelining) 9 Review: Main-Memory Merge Sort 10 Sorting 2 pass multi-way merge sort (2PMMS) Step 1: Read M blocks at a time, sort, write Result: B/M runs of length M on disk (the last run may be shorter) Step 2: Merge M-1 at a time, write to disk (M-1): input buffer...
View Full Document

Page1 / 44

cs411-08-queryprocessing-2 - 1 CS411 Database Systems 8:...

This preview shows document pages 1 - 11. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online