CS411 - Query Processing2 Final Scribe

# If the tuple does not exist then it is outputted and

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: ere the blocks are • Index scan: if we have index to find the blocks • The table is unclustered (e.g. its records are placed on blocks with other tables) • May need one read for each record 10 If tables are unclustered, it is possible that every record is placed in a different block. Worst case time spent reading it could be T(R). Professor didn’t really touch on this one… Cost of the Scan Operator • Clustered relation: • Table scan: B(R); to sort: 3B(R) • Index scan: B(R); to sort: B(R) or 3B(R) • Unclustered relation • T(R); to sort: T(R) + 2B(R) 11 One- pass algorithm One- pass Algorithms A pass is a “phase” of processing. Imagine this as one pass through the data. Selection σ(R), projection Π(R) • Both are tuple- at- a- Time algorithms • Cost: B(R) Input buffer Unary operator At each pass, we have a unary operator, like a select or a projection that goes through the data and gives an output in the buffer. One tuple at a time is processed. Output buffer 12 In the example of a duplicate elimination, we use a delta operator. One- pass Algorithms Duplicate elimination δ(R) • Need to keep a dictionary in memory: The operator DISTINCT is a great example of a way to eliminate duplicates. With the query SELECT DISTINCT names FROM users, we get all the distinct user names. • balanced search tree • hash table • etc • Cost: B(R) • Assumption: B(δ(R)) <= M However, what this means is that we need to maintain a dictionary in memory. 13 If we have a tuple coming in, then what we do is check it in the dictionary. If the tuple does not exist, then it is outputted and added to the data structure. However, if it is, then it means that it is a duplicate. We are basically building a dictionary in the main memory. Furthermore, operators such as Grouping (union, etc) all follow this same use case. One- pass Algorithms Grouping: γcity, sum(price) (R) • Need to keep a dictionary in memory • Also store the sum(price) for each city...
View Full Document

{[ snackBarMessage ]}

Ask a homework question - tutors are online