lec05_query_processing_2up.pdf

Do if t has been seen before keep going 3 t i getnext

This preview shows page 16 - 20 out of 69 pages.

do % if t has been seen before, keep going 3: t i. GetNext () 4: end while 5: if t ̸ = <EOF> then 6: add t to s % this avoids returning duplicates! 7: return t 8: else 9: return <EOF> 10: end if 29 Implementation considerations Again, the previous slides are meant to give a gist of what an iterator looks like. There are many practical considerations that are too low level to list in the notes. For example, for the distinct () iterator (slide 29), the DBMS would need to decide whether to keep the a hash set in memory or on disk , depending on the system load at query execution time! Similarly for joins (and most other binary operators), the DBMS chooses the actual algorithm and data structure to use in real-time, based on the resources (especially RAM) available. 30
Image of page 16

Subscribe to view the full document.

Main Memory Algorithms iterators for when data fits in RAM Cost model for query processing Because I/O operations are much more costly than CPU operations, we are not concerned with algorithmic cost (as in CMPUT204). Instead, we measure the cost of a query plan by: (1) the number of I/O operations and (2) the number of memory buffers needed to execute it. 3
Image of page 17
Table scans The table scan is the simplest way to get tuples from a table. It goes over every tuple in every block (recall slide 25). table (heap file) on disk disk block operator buffer in RAM A table scan requires only one buffer of RAM (which can be reused). But its I/O cost is O ( | R | ) (as it goes over every disk block). 32 How do we find the cost of a plan? Focus on the iterators that actually perform I/O. π a 1 ,...,a n σ C scan ( R ) Recall that the iterators for σ and π do not read any data from disk, thus have 0 I/O cost. Also, they do not need to keep any data in buffers. So their memory cost is also 0. So the final cost of the plan is O (1) buffers in RAM and O ( | R | ) I/O operations 33
Image of page 18

Subscribe to view the full document.

How many buffers should the DBMS use? A complete table scan can be done with a single buffer . But... IF there is more RAM available, should the DBMS read more blocks at once (maybe all of them) OR ... should it save RAM ? Reading many consecutive blocks at once is better: - this amortizes the seek time (for HDDs) and - reduces the impact of BUS congestion 34 The I/O cost of joins and products To find out the actual I/O cost of the nested-loop-join algorithm of slide 28, consider what happens when we take the calls to GetNext() in table scans into account: for each buffer b R of R do for each tuple t R in b R do for each buffer b S of S do for each tuple t S in b S do t join ( t R , t S ) if t satisfies C then add t to output buffer end if end for end for end for end for C scan ( R ) scan ( S ) So, how many block reads are done? 35
Image of page 19
Case 1: use one buffer for each scan. Memory Cost = 2 buffers I/O Cost = O ( | R | · | S | ) block reads for each buffer b R of R do for each tuple t R in b R do for each buffer b S of S do for each tuple t S in b S do t join ( t R , t S ) if t satisfies C then add t to output buffer end if end for end for end for end for This is the worst case scenario for I/O: quadratic number of block reads.
Image of page 20
You've reached the end of this preview.
  • Winter '16
  • Bill Murray

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern