cs411-20110413 - Two pggflalgvorithglgu Two-Pass...

Info icon This preview shows pages 1–23. Sign up to view the full content.

View Full Document Right Arrow Icon
Image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 2
Image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 4
Image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 6
Image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 8
Image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 10
Image of page 11

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 12
Image of page 13

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 14
Image of page 15

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 16
Image of page 17

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 18
Image of page 19

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Image of page 20
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Two pggflalgvorithglgu Two-Pass Algorithms Based o Du lieate elimin ’ 115 R ac? J'Sflnt F" (HM - Step 1: sort runs of size M, write .__> — Cost: 2B(R) ' Step 2: merge M-l runs, but include each tuple only once — Cost: B(R) VijflPm ‘ - Total cost 313 R Assumption: B(R)< Jwfi “l”: Q: What can sorting help? And, how? ' Selection? R1 3, 9/254 / ..—r”' 0 Projection? ° Join? - Duplicate elimination? cla‘mM "bf Mai a .. ° Grouping? select .Cm 23 J Two-Pass Algorithms Based on Sorting Grouping: y éMTgym like 8 Jity, sumLprice) - Same as’EEfore: sort, then compute the _ sum(price) for each group Pd 7-4 Sm}? . - As before: compute sum(price) during the merge phase. 5 WI - Total cost: 3B(R) - Assumption: B(R) <= M3 24 Two-Pass Algorithms Based on Sorting Swag 0,} Kr \S Binary operations: R O S, R U S, R — S - Idea: sort R, sort S, then do the right thing - A closer look: — Step 1: split R into runs of size M, then split S into runs of size M. Cost: 2B(R) + 2B(S) # Step 2: marge aHx 111115 frOlTl R; marge all y runs from S; ouput I - . _ ; :- : e by cases basis (.r+jr <= M) - Total cos : 3B R +3B _. _ ' Assumption: B(R)+B(S)<= M2 f‘fl—H‘ 25 Two-Pass Algorithms Based on Sorting Join Rt><1 s W R Q) ' Start by sorting both R and S on the join attribute: — Cost: 4B{R)+4B(S) (because need to write to disk) II Read both relations in sorted order, match tuples — Cost: B(R)+B(S) - Difficulty: many tuples in R may match many in S $21.39*“? — If at least one set of tuples fits in M, we are OK — Otherwise n - - _ . up,highereost »\ - Total cos 5B '__ t“ my wJI‘W 55V“ - Assumption: - M2 ‘ 0 Q: Why is sorting-based “two” pass? P . 0r ‘ if? ’29“ b7 5 www R—v '31 “W” I) 80%. TL“? 5 é? WMFMT} \3 3' ng W st 2% WA 3 ~-.. A M .. 27 5953, Two Pass Algorithms Based 0 Disk M main memory hufi'el‘s Disk .—p- [ ° Does each bucket fit in main memo 9 Q: What can hashing help? And, how? Selection? Projection? Set operations? Join? Duplicate elimination? Grouping? 29 Hash Based Algorithms for 2 Recall: QR) : duplicate elimination Step 1. Partition R into buckets R7) 2, . _ , 1% Step 2. Apply 5: to each bucket (may read in'fiTain # memory) W Fm: 5G2) —. M Cost: 3B(R) [2 :: Assumption:B(R) < S Iago Hash Based Algorithms for y - Recall: y(R) = grouping and aggregation - Step 1.?artition R into buckets - Step 2. Apply y to each bucket (may read in main memory) - Cost: 3B(R) - Assumption:B(R) <= M2 3] Hash-based Join I'R[><IS ° Simple version: main memog hash-based foin — Scan S? build buckets in main memory — Then scan R and join - Requirement: min(B(R), B(S)) <= M 32 T]. I74 T211 1.1,? VTW‘fiPartitioned Hash R D<1 S M '3' [577 Step 1: (D 5 “m2 HashSintoMbuekets S _> SI ' ' "" 10—1? 9 send all buckets to disk - tep 2 ® R ‘ — Hash R into M buckets R19 Rd * ' ‘- '51) — Send all buckets to disk - Step 3 <3) 85:; D§ ‘21; — Join every pair of buckets "fl" \ _ wa$139 w £22,905; 33 Partitioned Egan] Hash-Join R * Partition both relations using hash fn h: R tuplos in partition i will only match S tnples in partition i. — — — — — — — — — — — — — — — — — — — — — — — - — - — — — o:o Read in a partition . of R, hash it using h2 (<:> ht). Scan matching partition of 3, search for l {D D D matches. Partitioned Hash Join - Assumption: M I- At least one full bucket of the smaller rel must fit in mem0(R), B(S_)) <= M2 [we 6 Small +136 70 ML 35 Partitioned Hash Join - Assumptin: At least one full bucket of the smaller rel must fit in 35 / 3N 1m . Index-based algpfi’thms mpass!)dlmn 36 Indexed Based Algorithms - In a clustered index all tuples with the same value of the key are clust ed on as few blocks as p0331ble W M ' ~/ 3T SimBfic FNWW VCR; “MM” Wsed Selection £56190 - Selection on equality: Ga=V(R) y-(R’ - Clustered index on a: cost B(R) 4L - Unelustered index on a: cost T(R) 'V ' ,a) 35: ‘56:; black; wgte nmnolwwevl Ra erg €5.21 @926 Wei“) m blocks tar 549%: $4233 ---P 17170 .. (clusww‘) ‘— Lg—QED Emeglfi) / 38 Index Based Selection Example: B(R) = 2000, T(R) = 1003000, V(R, a) = 20, compute the cost of 03=V(R) Cost of table scan: — If R is clustered: B(R) I 2000 I/Os — If R is unelustered: T(R) = 100,000 I/Os Cost of index based selection: — If index is clustered: B(R)/V(R,a) = 100 — If index is unclustered: T(R)/V(R,a) = 5000 Notice: when V(R,a) is small, then unclustered index is useless 39 Index Based Join R [><] S Assume S has an index on the join attribute Iterate ever R, for each tuple fetch cen‘espending tuple(s) from S Assume R is elustere ; ' — Ifindex is clustere' B(R) + T( B(S)fV S,a) — If index is unclustered: B R) .9 a) as; S 9L” 11% V ’ Average SQLLite Score: 3.2 -—: Average SQL Tuning Score: 3.55 q—" :g- 'ons re engaging lectures ' - p psqldemo op It s gestion Combine into 1 lecture hot topics in db field more integration, no standalone ST lectures ' ' tion RDBMS topic on web crawling vote on topics before hand enum exam topics from ST lectures topic on massively scalable DBs easier topics have kevin teach ST lectures lecturefrom industry more famous speakers more hands one topics more variety move ST lectures up no ST lecture on Fridays remove SQLLite topic on DB's behind facebook, twitter topic on hash tables topic on OODBMS topic on Oracle topic on speeding up sql Use Previous Proiects rthan SQL Count l—‘LLJ NI—‘I NMWWWWWWWW~WW ...
View Full Document

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern