06_btree

06_btree - CPS216: Dataintensive Computing Systems...

Info iconThis preview shows page 1. Sign up to view the full content.

View Full Document Right Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: CPS216: Dataintensive Computing Systems Operators for Data Access (contd.) Shivnath Babu 1 Insertion in a BTree 49 n = 2 49 15 36 Insert: 62 2 Insertion in a BTree 49 n = 2 49 62 15 36 Insert: 62 3 Insertion in a BTree 49 n = 2 49 62 15 36 Insert: 50 4 Insertion in a BTree 49 62 n = 2 49 50 62 15 36 Insert: 50 5 Insertion in a BTree 49 62 n = 2 49 50 62 15 36 Insert: 75 6 Insertion in a BTree 49 62 n = 2 49 50 62 75 15 36 Insert: 75 7 Insertion 8 Insertion 9 Insertion 10 Insertion 11 Insertion 12 Insertion 13 Insertion 14 Insertion 15 Insertion 16 Insertion 17 Insertion 18 Insertion: Primitives Inserting into a leaf node Splitting a leaf node Splitting an internal node Splitting root node 19 Inserting into a Leaf Node 58 54 57 60 62 20 Inserting into a Leaf Node 58 54 57 60 62 21 Inserting into a Leaf Node 58 54 57 58 60 62 22 Splitting a Leaf Node 61 54 66 54 57 58 60 62 23 Splitting a Leaf Node 61 54 66 54 57 58 60 62 24 Splitting a Leaf Node 61 54 66 54 57 58 60 61 62 25 Splitting a Leaf Node 59 61 54 66 54 57 58 60 61 62 26 Splitting a Leaf Node 61 54 59 66 54 57 58 60 61 62 27 Splitting an Internal Node ... 21 99 ... 59 40 54 66 74 84 [54, 59) [ 59, 66) [66,74) Splitting an Internal Node ... 21 99 ... 59 40 54 66 74 84 [54, 59) [ 59, 66) [66,74) Splitting an Internal Node 66 ... 21 99 ... [21,66) 40 54 59 [66, 99) 74 84 [54, 59) [ 59, 66) [66,74) Splitting the Root 59 40 54 66 74 84 [54, 59) [ 59, 66) [66,74) Splitting the Root 59 40 54 66 74 84 [54, 59) [ 59, 66) [66,74) Splitting the Root 66 40 54 59 74 84 [54, 59) [ 59, 66) [66,74) Deletion 34 Deletion redistribute 35 Deletion 36 Deletion II 37 Deletion II merge Deletion II 39 Deletion II 40 Deletion II 41 Deletion II Not needed merge 42 Deletion II 43 Deletion: Primitives Delete key from a leaf Redistribute keys between sibling leaves Merge a leaf into its sibling Redistribute keys between two sibling internal nodes Merge an internal node into its sibling 44 Merge Leaf into Sibling 72 ... 67 85 54 58 64 68 72 75 45 Merge Leaf into Sibling 72 ... 67 85 54 58 64 68 75 46 Merge Leaf into Sibling 72 ... 67 85 54 58 64 68 75 47 Merge Leaf into Sibling 72 ... 85 54 58 64 68 75 48 Merge Internal Node into Sibling ... 59 ... 41 48 52 63 74 [52, 59) [59,63) 49 Merge Internal Node into Sibling ... 59 ... 41 48 52 59 63 [52, 59) [59,63) 50 BTree Roadmap BTree Recap Insertion (recap) Deletion Construction Efficiency BTree variants Hashbased Indexes 51 Question How does insertionbased construction perform? 52 BTree Construction Sort 48 57 41 15 75 21 62 34 81 11 97 13 53 BTree Construction 11 13 15 21 34 41 48 57 62 75 81 97 11 13 15 21 34 41 Scan 48 57 62 75 81 97 BTree Construction 21 48 75 11 13 15 21 34 41 48 57 62 75 81 97 Scan BTree Construction Why is sortbased construction better than insertionbased one? 56 Cost of BTree Operations Height of BTree: H Assume no duplicates Question: what is the random I/O cost of: Insertion: Deletion: Equality search: Range Search: 57 Height of BTree Number of keys: N BTree parameter: n log N Height log N = n log n In practice: 23 levels 58 Question: How do you pick parameter n? 1. Ignore inserts and deletes 2. Optimize for equality searches 3. Assume no duplicates 59 Roadmap BTree BTree variants Sparse Index Duplicate Keys Hashbased Indexes 60 Roadmap BTree BTree variants Hashbased Indexes Static Hash Table Extensible Hash Table Linear Hash Table 61 HashBased Indexes Adaptations of main memory hash tables Support equality searches No range searches 62 Indexing Problem (recap) Index Keys a1 A = val record pointers a2 ai an Main Memory Hash Table buckets 0 1 key h (key) 2 3 4 h (key) = key % 8 5 6 7 55 32 10 27 21 48 (null) 75 (null) (null) 64 (null) (null) Adapting to disk 1 Hash Bucket = 1 Block All keys that hash to bucket stored in the block Intuition: keys in a bucket usually accessed together No need for linked lists of keys ... 65 Adapting to Disk How do we handle this? 66 Adapting to disk 1 Hash Bucket = 1 Block All keys that hash to bucket stored in the block Intuition: keys in a bucket usually accessed together No need for linked lists of keys ... ... but need linked list of blocks (overflow blocks) 67 Adapting to Disk 68 Adapting to disk Bucket Id Disk Address mapping Contiguous blocks Store mapping in main memory Too large? Dynamic Linear and Extensible hash tables 69 Beware of claims that assume 1 I/O for hash tables and 3 I/Os for BTree!! 70 ...
View Full Document

This document was uploaded on 01/17/2012.

Ask a homework question - tutors are online