Unformatted text preview: External Memory Dictionary Task: Given a large amount of data that does not fit into main memory, process it into a dictionary data structure. Need to minimize number of disk accesses With each disk read, read a whole block of data Construct a balanced search tree that uses one disk block per tree node Each node needs to contain more than one key 1 From Binary to kary A kary search tree T is defined as follows: For each node x of T : x has at most k children x stores an ordered list of pointers to its children x stores an ordered list of keys x fulfils the search tree property: keys in the subtree rooted at ith child keys in subtree rooted at ( i + 1)st child. 2 Chapter 18: BTrees A Btree is a balanced tree scheme in which balance is achieved by permitting the nodes to have multiple keys and more than two children. 3 Definition Let t 2 be an integer. A tree T is called a Btree having minimum degree t if the leaves of T are at the same depth and each node u has the following properties: 1. u has at most 2 t 1 keys. 2. If u is not the root, u has at least t 1 keys. 3. The keys in u are sorted in the increasing order. 4. The number of u s children is precisely one more than the number of u s keys. 5. For all i 1, if u has at least i keys and has children, then every key appearing in the subtree rooted at the ith child of u is less than the ith key and every key appearing in the subtree rooted at the ( i + 1)st child of u is greater than the ith key. 4 S R P D Y W V M N QTX H L K J FG BC Z root[T] 5 Notation Let u be a node in a Btree. By n [ u ] we denote the number of keys in u . For each i , 1 i n [ u ], key i [ u ] denotes the ith key of u . For each i , 1 i n [ u ] + 1, c i [ u ] denotes the ith child of u . Terminology For the sake of simplicity we will introduce some terminology. We say that a node is full if it has 2 t 1 keys and we say that a node is lean if it has the minimum number of keys, that is t 1 keys in the case of a nonroot and 1 in the case of the root. 234 Trees Btrees with minimum degree 2 are called 234 trees to signify that the number of children is two, three, or four. 6 Depth of a Btree Theorem A Let t 2 and n be integers. Let T be an arbitrary Btree with minimum degree t having n keys. Let h be the height of T . Then h log t n +1 2 . Proof The height is maximized when all the nodes are lean. If T is of that form, the number of keys in T is 1+ h X i =1 2( t 1) t i 1 = 2( t 1) t h 1 t 1 +1 = 2 t h 1 . Thus the depth of a Btree is at most 1 lg t of the depth of an RBtree. 7 Searching for a key k in a Btree Start with x = root [ T ]....
