This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: 6.851: Advanced Data Structures Spring 2010 Lecture 21 — 27 April, 2010 Prof. Erik Demaine Scribe: Tom Morgan 1 From Last Lecture. . . In the previous lecture, we discussed the External Memory and Cache Oblivious memory models. We additionally discussed some substantial results from data structures under each model; the most complex of these was the Cache Oblivious B-Tree. In that construction, we used the Ordered File Maintenance (OFM) data structure as a black box to maintain an ordered array with O (log 2 N ) updates. Now we will fill in the details of the OFM. 2 Ordered File Maintenance   The OFM problem is to store N elements in an array of size O ( N ), in a specified order. Ad- ditionally, the gaps between elements must be O (1) elements wide, so that scanning k elements costs O ( d k B e ) memory transfers. The data structure must support deletion and insertion (between two existing elements). These updates are accomplished by re-arranging a contiguous block of O (log 2 N ) elements using O (1) interleaved scans. Thus the cost in memory transfers is O ( log 2 N B ); note that these bounds are amortized. The OFM structure obtains its performance by guaranteeing that no part of the array becomes too densely or too sparsely populated. When a density threshold is violated, rebalancing (uniformly redistribute elements) occurs. To motivate the discussion, imagine that the array (size O ( N )) is split into pieces of size Θ(log N ) each. Now imagine a complete binary tree (depth: O (log N )- Θ(log log N ) = O (log N )) over these subsets. Each tree-node tracks the number of elements and the number of total array slots in its range. Density of an interval is then defined as the ratio of the number of elements stored in that interval to the number of total array slots in that interval. 2.1 Updates To update element X , • Update a leaf chunk of size Θ(log N ) containing X . • Walk up the tree to the first node within the density threshold . • Uniformly redistribute the elements in this node’s interval. The density threshold is depth-dependent. The depth is defined such that the tree root has depth 0, and tree leaves have depth h = Θ(log N ). We require: 1 • density ≥ 1 2- 1 4 d h ∈ [ 1 4 , 1 2 ] (not too sparse) item density ≤ 3 4 + 1 4 d h ∈ [ 3 4 , 1] (not too dense) Notice that the density constraints are highest at the shallowest node. Intuitively, saving work (i.e., having tight constraints) at the deepest nodes gains relatively little performance because the working sets are comparatively small. Keep in mind that the BST is never physically constructed. It is only a conceptual tool useful for understanding and analyzing the OFM structure. To perform the tree search operations, we can instead examine the binary representation of the left/right edges of a “node” to determine whether this node is a left/right child. Then the scan can proceed left/right accordingly; this iswhether this node is a left/right child....
View Full Document
This note was uploaded on 01/20/2012 for the course CS 6.849 taught by Professor Erikdemaine during the Fall '10 term at MIT.
- Fall '10
- Data Structures