Neyer paper

Neyer paper - A Comparison of Dictionary Implementations...

This preview shows pages 1–3. Sign up to view the full content.

A Comparison of Dictionary Implementations Mark P Neyer April 10, 2009 1 Introduction A common problem in computer science is the representation of a mapping between two sets. A mapping f : A B is a function taking as input a member a A , and returning b , an element of B . A mapping is also sometimes referred to as a dictionary, because dictionaries map words to their deﬁnitions. Knuth [ ? ] explores the map / dictionary problem in Volume 3, Chapter 6 of his book The Art of Computer Programming . He calls it the problem of ’searching,’ and presents several solutions. This paper explores implementations of several diﬀerent solutions to the map / dictionary problem: hash tables , Red-Black Trees , AVL Trees , and Skip Lists . This paper is inspired by the author’s experience in industry, where a dictionary structure was often needed, but the natural C# hash table-implemented dictionary was taking up too much space in memory. The goal of this paper is to determine what data structure gives the best performance, in terms of both memory and processing time. AVL and Red-Black Trees were chosen because Pfaﬀ [ ? ] has shown that they are the ideal balanced trees to use. Pfaﬀ did not compare hash tables, however. Also considered for this project were Splay Trees [ ? ]. 2 Background 2.1 The Dictionary Problem A dictionary is a mapping between two sets of items, K , and V . It must support the following operations: 1. Insert an item v for a given key k . If key k already exists in the dictionary, its item is updated to be v . 2. Retrieve a value v for a given key k . 3. Remove a given k and its value from the Dictionary. 2.2 AVL Trees The AVL Tree was invented by G.M. Adel’son-Vel’ski˘ ı and E. M. Landis, two Soviet Mathematicians, in 1962 [ ? ]. It is a self-balancing binary search tree data structure. Each node has a balance factor , which is the height of its right subtree minus the height of its left subtree. A node with a balance factor of -1,0, or 1 is considered ’balanced.’ Nodes with diﬀerent balance factors are considered ’unbalanced’, and after diﬀerent operations on the tree, must be rebalanced. 1. Insert Inserting a value into an AVL tree often requires a tree search for the appropriate location. The tree must often be re-balanced after insertion. The re-balancing algorithm ’rotates’ branches of the tree to ensure that the balance factors are kept in the range [-1,1]. Both the re-balancing algorithm and the binary search take O (log n ) time. 1

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
2. Retrieve Retrieving a key from an AVL tree is performed with a tree search. This takes O (log n ) time. 3. Remove Just like an insertion, a removal from an AVL tree requires the tree to be re-balanced. This operation also takes O (log n ) time. 2.3
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

Page1 / 7

Neyer paper - A Comparison of Dictionary Implementations...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online