trie - Symbol Table Review 6.2 String Sets Symbol table....

Info iconThis preview shows pages 1–4. Sign up to view the full content.

View Full Document Right Arrow Icon
Robert Sedgewick and Kevin Wayne • Copyright © 2006 • http://www.Princeton.EDU/~cos226 6.2 String Sets 2 Symbol Table Review Symbol table. ! Associate a value with a key. ! Search for value given key. ! Balanced trees use O(log N) key comparisons. ! Hashing uses O(1) probes, but probe proportional to key length. Q. Are key comparisons necessary? No. Q. Is time proportional to key length required? No. Best possible. Examine O(log N) bits. This lecture. Specialized symbol table/set for string keys . ! Faster than hashing. ! More flexible than BST. 3 Applications Applications. ! Spell checkers. ! Auto-complete. ! Data compression. [stay tuned] ! Computational biology. ! Inverted index of Web. ! Routing tables for IP addresses. ! Storing and querying XML documents. ! Linux kernel API: memory management. ! T9 predictive text input for cell phones. 4 String Set: Operations Operations. ! set.add(String s) insert string s into set ! set.contains(String s) is string s in the set ? removes duplicates from input stream StringSET set = new StringSET (); while (! StdIn . isEmpty ()) { String key = StdIn . readString (); if (! set . contains ( key )) { set . add ( key ); System . out . println ( key ); } } goal: implement this efficiently
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
5 Keys Key = sequence of "digits." ! DNA: sequence of a, c, g, t. ! IPv6 address: sequence of 128 bits. ! English words: sequence of lowercase letters. ! Protein: sequence of amino acids A, C, . .., Y. ! Credit card number: sequence of 16 decimal digits. ! International words: sequence of Unicode characters. ! Library call numbers: sequence of letters, numbers, periods. This lecture. Key = string over ASCII alphabet. 6 String Set: Implementations Cost Summary Challenge. As fast as hashing, as flexible as BST. N = number of strings L = size of string C = number of characters in input R = radix * only reads in data Actor. 82MB, 11.4M words, 900K distinct. Moby. 1.2MB, 210K words, 32K distinct. Red-black Implementation Hashing L + log N Search hit L log N Insert L Typical Case C Space C 1.40 Moby 0.76 97.4 Actors 40.6 Dedup Input * L L L 0.26 15.1 Robert Sedgewick and Kevin Wayne • Copyright © 2006 • http://www.Princeton.EDU/~cos226 Tries 8 Tries. [from retrieval, but pronounced "try"] ! Store characters in internal nodes, not keys. ! Store records in external nodes. ! Use the characters of the key to guide the search. Ex. sells sea shells by the sea Tries by sea sells shells the
Background image of page 2
9 Tries. [from retrieval, but pronounced "try"] ! Store characters in internal nodes, not keys. ! Store records in external nodes. ! Use the characters of the key to guide the search. Ex.
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 4
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 9

trie - Symbol Table Review 6.2 String Sets Symbol table....

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online