day16 - COP 3503 Computer Science II CLASS NOTES - DAY #16...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
COP 3503 – Computer Science II CLASS NOTES - DAY #16 The applications to which the tree data structure has been applied cover a vast array of areas. Searching, compression, expression evaluation, and priority queues are but a few of these areas; which in the interest of time will be the only areas that we will explore. We’ll also look a general tree traversals. Example: This is the same example that appeared in Day 16 notes, repeated here for continuity purposes only. Suppose that we have a four letter alphabet consisting of a, b, c, and d, and e only. To encode four letters requires 2 bits. Suppose that these are assigned as follows: a = 00, b = 01, c = 10, and d = 11. Now suppose that we have a sentence of these letters which is 15 characters long. This sentence will require 30 bits to encode. Suppose that we also have some information about the frequency of occurrence of each of our letters and know that “a” occurs most frequently, followed by b and so on. A Huffman coding tree is built as shown below with the most frequently occurring letters closest to the root. 1 1 0 1 Reading the new codes from the tree we have: a = 0, b = 10, c = 110, and d = 111. Now suppose our 15 character sentence contains 8 a’s, 4 b’s, 2 c’s, and 1 d. With the new code this sentence requires (8*1) + (4*2) + (2*3) + (1*3) bits = 8 + 8 + 6 + 3 = 25 bits. The original code required 30 bits so we have save (30- 25)/30 = 16%. Day 16 - 1 Huffman Coding Revisited 1 a d c Tree Applications
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
We mentioned that the occurrence frequency of every “character” in the file which is to be compressed must be known prior to building the coding tree. This information might appear in a frequency array like the one shown below: Letter a b c d Frequency 8 4 2 1 Building a Huffman Coding Tree Let’s assume that the file to be compressed consists of alphabetic characters, like a text file. (Huffman’s coding algorithm can be used to compress a file of basically any type of object for which a frequency of occurrence can be established.) The first step in building the Huffman coding tree is to generate the file of frequency values for each character in the file to be compressed. This can be accomplished in a number of different fashions involving a pass over the file to be compressed (works for static files) or by statistical methods (commonly historically based) applied to streams of characters to be compressed. The next step is to generate a binary tree, which will not necessarily be balanced, that utilizes this frequency information to structure the tree. Each of the letters which appear in the file will be stored in a leaf node of the binary tree. Each non- leaf node will be the root of a sub-tree (think of the recursive definition of a binary tree). These internal nodes will store the sum of the frequencies of the letters stored in that subtree. In our example above, the root of the tree has a value of 15 which indicates that this subtree contains letters which occur 15 times in our “file”.
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 16

day16 - COP 3503 Computer Science II CLASS NOTES - DAY #16...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online