14 - Huffman Coding

# 14 - Huffman Coding - Part IV Greedy Algorithms Lecture 14...

This preview shows pages 1–7. Sign up to view the full content.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Part IV: Greedy Algorithms Lecture 14: Huffman Coding Lecture 14: Huffman Coding Part IV: Greedy Algorithms Objective and Outline Objective : Another example of greedy algorithms Reference : Section 16.3 of CLRS Outline Coding and Decoding The optimal source coding problem Huffman coding: A greedy algorithm Correctness Lecture 14: Huffman Coding Part IV: Greedy Algorithms Encoding Example Suppose that we have a 100 , 000 character data file that we wish to store. The file contains only 6 characters, appearing with the following frequencies: a b c d e f Frequency in ’000s 45 13 12 16 9 5 A binary code encodes each character as a binary string or codeword . a code is a set of codewords e.g., { 000 , 001 , 010 , 011 , 100 , 101 } and { , 101 , 100 , 111 , 1101 , 1100 } Lecture 14: Huffman Coding Part IV: Greedy Algorithms Encoding Given a code (corresponding to some alphabet Σ) and a message it is easy to encode the message. Just replace the characters by the codewords. Example Σ = { a , b , c , d } If the code is C 1 = { a = 00 , b = 01 , c = 10 , d = 11 } then bad is encoded into 01 00 11 If the code is C 2 = { a = 0 , b = 110 , c = 10 , d = 111 } then bad is encoded into 110 0 111 Lecture 14: Huffman Coding Part IV: Greedy Algorithms Fixed-Length vs Variable-Length In a fixed-length code each codeword has the same length. In a variable-length code codewords may have different lengths. Example a b c d e f Freq in ’000s 45 13 12 16 9 5 fixed-len code 000 001 010 011 100 101 variable-len code 101 100 111 1101 1100 (note that, since there are 6 characters, a fixed-length code must have at least 3 bits per codeword). The fixed-length code requires 300 , 000 bits to store the file. The variable-length code uses only (45 · 1 + 13 · 3 + 12 · 3 + 16 · 3 + 9 · 4 + 5 · 4) · 1000 = 224 , 000bits, saving a lot of space! Lecture 14: Huffman Coding Part IV: Greedy Algorithms Decoding C 1 = { a = 00 , b = 01 , c = 10 , d = 11 } . C 2 = { a = 0 , b = 110 , c = 10 , d = 111 } . C 3 = { a = 1 , b = 110 , c = 10 , d = 111 } Given an encoded message, decoding is the process of turning it back into the original message. A message is uniquely decodable (vis-a-vis a particular code) if it can only be decoded in one way. Example Relative to C 1 , 010011 is uniquely decodable to bad. Relative to C 2 , 1100111 is uniquely decodable to bad. But, relative to C 3 , 1101111 is not uniquely decipherable since it could have encoded either bad or acad. In fact, one can show that every message encoded using C 1 or C 2 is uniquely decipherable. The unique decipherability property is needed in order for a code to be useful....
View Full Document

{[ snackBarMessage ]}

### Page1 / 27

14 - Huffman Coding - Part IV Greedy Algorithms Lecture 14...

This preview shows document pages 1 - 7. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online