Chap1-Part3-T2_2010-11.ppt - Chapter 1 Information Sources Source Coding 1 Chapter 1(Part 3 Huffman coding Lempel-Ziv coding Run Length encoding

# Chap1-Part3-T2_2010-11.ppt - Chapter 1 Information Sources...

• Notes
• 32

This preview shows page 1 - 9 out of 32 pages.

1 Chapter 1 Information Sources & Source Coding 2 Chapter 1 (Part 3) Huffman coding Lempel-Ziv coding Run Length encoding Differential coding Shannon-Fano coding Arithmetic coding 3 Huffman Coding Having been introduced to what prefix codes arein Part 2, you will now learn how to actually constructa type of prefix code known as the Huffman codeThe basic idea behind Huffman coding is to encode each symbol with a binary codeword that is roughly equal in length to the amount of information conveyed by the symbol in question. (Why?) . 4 The Huffman Coding Algorithm STEP 1: The Splitting Stage List the source symbols in the order of decreasing probability . The two symbols of lowest probability are assigned a 0 and a 1. STEP 2: The Combining Stage Combine the probabilities of the last two symbols, and reorder the resultant probabilities. The list of symbols is now reduced by one. STEP 3: Repeat Repeat STEP 1 and STEP 2 until only two symbols are left , for which a 0 and 1 are assigned. STEP 4: Encode The code for each source symbol is found by working backward and tracing the sequence of 0s and 1s assigned to that symbol as well as its successors 5 Example 1.6aIn this example, we demonstrate how a prefix code is constructed for a DMS with alphabet {s0, s1, s2, s3, s4} and corresponding probabilities {0.4, 0.2, 0.2, 0.1, 0.1}.Following through the Huffman algorithm, our computation ends after four iterations, resulting in the Huffman treeshown below: 6 : 7 From Example 1.6a, we may make several observations: No two codeword consist of identical arrangement of bits No codeword is a prefix of another codeword => Huffman code is a type of prefix code Higher probability symbols have shorter codewords, and vice versa => Huffman code is a variable-length code The two least probable codewords have equal length , and differ only in the final digit The average codeword length is very close to the source entropy The average codeword length satisfy H(S) < L < H(S) + 1 8 Variations in Huffman Coding The Huffman tree constructed in Example 1.6a is not unique . In particular, there are two variations to the process that may produce different sets of Huffman codes for the same source: 1) at each splitting stage, there is arbitrariness in the way a 0 and a 1 are assigned to the last source symbols 2) when the probability of a combined symbol is found to equal another probability in the list, we may proceed by placing the new combined symbol as high as possible or as low as possible Whichever way the variations are chosen, however, they are to be consistently adhered to throughout the encoding process.  • • • 