Arithmetic_coding_modified_2005

# Arithmetic_coding_modified_2005 - Lecture notes on Data...

This preview shows pages 1–8. Sign up to view the full content.

Lecture notes on Data Compression Arithmetic Coding

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
2 Contents z Huffman coding revisited z History of arithmetic coding z Ideal arithmetic coding z Properties of arithmetic coding
3 Huffman Coding Revisited -- How to Create Huffman Code z Construct a Binary Tree of Sets of Source Symbols. z Sort the set of symbols with non-decreasing probabilities. z Form a set including two symbols of smallest probabilities. z Replace these by a single set containing both the symbols whose probability is the sum of the two component sets. z Repeat the above steps until the set contains all the symbols. z Construct a binary tree whose nodes represent the sets. The leaf nodes representing the source symbols. z Traverse each path of the tree from root to a symbol, assigning a code 0 to a left branch and 1 to a right branch. The sequence of 0’s and 1’s thus generated is the code for the symbol.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
4 Properties of Huffman Coding z Huffman codes are minimum redundancy codes for a given probability distribution of the message. z Huffman coding guarantees a coding rate l H within one bit of the entropy H. z Average code length l H of the Huffman coder on the source S is bounded by H(S)<= l H <= H(S) + 1 z Studies showed that a tighter bound on the Huffman coding exists. z Average code length l H < H(S) + p max +0.086, where p max is the probability of the most frequently occurring symbol. z So, if the p max is quite big (in case that the alphabet is small and the probability of occurrence of the different symbols is skewed), Huffman coding will be quite inefficient.
5 Properties of Huffman Coding (continued) z Huffman code does not achieve ‘minimum redundancy’ because it does not allow fractional bits. z Huffman needs at least one bit per symbol. z For example, given alphabet containing two symbols with probability: z The optimal length for the first symbol is: z The Huffman coding, however, will assign 1 bit to this symbol. z If the alphabet is large and probabilities are not skewed , Huffman rate is pretty close to entropy. 12 0.99, 0.01 pp = = log(0.99) 0.0145 =

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
6 Properties of Huffman Coding (continued) z If we block m symbols together, the average code length l H of the Huffman coder on the source S is bounded by H(S)<= l H <= H(S) + 1/m z However, the problem here is that we need a big codebook. If the size of the original alphabet is K, then the size of the new code book is K m . z Thus, Huffman’s performance becomes better at the expense of exponential codebook size .
7 Another View of Huffman Coding z Huffman code re-interpreted here by mapping the symbols to subintervals of [0,1) at the base value of the subintervals. z The code words, if regarded as binary fractions , are pointers to the particular interval in the binary code. z An extension to this idea is to encode the symbol sequence as a subinterval leads to arithmetic coding.

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
This is the end of the preview. Sign up to access the rest of the document.

## This note was uploaded on 06/09/2011 for the course CAP 5015 taught by Professor Mukherjee during the Spring '11 term at University of Central Florida.

### Page1 / 92

Arithmetic_coding_modified_2005 - Lecture notes on Data...

This preview shows document pages 1 - 8. Sign up to view the full document.

View Full Document
Ask a homework question - tutors are online