04huffman-2x2

04huffman-2x2 - Data Compression 4.8 Huffman Codes Q. Given...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
4.8 Huffman Codes These lecture slides are supplied by Mathijs de Weerd 2 Data Compression Q. Given a text that uses 32 symbols (26 different letters, space, and some punctuation characters), how can we encode this text in bits? Q. Some symbols (e, t, a, o, i, n) are used far more often than others. How can we use this to reduce our encoding? Q. How do we know when the next symbol begins? Ex. c(a) = 01 What is 0101? c(b) = 010 c(e) = 1 3 Data Compression Q. Given a text that uses 32 symbols (26 different letters, space, and some punctuation characters), how can we encode this text in bits? A. We can encode 2 5 different symbols using a fixed length of 5 bits per symbol. This is called fixed length encoding . Q. Some symbols (e, t, a, o, i, n) are used far more often than others. How can we use this to reduce our encoding? A. Encode these characters with fewer bits, and the others with more bits. Q. How do we know when the next symbol begins? A. Use a separation symbol (like the pause in Morse), or make sure that there is no ambiguity by ensuring that no code is a prefix of another one. Ex. c(a) = 01 What is 0101? c(b) = 010 c(e) = 1 4 Prefix Codes Definition. A prefix code for a set S is a function c that maps each x S to 1s and 0s in such a way that for x,y S, x y, c(x) is not a prefix of c(y). Ex. c(a) = 11 c(e) = 01 c(k) = 001 c(l) = 10 c(u) = 000 Q. What is the meaning of 1001000001 ? Suppose frequencies are known in a text of 1G: f a =0.4, f e =0.2, f k =0.2, f l =0.1, f u =0.1 Q. What is the size of the encoded text?
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
5 Prefix Codes Definition. A prefix code for a set S is a function c that maps each x S to 1s and 0s in such a way that for x,y S, x y, c(x) is not a prefix of c(y). Ex. c(a) = 11 c(e) = 01 c(k) = 001 c(l) = 10 c(u) = 000 Q. What is the meaning of 1001000001 ? A. “leuk” Suppose frequencies are known in a text of 1G: f a =0.4, f e =0.2, f k =0.2, f l =0.1, f u =0.1 Q. What is the size of the encoded text? A. 2*f a + 2*f e + 3*f k + 2*f l + 4*f u = 2.4G 6 Optimal Prefix Codes Definition. The average bits per letter of a prefix code c is the sum over all symbols of its frequency times the number of bits of its encoding:
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 6

04huffman-2x2 - Data Compression 4.8 Huffman Codes Q. Given...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online