{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

9.4 Text Representation

# Thetwosmallestare8and11 combinethem 19 8 11 a b c d e

This preview shows page 1. Sign up to view the full content.

This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: hem into a new tree. 8 A C D E 4 B 4 5 6 12 Label the new vertex with the sum of the frequencies (4+4 = 8) Repeat Repeat Now, we consider 8, 5, 6, and 12. The two smallest are 5 and 6 Combine them as in the first step, yielding new vertex with frequency label 11 8 11 A B C D E 4 4 5 6 12 Repeat Repeat Now, we consider 8, 11, and 12. The two smallest are 8 and 11 Combine them 19 8 11 A B C D E 4 4 5 6 12 Finally, combine the two remaining vertices (12 and 19) to get a single tree. 31 19 8 11 A B C D E 4 4 5 6 12 The final code Letter Code A B C D E 4 4 5 6 12 010 D B 001 C A 000 011 E 1 Length of entire encoding of text…. 4 A’s, 4 B’s, 5 C’s, 6 D’s, and 12 E’s 4x3 + 4x3 + 5x3 + 6x3 + 12x1 = 12 + 12 + 15 + 18 + 12 = 69 bits Letter Code A 000 5 characters: need 3 bits each 4x3 + 4x3 + 5x3 + 6x3 + 12x3 = 12 + 12 + 15 + 18 + 36 = 93 bits B 001 C 010 D 011 A savings of 24 bits. Longer text would mean more savings E 1 Compare to fixed length code Summary Summary Text encoded in natural way using binary n bits allows 2n chars to be represented (or: need log2 n bits to represent n chars) ASCII 8 bit encoding, Unicode 16 bits Fixed length codes are simple, but not necessarily compact Run­length coding takes advantage of repeated sequential characters Lempel­Ziv compression takes advantage of repeating patterns previously occurring Huffman coding chooses variable length code with short codewords for most frequent letters...
View Full Document

{[ snackBarMessage ]}

Ask a homework question - tutors are online