{[ promptMessage ]}

Bookmark it

{[ promptMessage ]}

12-huffman - Spring 2009 CS216 Program and Data...

Info iconThis preview shows pages 1–11. Sign up to view the full content.

View Full Document Right Arrow Icon
Data Compression: Huffman Coding 10.1.2 in Weiss CS216: Program and Data Representation University of Virginia Computer Science Spring 2009 Aaron Bloomfield
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
2 Why compress files? • Disk space is limited • File transfer – Bigger files take longer to transfer • Smaller file might fit in memory more easily
Background image of page 2
3 What is a file? • Named collection of information – C++ program – Application executables – Word documents – Email – Web pages – Pictures, audio, video Q: Which of these needs to be exactly the same when we use them?
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
4 Lossless compression X = X’ Lossy compression X != X’ – information is lost (irreversible) Compression Ratio |X|/|Y| – Where |X| is the # of bits in X. Data Compression Encoder Decoder X Y X’ original compressed decompressed
Background image of page 4
5 Lossy Compression • Some data is lost, but not too much. Standards : • JPEG (Joint Photographic Experts Group) – Still images • MPEG (Motion Picture Experts Group) – Audio and video • MP3 (MPEG-1, Layer 3), Ogg vorbis Compression ratios of 10:1 are possible
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
6 JPEG image quality comparison Quality = 100 Image size: 83,261 (100%) Quality = 50 Image size: 15,138 (18%) Quality = 25 Image size: 9,553 (11%) Quality = 10 Image size: 4,787 (6%) Quality = 1 Image size: 1,523 (2%)
Background image of page 6
7 Lossless Compression • No data is lost. Standards: • Gzip, Unix compress, zip, Morse code • PNG image file formats • Run-length encoding (RLE) Can get compression ratios of 4:1
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
8 Lossless Compression of Text ASCII = fixed 8 bits per character Example : “hello there” – 11 characters * 8 bits = 88 bits Can we encode this message using fewer bits?
Background image of page 8
9 Huffman Coding • Uses frequencies of symbols in a string to build a prefix code . • The more frequent a character is, the fewer bits we’ll use to represent it Prefix Code – no code in our encoding is a prefix of another code. Letter code a 0 b 100 c 101 d 11
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
10 Decoding a Prefix Code Create the Huffman coding tree Loop start at root of tree loop if bit read = 1 then go right else, go left until node is a leaf Report character found!
Background image of page 10
Image of page 11
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}