12-huffman

12-huffman - Spring 2009 CS216: Program and Data...

Info iconThis preview shows pages 1–11. Sign up to view the full content.

View Full Document Right Arrow Icon
Data Compression: Huffman Coding 10.1.2 in Weiss CS216: Program and Data Representation University of Virginia Computer Science Spring 2009 Aaron Bloomfield
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
2 Why compress files? • Disk space is limited • File transfer – Bigger files take longer to transfer • Smaller file might fit in memory more easily
Background image of page 2
3 What is a file? • Named collection of information – C++ program – Application executables – Word documents – Email – Web pages – Pictures, audio, video Q: Which of these needs to be exactly the same when we use them?
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
4 Lossless compression X = X’ Lossy compression X != X’ – information is lost (irreversible) Compression Ratio |X|/|Y| – Where |X| is the # of bits in X. Data Compression Encoder Decoder X Y X’ original compressed decompressed
Background image of page 4
5 Lossy Compression • Some data is lost, but not too much. Standards : • JPEG (Joint Photographic Experts Group) – Still images • MPEG (Motion Picture Experts Group) – Audio and video • MP3 (MPEG-1, Layer 3), Ogg vorbis Compression ratios of 10:1 are possible
Background image of page 5

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
6 JPEG image quality comparison Quality = 100 – Image size: 83,261 (100%) Quality = 50 – Image size: 15,138 (18%) Quality = 25 – Image size: 9,553 (11%) Quality = 10 – Image size: 4,787 (6%) Quality = 1 – Image size: 1,523 (2%)
Background image of page 6
7 Lossless Compression • No data is lost. Standards: • Gzip, Unix compress, zip, Morse code • PNG image file formats • Run-length encoding (RLE) Can get compression ratios of 4:1
Background image of page 7

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
8 Lossless Compression of Text ASCII = fixed 8 bits per character Example : “hello there” – 11 characters * 8 bits = 88 bits Can we encode this message using fewer bits?
Background image of page 8
9 Huffman Coding • Uses frequencies of symbols in a string to build a prefix code . • The more frequent a character is, the fewer bits we’ll use to represent it Prefix Code no code in our encoding is a prefix of another code. Letter code a 0 b 100 c 101 d 11
Background image of page 9

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Decoding a Prefix Code Create the Huffman coding tree Loop start at root of tree loop if bit read = 1 then go right else, go left until node is a leaf Report character found! Until end of the message
Background image of page 10
Image of page 11
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 09/11/2009 for the course CS 216 taught by Professor Bloomfield during the Spring '08 term at UVA.

Page1 / 46

12-huffman - Spring 2009 CS216: Program and Data...

This preview shows document pages 1 - 11. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online