compress

compress - Robert Sedgewick and Kevin Wayne Copyright 2006

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Robert Sedgewick and Kevin Wayne Copyright 2006 http://www.Princeton.EDU/~cos226 Data Compression Reference: Chapter 22, Algorithms in C, 2 nd Edition , Robert Sedgewick. Reference: Introduction to Data Compression , Guy Blelloch. 2 Data Compression Compression reduces the size of a file: ! To save space when storing it. ! To save time when transmitting it. ! Most files have lots of redundancy. Who needs compression? ! Moore's law: # transistors on a chip doubles every 18-24 months. ! Parkinson's law: data expands to fill space available. ! Text, images, sound, video, Basic concepts ancient (1950s), best technology recently developed. All of the books in the world contain no more information than is broadcast as video in a single large American city in a single year. Not all bits have equal value. -Carl Sagan 3 Applications of Data Compression Generic file compression. ! Files: GZIP, BZIP, BOA. ! Archivers: PKZIP. ! File systems: NTFS. Multimedia. ! Images: GIF, JPEG. ! Sound: MP3. ! Video: MPEG, DivX, HDTV. Communication. ! ITU-T T4 Group 3 Fax. ! V.42bis modem. Databases. Google. 4 Encoding and Decoding Message. Binary data M we want to compress. Encode. Generate a "compressed" representation C(M). Decode. Reconstruct original message or some approximation M'. Compression ratio. Bits in C(M) / bits in M. Lossless. M = M', 50-75% or lower. Ex. Natural language, source code, executables . Lossy. M ! M', 10% or lower. Ex. Images, sound, video. Encoder M Decoder C(M) M' hopefully uses fewer bits 5 Ancient Ideas Ancient ideas. ! Braille. ! Morse code. ! Natural languages. ! Mathematical notation. ! Decimal number system. "Poetry is the art of lossy data compression." 6 Natural Encoding Natural encoding. (19 " 51) + 6 = 975 bits. needed to encode number of characters per line 000000000000000000000000000011111111111111000000000 000000000000000000000000001111111111111111110000000 000000000000000000000001111111111111111111111110000 000000000000000000000011111111111111111111111111000 000000000000000000001111111111111111111111111111110 000000000000000000011111110000000000000000001111111 000000000000000000011111000000000000000000000011111 000000000000000000011100000000000000000000000000111 000000000000000000011100000000000000000000000000111 000000000000000000011100000000000000000000000000111 000000000000000000011100000000000000000000000000111 000000000000000000001111000000000000000000000001110 000000000000000000000011100000000000000000000111000 011111111111111111111111111111111111111111111111111 011111111111111111111111111111111111111111111111111 011111111111111111111111111111111111111111111111111 011111111111111111111111111111111111111111111111111 011111111111111111111111111111111111111111111111111 011000000000000000000000000000000000000000000000011 19-by-51 raster of letter 'q' lying on its side 7 Run-Length Encoding Natural encoding. (19 " 51) + 6 = 975 bits....
View Full Document

This document was uploaded on 06/10/2011.

Page1 / 11

compress - Robert Sedgewick and Kevin Wayne Copyright 2006

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online