compression1and2 - Compression in the Real World...

Info iconThis preview shows pages 1–4. Sign up to view the full content.

View Full Document Right Arrow Icon
1 15-853 Page 1 15-853:Algorithms in the Real World Data Compression: Lectures 1 and 2 15-853 Page 2 Compression in the Real World Generic File Compression Files : gzip (LZ77), bzip (Burrows-Wheeler), BOA (PPM) Archivers : ARC (LZW), PKZip (LZW+) File systems : NTFS Communication Fax : ITU-T Group 3 (run-length + Huffman) Modems : V.42bis protocol (LZW), MNP5 (run-length+Huffman) Virtual Connections 15-853 Page 3 Compression in the Real World Multimedia Images : gif (LZW), jbig (context), jpeg-ls (residual), jpeg (transform+RL+arithmetic) TV : HDTV (mpeg-4) Sound : mp3 An example Other structures Indexes : google, lycos Meshes (for graphics) : edgebreaker Graphs Databases : 15-853 Page 4 Compression Outline Introduction : – Lossless vs. lossy – Model and coder –Benchma rk s Information Theory : Entropy, etc. Probability Coding : Huffman + Arithmetic Coding Applications of Probability Coding : PPM + others Lempel-Ziv Algorithms : LZ77, gzip, compress, . .. Other Lossless Algorithms: Burrows-Wheeler Lossy algorithms for images: JPEG, MPEG, . .. Compressing graphs and meshes: BBK
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
2 15-853 Page 5 Encoding/Decoding Will use “message” in generic sense to mean the data to be compressed Encoder Decoder Input Message Output Message Compressed Message The encoder and decoder need to understand common compressed format. 15-853 Page 6 Lossless vs. Lossy Lossless : Input message = Output message Lossy : Input message Output message Lossy does not necessarily mean loss of quality. In fact the output could be “better” than the input. – Drop random noise in images (dust on lens) –D rop backg round in mu s ic – Fix spelling errors in text. Put into better form. Writing is the art of lossy text compression. 15-853 Page 7 How much can we compress? For lossless compression, assuming all input messages are valid, if even one string is compressed, some other must expand. 15-853 Page 8 Model vs. Coder To compress we need a bias on the probability of messages. The model determines this bias Model Coder Probs. Bits Messages Encoder Example models: –S imp le : Cha rac te r counts, repeated strings – Complex: Models of a human face
Background image of page 2
3 15-853 Page 9 Quality of Compression Runtime vs. Compression vs. Generality Several standard corpuses to compare algorithms e.g. Calgary Corpus 2 books, 5 papers, 1 bibliography, 1 collection of news articles, 3 programs, 1 terminal session, 2 object files, 1 geophysical data, 1 bitmap bw image The Archive Comparison Test maintains a comparison of just about all algorithms publicly available 15-853 Page 10 Comparison of Algorithms Program Algorithm Time BPC Score RK LZ + PPM 111+115 1.79 430 BOA PPM Var. 94+97 1.91 407 PPMD PPM 11+20 2.07 265 IMP BW 10+3 2.14 254 BZIP BW 20+6 2.19 273 GZIP LZ77 Var. 19+5 2.59 318 LZ77 LZ77 ? 3.94 ? 15-853 Page 11 Compression Outline Introduction : Lossy vs. Lossless, Benchmarks, … Information Theory : –En t ropy – Conditional Entropy – Entropy of the English Language Probability Coding : Huffman + Arithmetic Coding Applications of Probability Coding : PPM + others Lempel-Ziv Algorithms : LZ77, gzip, compress, . ..
Background image of page 3

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
Image of page 4
This is the end of the preview. Sign up to access the rest of the document.

Page1 / 14

compression1and2 - Compression in the Real World...

This preview shows document pages 1 - 4. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online