14 Compression Part 2

# 14 Compression Part 2 - Compression, part 2 15-211:...

This preview shows pages 1–22. Sign up to view the full content.

Compression, part 2 15-211: Fundamental Data Structures and Algorithms Charlie Garrod 04 March 2010 Reading for today:

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
2 Announcements
3 Announcements

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
4 Announcements HW4 is available Theory due in lecture Thursday, 18 March Programming due 11:59 pm, Monday, 22 March
5 Last time: Prefix-free codes “aaaebcbacfcaaaababaabbcciabbbbaaddcaccabcahdbaacabccbbaaba…” 10 11 01 0010 0000 0001 00110 001111 001110 h i g d f e c a b 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
6 Last time: Huffman’s algorithm “aaaebcbacfcaaaababaabbcciabbbbaaddcaccabcahdbaacabccbbaaba…” 1) Count frequencies for each letter in message 2) Use frequencies to build Huffman code dictionary 3) Use Huffman code to encode message 4) Write the dictionary and the encoded message
7 Last time: Huffman’s algorithm “abcdabcdabcdabcdabcdabcdabcdabcdabcdabcd…” (70 times)

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
8 Last time: Huffman’s algorithm “abcdabcdabcdabcdabcdabcdabcdabcdabcdabcd…” (70 times) b c d 0 0 0 1 1 1 a 00 01 10 11 Length:
9 Last time: Huffman’s algorithm “abcdabcdabcdabcdabcdabcdabcdabcdabcdabcd…” (70 times) b c d 0 0 0 1 1 1 a 00 01 10 11 Length: Send the dictionary, plus 560 bits for the message

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
10 Last time: Huffman’s algorithm “aaaaaaa…aabbbbbbb…bbcccccccc…ccddddddd….dd” b c d 0 0 0 1 1 1 a 00 01 10 11 70 70 70 70 Length: Send the dictionary, plus 560 bits for the message
11 Last time: Huffman’s algorithm “1111111111111111110111111111111111111111001111111111101111…” 1 0 1 0

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
12 Today: Compression, part 2 N-gram compression Adaptive compression algorithms Move-to-front LZW Java bytes
13 N-gram compression “aaaaaaa…aabbbbbbb…bbcccccccc…ccddddddd….dd” 70 70 70 70 1) Break message into chunks of size N 2) Apply Huffman’s algorithm to the chunks e.g., 4-gram compression:

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
14 N-gram compression “aaaaaaa…aabbbbbbb…bbcccccccc…ccddddddd….dd” 70 70 70 70 e.g., 4-gram compression: bbbb cccc dddd aaaa aabb ccdd 0 0 0 0 0 1 1 1 1 1 110 1110 00 01 1111 10 Length: Send the dictionary, plus 161 bits for the message
15 N-gram compression “abcdabcdabcdabcdabcdabcdabcdabcdabcdabcd…” (70 times) e.g., 6-gram compression:

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
16 N-gram compression “abcdabcdabcdabcdabcdabcdabcdabcdabcdabcd…” (70 times) e.g., 6-gram compression: cdabcd 0 0 1 abcd abcdab Length: Send the dictionary, plus 71 bits for the message 1
17 N-gram compression Problems?

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
18 N-gram compression Problems? N = ? “2-pass” compression algorithm
19 Adaptive compression Reduce code-length for recent or frequent letters/patterns e.g., abbbaaaaaddddddddaaaacccbbbbbcccccc… Add new patterns to the dictionary e.g., abcdabcdabcdabcdabcdabcdabcdabcd… Encoding: input message intermediate encoding output adaptive trans- formation bit encoding “abbaaa…” 1 3 0 1 5 … 00100111101…

This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document
20 Move-to-front (MTF) When we see a letter: Emit current list position Then move letter to front of list e.g., abbbaaaaaddddddddaaaacccbbbbbcccccc… b a c d e f