Module5_1 - Module 5, Lecture 1 Data Compression:...

Info iconThis preview shows pages 1–6. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: Module 5, Lecture 1 Data Compression: Introduction G.L. Heileman Module 5, Lecture 1 Data Compression Introduction In this module we study the science (art) or representing information (i.e., data) in a compact form. Key idea: Compact representations are created by identifying and exploiting structure that exists in the data. Ex: Morse code – Uses an alphabet of four symbols, dot ( · ), dash (–), letter space, and word space, to encode the English alphabet. – Shorter codewords are assigned more frequently occurring letters, and longer ones to less frequently occurring letters. E.g., e ( · ) a ( ·- ) q (- - ·- ) j ( · - -- ) – What kind of structure is being exploited? Statistical. G.L. Heileman Module 5, Lecture 1 Data Compression Set-up Overview: source encoder decoder x n ε X x n C( ) x n ε X transmission (error free) original reconstructed data data compressed data n n We assume: The information source is outputting a string x n ∈ X n . The encoder performs a mapping from source data x n to codewords C ( x n ). Two general types of codes: fixed length (e.g., ASCII, Unicode) – all codewords have the same length. A q character alphabet requires d log q e bits/symbol. variable length – # of bits vary from codeword to codeword (hopefully | C ( x n ) | is much smaller than | x n | ). The communication channel could be: a telephone line, radio waves through the atmosphere, disk drive (communication does not necessarily involved moving data from one place to another). In every case, we’re assuming error-free communication. G.L. Heileman Module 5, Lecture 1 Data Compression Set-up The decoder must “know” the encoding algorithm or the encoder needs to send a codebook with the encoded data). If ˆ x n = x n , this is called lossless compression (remember, we’re assuming error-free communication). If ˆ x n 6 = x n , this is called lossy compression , e.g., JPEG. Finally, given some compression algorithm, we generally want to be able to say something about the performance of the algorithm. Factors include: Amount of compression – Lossless: – compression ratio – ratio of bits used before and after compression. – expected codeword length – average number of bits/symbol. Lossy: Same as above, except that we need to quantify the difference between x n and ˆ x n (rate distortion theory). Efficiency of the compression algorithm. Complexity of the compression algorithm. G.L. Heileman Module 5, Lecture 1 Data Compression Models Other types of structure can be exploited for compression. Ex 1: Numerical Data – Consider the data { x 1 , x 2 , . . . } given by { 9 , 11 , 11 , 11 , 14 , 13 , 15 , 17 , 16 , 17 , 20 , 21 } . 2 4 6 8 10 12 8 10 12 14 16 18 20 22 Since these numbers are in the range [0 , 32], if we encoded them directly, we would need 5 bits/sample....
View Full Document

This note was uploaded on 05/06/2010 for the course ECE 549 taught by Professor G.l.heileman during the Spring '10 term at University of New Brunswick.

Page1 / 33

Module5_1 - Module 5, Lecture 1 Data Compression:...

This preview shows document pages 1 - 6. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online