Ch. 7 - Data Compression

1 DATA COMPRESSION Focus on Data Compression 2 DATA COMPRESSION Objectives ± Understand the essential ideas underlying data compression. ± Become familiar with the different types of compression algorithm. ± Be able to describe the most popular data compression algorithms in use today and know the applications for which each is suitable.

2 3 DATA COMPRESSION ± Data compression is important to storage systems because it allows more bytes to be packed into a given storage medium than when the data is uncompressed. ± Some storage devices (notably tape) compress data automatically as it is written, resulting in less tape consumption and significantly faster backup operations. ± Compression also reduces file transfer time, saving time and communications bandwidth. Introduction 4 DATA COMPRESSION ± Communication bandwidth is an expensive resource. ± Interest in reducing the number of bits that must be transmitted. ± Text Compression o Data sent over any connection can be viewed as a sequence of symbols: S 1 , S 2 , …. .,S n Introduction
3 5 DATA COMPRESSION S 1 , S 2 , …. .,S n The may be: 1. Set of bits 0,1 2. Set of decimal digits 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 3. Set of letters A, B, C, …, X, Y, Z 4. Set of countries: Argentina, Belgium, … o Text compression can be approached as: 1. The finiteness of the set of symbols 2. The relative frequency with which the symbols are used 3. The context in which a symbol appears Introduction 6 DATA COMPRESSION ± A good metric for compression is the compression factor (or compression ratio ) given by: ± If we have a 100KB file that we compress to 40KB, we have a compression factor of: Introduction

4 7 DATA COMPRESSION ± Compression is achieved by removing data redundancy while preserving information content. ± The information content of a group of bytes (a message) is its entropy . o Data with low entropy permit a larger compression ratio than data with high entropy. ± Entropy,
