This preview shows pages 1–2. Sign up to view the full content.
This preview has intentionally blurred sections. Sign up to view the full version.View Full Document
Unformatted text preview: ECE 5620 Spring 2011 Compression Project: Issued Apr. 6, Due May. 3 at Noon Overview Natural language, such as English, is an important and fundamental data source. From an information theory standpoint, natural language is quite compressible; the entropy of English is at most 1-2 bits per character. The goal of this project is to design an encoder and decoder for compressing English. We will focus on compressing one particular piece of prose, a concatenation of the novels Pride and Prejudice and Sense and Sensibility , both by the early 19th century British author Jane Austen. The concatenation consists of roughly 240,000 words and 1,400,000 characters. The goal is to compress it as much as possible. Your assignment is to write a C or C++ program that, after reading from a data file that you create, prints out this concatenation in its entirety. We view the data file as the compressed version of the text and the C or C++ program as the decoder. Rules You are asked to submit exactly three files: a data file, a decompression pro- gram, and a report. If you choose to write your solution in C, then the data file must be named compressed (with no extension) and the decompression pro- gram must be named decompressit.c . We place no constraints whatsoever on the compressed file. You may place whatever data you like in this file in what- ever format you like. The file decompressit.c , on the other hand, must be a C program that compiles on the ECE School’s Linux cluster ( amdpool.ece ) using the command gcc -Os -lm -o decompressit decompressit.c (The-Os flag signals the compiler to make the compiled program as small as possible. The-lm flag signals the compiler to link against the standard Math library.) This program must create a file named out.txt whose contents exactly match the master file austen.txt that is posted on the course Blackboard site....
View Full Document