project - ECE 5620 Spring 2011 Compression Project: Issued...

Info iconThis preview shows pages 1–2. Sign up to view the full content.

View Full Document Right Arrow Icon

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
This is the end of the preview. Sign up to access the rest of the document.

Unformatted text preview: ECE 5620 Spring 2011 Compression Project: Issued Apr. 6, Due May. 3 at Noon Overview Natural language, such as English, is an important and fundamental data source. From an information theory standpoint, natural language is quite compressible; the entropy of English is at most 1-2 bits per character. The goal of this project is to design an encoder and decoder for compressing English. We will focus on compressing one particular piece of prose, a concatenation of the novels Pride and Prejudice and Sense and Sensibility , both by the early 19th century British author Jane Austen. The concatenation consists of roughly 240,000 words and 1,400,000 characters. The goal is to compress it as much as possible. Your assignment is to write a C or C++ program that, after reading from a data file that you create, prints out this concatenation in its entirety. We view the data file as the compressed version of the text and the C or C++ program as the decoder. Rules You are asked to submit exactly three files: a data file, a decompression pro- gram, and a report. If you choose to write your solution in C, then the data file must be named compressed (with no extension) and the decompression pro- gram must be named decompressit.c . We place no constraints whatsoever on the compressed file. You may place whatever data you like in this file in what- ever format you like. The file decompressit.c , on the other hand, must be a C program that compiles on the ECE School’s Linux cluster ( amdpool.ece ) using the command gcc -Os -lm -o decompressit decompressit.c (The-Os flag signals the compiler to make the compiled program as small as possible. The-lm flag signals the compiler to link against the standard Math library.) This program must create a file named out.txt whose contents exactly match the master file austen.txt that is posted on the course Blackboard site....
View Full Document

This note was uploaded on 10/03/2011 for the course ECE 5620 at Cornell.

Page1 / 4

project - ECE 5620 Spring 2011 Compression Project: Issued...

This preview shows document pages 1 - 2. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online