lz - A daptive Huffman and arithmetic methods are universal...

Info iconThis preview shows pages 1–3. Sign up to view the full content.

View Full Document Right Arrow Icon
daptive Huffman and arithmetic methods are universal in the sense that the encoder can adapt to the statistics of the source. But, adaptation is computationally expensive, particular when k-th order Markov approximation is needed for some k > 2. As we know, the kth order approximation approaches the source entropy rate when k . For example, for English text, to do second order Markov approximation, we will need to estimate the probability of all possible triplets (about 35 3 =42,875, 35 = {a-z,(,). ...etc} ) triplets, which is impractical. Arithmetic codes are inherently adaptive, but it is slow and works well for binary file. The dictionary-based methods such as the LZ-family of encoders do not use any statistical model, nor do they use variable size prefix code. Yet, they are universal, adaptive, reasonably fast and use modest amount of storage and computational resources. Variants of LZ algorithm form the basis of Unix compress, gzip, pkzip, stacker and for modems operating at more than 14.4 KBPS. Dictionary Models The dictionary model allows several consecutive symbols, called phrases stored in a dictionary, to be encoded as an address in the dictionary. Usually, an adaptive model is used where the dictionary is encoded using previously encoded text. As the text is compressed, previously encountered substrings are added to the dictionary. Almost all adaptive dictionary models originated from the original papers by Ziv and Lempel which led to several families of LZ coding techniques. Here we will present a couple of those techniques. A
Background image of page 1

Info iconThis preview has intentionally blurred sections. Sign up to view the full version.

View Full DocumentRight Arrow Icon
LZ77 algorithms The prior text constitutes the codebook or the dictionary. Rather than keeping an explicit dictionary, the decoded text up to current time can be used as a dictionary. The figure below shows the characters abaabab just decoded and the decoder is looking at the triplet (5,3,b) - number 5 denotes how far back to look into the already decoded text stream, number 3 gives the length of the phrase matched beginning the first character of yet un-encoded part of the text and the character ‘b’ gives the next character from input. This yields ‘ aabb ’ to be the next phrase added. a b a a b a b (0,0,a) (0,0,b) (2,1,a) (3,2,b) (5,3,b) (10,1,a) Encoded Output Decoded Output LZ77 Algorithm with Finite Buffer s Two buffers of finite size W, called the search(left) and the look-ahead(right )buffers are connected as a shift register. The text to be decoded is shifted in from right to left, initially placing W symbols in the right buffer and filling in the left buffer with the first character of the text. The information transmitted is (p,L,S) and the buffer is shifted L+1 places left. Actually, rather than transmitting p, the offset backward in the search buffer is transmitted. The process is repeated until text is fully encoded.
Background image of page 2
Image of page 3
This is the end of the preview. Sign up to access the rest of the document.

This note was uploaded on 06/09/2011 for the course CAP 5015 taught by Professor Mukherjee during the Spring '11 term at University of Central Florida.

Page1 / 16

lz - A daptive Huffman and arithmetic methods are universal...

This preview shows document pages 1 - 3. Sign up to view the full document.

View Full Document Right Arrow Icon
Ask a homework question - tutors are online