(011 1001) Extending ASCII: Unicode Extending ASCII: Unicode With more bits (16), we can represent more symbols. 216 = 65536 Hangul, Zhuyin, Devanagari, Mongolian, Cherokee, Canadian Aboriginal Syllabics, Tifinagh (Berbers), Osmanya (Somali) Ogham (Irish), Cuneiform, Klingon, Tolkien Text Compression Text Compression Problem: files can be large, requiring lots of storage space, or time to transmit. Solution: "compress" the data somehow. Text Compression Text Compression Several techniques, all lossless. – – – – Codeword/table lookup for common words Run­length coding Huffman coding Lempel­Ziv­Welch http://nostalgia.wikipedia.org/wiki/Data_compression http://en.wikipedia.org/wiki/Data_compression Codeword Table Codeword Table Word as the and that these for etc. Symbol ^ ~ & $ # * % Symbol cannot appear in encoded text Longer words...
