Now when we run again we get the following output utf

Info icon This preview shows pages 389–393. Sign up to view the full content.

View Full Document Right Arrow Icon
Now, when we run again, we get the following output: UTF-8 ----- 195 137 99 111 117 116 101 45 109 111 105 33 ASCII ----- 63 99 111 117 116 101 45 109 111 105 33 We’ve quite clearly not got the same output in each case. The UTF-8 case starts with 195, 137, while the ASCII starts with 63. After this preamble, they’re again identical. So, let’s try decoding those two byte arrays back into strings, and see what happens. Insert the code in Example 10-82 before the call to Console.ReadKey . Encoding Characters | 365
Image of page 389

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Example 10-82. Decoding text string decodedUtf8 = Encoding.UTF8.GetString(utf8Bytes); string decodedAscii = Encoding.ASCII.GetString(asciiBytes); Console.WriteLine(); Console.WriteLine(); Console.WriteLine("Decoded UTF-8"); Console.WriteLine("-------------"); Console.WriteLine(decodedUtf8); Console.WriteLine(); Console.WriteLine(); Console.WriteLine("Decoded ASCII"); Console.WriteLine("-------------"); Console.WriteLine(decodedAscii); Figure 10-2. Charmap.exe in action 366 | Chapter 10: Strings
Image of page 390
We’re now using the GetString method on our Encoding objects, to decode the byte array back into a string. Here’s the output: UTF-8 ----- 195 137 99 111 117 116 101 45 109 111 105 33 ASCII ----- 63 99 111 117 116 101 45 109 111 105 33 Decoded UTF-8 ------------- É coute-moi! Decoded ASCII ------------- ? coute-moi! The UTF-8 bytes have decoded back to our original string. This is because the UTF-8 encoding supports the E-acute character, and it does so by inserting two bytes into the array: 195 137 . On the other hand, our ASCII bytes have been decoded and we see that the first char- acter has become a question mark. If you look at the encoded bytes, you’ll see that the first byte is 63, which (if you look it up in an ASCII table somewhere) corresponds to the question mark character. So this isn’t the fault of the decoder. The encoder , when faced with a character it didn’t un- derstand, inserted a question mark. So, you need to be careful that any encoding you choose is capable of supporting the characters you are using (or be prepared for the infor- mation loss if it doesn’t). OK, we’ve seen an example of the one-byte-per-character ASCII representation, and the at-least-one-byte-per-character UTF-8 representation. Let’s have a look at the un- derlying at-least-two-bytes-per-character UTF-16 encoding that the framework uses internally— Example 10-83 uses this. Example 10-83. Using UTF-16 encoding static void Main(string[] args) { string listenUpFR = "Écoute-moi!"; byte[] utf16Bytes = Encoding.Unicode.GetBytes(listenUpFR); Console.WriteLine("UTF-16"); Console.WriteLine("-----"); foreach (var encodedByte in utf16Bytes) { Encoding Characters | 367
Image of page 391

Info icon This preview has intentionally blurred sections. Sign up to view the full version.

View Full Document Right Arrow Icon
Console.Write(encodedByte); Console.Write(" "); } Console.ReadKey(); } Notice that we’re using the Unicode encoding this time. If we compile and run, we see the following output: UTF-16 ----- 201 0 99 0 111 0 117 0 116 0 101 0 45 0 109 0 111 0 105 0 33 0 It is interesting to compare this with the ASCII output we had before: ASCII ----- 63 99 111 117 116 101 45 109 111 105 33 The first character is different, because UTF-16 can encode the E-acute correctly; thereafter, every other byte in the UTF-16 array is zero, and the next byte corresponds to the ASCII value. As we said earlier, the Unicode standard is highly compatible with ASCII, and each 16-bit value (i.e., pair of bytes) corresponds to the equivalent 7-bit
Image of page 392
Image of page 393
This is the end of the preview. Sign up to access the rest of the document.

{[ snackBarMessage ]}

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern