[CSAPP] The Unicode Standard for text coding

The ASCII is only suitable for encoding English-language documents. It's hard for us to encode the special character. 

The Unicode Consortium has devised the most comprehensive and widely accepted standard for encoding text.  

The UTF-8 uses 32-bit represent a character. Thus, every string of text consists of 4 bytes per character. 

The standard ASCII characters use the same single-byte encodings as they have in ASCII.

原文地址:https://www.cnblogs.com/KennyRom/p/6425571.html