utf-8

§什么是utf-8

The name is derived from: Universal Coded Character Set + Transformation Format—8-bit.
                                      统一编码字符集转型格式8位

§几个字节

UTF-8 uses one byte for any ASCII character, all of which have the same code values in both UTF-8 and ASCII encoding, and up to four bytes for other characters.
8bit = 1byte (00-FF)

§为什么字节不一样

The encoding is variable-length and uses 8-bit code units.
All code points in the BMP are accessed as a single code unit in UTF-16 encoding and can be encoded in one, two or three bytes in UTF-8.

§优势

UTF-8 is the dominant character encoding for the World Wide Web.
The Internet Mail Consortium (IMC) recommends that all e-mail programs be able to display and create mail using UTF-8,[5] and the W3C recommends UTF-8 as the default encoding in XML and HTML.

原文地址:https://www.cnblogs.com/zno2/p/4629692.html