UTF-8

0000-FFFF 最多四字节。

https://en.wikipedia.org/wiki/UTF-8

UTF-8 encodes each of the 1,112,064 valid code points in the Unicode code space (1,114,112 code points minus 2,048 surrogate code points) using one to four 8-bit bytes (a group of 8 bits is known as an octet in the Unicode Standard). 

原文地址:https://www.cnblogs.com/rsapaper/p/6351684.html