cahr[]和char*

#include <iostream>
usingnamespace std;

int main()
{
char ch_array[] ="abcde";
cout<<sizeof(ch_array)<<endl; //6
cout<<strlen(ch_array)<<endl; //5

constchar* cp ="abcde";
cout<<sizeof(cp)<<endl; //4
cout<<strlen(cp)<<endl; //5

char ch_array2[100] ="abcde";
cout<<sizeof(ch_array2)<<endl; //100
cout<<strlen(ch_array2)<<endl; //5

return0;
}

//注：sizeof是一个运算符
//strlen是一个函数，内部实现使用一个循环计算到\0为止，不包括\0本身

为了双字节的Unicode能够在现存的处理单字节的系统上正确传输，出现了UTF-8，使用类似MBCS的方式对Unicode进行编码。注意UTF-8是编码，它属于Unicode字符集。

Unicode三种编码形式：
UTF-8:字符是以8位序列来编码的，用一个或几个字节来表示一个字符。这种方式的最大好处，是UTF－8保留了ASCII字符的编码做为它的一部分
UTF-16:Unicode的16位编码形式
UTF-32:Unicode的32位编码形式

big endian和little endianbig endian和little endian是CPU处理多字节数的不同方式。