用程序打印汉字

 1 //        short start = (short) 0xB0A0;
 2         short start = (short) 0xA1A0;
 3         // gbk gb2312 big5 gb18030
 4         String encoder = "gb2312";
 5 
 6         ByteArrayOutputStream byteArrayOS = new ByteArrayOutputStream();
 7         DataOutputStream dataOS = new DataOutputStream(byteArrayOS);
 8 
 9             for(short k = 0x00; k < 0x5700; k += 0x100){
10                 for(short j = 0x00; j < 0x60; j += 0x10){
11                     for(short i = 0x00; i < 0x10; i += 0x01){
12                         dataOS.writeShort(start + i + j + k);
13                         // 写一个空格
14                         dataOS.write(" ".getBytes(encoder));
15                     }
16                     dataOS.write("

".getBytes(encoder));
17                 }
18                 dataOS.write("

".getBytes(encoder));
19             }
20 
21             System.out.println(byteArrayOS.toString(encoder));

Unicode , UTF-8 , GBK互转

 unicode 和 gbk编码没有线性关系,unicode是按照字形(笔画:横竖撇捺点)的顺序来编码,而gbk,gb2312,gb18030则是按照拼音的顺序来编码

 所以 unicode 和 gbk编码的转换只能靠对照表,来转换:对照表

微软NLS文件格式 

字符集有关的博客

 Byte order mark BOM

原文地址:https://www.cnblogs.com/a-ray-of-sunshine/p/4561637.html