字符集(编码)转换_Qt532

字符集(编码)转换_Qt532_QString

1、网上的资料：

　　1.1、参考网址：http://blog.csdn.net/changsheng230/article/details/6588447

　　1.2、网页内容：

“

Qt 使用Unicode编码来存储操作字符串，但很多情况下，我们不得不处理采用其他编码格式的数据，举例来说，中文多采用GBK和Big5编码，而日本则多采用Shift-JIS or ISO2022编码。本文将讨论如何将其他编码格式的字符串转化成采用Unicode编码的QString

// Method 1
    QString str = QString::fro mLocal8Bit("本地文本");
    QString str2 = QString("本地文本");  // 乱码
// Method 2
    QTextCodec *codec = QTextCodec::codecForName("GBK"); // get the codec for KOI8-R
    QString locallyEncoded = codec->toUnicode( "显示中文" );
    qDebug() << locallyEncoded << endl;

//更多细节请参见：

http://www.kuqin.com/qtdocument/qtextcodec.html

http://blog.csdn.net/catamout/article/details/5675878

”

2、我的理解

　　QString在底层是使用Unicode来存储字符串的(Java的String也是这样)，Unicode 也是一种编码的方式且它是用2个字节来存储一个字符的(宽字符)，不管是英文的一个单词/中文的一个汉字/等它都是使用 2个字节来存储。

　　Unicode作为一种中间状态存储在QString中，不同编码方式(如 GBK/utf-8/等)的字符串先通过它们的编码方式转码成Unicode(也就是QString)，然后我们需要什么编码方式再从Unicode转为目标的编码方式的字节数组。

　　ZC: 使用中，与Java不同之处：(需要注意)
　　　　我的理解是这样的：

3、我的测试代码：(该cpp文件编码方式为"UTF-8 + BOM")

// http://blog.csdn.net/changsheng230/article/details/6588447

    wchar_t *pwc = L"我是中国人";　　// ZC: 这里是使用的 编码方式为"UTF-8 + BOM"
    qDebug() << "(1) ==>";
    for (size_t i=0; i<wcslen(pwc); i++)
    {
        ushort us = pwc[i];
        qDebug() << "	" << QString::number(us, 16).leftJustified(2, '0');
    }

　　// ZC: QString 默认还是使用GBK
    QTextCodec *codec = QTextCodec::codecForName("GBK"); // get the codec for KOI8-R
    QString locallyEncoded = codec->toUnicode( "我是中国人" );
    qDebug() << locallyEncoded << endl;

    QChar *pcs = (QChar*)locallyEncoded.unicode();
    qDebug() << "(2) ==>";
    for (int i=0; i<locallyEncoded.length(); i++)
    {
        QChar c = pcs[i];
        ushort us = c.unicode();
        qDebug() << "	" << QString::number(us, 16).leftJustified(2, '0');
    }

    pcs = (QChar*)locallyEncoded.data();
    qDebug() << "(3) ==>";
    for (int i=0; i<locallyEncoded.length(); i++)
    {
        QChar c = pcs[i];
        ushort us = c.unicode();
        qDebug() << "	" << QString::number(us, 16).leftJustified(2, '0');
    }


    QTextCodec *codecUtf8 = QTextCodec::codecForName("utf-8");
    QByteArray ba = codecUtf8->fromUnicode(locallyEncoded);

    qDebug() << "(4) ==>";
    for (int i=0; i<ba.length(); i++)
    {
        ushort us = ba[i];
        us &= 0xFF;
        qDebug() << "	("<< QString::number(i).rightJustified(2, '0') <<")"
                 << QString::number(us, 16).leftJustified(2, '0');
    }

    ba = locallyEncoded.toUtf8();
    qDebug() << "(5) ==>";
    for (int i=0; i<ba.length(); i++)
    {
        ushort us = ba[i];
        us &= 0xFF;
        qDebug() << "	("<< QString::number(i).rightJustified(2, '0') <<")"
                 << QString::number(us, 16).leftJustified(2, '0');
    }

　　3.1、打印的信息：

(1) ==>
     "6211"
     "662f"
     "4e2d"
     "56fd"
     "4eba"
"我是中国人" 

(2) ==>
     "6211"
     "662f"
     "4e2d"
     "56fd"
     "4eba"
(3) ==>
     "6211"
     "662f"
     "4e2d"
     "56fd"
     "4eba"
(4) ==>
    ( "00" ) "e6"
    ( "01" ) "88"
    ( "02" ) "91"
    ( "03" ) "e6"
    ( "04" ) "98"
    ( "05" ) "af"
    ( "06" ) "e4"
    ( "07" ) "b8"
    ( "08" ) "ad"
    ( "09" ) "e5"
    ( "10" ) "9b"
    ( "11" ) "bd"
    ( "12" ) "e4"
    ( "13" ) "ba"
    ( "14" ) "ba"
(5) ==>
    ( "00" ) "e6"
    ( "01" ) "88"
    ( "02" ) "91"
    ( "03" ) "e6"
    ( "04" ) "98"
    ( "05" ) "af"
    ( "06" ) "e4"
    ( "07" ) "b8"
    ( "08" ) "ad"
    ( "09" ) "e5"
    ( "10" ) "9b"
    ( "11" ) "bd"
    ( "12" ) "e4"
    ( "13" ) "ba"
    ( "14" ) "ba"

4、

5、