Python3、Unicode、UTF-8、编码

text = u'你好,今天天气不错'
text
print(text)

text = 'u4f60u597duff0cu4ecau5929u5929u6c14u4e0du9519'
text
print(text)

text = u'u4f60u597duff0cu4ecau5929u5929u6c14u4e0du9519'
text
print(text)

text = '\u4f60\u597d\uff0c\u4eca\u5929\u5929\u6c14\u4e0d\u9519'
text
print(text)
text = text.encode('utf-8').decode('unicode_escape')
text
print(text)

text = '\u4f60\u597d\uff0c今天天气不错'
text
print(text)
import re
text = re.sub(r'(\u[0-9a-fA-F]{4})', lambda matched: matched.group(1).encode('utf-8').decode('unicode_escape'), text)
text
print(text)

以上为运行的代码,运行的结果如下:

>>> text = u'你好,今天天气不错'
>>> text
'你好,今天天气不错'
>>> print(text)
你好,今天天气不错

>>> text = 'u4f60u597duff0cu4ecau5929u5929u6c14u4e0du9519'
>>> text
'你好,今天天气不错'
>>> print(text)
你好,今天天气不错

>>> text = u'u4f60u597duff0cu4ecau5929u5929u6c14u4e0du9519'
>>> text
'你好,今天天气不错'
>>> print(text)
你好,今天天气不错

>>> text = '\u4f60\u597d\uff0c\u4eca\u5929\u5929\u6c14\u4e0d\u9519'
>>> text
'\u4f60\u597d\uff0c\u4eca\u5929\u5929\u6c14\u4e0d\u9519'
>>> print(text)
u4f60u597duff0cu4ecau5929u5929u6c14u4e0du9519
>>> text = text.encode('utf-8').decode('unicode_escape')
>>> text
'你好,今天天气不错'
>>> print(text)
你好,今天天气不错

>>> text = '\u4f60\u597d\uff0c今天天气不错'
>>> text
'\u4f60\u597d\uff0c今天天气不错'
>>> print(text)
u4f60u597duff0c今天天气不错
>>> import re
>>> text = re.sub(r'(\u[0-9a-fA-F]{4})', lambda matched: matched.group(1).encode('utf-8').decode('unicode_escape'), text)
>>> text
'你好,今天天气不错'
>>> print(text)
你好,今天天气不错

原文地址:https://www.cnblogs.com/jacen789/p/9401877.html