Python 的 IO 操作

计算机处理数据基本上是 CPU 对寄存器里的数据做高速运算。寄存器的数据来自内存条，内存条的数据可以来自硬盘、网卡、光驱、软盘、USB 等等外部设备或网络。
数据从外部设备进入内存称为 input，从内存输出到外部设备称为 output，合称 IO。

外部储存设备的 IO：文件读写

读取文本文件

with open("./file.txt", "r") as file: # 模式为 r
    context = file.read() # 全部内容
    context = file.read(size) # 读取 size 个 byte 的内容
    context = file.readline() # 读取一行
    context_list = [line for line in file.readlines()] # 全部内容列表
    print(context)

指定编码

with open("./file.txt", "r", encoding="gbk", errors="ignore") as file: # 指定编码为 gbk，错误可忽略
    context = file.read()

读取二进制文件

with open("./file.txt", "rb") as file: # 模式为 rb
    context = file.read() # b"\xff\xe1"

写文本文件

with open("./file.txt", "w") as file:
    text = "text"
    context = file.write(text)

指定编码

with open("./file.txt", "w", encoding="gbk") as file: # 指定编码为 gbk
    context = file.write()

写二进制文件

with open("./file.txt", "wb") as file:
    text = "text"
    context = file.write(text)

w 模式为覆盖，如果要追加用 a。

内存中创建的 IO

内存中创建并读写 str 数据：io.StringIO。

In [1]: from io import StringIO
   ...: str_io = StringIO()

In [2]: str_io.write("text1")
Out[2]: 5

In [3]: str_io.write("text2")
Out[3]: 5

In [4]: str_io.getvalue()
Out[4]: 'text1text2'

初始化时批量传入内容

In [1]: from io import StringIO

In [2]: str_io = StringIO("text1\ntext2\ntext3")

In [3]: for line in str_io.readlines():
   ...:     print(line)
   ...: 
text1

text2

text3

内存中创建并读写二进制数据：io.BytesIO

In [1]: from io import BytesIO

In [2]: bytes_io = BytesIO()

In [3]: bytes_io.write("网站".encode("utf-8"))
Out[3]: 6

In [4]: bytes_io.getvalue()
Out[4]: b'\xe7\xbd\x91\xe7\xab\x99'

初始化时批量传入内容

In [1]: from io import BytesIO

In [2]: bytes_io = BytesIO(b'\xe7\xbd\x91\xe7\xab\x99')

In [3]: bytes_io.read()
Out[3]: b'\xe7\xbd\x91\xe7\xab\x99'

In [4]: bytes_io.getvalue()
Out[4]: b'\xe7\xbd\x91\xe7\xab\x99'

序列化与反序列化

把数据弄成给人看的形式的过程是序列化，弄成方便机器处理的过程叫反序列化。
Django REST framework 框架中的序列化器，将数据处理成 json 返回给前端叫序列化，把前端传过来的 json 数据转成 ORM 对象做处理叫反序列化。

序列化有多个说法：pickling、serialization、marshalling、flattening，这些都是一回事，整的花里胡哨的。

import pickle
data = {
    "a":1,
    "b":2,
    "c":3,
}
ser_data = pickle.dumps(data)
print(ser_data) # b'\x80\x04\x95\x17\x00\x00\x00\x00\x00\x00\x00}\x94(\x8c\x01a\x94K\x01\x8c\x01b\x94K\x02\x8c\x01c\x94K\x03u.'
unser_data = pickle.loads(ser_data)
print(unser_data) # {'a': 1, 'b': 2, 'c': 3}

序列化后写入文件

import pickle
with open("./a.txt", "wb") as f:
    data = {
        "a":1,
        "b":2,
        "c":3,
    }
    ser_data = pickle.dump(data, f)

从文件反序列化内容

import pickle
with open("./a.txt", "rb") as f:
    ser_data = pickle.load(f)

（本文完）