文件处理

1.文件

f = open(r"文件路径", mode="rt", encoding="utf-8")
data = f.read(内容)  # f.write(内容)
f.close()


with open('今日内容.txt',mode='rt',encoding='utf-8') as f1:
    data = f1.read()
    print(data)

自动调用f1.close()回收操作系统

with open('今日内容.txt', mode='rt', encoding='utf-8') as f1, 
        open('a.txt', mode='rt', encoding='utf-8') as f2:
    print('文件1的内容'.center(50, '#'))
    data = f1.read()
    print(data)

​    print('文件2的内容'.center(50, '#'))
​    data = f2.read()
​    print(data)

​ 自动调用f1.close()、f2.close()回收操作系统

   bytes
​    with open('a.txt',mode='rt') as f:
​        data=f.read()
​        print(data)
​        print(type(data))

2 t模式只能用于读文本文件

    with open('a.jpg',mode='rt',encoding='utf-8') as f:
​        data=f.read()
​        print(data)
​        print(type(data))

图片<---------jpg-------二进制数

字符<---------utf-8-------二进制数

3 b模式可能用于读所有的文件

    with open('a.jpg',mode='rb') as f:
​        data=f.read()
​        print(data)
​        print(type(data))

​    with open('a.jpg', mode='rb') as f:
​        data = f.read()
​        print(data.decode("utf-8"))
​    print(type(data))

​ 二进制数

4 t模式是帮我们解码了

​ 字符<---------utf-8-------二进制数

​ 补充字符编码解码的知识
​ '''

user = input('>>: ') # user="林海峰"

​ user = "林海峰"

编码操作:

字符串=utf-8=》bytes

    res=user.encode("utf-8")
​    print(res)
​    print(type(res))

基于网络发送数据(res)

5 解码操作:

bytes》utf-8=》字符串

    print(res.decode("utf-8"))
​    '''

​    with open('a.jpg', mode='rb') as src_f, 
​            open('b.jpg', mode='wb') as dst_f:

data = src_f.read()

dst_f.write(data)

​        for line in src_f: # line=文件中的2行内容
​            dst_f.write(line)

​    with open('b.txt', mode='wb') as f:
​        user = "林海峰"
​        res=user.encode('utf-8')
​        f.write(res)

​    with open('b.txt', mode='wt', encoding="utf-8") as f:
​        user = "林海峰"
​        f.write(user)

6 可读可写模式,可以省略t,默认就是t模式,读写都是以字符串为单位

r+t
w+t
a+t

7 可读可写模式,b模式下读写都是以bytes二进制为单位

r+b
w+b
a+b

with open('b.txt',mode='r+t',encoding='utf-8') as f:
    print(f.read())
    f.write("abcdefg")
with open('b.txt',mode='w+t',encoding='utf-8') as f:
    f.write("我爱你中国")
    print(f.read())
with open('b.txt',mode='a+t',encoding='utf-8') as f:
    f.write("我爱你中国")
    print(f.read())
with open('b.txt', mode='rt', encoding='utf-8') as f:
    line1=f.readline()
    line2=f.readline()
    line3=f.readline()
    line4=f.readline()
    print(line1,end="")
    print(line2,end="")
    print(line3,end="")
    print(line4,end="")

​    for line in f:
​        print(line)

​    l = []
​    for line in f:
​        l.append(line)

​    l = f.readlines()
​    print(l)

with open('b.txt', mode='wt', encoding='utf-8') as f:
f.write("1111 2222 333 ")

​ lines=["1111 ","222 ","333 "]

​ for line in lines:
​ f.write(line)

​ f.writelines(lines)

​ f.writelines({'k1':111,'k2':222,"k3":3333})
​ f.writelines({'k1':111,1:44444,'k2':222,"k3":3333}) # 报错

​ f.writelines("hello")
​ f.write("hello")

with open(r'b.txt', mode='wt', encoding='utf-8') as f:
print(f.name) # 获取的是文件的路径
f.write('哈哈哈 ')
f.flush()
coding:utf-8 python2操作

一:文件内指针移动的单位是什么?

读出二进制解码得到的字符串:hello你好
硬盘: 0101010101101010101011010101010

1.只有t模式下read(n),这个n代表的字符个数

with open('a.txt',mode='rt',encoding='utf-8') as f:
    data=f.read(6)
    print(f.tell())
print(data)

2.了解:硬盘容量的本质就是能存多个二进制数bit

8bit=>1Byte
1024Byte = 1KB
1024KB=1MB
1024MB=1GB
1024GB=1TB
1GB=102410248

with open('a.txt',mode='rb') as f:
    data=f.read(8)
    print(type(data))
    print(len(data))

​    print(data.decode("utf-8"))

with open('b.txt',mode='rb') as f:
    data=f.read(7)
    print(type(data))
    print(len(data))

​    print(data.decode("gbk"))


r+
a
with open('a.txt', mode='r+t', encoding='utf-8') as f:
    f.truncate(7)

除此之外,所有的被动的、主动的文件指针移动的单位都是字节的个数

二: 主动/单纯地控制文件指针移动

f.seek(x,y)
x代表的是移动的字节个数
y代表的模式:

0:代表参照物是文件开头,可以在t模式和b模块下使用

示范:

with open('d.txt', mode='rt', encoding='utf-8') as f:
    f.read(3)
    print(f.tell())  # 5

​    f.seek(3, 0)
​    print(f.tell())  # 3

1:代表参照物是当前位置,只能在b模式下用

with open('d.txt', mode='rb') as f:
    f.read(1)
    print(f.tell()) # 1
    f.seek(2,1)
    print(f.tell()) # 3

print(f.read().decode("utf-8"))

2:代表参照物是文件末尾,,只能在b模式下用

with open('d.txt', mode='rb') as f:

f.seek(3333, 2)

print(f.tell()) # 14+3333=3347

f.seek(-3, 2)

print(f.tell())

​    f.seek(0, 2)  # 快速将指针移动到文件末尾
​    print(f.tell())

with open('d.txt', mode='a') as f:
    print(f.tell())

开发如下命令:

tail -f access.log
import time
with open(r"/day10/代码/access.log", mode="rb") as f:
    f.seek(0, 2)  # 快速将指针移动到文件末尾

​    while True:
​        line = f.readline()
​        if len(line) == 0:
​            time.sleep(0.1)
​        else:
​            print(line.decode('utf-8'),end='')

引入:硬盘数据没有改这么一说,都是用新内容覆盖老内容

with open('e.txt', mode="r+t", encoding='utf-8') as f:
f.seek(9, 0)
f.write("你好")

但是文件是可以修改的,但都是模拟出来的,如何实现,借助内存
具体来说,有两种方式

方式一原理:

1、把硬盘内容全部读入内存,

2、在内存中把内容一次性修改完毕

3、然后把修改完毕的结果覆盖回原文件

with open('a.txt', mode='rt', encoding='utf-8') as read_f:
    data = read_f.read()
with open('a.txt', mode='wt', encoding='utf-8') as write_f:
    write_f.write(data.replace('LIUGUIHAI','liuguihai'))

总结方式一:

优点:不费硬盘,硬盘数据只有一份
缺点:费内存,文件过大时内存占用过多

方式二原理:

1、把硬盘一点一点读入内存,
2、在内存中把内容一次修改
3、然后把修改完毕的结果覆盖回原文件

import os

with open('f.txt', mode='rt', encoding='utf-8') as read_f,
        open(".f.txt.swap",mode='wt',encoding='utf-8') as write_f:
    for line in read_f:
        write_f.write(line.replace("egon",'===>EGON<==='))

os.remove('f.txt')
os.rename('.f.txt.swap', 'f.txt')

总结方式二:

优点:不费内存,内存同一时刻只有文件的一行内容
缺点:费硬盘,在修改过程中硬盘上会同时存放两份数据

原文地址:https://www.cnblogs.com/lgh8023/p/13092990.html