python基础(十)常用模块

模块

模块的定义

模块就是从逻辑上组织的python代码,是.py的python文件.将python文件的目录和目录下各个文件当作模块对象来处理(用import关键字引用,成为当前代码的一部分),而不是作为一般的文件对象来处理(例如通过open来作为单纯的字符串数据).
同时分为module模块和package包两个概念,module是文件对象对应的文件,package是路径对象(一个文件夹),对应导入的是文件夹中的__init__.py

模块分为三类:

1,标准库:python解释器内置的基础模块,python官方收录维护的代码.
2,开源模块: 由第三方组织提供的代码
3,自定义模块: 开发者自己编写维护的代码都称为自定义模块

导入方法

通常我们可以直接使用import module_name来导入一个模块.同时我们也可以使用from ... import ...来导入模块中的部分代码.

1, import module_name
将一个py文件作为模块引入当前代码,通常用来导入内置或者是已经安装过的第三方模块.(会导入整个.py文件)

#module_test/test1.py
name = 'test1'
def run_hello():
    print('hello world')

#module_test/test2.py
import test1
test1.run_hello()
print(test1.name)

2, from package import module_name
从一个package中引入py模块(会优先初始化路径下的__init__.py).通常用来导入自定义模块.(会导入整个.py文件)

#module_test/test1.py
name = 'test1'
def run_hello():
    print('hello world')

#module_test/test2.py
from module_test import test1
test1.run_hello()
print(test1.name)

3, from package.module_name import fun_name
从具体模块中引入指定的代码块,通常封装为函数或者是类

#module_test/test1.py
name = 'test1'
def run_hello():
    print('hello world')

def run_test1():
    print('in the test1')

#module_test/test2.py
from module_test.test1 import run_hello
run_hello()
print(locals())

4, from package.module_name import *
类似于from package import module_name的引用方法,但是呢有一个坏处因为不是用module_name来引用模块中的代码,而是直接在当前代码块中引用对象.所以会被当前代码块中的同名对象覆盖.

#module_test/test1.py
name = 'test1'
def run_hello():
    print('hello world')

def running_test():
    print('in the test1')

#module_test/test2.py
from module_test.test1 import *
running_test()
print(locals())
def running_test():
    print('in the test2')

running_test()

5, from package import module_name as alias_name / from package.module_name import fun_name as alias_name
使用as关键字给引入的代码块使用一个别名

#module_test/test2.py
from module_test import test1 as mytest

print(locals())
def running_test():
    print('in the test2')

running_test()
mytest.running_test()

or

#module_test/test2.py
from module_test.test1 import running_test as mytest

print(locals())
def running_test():
    print('in the test2')

running_test()
mytest()

6, 修改__init__.py 在导入包的时候,由于我们执行的是包的__init__.py所以需要在这个文件中对module进行一些定义

直接导入包会提示找不到模块

#module_test/test1.py
name = 'test1'
def run_hello():
    print('hello world')

def running_test():
    print('in the test1')

#module_test/test2.py 
import module_test

module_test.test1.running_test()

修改__init___.py 这一步相当于给包下的模块做索引.

#module_test/test1.py
name = 'test1'
def run_hello():
    print('hello world')

def running_test():
    print('in the test1')

#module_test/__init__.py
from . import test1

#module_test/test2.py 
import module_test

import module_test

module_test.test1.running_test()

import 本质

import的过程简单来说分为三步:

1,根据package,module_name搜索环境变量的路径中是否有module同名的文件,当前路径和sys.path.

2,将搜索到的同名文件或是文件中指定部分代码块载入内存.

3,根据是否使用as判断,默认内存中加载的代码块赋值给module_name或是fun_name.如果使用了as将内存中的代码块赋值给as给出的变量名.

python标准库

官方文档点击

time

time模块主要是对时间的一些输出格式处理,需要了解unix时间戳(timestamp)和时区等相关概念.time模块中时间是默认以一个元组的方式存储(python称为struct_time)

asctime 将时间元组返回为一个英语系标准时间格式输出

print(time.asctime())

clock 进程所用cpu时间
ctime 将unix时间戳转换为英系标准格式时间, 默认当前时间

print(time.ctime(987654321))

gmtime 将一个时间戳返回一个时间信息的元组,默认当前时间

print(time.gmtime(987654321))

localtime() 将当前时区的unix时间戳返回为一个时间元组的格式,默认当前时间

print(time.localtime(987654321))

mktime 将时间元组以时间戳方式返回

a = time.gmtime()
print(time.mktime(a))

process_time 用于分析的进程时间：内核和用户空间CPU时间之和。

print(time.process_time())

sleep 执行指定秒数的延迟

print(time.sleep(3))

strftime 将一个时间元组返回为一个指定格式的字符模式

a = time.gmtime()
print(time.strftime("%Y-%m-%d %H:%M:%S",a))

strptime strftime的反操作,格式要一致

a = time.gmtime()
b = time.strftime("%Y-%m-%d %H:%M:%S",a)
print(time.strptime(b,"%Y-%m-%d %H:%M:%S"))

time 以浮点数的方式返回当前时间

补充记住3种时间格式转换方式

datetime

time模块的增强功能, 有date,datetime,time,timedelta,tzinfo,timezone几个子模块

datetime.now 当前时间

print(datetime.datetime.now())

timedelta 对时间进行数学运算

print(datetime.datetime.now()+datetime.timedelta(-10))
print(datetime.datetime.now()+datetime.timedelta(hours=-10))
print(datetime.datetime.now()+datetime.timedelta(minutes=-10))
print(datetime.datetime.now()+datetime.timedelta(seconds=-10))

os

对操作系统进行调用的接口,

os.getcwd()			|	获取当前工作目录
os.chdir()			|	移动工作目录
os.pardir()			|	获取当前父目录
os.makedirs()		|	递归生成多级目录
os.removedirs()		|	删除多级目录
os.mkdir()			|	生成单级目录
os.rmdir()			|	删除单级目录
os.listdir()		|	列出指定目录下文件和子目录,相当于ls -a
os.remove()			|	删除文件
os.rename()			|	重命名文件或目录
os.stat()			|	文件信息
os.sep				|	操作系统的路径分隔符,x = os.sep.encode()
os.linesep			|	查看换行符 x = os.linesep.encode()
os.pathsep			|	多个文件路径字符串之间分隔符
os.name				|	操作系统类别,win为nt,linux为posix
os.system()			|	执行一个操作系统允许执行的命令:os.system('pwd')
os.environ()		|	系统环境变量

用的比较多的是路径查询

abspath 输出文件绝对路径

print(os.path.abspath(__file__))

dirname 返回路径

print(os.path.dirname(__file__))

os.path.split() 	|	返回分割目录和文件名的元组
os.path.basename()	|	返回文件名
os.path.exists()	|	判断文件是否存在,返回True或者False
os.path.isabs()		|	判断绝对路径
os.path.isdir()		|	判断目录
os.path.isfile()	|	判断是否为文件
os.path.join()		|	组合多个路径
os.path.getatime()	|	返回文件atime,相应还有ctime,mtime
os.path.getsize()	|	获取文件大小

补充: 字符串前加r申明字符串不需要转义,例如''符号,通常用在正则和文件地址目录等
字符串前加u申明字符串编码为unicode,通常用在申明非英文字符串编码
字符串前加b申明字符串为bytes类型,python2中没有意义只是为兼容python3的写法:b'str' == bytes('str',encoding='utf8')

sys

python解释器相关操作

官方文档点击

常用的两个

sys.argv 获取python执行时的参数,默认为列表,第一个值为文件名
sys.exit 退出python解释器.终止程序的运行

sys.getdefaultencoding()	|	获取解释器默认编码
sys.path					|	获取解释器默认环境变量path中的路径
sys.platform				|	获取解释器的操作系统
sys.api_version				|	获取解释器的c的api版本
sys.base_exec_prefix		|	解释器执行程序所在目录
sys.builtin_module_names	|	内置模块
sys.exc_info()				|	获取当前进程的信息,此外还有exc_traceback,exc_value等
sys.modules					|	解释器中已导入的模块信息
sys.stderr					|	标准错误对象信息,此外还有stdin,stdout

stdin,stdout,std.err

sys.stdout.write('please')

x = sys.stdin.readline()
print(x)

random

random 取0-1之间的浮点数,uniform方法可以指定范围

import random
print(random.random())
print(random.uniform(1,10))

随机取两个整数之间的值

print(random.randint(1,100)) 	#取值范围包含1和100

随机取两个整数之间的值

print(random.randrange(0,100))	#取值范围包含0,不包含100

随机取一个序列(列表,元组,字符串)中的一个值,

list_test = ['a','b','c','d']
print(random.choice(list_test))	#choice方法返回值是字符,choices返回值是一个列表

随机去一个序列中的多个值

list_test = ['a','b','c','d']
print(random.sample(list_test,3))

随机打乱序列顺序

list_test = ['a','b','c','d']
random.shuffle(list_test)
print(list_test)

验证码程序

def check_code():
    import random
    checkcode=''
    for i in range(6):							#6位验证码位置循环
        current = random.randrange(0,6)			#随机字母和数字的条件码
        if current == i :						#如果位置码和条件码一致生成字母
        	tmp = chr(random.randint(65,90))
        else:									#负责生成0-9数字
            tmp = random.randint(0,9)
        checkcode += str(tmp)
    return checkcode

shutil,

对内置函数open更高级的扩展,处理文件对象操作,复制,压缩

copyfileobj 复制文件对象

f1 = open('a.txt','rb')
f2 = open('b.txt','wb')
shutil.copyfileobj(f1,f2)

copyfile 复制文件,调用的copyfileobj方法

shutil.copyfile('a.txt','c.txt')

copy 复制文件调用copyfile和copymode,复制文件和权限
copy2 复制文件调用copyfile和copystat,复制文件和信息

shutil.copy('a.txt','d.txt')

move 重命名

shutil.move('a.txt','e.txt')

chown 修改属主

shutil.chown('a.txt',user='sylar',group='sylar')

make_archive 压缩,调用的ZipFile 和TarFile两个模块

shutil.make_archive('test','tar',root_dir='./')

import tarfile
t = tarfile.open('test.tar','r')
t.extractall()
t.close()

copymode			|	复制文件权限
copystat 			|	复制文件信息
copytree			|	复制目录树
rmtree				|	删除目录树
disk_usage()		|	获取磁盘空间
get_terminal_size() |	终端大小,

json 和pickle模块 shelve

1, json和pickle

因为在网络传输中,python3是默认是用bytes类型传输数据,而在文件中是使用str类型来保存数据,所以数据需要在bytes和str类型中进行转换.json是可跨语言使用的数据交互或序列化的数据标准,pickle是python自有的数据交互和序列化的数据标准.这两个模块的功能:

dump load 对数据进行处理并写如文件

d1 = {'name':'sylar',
      'age':18,}
with open('test.txt','wb') as f1:
    pickle.dump(d1,f1)

with open('test.txt','rb') as f2:
    data = pickle.load(f2)
    print(data)

dumps loads 只对数据进行处理

d1 = {'name':'sylar',
      'age':18,}
print(type(d1))
dump_obj = pickle.dumps(d1)
print(type(dump_obj))
load_obj = pickle.loads(dump_obj)
print(type(load_obj))

2, shelve模块
pickle对一个文件无论dumps多次只能loads一次,如果需要记录文件的多次dumps可以使用shelve模块.使用方法和open差不多

d1 = {'name':'sylar',
      'age':18,}

with shelve.open('test.shelve') as f:
    f['key'] = d1

with shelve.open('test.shelve') as f:
    for k,v in f.items():
        print(k,v)

3, 其它文档处理xml,pyyaml,confgparser

hashlib,hmac

用于加密相关的操作,python3里面hashlib整合了md5和sha模块.hmac多重加密

import hashlib
text = b'hello,world'
t1 = hashlib.md5(text)
t = hashlib.md5()
t.update(text)
print('二进制:',t.digest())
print('16进制:',t.hexdigest())
print('16进制:',t1.hexdigest())

import hmac
text = 'abcd'.encode(encoding='utf8')
text_cn = bytes('我是中文',encoding='utf8')
t = hmac.new(text,text_cn)
print('二进制:',t.digest())
print('16进制:',t.hexdigest())

re正则表达式

1,特殊匹配模式

re.I	|	re.IGNORECASE  启用后不区分大小写
re.M	|	re.MULTILINE 启用后匹配行首和行尾,不启用以字符串计算起始和结束位置
re.S	|	re.DOTALL	启用后不忽略换行符

2, 正则表达式匹配符:

.		|	默认匹配
之外任意一个字符,指定flag DOTALL 则可以匹配换行符
^		|	定位至字符串起始位置.指定flag MULTILINE,则定位行首
$		|	定位至字符串结束位置.指定flag MULTILINE,则定位行尾
*		|	匹配*号前字符任意次
+		|	匹配前一个字符多次
?		|	匹配前一个字符最多1次
{m}		|	匹配前一个条件m次
{n,m}	|	匹配前一个条件n到m次
|		|	匹配左右条件
()		|	配合{m}等使用组成一个条件
[a-z]	|	匹配指定范围小写字母
[A-Z]	|	匹配指定范围大写字母
[0-9]	|	匹配指定范围数字
A		|	从字符起始位置匹配 同^
		|	匹配字符结束 同$
d		| 	匹配数字
D		|	非数字
w		|	匹配[0-9a-zA-Z]数字和字母
W		|	非数字和字母,即特殊符号
s		|	匹配空白字符,换行等 	, 
, 
(?P<name>)	|	分组匹配,将结果以name为key存储为字典

3, 常用方法

re.match() 从字符串起始位置开始匹配,也就是说默认是带了^,下面第二例子就不能匹配

import re
text = 'abcdddddcba0022abc'

res = re.match('a.+d',text)
print(res.group())

res = re.match('cbad+',text)

re.search() 在整个字符串中匹配一次上面错误的例子就可以使用search进行匹配

res = re.search('cbad+',text)
print(res.group())

re.findall 匹配字符串中所有符合条件的字符,不需要group方法来返回输出

#res = re.findall('[a-z]{3}',text) 这两种结果是不一致的.
res = re.findall('[a-z]{1,3}',text)
print(res)

split() 按照匹配规则分割字符串

res = re.split('c',text)
print(res)

sub() 替换

res = re.sub('c','_',text,count=2)
print(res)

groupdict和分组匹配

id_card = '510703200004010517'
id_info = re.search("(?P<city>[0-9]{6})(?P<birth_year>[0-9]{4})(?P<birth_day>[0-9]{4})",id_card).groupdict()
print(id_info)
#返回结果为:{'city': '510703', 'birth_year': '2000', 'birth_day': '0401'}