Python成长之路（常用模块学习）

Python 拥有很多很强大的模块主要写一下常用的几个吧

大概就是这些内容了

模块介绍
time &datetime模块
random
os
sys
shutil
json & picle
shelve
xml处理
yaml处理
configparser
hashlib
subprocess
logging模块
re正则表达式

1.模块简介

模块，就是用一段代码实现了某个功能的代码集合。

类似于函数式编程和面向过程编程，函数式编程则完成一个功能，其他代码用来调用即可，提供了代码的重用性和代码间的耦合。而对于一个复杂的功能来，可能需要多个函数才能完成（函数又可以在不同的.py文件中），n个 .py 文件组成的代码集合就称为模块。

如：os 是系统相关的模块；file是文件操作相关的模块

模块分为三种：

自定义模块
内置标准模块（又称标准库）
开源模块

自定义模块和开源模块的使用参考 http://www.cnblogs.com/wupeiqi/articles/4963027.html

2.time & datetime

 1 #_*_coding:utf-8_*_
 2 __author__ = 'Alex Li'
 3 
 4 import time
 5 
 6 
 7 # print(time.clock()) #返回处理器时间,3.3开始已废弃 , 改成了time.process_time()测量处理器运算时间,不包括sleep时间,不稳定,mac上测不出来
 8 # print(time.altzone)  #返回与utc时间的时间差,以秒计算
 9 # print(time.asctime()) #返回时间格式"Fri Aug 19 11:14:16 2016",
10 # print(time.localtime()) #返回本地时间 的struct time对象格式
11 # print(time.gmtime(time.time()-800000)) #返回utc时间的struc时间对象格式
12 
13 # print(time.asctime(time.localtime())) #返回时间格式"Fri Aug 19 11:14:16 2016",
14 #print(time.ctime()) #返回Fri Aug 19 12:38:29 2016 格式, 同上
15 
16 
17 
18 # 日期字符串 转成  时间戳
19 # string_2_struct = time.strptime("2016/05/22","%Y/%m/%d") #将 日期字符串 转成 struct时间对象格式
20 # print(string_2_struct)
21 # #
22 # struct_2_stamp = time.mktime(string_2_struct) #将struct时间对象转成时间戳
23 # print(struct_2_stamp)
24 
25 
26 
27 #将时间戳转为字符串格式
28 # print(time.gmtime(time.time()-86640)) #将utc时间戳转换成struct_time格式
29 # print(time.strftime("%Y-%m-%d %H:%M:%S",time.gmtime()) ) #将utc struct_time格式转成指定的字符串格式
30 
31 
32 
33 
34 
35 #时间加减
36 import datetime
37 
38 # print(datetime.datetime.now()) #返回 2016-08-19 12:47:03.941925
39 #print(datetime.date.fromtimestamp(time.time()) )  # 时间戳直接转成日期格式 2016-08-19
40 # print(datetime.datetime.now() )
41 # print(datetime.datetime.now() + datetime.timedelta(3)) #当前时间+3天
42 # print(datetime.datetime.now() + datetime.timedelta(-3)) #当前时间-3天
43 # print(datetime.datetime.now() + datetime.timedelta(hours=3)) #当前时间+3小时
44 # print(datetime.datetime.now() + datetime.timedelta(minutes=30)) #当前时间+30分
45 
46 
47 #
48 # c_time  = datetime.datetime.now()
49 # print(c_time.replace(minute=3,hour=2)) #时间替换

Directive	Meaning	Notes
`%a`	Locale’s abbreviated weekday name.
`%A`	Locale’s full weekday name.
`%b`	Locale’s abbreviated month name.
`%B`	Locale’s full month name.
`%c`	Locale’s appropriate date and time representation.
`%d`	Day of the month as a decimal number [01,31].
`%H`	Hour (24-hour clock) as a decimal number [00,23].
`%I`	Hour (12-hour clock) as a decimal number [01,12].
`%j`	Day of the year as a decimal number [001,366].
`%m`	Month as a decimal number [01,12].
`%M`	Minute as a decimal number [00,59].
`%p`	Locale’s equivalent of either AM or PM.	(1)
`%S`	Second as a decimal number [00,61].	(2)
`%U`	Week number of the year (Sunday as the first day of the week) as a decimal number [00,53]. All days in a new year preceding the first Sunday are considered to be in week 0.	(3)
`%w`	Weekday as a decimal number [0(Sunday),6].
`%W`	Week number of the year (Monday as the first day of the week) as a decimal number [00,53]. All days in a new year preceding the first Monday are considered to be in week 0.	(3)
`%x`	Locale’s appropriate date representation.
`%X`	Locale’s appropriate time representation.
`%y`	Year without century as a decimal number [00,99].
`%Y`	Year with century as a decimal number.
`%z`	Time zone offset indicating a positive or negative time difference from UTC/GMT of the form +HHMM or -HHMM, where H represents decimal hour digits and M represents decimal minute digits [-23:59, +23:59].
`%Z`	Time zone name (no characters if no time zone exists).
`%%`	A literal `'%'` character.

3.random模块（随机数模块）

1 import random
2 print(random.random())   
3 print(random.randint(1,2)) #在1-2之间取一个随机数字
4 print(random.randrange(1,10)) #在1-10之间取一个随机数字

高逼格应用（生成随机验证码）

 1 import random
 2 checkcode = ''
 3 for i in range(4):
 4     current = random.randrange(0,4)
 5     if current != i:
 6         temp = chr(random.randint(65,90))
 7     else:
 8         temp = random.randint(0,9)
 9         checkcode += str(temp)
10 print (checkcode)

4.os模块***【灰常重要】

提供对操作系统进行调用的接口

 1 os.getcwd() 获取当前工作目录，即当前python脚本工作的目录路径
 2 os.chdir("dirname")  改变当前脚本工作目录；相当于shell下cd
 3 os.curdir  返回当前目录: ('.')
 4 os.pardir  获取当前目录的父目录字符串名：('..')
 5 os.makedirs('dirname1/dirname2')    可生成多层递归目录
 6 os.removedirs('dirname1')    若目录为空，则删除，并递归到上一级目录，如若也为空，则删除，依此类推
 7 os.mkdir('dirname')    生成单级目录；相当于shell中mkdir dirname
 8 os.rmdir('dirname')    删除单级空目录，若目录不为空则无法删除，报错；相当于shell中rmdir dirname
 9 os.listdir('dirname')    列出指定目录下的所有文件和子目录，包括隐藏文件，并以列表方式打印
10 os.remove()  删除一个文件
11 os.rename("oldname","newname")  重命名文件/目录
12 os.stat('path/filename')  获取文件/目录信息
13 os.sep    输出操作系统特定的路径分隔符，win下为"\",Linux下为"/"
14 os.linesep    输出当前平台使用的行终止符，win下为"	
",Linux下为"
"
15 os.pathsep    输出用于分割文件路径的字符串
16 os.name    输出字符串指示当前使用平台。win->'nt'; Linux->'posix'
17 os.system("bash command")  运行shell命令，直接显示
18 os.environ  获取系统环境变量
19 os.path.abspath(path)  返回path规范化的绝对路径
20 os.path.split(path)  将path分割成目录和文件名二元组返回
21 os.path.dirname(path)  返回path的目录。其实就是os.path.split(path)的第一个元素
22 os.path.basename(path)  返回path最后的文件名。如何path以／或结尾，那么就会返回空值。即os.path.split(path)的第二个元素
23 os.path.exists(path)  如果path存在，返回True；如果path不存在，返回False
24 os.path.isabs(path)  如果path是绝对路径，返回True
25 os.path.isfile(path)  如果path是一个存在的文件，返回True。否则返回False
26 os.path.isdir(path)  如果path是一个存在的目录，则返回True。否则返回False
27 os.path.join(path1[, path2[, ...]])  将多个路径组合后返回，第一个绝对路径之前的参数将被忽略
28 os.path.getatime(path)  返回path所指向的文件或者目录的最后存取时间
29 os.path.getmtime(path)  返回path所指向的文件或者目录的最后修改时间

5.sys模块

1 sys.argv           命令行参数List，第一个元素是程序本身路径
2 sys.exit(n)        退出程序，正常退出时exit(0)
3 sys.version        获取Python解释程序的版本信息
4 sys.maxint         最大的Int值
5 sys.path           返回模块的搜索路径，初始化时使用PYTHONPATH环境变量的值
6 sys.platform       返回操作系统平台名称
7 sys.stdout.write('please:')
8 val = sys.stdin.readline()[:-1]

6.json 和 pickle

7.shelve 模块模块

shelve模块是一个简单的k,v将内存数据通过文件持久化的模块，可以持久化任何pickle可支持的python数据格式

 1 import shelve
 2 d = shelve.open('shelve_test') #打开一个文件 
 3 class Test(object):
 4     def __init__(self,n):
 5         self.n = n
 6 t = Test(123)  
 7 t2 = Test(123334)
 8 name = ["alex","rain","test"] 
 9 d["test"] = name #持久化列表
10 d["t1"] = t      #持久化类
11 d["t2"] = t2
12 d.close()

8.xml处理模块

xml是实现不同语言或程序之间进行数据交换的协议，跟json差不多，但json使用起来更简单，不过，古时候，在json还没诞生的黑暗年代，大家只能选择用xml呀，至今很多传统公司如金融行业的很多系统的接口还主要是xml。

xml的格式如下，就是通过<>节点来区别数据结构的:

 1 <?xml version="1.0"?>
 2 <data>
 3     <country name="Liechtenstein">
 4         <rank updated="yes">2</rank>
 5         <year>2008</year>
 6         <gdppc>141100</gdppc>
 7         <neighbor name="Austria" direction="E"/>
 8         <neighbor name="Switzerland" direction="W"/>
 9     </country>
10     <country name="Singapore">
11         <rank updated="yes">5</rank>
12         <year>2011</year>
13         <gdppc>59900</gdppc>
14         <neighbor name="Malaysia" direction="N"/>
15     </country>
16     <country name="Panama">
17         <rank updated="yes">69</rank>
18         <year>2011</year>
19         <gdppc>13600</gdppc>
20         <neighbor name="Costa Rica" direction="W"/>
21         <neighbor name="Colombia" direction="E"/>
22     </country>
23 </data>

xml协议在各个语言里的都是支持的，在python中可以用以下模块操作xml

 1 import xml.etree.ElementTree as ET
 2 tree = ET.parse("xmltest.xml")
 3 root = tree.getroot()
 4 print(root.tag)
 5 #遍历xml文档
 6 for child in root:
 7     print(child.tag, child.attrib)
 8     for i in child:
 9         print(i.tag,i.text)
10 #只遍历year 节点
11 for node in root.iter('year'):
12     print(node.tag,node.text)

修改和删除xml文档内容

 1 import xml.etree.ElementTree as ET
 2 tree = ET.parse("xmltest.xml")
 3 root = tree.getroot()
 4  
 5 #修改
 6 for node in root.iter('year'):
 7     new_year = int(node.text) + 1
 8     node.text = str(new_year)
 9     node.set("updated","yes")
10 tree.write("xmltest.xml")
11 
12 #删除node
13 for country in root.findall('country'):
14    rank = int(country.find('rank').text)
15    if rank > 50:
16      root.remove(country)
17 tree.write('output.xml')

自己创建xml文档

 1 import xml.etree.ElementTree as ET
 2 new_xml = ET.Element("namelist")
 3 name = ET.SubElement(new_xml,"name",attrib={"enrolled":"yes"})
 4 age = ET.SubElement(name,"age",attrib={"checked":"no"})
 5 sex = ET.SubElement(name,"sex")
 6 sex.text = '33'
 7 name2 = ET.SubElement(new_xml,"name",attrib={"enrolled":"no"})
 8 age = ET.SubElement(name2,"age")
 9 age.text = '19'
10 et = ET.ElementTree(new_xml) #生成文档对象
11 et.write("test.xml", encoding="utf-8",xml_declaration=True)
12 ET.dump(new_xml) #打印生成的格式

9.ConfigParser模块(配置文件操作)

用于生成和修改常见配置文档，当前模块的名称在 python 3.x 版本中变更为 configparser。

来看一个好多软件的常见文档格式如下

 1 [DEFAULT]
 2 ServerAliveInterval = 45
 3 Compression = yes
 4 CompressionLevel = 9
 5 ForwardX11 = yes
 6 
 7 [bitbucket.org]
 8 User = hg
 9 
10 [topsecret.server.com]
11 Port = 50022
12 ForwardX11 = no

如何使用configparser模块生成一个配置文件文档

 1 import configparser
 2 config = configparser.ConfigParser()
 3 config["DEFAULT"] = {'ServerAliveInterval': '45',
 4                       'Compression': 'yes',
 5                      'CompressionLevel': '9'}
 6 config['bitbucket.org'] = {}
 7 config['bitbucket.org']['User'] = 'hg'
 8 config['topsecret.server.com'] = {}
 9 topsecret = config['topsecret.server.com']
10 topsecret['Host Port'] = '50022'     # mutates the parser
11 topsecret['ForwardX11'] = 'no'  # same here
12 config['DEFAULT']['ForwardX11'] = 'yes'
13 with open('example.ini', 'w') as configfile:
14    config.write(configfile)

读配置文件

 1 >>> import configparser
 2 >>> config = configparser.ConfigParser()
 3 >>> config.sections()
 4 []
 5 >>> config.read('example.ini')
 6 ['example.ini']
 7 >>> config.sections()
 8 ['bitbucket.org', 'topsecret.server.com']
 9 >>> 'bitbucket.org' in config
10 True
11 >>> 'bytebong.com' in config
12 False
13 >>> config['bitbucket.org']['User']
14 'hg'
15 >>> config['DEFAULT']['Compression']
16 'yes'
17 >>> topsecret = config['topsecret.server.com']
18 >>> topsecret['ForwardX11']
19 'no'
20 >>> topsecret['Port']
21 '50022'
22 >>> for key in config['bitbucket.org']: print(key)
23 ...
24 user
25 compressionlevel
26 serveraliveinterval
27 compression
28 forwardx11
29 >>> config['bitbucket.org']['ForwardX11']
30 'yes'

configparser增删改查语法

 1 [section1]
 2 k1 = v1
 3 k2:v2
 4 [section2]
 5 k1 = v1
 6  
 7 import ConfigParser
 8 config = ConfigParser.ConfigParser()
 9 config.read('i.cfg')
10   
11 # ########## 读 ##########
12 #secs = config.sections()
13 #print secs
14 #options = config.options('group2')
15 #print options
16 #item_list = config.items('group2')
17 #print item_list
18 #val = config.get('group1','key')
19 #val = config.getint('group1','key')
20 
21 # ########## 改写 ##########
22 #sec = config.remove_section('group1')
23 #config.write(open('i.cfg', "w"))
24 #sec = config.has_section('wupeiqi')
25 #sec = config.add_section('wupeiqi')
26 #config.write(open('i.cfg', "w"))
27 #config.set('group2','k1',11111)
28 #config.write(open('i.cfg', "w"))
29 #config.remove_option('group2','age')
30 #config.write(open('i.cfg', "w"))

hashlib模块

用于加密相关的操作，3.x里代替了md5模块和sha模块，主要提供 SHA1, SHA224, SHA256, SHA384, SHA512 ，MD5 算法

import hashlib
m = hashlib.md5()
m.update(b"Hello")
m.update(b"It's me")
print(m.digest())
m.update(b"It's been a long time since last time we ...")
print(m.digest()) #2进制格式hash
print(len(m.hexdigest())) #16进制格式hash
'''
def digest(self, *args, **kwargs): # real signature unknown
    """ Return the digest value as a string of binary data. """
    pass
def hexdigest(self, *args, **kwargs): # real signature unknown
    """ Return the digest value as a string of hexadecimal digits. """
    pass
'''
import hashlib

# ######## md5 ########
hash = hashlib.md5()
hash.update('admin')
print(hash.hexdigest())

# ######## sha1 ########
hash = hashlib.sha1()
hash.update('admin')
print(hash.hexdigest())

# ######## sha256 ########
hash = hashlib.sha256()
hash.update('admin')
print(hash.hexdigest())

# ######## sha384 ########
hash = hashlib.sha384()
hash.update('admin')
print(hash.hexdigest())

# ######## sha512 ########
hash = hashlib.sha512()
hash.update('admin')
print(hash.hexdigest())

python 还有一个 hmac 模块，它内部对我们创建 key 和内容再进行处理然后再加密

散列消息鉴别码，简称HMAC，是一种基于消息鉴别码MAC（Message Authentication Code）的鉴别机制。使用HMAC时,消息通讯的双方，通过验证消息中加入的鉴别密钥K来鉴别消息的真伪；

一般用于网络通信中消息加密，前提是双方先要约定好key,就像接头暗号一样，然后消息发送把用key把消息加密，接收方用key ＋消息明文再加密，拿加密后的值跟发送者的相对比是否相等，这样就能验证消息的真实性，及发送者的合法性了。

1 import hmac
2 h = hmac.new(b'天王盖地虎', b'宝塔镇河妖')
3 print h.hexdigest()

更多关于md5,sha1,sha256等介绍的文章看这里https://www.tbs-certificates.co.uk/FAQ/en/sha256.html

Subprocess模块

常用subprocess方法示例

#执行命令，返回命令执行状态， 0 or 非0 >>> retcode = subprocess.call(["ls", "-l"])

#执行命令，如果命令结果为0，就正常返回，否则抛异常 >>> subprocess.check_call(["ls", "-l"]) 0

#接收字符串格式命令，返回元组形式，第1个元素是执行状态，第2个是命令结果 >>> subprocess.getstatusoutput('ls /bin/ls') (0, '/bin/ls')

#接收字符串格式命令，并返回结果 >>> subprocess.getoutput('ls /bin/ls') '/bin/ls'

#执行命令，并返回结果，注意是返回结果，不是打印，下例结果返回给res >>> res=subprocess.check_output(['ls','-l']) >>> res b'total 0 drwxr-xr-x 12 alex staff 408 Nov 2 11:05 OldBoyCRM '

#上面那些方法，底层都是封装的subprocess.Popen poll() Check if child process has terminated. Returns returncode

wait() Wait for child process to terminate. Returns returncode attribute.

terminate() 杀掉所启动进程 communicate() 等待任务结束

stdin 标准输入
stdout 标准输出
stderr 标准错误

pid The process ID of the child process.

#例子

>>> p = subprocess.Popen("df -h|grep disk",stdin=subprocess.PIPE,stdout=subprocess.PIPE,shell=True)
>>> p.stdout.read()
b'/dev/disk1 465Gi 64Gi 400Gi 14% 16901472 104938142 14% / '

1 >>> subprocess.run(["ls", "-l"])  # doesn't capture output
2 CompletedProcess(args=['ls', '-l'], returncode=0)
3 >>> subprocess.run("exit 1", shell=True, check=True)
4 Traceback (most recent call last):
5   ...
6 subprocess.CalledProcessError: Command 'exit 1' returned non-zero exit status 1
7 >>> subprocess.run(["ls", "-l", "/dev/null"], stdout=subprocess.PIPE)
8 CompletedProcess(args=['ls', '-l', '/dev/null'], returncode=0,
9 stdout=b'crw-rw-rw- 1 root root 1, 3 Jan 23 16:23 /dev/null
')