序列化、常用模块以及面向对象基础

一、JSON序列化

1、为什么要使用JSON：不同程序之间或不同语言之间内存数据的交互

在不同程序语言或程序之间交互数据的时候,数据的格式只有字符串格式的才可能被对端程序识别，而事实上，在数据交互的时候，大多数数据的数据类型是很复杂的，比如python的字典，而字典格式的数据是不能被别的程序语言识别的，那么久需要一个中间的媒介做翻译的功能，被该媒介翻译的数据的格式可以被其他程序识别，这种媒介就叫做JSON，类似于以前常见的XML，翻译数据的过程叫做序列化。

被JSON翻译之后任何数据都会转换为字符串类型，可以在网络(网络中只能传输字符串或二进制文件)中传输。字符串是所有程序都有的。

在将数据存入硬盘中(写入文件)的时候，数据的格式也必须是字符串格式的，JSON序列化之后的数据可以被写入文件当中。

dumps

name = {'name':'Charles'}
import json
f  = file('data_to_qq','wb')
name_after_transfer = json.dumps(name,f)
print type(name_after_transfer)
f.write(name_after_transfer)
f.close()

E:pythonpython.exe E:/python_scripts/11S_06day/json_file.py
<type 'str'>      #被JSON dumps之后的数据类型为字符串

文件:data_to_qq
{"name": "Charles"}

f = file('data_to_qq','rb')
import json
name = json.loads(f.read())
f.close()
print name
print name['name']


E:pythonpython.exe E:/python_scripts/11S_06day/qq_app.py
{u'name': u'Charles'}
Charles

loads

JSON的局限性：对于复杂的数据格式，如datetime.datetime.now()，JSON不能序列化。

2、dump/load和dumps/loads的区别:

dump直接将序列化的后的内容写入文件，而dumps是将序列化后的内容通过f.write()方法写入文件中,load/loads道理相同：

import json
with open('data_to_qq','wb') as f:
    json.dump(name,f)

with open('date_to_qq','wb') as f:
    name_after_transfer = json.dumps(name)
    f.write(name_after_transfer)


print json.dumps(name)
print type(json.dumps(name))

dump&dumps

import json
with open('data_to_qq','rb') as f:
    name = json.loads(f.read())
print name
print name['name']

import json
with open('data_to_qq','rb') as f:
    name = json.load(f)
print name
print name['name']

load&loads

3、pickle序列化

pickle只是针对python的，可以序列化几乎python的数据类型

二、subprocess模块

import subproces
cmd = subprocess.check_output(["dir"],shell=True)   #和.call方法类似，只是call方法在命令执行错误的时候不会报错，而check_out会报错；
cmd_res = subprocess.call(["dir"],shell=True)  #类似于os.system()方法

####################### subprocess.call############################

>>> res = subprocess.call("ls -l",shell=True) #shell=True表示允许shell命令为字符串形式,如果是这样，那么前面的shell命令必须为一条命令
总用量 13296
-rw-r--r-- 1 root root 4 12月 5 06:07 456
>>> print res
0

>>> res = subprocess.call(["ls", "-l"],shell=False) #shell=False，相当于将前面的字符串采用了.join方法
总用量 13296
-rw-r--r-- 1 root root 4 12月 5 06:07 456
>>> print res
0

#######################subprocess.check_call######################

>>> subprocess.check_call(["ls","-l"])
总用量 13296
-rw-r--r-- 1 root root 4 12月 5 06:07 456

0
>>> subprocess.check_call("exit 1",shell=True) #执行命令，如果执行状态码为0，则返回0，否则抛出异常
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/subprocess.py", line 511, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'exit 1' returned non-zero exit status 1

>>> subprocess.check_call("exit 0",shell=True)
0

#######################subprocess.Popen############################

终端输入命令：分类

1、输入即可输出，如ifconfig

2、输入进入某环境，依赖该环境再输出

import subprocess

obj = subprocess.Popen(["python"], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
obj.stdin.write('print 1 ')
obj.stdin.write('print 2 ')
obj.stdin.write('print 3 ')
obj.stdin.write('print 4 ')
obj.stdin.close()

cmd_out = obj.stdout.read()
obj.stdout.close()
cmd_error = obj.stderr.read()
obj.stderr.close()

print cmd_out
print cmd_error

import subprocess

out_error_list = obj.communicate() #communicate方法可以输出标准输出和错误输出，添加到列表中
print out_error_list

import subprocess

obj = subprocess.Popen(["python"], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
out_error_list = obj.communicate('print "hello"')   #communicate方法可以输出
print out_error_list

>>> obj = subprocess.Popen(["python"],stdin = subprocess.PIPE,stdout=subprocess.PIPE,stderr=subprocess.PIPE)

>>> obj.stdin.write('print 1 
')
>>> obj.stdin.write('print 2 
')
>>> obj.stdin.write('print 3 
')
>>> obj.stdin.write('print 4 
')
>>> obj.stdin.close()

>>> cmd_out = obj.stdout.read()
>>> obj.stdout.close()
>>> cmd_error = obj.stderr.read()
>>> obj.stderr.close()

>>> print cmd_out
1
2
3
4
>>> print cmd_error

View Code

>>> obj = subprocess.Popen(["python"],stdin = subprocess.PIPE,stdout=subprocess.PIPE,stderr=subprocess.PIPE)
>>> out_error_list = obj.communicate('print "hello"')
>>> print out_error_list
('hello
', '')

View Code

三、Shutil模块

高级的文件、文件夹、压缩包处理模块

shutil模块对压缩包的处理是调用ZipFile 和TarFile模块进行的

四、sys

import sys
import time


def view_bar(num, total):
    rate = float(num) / float(total)
    rate_num = int(rate * 100)
    r = '
'+'#'*rate_num+'-->'+'%d%%' % (rate_num, )
    sys.stdout.write(r)
    sys.stdout.flush()


if __name__ == '__main__':
    for i in range(0, 100):
        time.sleep(0.1)
        view_bar(i, 100)

四、日期模块

对时间的操作，有三种方式:

import time
print time.time()    #时间戳，即1970年1月1日起的秒数
print time.strftime("%Y-%m-%d")    #结构化的字符串
print time.localtime()    #结构化时间，包含年、日、星期等

E:pythonpython.exe E:/python_scripts/11S_06day/time_file.py
1452996161.57
2016-01-17
time.struct_time(tm_year=2016, tm_mon=1, tm_mday=17, tm_hour=10, tm_min=2, tm_sec=41, tm_wday=6, tm_yday=17, tm_isdst=0)

import time
print time.time()  #打印时间戳

import datetime
print datetime.datetime.now()

E:pythonpython.exe E:/python_scripts/11S_06day/time_file.py
1452695581.91
2016-01-13 22:33:01.911000

#############################
转为之间戳：可以设定格式
print time.strftime("%Y-%m-%d %H-%S")
2016-01-13 22-51
#############################
字符串转为日期
t = time.strftime("2015-09-19","%Y-%m-%d")

############################
日期的加减(只能针对天、小时和分钟)
print datetime.datetime.now() - datetime.timedelta(days=3)#和下一条效果相同
print datetime.datetime.now() + datetime.timedelta(days=-3)
print datetime.datetime.now() - datetime.timedelta(hours=3)
print datetime.datetime.now() - datetime.timedelta(minutes=3)

View Code

五、logging日志模块

CRITICAL = 50                      #日志的等级
FATAL = CRITICAL
ERROR = 40
WARNING = 30
WARN = WARNING
INFO = 20
DEBUG = 10
NOTSET = 0

用于便携记录日志和线程安全的(在多线程写日志的情况下，不会造成死锁)

logging模块默认只会存储info以及info以上级别的日志

日志级别依次为:DEBUG-->INFO-->WARNING-->ERROR-->CRITICAL

　　默认打印logging到屏幕上

logging.info('So should this')
logging.critical("This is critical message...")

　　日志记录到日志文件中，可以指定日志的级别

import logging
logging.basicConfig(filename='example.log',level=logging.INFO)
logging.info('So should this')
logging.critical("This is critical message...")

　　在日志中增加时间

import logging
logging.basicConfig(format='%(asctime)s %(message)s',datefmt='%m/%d/%Y %I:%M:%S %p')
logging.info('So should this')
logging.critical("This is critical message...")

　　同时将日志内容打印的屏幕上和写入到日志中

import logging
logger = logging.getLogger('TEST_LOG')
logger.setLevel(logging.DEBUG)

ch = logging.StreamHandler()
ch.setLevel(logging.DEBUG)

fh = logging.FileHandler("access.log")
fh.setLevel(logging.WARNING)

formattar = logging.Formatter('%(asctime)s-%(name)s-%(levelname)s-%(message)s')

ch.setFormatter(formattar)
fh.setFormatter(formattar)

logger.addHandler(ch)
logger.addHandler(fh)

logger.debug('debug message...')
logger.info('info message...')
logger.warn('warn message...')
logger.error('error message...')
logger.critical('critical message...')

六、re模块

###################
match  从字符串的开头匹配
>>> import re
>>> re.match("d","abc123def") #匹配不成功无任何输出
>>> re.match(".","abc123def")  #匹配成功返回对象
<_sre.SRE_Match object at 0x0047EA30>
>>> re.match(".","abc123def").group()  #打印输出匹配到的字符串
'a'

###################
search 在整个字符串中匹配
>>> re.search("d","abc123def")  #d表示数字
<_sre.SRE_Match object at 0x0047EA30>
>>> re.search("d","abc123def").group()
'1'

>>> re.search("d+","abc123def456ghi").group()  #+号表示重复一次或更多次
'123'

###################
findall #找出所有符合的字符串
>>> re.findall("d+","abc123def456ghi_*789dd")
['123', '456', '789']
>>> re.findall("[^d+]","abc123def456ghi_*789dd")  #找出所有非数字
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', '_', '*', 'd', 'd']

###################
.split #分割字符串
 re.findall("[^d+]","abc123def456ghi_*789dd")
>>> re.split("[d+]","abc123def456ghi_*789dd")
['abc', '', '', 'def', '', '', 'ghi_*', '', '', 'dd']
>>> re.split("d+","abc123def456ghi_*789dd")  #以数字进行分割
['abc', 'def', 'ghi_*', 'dd']

>>> re.split("[d+,*]","abc123def456ghi_*789dd")  #以数字或*分割
['abc', '', '', 'def', '', '', 'ghi_', '', '', '', 'dd']

###################
sub 替换
>>> re.sub("ab","YUE","abc123def456ghi_*789ddabc") #将ab替换为YUE，替换所有
'YUEc123def456ghi_*789ddYUEc'

>>> re.sub("ab","YUE","abc123def456ghi_*789ddabc",count=1) #count参数可以设定替换几次
'YUEc123def456ghi_*789ddabc'
>>> re.sub("ab","YUE","abc123def456ghi_*789ddabc",count=2)
'YUEc123def456ghi_*789ddYUEc'
.sub包含replcce功能，replace不能替换由正则表达式匹配的字符串；

###################
匹配IP地址
>>> re.search("(d+.){3}(d+)",t)
<_sre.SRE_Match object at 0x004BB020>
>>> re.search("(d+.){3}(d+)",t).group()
'192.168.72.1'

精确匹配IP地址

>>> re.search("(d{1,2}|1dd|2[0-4]d|25[0-5]).(d{1,2}|1dd|2[0-4]d|25[0-5]).(d{1,2}|1dd|
2[0-4]d|25[0-5]).(d{1,2}|1dd|2[0-4]d|25[0-5])",t).group()
'192.168.72.1'

###################
group和groups

>>> name = 'Charles Chang'
>>> re.search("(w+)s(w+)",name)
<_sre.SRE_Match object at 0x004BB020>

>>> re.search("(w+)s(w+)",name).group()
'Charles Chang'

>>> re.search("(w+)s(w+)",name).groups()
('Charles', 'Chang')

>>> re.search("(w+)s(w+)",name).groups()[0]
'Charles'
>>>
>>> re.search("(?P<name>w+)s(?P<last_name>w+)",name) #起别名
<_sre.SRE_Match object at 0x004BB020>

>>> res = re.search("(?P<name>w+)s(?P<last_name>w+)",name)

>>> res.group("name")
'Charles'
>>> res.group("last_name")
'Chang'

#####################
转义
>>> t = "
	abc"
>>> print t

        abc
>>> t = r"
	abc"
>>> print t

	abc
####################
compile   事先编译，在循环调用的使用提供效率

>>> p = re.compile("d+")
>>> re.search(p,'12223')
<_sre.SRE_Match object at 0x004505D0>
>>> re.search(p,'12223').group()
'12223'
>>> p.match('123')
<_sre.SRE_Match object at 0x004505D0>
>>> p.match('123').group()
'123'
>>> p.search('12223').group()
'12223'
>>>

split和search的区别:
inpp = '1-2*((60-30 +(-40-5)*(9-2*5/3 + 7 /3*99/4*2998 +10 * 568/14 )) - (-4*3)/ (16-3*2))'

content = re.search('(([+-*/]*d+.*d*){2,})',inpp).group()
before,nothing,after = re.split('(([+-*/]*d+.*d*){2,})',inpp,1)
print before
print nothing    
print content
print after

结果为:
E:pythonpython.exe E:/python_scripts/11S_07day/re_file.py
['1-2*((60-30+', '-40-5', '*(9-2*5/3+7/3*99/4*2998+10*568/14))-(-4*3)/(16-3*2))']
1-2*((60-30+
-5
(-40-5)
*(9-2*5/3+7/3*99/4*2998+10*568/14))-(-4*3)/(16-3*2))

正则表达式的标志位Flags

　　re.I 大小写不敏感;

string2="Charles"
m= re.search("[a-z]",string2,flags=re.I)
print m.group()

　　re.M 用得少，不关注啦...

常见实例

　　匹配手机号

string2="hey this is my phone number 18611654950,please call me"
phone_str=re.search("(1)([3458])(d{9})",string2)
print phone_str.group()

　　匹配IP地址

string2="hey this is my ip address is  192.168.72.100 please ping me"
ip_str=re.search("d{1,3}.d{1,3}.d{1,3}.d{1,3}",string2)
print ip_str.group()

正则匹配分组

group()、group(d)、groups()、groupdict()的区别:group()默认返回所有的分组，在不分组的时候也只能这样使用，默认等价于group(0),group(1)、group(2)...依次为第一、第二个分组...，groups()返回的是元祖，元素为各个分组的内容;groupdict()为分组使用key的时候，返回字典，显示key/value的结果;

import re
a = "123abc456"
print re.search("([0-9]*)([a-z]*)([0-9]*)", a).group()
print re.search("([0-9]*)([a-z]*)([0-9]*)", a).group(0)
print re.search("(?P<first>[0-9]*)(?P<middle>[a-z]*)(?P<last>[0-9]*)", a).group("last")
print re.search("(?P<first>[0-9]*)(?P<middle>[a-z]*)(?P<last>[0-9]*)", a).groupdict()
print re.search("(?P<first>[0-9]*)(?P<middle>[a-z]*)(?P<last>[0-9]*)", a).groups()
print re.search("([0-9]*)([a-z]*)([0-9]*)", a).group(2)


结果如下:
('123', 'abc', '456')
123abc456
456
{'middle': 'abc', 'last': '456', 'first': '123'}
('123', 'abc', '456')
abc

参考博客:http://www.cnblogs.com/alex3714/articles/5143440.html

python导入最外层目录方法:

import os
import sys
base_dir = os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
sys.path.append(base_dir)

七、shelve模块

　　shelve模块是一个简单的k、v将内存数据通过文件持久化的模块，可以持久化任何python pickle支持的数据类型;也就是将pickle再进行了一层封装，使得其调用更加简单;

　　shelve dump

import shelve

d = shelve.open('shelve_test') #打开一个文件

class Test(object):
    def __init__(self,n):
        self.n = n

t = Test(123)
t2 = Test(123334)

name = ["alex","rain","test"]
d["test"] = name #持久化列表
d["t1"] = t      #持久化类
d["t2"] = t2

d.close()

　　shelve load

import shelve
d = shelve.open('shelve_test') #打开一个文件

print d.get("test")
print d.get("t1")
print d.get("t1").n    #获取类的值必须使用n类获取实例化得类的内容

shelve比pickle/json的好处:如果向一个文件dump多次的话，那么就会load的时候就需要load多次，但是没有办法控制load到哪个我们所需要的数据；而shelve正好可以通过get方法实现这一点;

八、XML处理模块

xml是实现不同语言或程序之间进行数据交换的协议，跟json差不多，但json使用起来更简单。

<?xml version="1.0"?>
<data>
    <country name="Liechtenstein">
        <rank updated="yes">2</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
        <neighbor name="Austria" direction="E"/>
        <neighbor name="Switzerland" direction="W"/>
    </country>
    <country name="Singapore">
        <rank updated="yes">5</rank>
        <year>2011</year>
        <gdppc>59900</gdppc>
        <neighbor name="Malaysia" direction="N"/>
    </country>
    <country name="Panama">
        <rank updated="yes">69</rank>
        <year>2011</year>
        <gdppc>13600</gdppc>
        <neighbor name="Costa Rica" direction="W"/>
        <neighbor name="Colombia" direction="E"/>
    </country>
</data>

参考博客:http://www.cnblogs.com/alex3714/articles/5161349.html

十、optparse模块

这里介绍一个非常强大的命令行参数处理工具模块，optparse，可以生成命令行帮助信息的显示,可以使用key、value的方式获取传入参数的值;

本文参考:http://wolfchen.blog.51cto.com/2211749/1230061

parser.parse_args()对象包含两个元素,第一个元素options生成的是一个字典，内容是通过parser.add_option传入的参数的key、value值，如果没有desc参数，key的值为file(如果为--file的话),否则为desc指定的内容;

参数的值可以通过options.file来获取，如果获取不到，返回空，如果parser.add_option不存在这样的参数，就会抛出异常;

#!/usr/bin/env python
from optparse import OptionParser
parser = OptionParser()
parser.add_option("-f", "--file", dest="filename",
                  help="write report to FILE", metavar="FILE")
parser.add_option("-q", "--quiet",
                  action="store_false", dest="verbose", default=True,
                  help="don't print status messages to stdout")
                                                                                                                                                             
(options, args) = parser.parse_args()    #args表示不是add_option传入的参数,为列表类型;

parser.parse_args(args)也支持传入参数，其中参数args为列表;

parser.add_option("-f", "--file",
action="store", type="string", dest="filename")
args = ["-f", "foo.txt"]
(options, args) = parser.parse_args(args)
print options.filename      #输出结果为 foo.txt

对于一些特殊的参数，如--verbose，--quite等,并不需要参数的传入，如果没有参数传入，默认会抛出异常；如下当传入参数-v的时候，options.verbose赋值为True; 不传入-v参数的时候，options.verbose赋值为False;

parser.add_option("-v", action="store_true", dest="verbose")
parser.add_option("-v", action="store_false", dest="verbose")

add_option中的metavar 参数有助于提醒用户，该命令行参数所期待的参数，如 metavar=“mode”：

-m MODE, --mode=MODE;

此外，add_option还支持default参数，支持默认参数值的传入;

九、面向对象

################################
类初始化的两种方式:
1、
class Person(object):
    def __init__(self,name):
        self.name = name
        print "--->create:",name
    def say_name(self):
        print "My name is %s" %self.name
    def eat(self):
        print "%s is eating...." %self.name

p1 = Person("gf1")
p2 = Person("gf2")
p1.eat()

E:pythonpython.exe E:/python_scripts/11S_06day/class_sample.py
--->create: gf1
--->create: gf2
gf1 is eating....

2、
class Person(object):
    def __init__(self,name):
        #self.name = name
        print "--->create:",name
    def say_name(self):
        print "My name is %s" %self.name
    def eat(self):
        print "%s is eating...." %self.name

p1 = Person("gf1")
p2 = Person("gf2")
p1.name = "GF"
p1.eat()

E:pythonpython.exe E:/python_scripts/11S_06day/class_sample.py
--->create: gf1
--->create: gf2
GF is eating....

十一、glob模块

glob模块可以实现类似于shell中，通过通配符匹配文件名，进而完成搜索文件的目的,使用到的通配符有:”*”, “?”, “[]”;

　　”*”匹配0个或多个字符；

　　”?”匹配单个字符；”

　　[]”匹配指定范围内的字符，如：[0-9]匹配数字。

print glob.glob("/root/*ldap*","/") #返回匹配到的文件名的列表

['/root/python-ldap-2.4.28.tar.gz', '/root/ldaptest', '/root/ldap5.py', '/root/ldap1.py.ori', '/root/python-ldap-2.4.28', '/root/ldap1.py', '/root/ldap2.py', '/root/ldap3.py']

而iglog返回的是一个生成器,只有循环才可以各个文件的名称

>>> for n in glob.iglob("/root/*ldap*"):
...     print n
... 
/root/python-ldap-2.4.28.tar.gz
/root/ldaptest
/root/ldap5.py
/root/ldap1.py.ori
/root/python-ldap-2.4.28
/root/ldap1.py
/root/ldap2.py
/root/ldap3.py
>>>

十二、first模块

first模块不是标准模块，需要单独安装;作用默认是返回可迭代对象中第一个布尔值不是False的元素,如果没有，为None;如果key不为空,返回第一个满足key表达式的第一个元素的值;

from first import first
first([0,False,None,[],(),42])
first([-1,0,1,2],key=lambda x:x>1)

上述代码中的key后面的函数如果超过一行,就需要非匿名函数了

from first import first

def first_than_zero(number):
    return number>0

first([-1,0,1,2],key=first_than_zero)

上述方式在使用的时候，如果最小的那个值不确定，就必须为每一个对于的key创建函数，非常麻烦,使用偏函数可以解决

from functools import partial

def greater_than(number,min=0):
    return number>min
first([-1,0,1,2],key=partial(greater_than,min=1))

如果需要函数是内置的，operator模块可以提供这样的函数:

动态导入模块

import importlib

aa = importlib.import_module('lib.aa')
print(aa.C().name)

__import__('lib.aa') 也可以；

断言:

assert isinstance(aa.C().name,str)
print('succ')    #否则报断言属性错误