Python自学，Day3Python基础

本节内容

集合操作
文件操作
字符编码与转码
函数基本语法及特性
参数与局部变量
返回值
嵌套函数
递归
匿名函数
函数式编程介绍
高阶函数
内置函数

1、集合　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　

集合是一个无序的，不重复的数据组合，它的主要作用如下：

去重，把一个列表变成集合，可以自动去重
关系测试，测试两组数据之前的交集、差集、并集等关系

常见操作

定义集合

 1 #创建一个数值集合
 2 num=set([1,2,3,4])
 3 print("1:",num)
 4 
 5 #创建一个唯一字符的集合
 6 str=set("hello")
 7 print("2:",str) #按无重复的字母单独拆开
 8 
 9 #数值和字符多种形式
10 name=set(["ace",1,3,5])
11 print("3:",name)
12 
13 #输出结果
14 1: {1, 2, 3, 4}
15 2: {'l', 'e', 'o', 'h'}
16 3: {3, 1, 'ace', 5}

关系测试

num=set([1,2,3,4])
name=set(["ace",1,3,5])
str=set([1,3])

#并集
print("1:",num.union(name)) #关键词表示形式
print("2:",num|name) #运算符表示

#交集
print("3:",num.intersection(name))
print("4:",num&name)

#差集
print("5:",num.difference(name))    #在num中，但不在name中
print("6:",num-name)

#子集
print("7:",str.issubset(name)) #完全包含则为子集
print("8:",str<=name)

#父集
print("8:",name.issuperset(str))
print("9:",name>=str)

#对称差集
print("10:",num.symmetric_difference(name)) #在num或name中，但是不会同时出现在二者中
print("11:",num^name)

#不存在交加
print("12:",num.isdisjoint(name))  #当两者间无交集时，则返回true，否则false

#输出结果
1: {1, 2, 3, 4, 5, 'ace'}
2: {1, 2, 3, 4, 5, 'ace'}
3: {1, 3}
4: {1, 3}
5: {2, 4}
6: {2, 4}
7: True
8: True
8: True
9: True
10: {2, 4, 5, 'ace'}
11: {2, 4, 5, 'ace'}
12: False

增删改

 1 num=set([1,2,3,4])
 2 name=set(["ace",1,3,5])
 3 str=set([1,3])
 4 
 5 #添加单项
 6 num.add(5)
 7 print("1:",num)
 8 
 9 #添加多项
10 str.update(["shang",2,4])
11 print("2:",str)
12 
13 #删除指定项
14 name.remove(1) #当删除项不存在时，会报错
15 print("3:",name)
16 
17 #随机删除，并将删除项输出
18 print("4:",name.pop())
19 
20 #删除指定项
21 name.discard(3) #当但删除项不存在时，程序不处理
22 print("5:",name)
23 
24 #输出结果
25 1: {1, 2, 3, 4, 5}
26 2: {1, 2, 3, 4, 'shang'}
27 3: {'ace', 3, 5}
28 4: ace
29 5: {5}

2、文件操作　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　

对文件操作流程

打开文件，得到文件句柄并赋值给一个变量
通过句柄对文件进行操作
关闭文件

创建文件

先创建file文件，格式为Text，在Text文件中添加内容

文件：over_the_sea

内容：为你我用了半年的积蓄

漂洋过海来看你
为了这次相聚
连见面时的呼吸
都曾反复练习

读取文件

 1 f=open("over_the_sea",'r',encoding="utf-8") #需要转化成utf－8
 2 data=f.read() #文件句柄
 3 print(data)
 4 
 5 #输出结果
 6 为你我用了半年的积蓄
 7 漂洋过海来看你
 8 为了这次相聚
 9 连见面时的呼吸
10 都曾反复练习

备注：

打开文件的操作：open("文件名",'模式',encoding="转移字符格式")

r为只读模式，为默认模式，可以不定义;

f.read()为文件的读取

　　当多次读取时

f=open("over_the_sea",'r',encoding="utf-8") #需要转化成utf－8
data=f.read() #文件句柄
print(data)
data2=f.read()
print('－－－－－－－－－data2－－－－－－－－－',data2)

#输出结果
为你我用了半年的积蓄
漂洋过海来看你
为了这次相聚
连见面时的呼吸
都曾反复练习
－－－－－－－－－data2－－－－－－－－－

备注：从输出结果可以看出，data2无内容；

因read的模式为影响到读取文件的指针（可类比C语言中的指针），data读取完成后，指针到文件的最后位置，data2进行read时，后续无内容，所以data2无数据

写文件

1 f=open("across_the_ocean_to_meet_you",'w',encoding="utf-8") #需要转化成utf－8
2 f.write("To meet you I’ve saved every penny\n"
3         "To travel far across the sea\n"
4         "To be free-and-easy\n"
5         "Many times I rehearsed the meet\n"
6         "Hoped to impress you deep\n"
7         )

结果：新增across_the_ocean_to_meet_you文件，文件内容为write中到内容；

文件内容：

To meet you I’ve saved every penny
To travel far across the sea
To be free-and-easy
Many times I rehearsed the meet
Hoped to impress you deep

写入的内容，以行为区分，换行时需用""来新定义文本内容

因w只是写的模式，不能读取；需注意，w的模式时新创建文件，当文件名已存在时，则会将原文件覆盖

追加写文件（append）

 1 f=open("across_the_ocean_to_meet_you",'a',encoding="utf-8")
 2 data=f.write("Words said never convey what I wanted to say\n"
 3             "What in my mind so clearly\n")
 4 
 5 
 6 #文件内容
 7 To meet you I’ve saved every penny
 8 To travel far across the sea
 9 To be free-and-easy
10 Many times I rehearsed the meet
11 Hoped to impress you deep
12 Words said never convey what I wanted to say
13 What in my mind so clearly

备注：模式a，为在原文件的基础上追加内容，当不存在指定文件时，则会同w模式一样创建原文件，并追加内容

按行读取

1 f=open("across_the_ocean_to_meet_you",'r',encoding="utf-8")
2 
3 print(f.readlines())    #按行进行全部读取，并输出打印
4 
5 #输出结果
6 ['To meet you I’ve saved every penny\n', 'To travel far across the sea\n', 'To be free-and-easy\n', 'Many times I rehearsed the meet\n', 'Hoped to impress you deep\n', 'Words said never convey what I wanted to say\n', 'What in my mind so clearly\n']

备注：输出结果为列表

　　打印前5行

f=open("across_the_ocean_to_meet_you",'r',encoding="utf-8")

#打印前5行
for line in range(5):
    print(f.readline())#readline为按行读取

#输出结果
To meet you I’ve saved every penny

To travel far across the sea

To be free-and-easy

Many times I rehearsed the meet

Hoped to impress you deep

　　按行打印全部行

 1 f=open("across_the_ocean_to_meet_you",'r',encoding="utf-8")
 2 
 3 for line in f.readlines():
 4     print(line)
 5 
 6 #输出结果
 7 
 8 To meet you I’ve saved every penny
 9 
10 To travel far across the sea
11 
12 To be free-and-easy
13 
14 Many times I rehearsed the meet
15 
16 Hoped to impress you deep
17 
18 Words said never convey what I wanted to say
19 
20 What in my mind so clearly

备注：readlines为按行读取后的列表，通过循环列表来按行打印

　　优化：因文件中有回车，所以输出结果中有空格，去掉空格，使用strip()

print(line.strip())

　　第四行不打印

f=open("across_the_ocean_to_meet_you",'r',encoding="utf-8")
data=f.readlines()
for line in data:
    if data.index(line)==3:
        print("------我是分隔符------")
        continue
    print(data.index(line),line.strip())

#输出结果
0 To meet you I’ve saved every penny
1 To travel far across the sea
2 To be free-and-easy
------我是分隔符------
4 Hoped to impress you deep
5 Words said never convey what I wanted to say
6 What in my mind so clearly

　　另外一种操作

1 f=open("across_the_ocean_to_meet_you",'r',encoding="utf-8")
2 
3 for index,line in enumerate(f.readlines()): #enumerate获取下标
4     if index==3:
5         print("------我是分隔符------")
6         continue
7     print(index,line.strip())

备注：输出结果与上一种方法相同

　　readlines的缺点：一次性将文件的内容全部读取到内存中，若文件量太大，会造成内存占用过高

　　改进方法：读取一行打印之后，读取下一行内容，将上一行内容覆盖，所以以上方法尽量不用

　　高效的循环方法

 1 f=open("across_the_ocean_to_meet_you",'r',encoding="utf-8")
 2 
 3 count=0 #用于记录打印行数
 4 for line in f:
 5     if count==3:
 6         print("------我是分隔符------")
 7         count+=1
 8         continue
 9     print(line.strip())
10     count+=1
11 
12 
13 #输出结果
14 To meet you I’ve saved every penny
15 To travel far across the sea
16 To be free-and-easy
17 ------我是分隔符------
18 Hoped to impress you deep
19 Words said never convey what I wanted to say
20 What in my mind so clearly

　　备注：该方法为一行一行读取，且后一行内容会覆盖前一行内容

tell() 当前read的指针所在的位置
seek() 设置指针所在的位置

　　over_the_sea的文件内容

To meetyou I have saved every penny
To travel far across the sea
To be free-and-easy
Many times I rehearsed the meet

f=open("over_the_sea",'r',encoding="utf-8") #需要转化成utf－8
data=f.readline() #文件句柄
print(data.strip())
print(f.tell()) #获取当前指针的位置
f.seek(0)   #将指针指向最初始0的位置
print(f.readline())

#输出结果
To meetyou I have saved every penny
36
To meetyou I have saved every penny

备注：tell()是按照一个字符为一个位置设定，所以打印的为“36”

flush

　　用来将写入缓存的内容强制写入磁盘中

 1 import sys,time
 2 
 3 for i in range(20):
 4     sys.stdout.write("#")   #屏幕作为终端输出
 5     sys.stdout.flush()  #强制刷新到磁盘
 6     time.sleep(0.1) #睡眠0.1秒
 7 
 8 #输出结果
 9  ####################
10 
11 #备注:依次输出#

truncate 截断

1 f=open("across_the_ocean_to_meet_you",'a',encoding="utf-8")
2 
3 f.truncate(10)

　　从第一个字符开始，到第10个字符进行截断

　　截断之后文件内容

To meet yo

r+ 读写（读和追加的模式）

 1 f=open("across_the_ocean_to_meet_you",'r+',encoding="utf-8") #r+的模式
 2 
 3 print(f.readline().strip())
 4 print(f.readline().strip())
 5 print(f.readline().strip())
 6 f.write("------第四行追加-------")
 7 print("-------分隔符---------")
 8 print(f.read())
 9 
10 #输出结果
11 To meet you I’ve saved every penny
12 To travel far across the sea
13 To be free-and-easy
14 -------分隔符---------
15 Many times I rehearsed the meet
16 Hoped to impress you deep
17 Words said never convey what I wanted to say
18 What in my mind so clearly------第四行追加-------

备注:及时打印为前三行之后，指针在第四行，但是执行write的操作时，却从最后开始追加

w+ 写读（先创建文件，然后可进行读和写的操作－使用场景不多）

 1 f=open("across_the_ocean_to_meet_you",'w+',encoding="utf-8") #r+的模式
 2 
 3 print(f.readline().strip())
 4 print(f.readline().strip())
 5 print(f.readline().strip())
 6 f.write("------第四行追加-------")
 7 print("-------分隔符---------")
 8 print(f.read())
 9 
10 #输出结果
11 
12 
13 
14 -------分隔符---------
15 28

备注：1个汉字相当于5个字符，所以输出字符为28；因先进行创建文件，所以前三个读都为空；执行完写之后，指针在最后位置，所以最后读取的内容也为空

a+ 追加读写

 1 f=open("across_the_ocean_to_meet_you",'a+',encoding="utf-8") #r+的模式
 2 
 3 print(f.readline().strip())
 4 print(f.readline().strip())
 5 print(f.readline().strip())
 6 f.write("\n------我是添加项-------")
 7 print("-------分隔符---------")
 8 f.seek(0)
 9 print(f.read())
10 
11 #输出结果
12 
13 
14 
15 -------分隔符---------
16 To meet you I’ve saved every penny
17 To travel far across the sea
18 To be free-and-easy
19 Many times I rehearsed the meet
20 Hoped to impress you deep
21 Words said never convey what I wanted to say
22 What in my mind so clearly
23 ------我是添加项-------

备注：追加读写，是将指针直接定位在最后，所以前三行读取的内容为空；追加内容添加在最后，添加完之后，指针也在最后；此时将指针的位置设定为0的位置，可以读取整个文件

修改文件内容

要求：修改文件中的指定信息，并将信息保存至新文件中

1 #需求：修改文件中信息，并将信息保存至新文件中
2 
3 f=open("across_the_ocean_to_meet_you","r",encoding="utf-8")
4 f_new=open("across_the_ocean_to_meet_you.bak","w",encoding="utf-8")#创建新文件
5 
6 for line in f:
7     if "across the sea" in line:
8         line=line.replace("across the sea","across the ocean") #使用字符串修改
9     f_new.write(line) #不管是否是需要更新的文件，都是将line更新至至新文件

采用的使逐行读取校验，如果不是需要更改的内容项，则直接更新至新文件，如果是需要更改的内容项，则使用字符替换，然后再写入新的文件中

with操作

正常情况下，在打开文件操作后，需关闭该文件，关闭语句为：f.close()；通常在程序执行完之后，python会自动给关闭打开的文件

但是会存在忘记关闭的情况，尤其在程序执行时，可能会同时打开多个文件，以及大量暂用内存；

with语句为了避免打开文件后忘记关闭，可以通过管理上下文的，即：

1 with open("文件名","r",encoding="utf-8") as f:
2     ...
3     ...
4     ...

如此方式，当with代码块执行完毕之后，内部会自动关闭并释放文件资源。

以修改文件内容的场景为例：

1 with open("across_the_ocean_to_meet_you","r",encoding="utf-8") as f ,\
2      open("across_the_ocean_to_meet_you.bak","w",encoding="utf-8") as f_new:
3     for line in f:
4         if "across the sea" in line:
5             line = line.replace("across the sea", "across the ocean")  # 使用字符串修改
6         f_new.write(line)

总结

　　打开文件的模式有：

r ，只读模式【默认】
w，只写模式【不可读；不存在则创建；存在则清空内容；】
x，只写模式【不可读；不存在则创建，存在则报错】
a，追加模式【可读；不存在则创建；存在则只追加内容；】

"+" 表示可以同时读写某个文件

r+，读写【可读，可写】
w+，写读【可读，可写】
x+ ，写读【可读，可写】
a+，写读【可读，可写】

"b"表示以字节的方式操作

rb 或 r+b
wb 或 w+b
xb 或 w+b
ab 或 a+b

注：以b方式打开时，读取到的内容是字节类型，写入时也需要提供字节类型

3、字符编码及转码　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　

详细文章:

http://www.cnblogs.com/yuanchenqi/articles/5956943.html

http://www.diveintopython3.net/strings.html

yy:找时间在看吧，先抓重点快速前进....

需知:

1.在python2默认编码是ASCII, python3里默认是unicode

2.unicode 分为 utf-32(占4个字节),utf-16(占两个字节)，utf-8(占1-4个字节)， so utf-16就是现在最常用的unicode版本，不过在文件里存的还是utf-8，因为utf8省空间

3.在py3中encode,在转码的同时还会把string 变成bytes类型，decode在解码的同时还会把bytes变回string

　　decode()　编码成unicode

　　encode()　解码成utf-8，或者gbk

#-*-coding:gbk-*-
#声明文件的编码格式为gbk
#备注：文件格式调整为GBK的模式

import sys
print(sys.getdefaultencoding()) #系统默认的编码还是utf－8


s="你好" #仍然为utf-8
print(s)
print(s.encode("gbk")) #显示的是gbk的编码格式
print(s.encode("utf-8"))
print(s.encode("utf-8").decode("utf-8"))
print(s.encode("gb2312"))#gbk向下兼容gb2312
print(s.encode("gb2312").decode("gb2312"))

4、函数与函数式编程　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　　

　　python定义函数

　　def 函数名(参数):

　　　　"文档描述"

　　　　函数体逻辑

　　　　return 参数

 1 #函数式编程
 2 def fun(x):
 3     '''自增函数'''
 4     x+=1
 5     print(x)
 6     return x
 7 
 8 #面向过程(无返回值的函数)
 9 def fun2(x):
10     '''过程'''
11     print(x)
12 
13 x=fun(1)
14 y=fun2(1)
15 
16 print("x:",x) #x有具体的返回值
17 print("y:",y) #y返回为none
18 
19 
20 #输出结果
21 2
22 1
23 x: 2
24 y: None

函数例子：

import time

def logger(): #打印log日志
    time_format='%Y-%m-%d %X' #定义时间格式
    time_current=time.strftime(time_format) #获取当前时间
    with open("log.txt","a+") as f:
        f.write("%s error.log\n" %time_current)


def test1():
    print("in the test1")
    logger()

def test2():
    print("in the test2")
    logger()

def test3():
    print("in the test3")
    logger()

test1()
test2()
test3()

#输出结果
in the test1
in the test2
in the test3


#log.txt文件内容
2018-08-16 07:01:02 error.log
2018-08-16 07:01:02 error.log
2018-08-16 07:01:02 error.log

使用函数的优点总结：

1、代码重用

2、保持一致性

3、可扩展性

返回值return

def test1():
    print("in the test1")

def test2():
    print("in the test2")
    return 0

def test3():
    print("in the test3")
    return 1,'hello',['shang','quan'],{'shang','quan'}

x=test1() #x用来接收返回值
y=test2()
z=test3()

print(x)    #返回none
print(y)    #返回0
print(z)    #返回元组


#输出结果
in the test1
in the test2
in the test3
None
0
(1, 'hello', ['shang', 'quan'], {'shang', 'quan'})

总结：

返回值个数＝0，none

返回值个数＝1，定义的内容

返回值个数>1，元组

参数定义

def test(x,y): #x、y为形参
    print(x)
    print(y)

test(1,2) #1,2为实参(位置参数)
test(y='A',x='B') #关键字调用
test(5,y=7) #关键字一定不可以放在位置参数前

#输出结果
1
2
B
A
5
7

　　默认参数

　　调用函数时，使用默认函数

 1 def test(x,y=2):
 2     print(x)
 3     print(y)
 4 
 5 
 6 test(1)
 7 test(1,3)
 8 test(1,y=4)
 9 
10 #输出结果
11 1
12 2
13 1
14 3
15 1
16 4

默认阐述特点：调用函数的时候，默认参数非必须传递

用途：软件默认安装、数据库端口号等

　　参数组

当实参不固定时，需使用参数组传递

def test(x,*args):  #
    print(x)
    print(args)

test(1,2,3,4,5)

#输出结果
1
(2, 3, 4, 5)

不固定参数，使用*args传递，将接收到的参数转化成元组的方式

备注：*args是用来接受N个位置参数

字典的方式

 1 #字典  把N个关键字参数，转换成字典的方式
 2 def test2(**kwargs):
 3     print(kwargs)
 4 
 5 test2(names='shang',age=29,sex='M')
 6 
 7 def test3(name,**kwargs):
 8     print(name)
 9     print(kwargs)
10 
11 test3('shang',age=29,sex='M')
12 
13 def test4(name,age=19,**kwargs):
14     print(name)
15     print(age)
16     print(kwargs)
17 
18 test4('shang',sex='M',hobby='Make Money')
19 
20 
21 #输出结果
22 {'names': 'shang', 'age': 29, 'sex': 'M'}
23 shang
24 {'age': 29, 'sex': 'M'}
25 shang
26 19
27 {'sex': 'M', 'hobby': 'Make Money'}

组合的方式

 1 def test1(name,age=19,*args,**kwargs):
 2     print(name)
 3     print(age)
 4     print(args)
 5     print(kwargs)
 6 
 7 test1('shang',29,'ace','python',10,sex='M',hobby='Make Money')
 8 
 9 
10 #输出结果
11 shang
12 29
13 ('ace', 'python', 10)
14 {'sex': 'M', 'hobby': 'Make Money'}

备注：位置参数传递给*args，如果没有对应的位置参数，则args为空元组；关键字传递给**kwargs

局部变量

定义在函数中，该函数是局部变量的作用域

 1 def chang_name(name):
 2     print("before_name",name)
 3     name='ace'  #局部变量，只在函数中生效
 4     print("after_name",name)
 5 
 6 
 7 name='shang'
 8 chang_name(name)
 9 print(name)
10 
11 #result
12 before_name shang
13 after_name ace
14 shang

全局变量

在整个程序中都生效的变量，定义在程序的顶部部分

 1 name='shang'  #全局变量
 2 age=29 #全局变量
 3 
 4 def chang_name(name):
 5     global age  #声明全局变量
 6     age = 30  # 修改全局变量
 7     print("before_name",name)
 8     name='ace'  #局部变量，只在函数中生效
 9     print("after_name",name)
10 
11 chang_name(name)
12 print(name)
13 print(age)
14 
15 #result
16 before_name shang
17 after_name ace
18 shang
19 30

备注：全局变量理论上是不可以修改的，但是如果需要修改的场景，则需在函数中先声明全局变量global，然后就可以改动全局变量；

同时，不可在函数中用global定义全局变量，不符合书写规范

需注意：

当全局变量为：列表、字典、集合时，函数可以直接修改全局变量

 1 resume=['shang',29,'pm']
 2 
 3 def change_resume():
 4     resume[1]=30 #函数中可以直接修改列表、字典、集合和类的全局变量中的参数
 5     
 6 change_resume()
 7 print(resume)
 8 
 9 #result
10 ['shang', 30, 'pm']

递归

在函数内部，可以调用其他函数，如果一个函数在内部调用自己，这个函数就是递归函数

递归特征：

1、必须要有明确的结束条件

2、每次进入更深一层递归时，问题规模相比上次递归都应有所减少

3、递归效率不高，递归层次过多会导致栈溢出

 1 def calc(n):
 2     print(n)
 3     n=int(n)
 4     if n > 1:
 5         return calc(n/2)
 6 
 7 n=input('input the number')
 8 calc(n)
 9 
10 #result
11 input the number45
12 45
13 22.5
14 11.0
15 5.5
16 2.5
17 1.0

高阶函数

变量可以指向函数，函数的参数能接收变量，那么一个函数就可以接收另一个函数作为参数，这种函数就称之为高阶函数。

1 def sum(x,y,func):
2     return func(x)+func(y)
3 
4 add=sum(3,-1,abs)   #将绝对值函数abs传递给func
5 print(add)
6 
7 #result
8 4