函数名的运用、闭包以及迭代器

函数名的运用

函数名是一种特殊的变量，函数名加上括号后表示函数执行，除此之外，函数名还可以进行如下几条操作：

1. 作为变量赋值

1 def func():
2     print(666)
3 
4 
5 f1 = func
6 f2 = f1
7 f2()   # 打印666

2. 作为容器内数据类型的元素

应用场景：需要调用多个函数的时候

方法一：

 1 def func1():
 2     print(111)
 3 
 4 
 5 def func2():
 6     print(222)
 7 
 8 
 9 def func3():
10     print(333)
11 
12 
13 func_list = [func1, func2, func3]
14 for func in func_list:
15     func()

执行结果

111
222
333

方法一里面是按照顺序执行函数，如果我们指定要执行某些函数呢，来看方法二

方法二：

 1 def func1():
 2     print(111)
 3 
 4 
 5 def func2():
 6     print(222)
 7 
 8 
 9 def func3():
10     print(333)
11 
12 
13 dic = {
14     1: func1,
15     2: func2,
16     3: func3
17 }
18 
19 for k in dic.keys():
20     dic[k]()

把函数放在字典里，想执行哪个就执行哪个

3. 作为函数的参数

1 def func(f):
2     f()
3 
4 
5 def func1():
6     print(222)
7 
8 
9 func(func1)    # 222

4. 作为函数的返回值

1 def func(x):
2     return x
3 
4 
5 def func1():
6     print("in func1")
7 
8 
9 func(func1)()   # in func1

实际上，函数名是第一类对象，第一类对象的特点：

1. 可在运行期间创建

2. 可用作函数参数或返回值

3. 可存入变量的实体

笼统来说，第一类变量就是普通变量

闭包

首先抛出一个问题：为什么要有闭包？来看两个例子

实例一：

 1 def func(step):
 2     num = 1
 3     num += step
 4     print(num)
 5 
 6 
 7 j = 0
 8 while j < 5:
 9     func(3)
10     j += 1

执行结果

上面是没有使用闭包的情况，下面来看使用闭包的情况

实例二：

 1 def wrapper(step):
 2     num = 1
 3 
 4     def inner():
 5         nonlocal num    # 引用num，与return inner形成闭包
 6         num += step     # 此时num在inner执行完了之后不会被清理，会作为下一次的起始值
 7         print(num)
 8     return inner
 9 
10 
11 f = wrapper(3)
12 j = 0
13 while j < 5:
14     f()
15     j += 1

执行结果

首先来看函数的结构，这是一个嵌套函数，wrapper函数里面嵌套一个inner函数，要知道，在函数外面是不能直接调用内层函数的，那么我们怎么做呢？答案是通过return，我们可以把外层函数的返回值设置为内层函数名，这样层层返回，就可以在外界调用任意位置的内层函数。可是这和我闭包又有什么关系呢？当然有啦，比如说我想调用inner函数对num进行操作，那么我是不是得让inner的外层函数wrapper的返回值设置为inner啊，其次还得在inner内部设置nonolocal，要不然就不能对num就行操作，其实在内层函数引用外层函数的变量，然后外层函数返回内层函数这样就形成了闭包，总结一下，关于闭包：

1. 闭包是内层函数对外层函数（非全局）变量的引用

2. 闭包只存在于内层函数中

3. 函数都要逐层返回，最终返回给最外层函数

下面来看一个例子，

1 def func(n):  # 相当于n=name
2     def inner():
3         print(n)
4 
5     return inner
6 
7 
8 name = "Hanser"
9 f = func(name)

这个是不是闭包呢，答案是肯定的，这里的n传入func里面后是存放于func的名称空间里的，inner的print(n)就是内层函数对外层函数的引用，然后func的返回值是内层函数名inner，所以这个是闭包。这样判断是不是有点麻烦啊，那么有没有简单的办法呢，有的，python提供了判断闭包的方法: __closure__，来看代码

 1 def func1(a):
 2     n = 1
 3 
 4     def func2():
 5         nonlocal n
 6         n += a
 7 
 8         def func3():
 9             nonlocal n
10             n *= a
11             print(n)
12         return func3
13     return func2
14 
15 
16 f = func1(3)  # f = func2
17 print(f.__closure__[0].cell_contents)    # 获取引用的外层函数的变量，如果能获取到，就是闭包
18 print(f.__closure__[1].cell_contents)    #
19 # print(f.__closure__[2].cell_contents)    # 报错，没有第三个
20 
21 print("------我是华丽丽的分割线-------")
22 
23 f1 = func1(3)()     # f1 = func3
24 print(f1.__closure__[0].cell_contents)
25 print(f1.__closure__[1].cell_contents)
26 # print(f1.__closure__[2].cell_contents)     # 报错，没有第三个

执行结果

3
1
------我是华丽丽的分割线-------
3
4

总结一下，判断步骤：

（1）找到要判断的内层函数（line16和line23）

（2）获取该内层函数引用的外层函数的变量

（3）能获取到，是闭包；否则不是闭包

这里有一个小知识点：获取到的引用的外层函数的变量顺序: 传入的参数 > 外层函数定义的变量

闭包作用

正常程序执行时，遇到函数，随着函数的结束而关闭临时名称空间，闭包的本质就是闭包会创建一个空间，这个空间不会随着函数的结束而关闭，因而之后可以继续调用，这是非常有用的，闭包的应用场景：

1. 装饰器

2. 爬虫

来看一个爬虫实例

 1 from urllib.request import urlopen
 2 
 3 
 4 def but():
 5     content = urlopen("https://book.douban.com/annual/2018?source=navigation#1").read()    # 获取网页源代码
 6 
 7     def get_content():
 8         return content
 9     return get_content
10 
11 
12 fn = but()
13 print(id(fn()))    # 2505706231536
14 print(id(fn()))    # 2505706231536  两个id一样，证明第一次获取的内容没有消失，第二次是直接调用第一次的内容
15 content1 = fn()   # 获取内容
16 print(content1.decode("utf-8"))   # 解码

执行结果

2505706231536
2505706231536
<!doctype html>

<html lang="zh-cmn-Hans">

    <head>
        <meta charset="utf-8">
        <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
        <meta name="viewport" content="width=device-width, initial-scale=1, maximum-scale=1, user-scalable=no, viewport-fit=cover">
        <meta name="apple-mobile-web-app-capable" content="yes">
        <link rel="shortcut icon" href="https://img3.doubanio.com/favicon.ico">
        <meta name="format-detection" content="telephone=no">
        <meta name="url_name" content="book_annual2018">
        <meta name="user_id" content="">
        <meta property="og:site_name" content="豆瓣" />
        <meta property="og:title" content="豆瓣2018年度读书榜单" />
        <meta property="og:description" content="这一年不可错过的好书都在这里了" />
        <meta property="og:url" content="https://book.douban.com/annual/2018?source=broadcast" />
        <meta property="og:image" content="https://img3.doubanio.com/img/files/file-1545618075.jpg" />
        <title>豆瓣2018年度读书榜单</title>
        <script>
            window.ITHIL = {};
            ITHIL.isFrodo = 'False' === 'True';
            ITHIL.isWechat = 'False' === 'True';
        </script>
        <script>
            var _hmt = _hmt || [];
            (function() {
                var hm = document.createElement("script");
                var hash = '2018' === '2018' ? '6e5dcf7c287704f738c7febc2283cf0c' : '16a14f3002af32bf3a75dfe352478639'
                hm.src = "https://hm.baidu.com/hm.js?" + hash;
                var s = document.getElementsByTagName("script")[0]; 
                s.parentNode.insertBefore(hm, s);
            })();
        </script>
    </head>
    <body>
        <div id="app"></div>
        <script src="https://img3.doubanio.com/f/ithil/31683c94fc5c3d40cb6e3d541825be4956a1220d/js/lib/es5-shim.min.js"></script>
        <script src="https://img3.doubanio.com/f/ithil/a7de8db438da176dd0eeb59efe46306b39f1261f/js/lib/es6-shim.min.js"></script>
            <script src="https://img3.doubanio.com/dae/cdnlib/libs/jweixin/1.0.0/jweixin.js"></script>
                <script src="https://img3.doubanio.com/f/ithil/b92012acc8222b31e7f1307c154fdb90b56d64d1/gen/ithil2018.bundle.js"></script>
            <div alt="main-pic" style="display: none">
                <img type="hidden" alt="cover" src="https://img3.doubanio.com/img/files/file-1545618075.jpg">
            </div>
    </body>
</html>

content内容不会随着but函数的结束而消失，这个非常有用，因为获取内容后还要进行筛选等操作，而请求一次因为网络延时等原因是非常耗时间的，有了闭包，就不用再次去请求获取内容，节省了很多时间。

迭代器

在讲迭代器之前，先来看一下可迭代对象，什么是可迭代对象呢，可迭代对象的定义是：内部含有__iter__方法的对象。

判断方法

方法一：

s1 = "hello"
print(dir(s1))   # 返回的方法里面有__iter__()就是可迭代对象
print("__iter__" in dir(s1))   # True

执行结果

['__add__', '__class__', '__contains__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getnewargs__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__mod__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmod__', '__rmul__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'capitalize', 'casefold', 'center', 'count', 'encode', 'endswith', 'expandtabs', 'find', 'format', 'format_map', 'index', 'isalnum', 'isalpha', 'isdecimal', 'isdigit', 'isidentifier', 'islower', 'isnumeric', 'isprintable', 'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'maketrans', 'partition', 'replace', 'rfind', 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upper', 'zfill']
True

f = open("goods.txt", encoding="utf-8", mode="r")
print(dir(f))
print("__iter__" in dir(f))

执行结果

['_CHUNK_SIZE', '__class__', '__del__', '__delattr__', '__dict__', '__dir__', '__doc__', '__enter__', '__eq__', '__exit__', '__format__', '__ge__', '__getattribute__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__lt__', '__ne__', '__new__', '__next__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '_checkClosed', '_checkReadable', '_checkSeekable', '_checkWritable', '_finalizing', 'buffer', 'close', 'closed', 'detach', 'encoding', 'errors', 'fileno', 'flush', 'isatty', 'line_buffering', 'mode', 'name', 'newlines', 'read', 'readable', 'readline', 'readlines', 'seek', 'seekable', 'tell', 'truncate', 'writable', 'write', 'writelines']
True

方法二：

1 from collections import Iterable
2 from collections import Iterator
3 l1 = [1, 2, 3]
4 print(isinstance(l1, Iterable))  # 判断是否是可迭代对象
5 print(isinstance(l1, Iterator))  # 判断是否是迭代器

执行结果

True
False

那么什么是迭代器呢，迭代器在可迭代对象的基础上增加了__next__方法（取值用），可迭代对象可以转化成迭代器，你猜的没错，就是用__iter__方法

1 obj = s1.__iter__()    # 方法一：可迭代对象转化成迭代器
2 # obj = iter(s1)         # 方法二：可迭代对象转化成迭代器
3 print(obj.__next__())   # a
4 print(obj.__next__())   # b
5 print(obj.__next__())   # c
6 print(obj.__next__())   # d
7 print(obj.__next__())   # 报错 StopIteration

执行结果

a
b
c
d

分析上述代码可以发现规律，__next__方法每次只取一个值，当取值个数超出时会报错，此外__next__()方法可以用next()方法代替。

1 s2 = [1, 2, 3]
2 obj = s2.__iter__()
3 print(obj.__next__())
4 print(next(obj))   # 与__next__一样

执行结果

1
2

运用迭代器和while循环还可以模拟for循环，来看代码

1 lst = [1, 2, 3, 4, 5]
2 obj = iter(lst)
3 while True:
4     try:
5         print(next(obj))
6     except StopIteration:
7         break

执行结果

试试字典

1 dic = {"name": "hanser", "age": 18, "height": 148}
2 obj = iter(dic)
3 while True:
4     try:
5         print(next(obj))
6     except StopIteration:
7         break

执行结果

name
age
height

对字典直接取值取出的是key,如果想取value或者键值对把dic换成dic.values()或dic.items()就行

1 dic = {"name": "hanser", "age": 18, "height": 148}
2 obj = iter(dic.values())
3 while True:
4     try:
5         print(next(obj))
6     except StopIteration:
7         break

执行结果

hanser
18
148

总结一下：

可迭代对象：内部含有iter方法的对象

　　str, list, tuple, dic, set, range(), 文件句柄都是可迭代对象

迭代器：内部含有iter方法和next方法的对象

　　str, list, tuple, dic, set, range()不是迭代器

　　文件句柄是迭代器

判断可迭代对象和迭代器的方法

　　"__iter__" in dir(s), "__next__" in dir(s)

　　isinstance(s, Iterable), isinstance(s, Iterator)

可迭代对象转化成迭代器

　　__iter__(), iter()

迭代器特点

　　节省内存，迭代器只存放下一个对象的值

　　惰性机制，next一次取一个值

　　单项取值，不走回头路