Python迭代器与生成器

迭代器
迭代是Python最强大的功能之一，是访问集合元素的一种方式。。
迭代器是一个可以记住遍历的位置的对象。
迭代器对象从集合的第一个元素开始访问，直到所有的元素被访问完结束。迭代器只能往前不会后退。
迭代器有两个基本的方法：iter() 和 next()。
字符串，列表或元组对象都可用于创建迭代器：

>>> list=[1,2,3,4]
>>> it = iter(list)          # 创建迭代器对象
>>> print (next(it))         # 输出迭代器的下一个元素
1
>>> print (next(it))
2
>>> print (next(it))
3

迭代器对象可以使用常规for语句进行遍历：

[root@localhost ~]# vi test.py 
#!/usr/bin/python
list=[1,2,3,4]
it = iter(list)    # 创建迭代器对象
for x in it:
    print (x, end=" ")
[root@localhost ~]# ./test.py 
1 2 3 4

也可以使用 next() 函数：

[root@localhost ~]# vi test.py 
#!/usr/bin/python
import sys         # 引入 sys 模块
list=[1,2,3,4]
it = iter(list)    # 创建迭代器对象
while True:
    try:
        print (next(it))
    except StopIteration:
        sys.exit()
[root@localhost ~]# ./test.py 
1
2
3
4

生成器
在 Python 中，使用了 yield 的函数被称为生成器（generator）。
通过列表生成式，我们可以直接创建一个列表。但是，受到内存限制，列表容量肯定是有限的。而且，创建一个包含100万个元素的列表，不仅占用很大的存储空间，如果我们仅仅需要访问前面几个元素，那后面绝大多数元素占用的空间都白白浪费了。
所以，如果列表元素可以按照某种算法推算出来，那我们是否可以在循环的过程中不断推算出后续的元素呢？这样就不必创建完整的list，从而节省大量的空间。在Python中，这种一边循环一边计算的机制，称为生成器：generator。
要创建一个generator，有很多种方法。第一种方法很简单，只要把一个列表生成式的[]改成()，就创建了一个generator：

>>> L = [x * x for x in range(10)]
>>> L
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
>>> g = (x * x for x in range(10))
>>> g
<generator object <genexpr> at 0x7f3a99998150>
>>>

创建L和g的区别仅在于最外层的[]和()，L是一个list，而g是一个generator。
我们可以直接打印出list的每一个元素，但我们怎么打印出generator的每一个元素呢？
如果要一个一个打印出来，可以通过next()函数获得generator的下一个返回值：

>>> next(g)
0
>>> next(g)
1
>>> next(g)
4
>>> next(g)
9
>>> next(g)
16
>>> next(g)
25
>>> next(g)
36
>>> next(g)
49
>>> next(g)
64
>>> next(g)
81
>>> next(g)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

我们讲过，generator保存的是算法，每次调用next(g)，就计算出g的下一个元素的值，直到计算到最后一个元素，没有更多

的元素时，抛出StopIteration的错误。

当然，上面这种不断调用next(g)实在是太变态了，正确的方法是使用for循环，因为generator也是可迭代对象：

>>> g = (x * x for x in range(10))
>>> for n in g:
...     print(n)
... 
0
1
4
9
16
25
36
49
64
81

所以，我们创建了一个generator后，基本上永远不会调用next()，而是通过for循环来迭代它，并且不需要关心StopIteration的错误。
generator非常强大。如果推算的算法比较复杂，用类似列表生成式的for循环无法实现的时候，还可以用函数来实现。

举个简单的例子，定义一个generator，依次返回数字1，3，5：

调用该generator时，首先要生成一个generator对象，然后用next()函数不断获得下一个返回值：

>>> def odd():
...     print('step 1')
...     yield 1
...     print('step 2')
...     yield (3)
...     print('step 3')
...     yield(5)
... 
>>> o = odd()
>>> next(o)
step 1
1
>>> next(o)
step 2
3
>>> next(o)
step 3
5
>>> next(o)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration
>>>

可以看到，odd不是普通函数，而是generator，在执行过程中，遇到yield就中断，下次又继续执行。执行3次yield后，已经没有yield可以执行了，所以，第4次调用next(o)就报错。

yield 与 return
在一个生成器中，如果没有return，则默认执行到函数完毕时返回StopIteration；

>>> def g1():
...     yield 1
... 
>>> g=g1()
>>> next(g)      #第一次调用next(g)时，会在执行完yield语句后挂起，所以此时程序并没有执行结束。
1
>>> next(g)      #程序试图从yield语句的下一条语句开始执行，发现已经到了结尾，所以抛出StopIteration异常。
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

如果遇到return,如果在执行过程中 return，则直接抛出 StopIteration 终止迭代。

>>> def g2():
...     yield 'a'
...     return  
...     yield 'b'
... 
>>> g=g2()
>>> next(g)            #程序停留在执行完yield 'a'语句后的位置。
'a'
>>> next(g)            #程序发现下一条语句是return，所以抛出StopIteration异常，这样yield 'b'语句永远也不会执行。
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

如果在return后返回一个值，那么这个值为StopIteration异常的说明，不是程序的返回值。

生成器没有办法使用return来返回值。

>>> def g3():
...     yield 'hello'
...     return 'world'
... 
>>> g=g3()
>>> next(g)
'hello'
>>> next(g)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration: world

以下实例使用 yield 实现斐波那契数列：

[root@localhost ~]# vi test.py 
#!/usr/bin/python
import sys
def fibonacci(n):      # 生成器函数 - 斐波那契
    a, b, counter = 0, 1, 0
    while True:
        if (counter > n):
            return
        yield a
        a, b = b, a + b
        counter += 1
f = fibonacci(10)      # f 是一个迭代器，由生成器返回生成
while True:
    try:
        print (next(f), end=" ")
    except StopIteration:
        sys.exit()
[root@localhost ~]# ./test.py 
0 1 1 2 3 5 8 13 21 34 55

end