Understanding Python Iterables and Iterators

The for loop, just like everything else in Python, is really simple. For a wide range of containers you can just do for i in container: do something. How does this work? And more importantly, if you create your own container how can you make sure that it supports this syntax?

for loop under the hood

First let’s look at the for loop under the hood. When Python executes the for loop, it first invokes the __iter__() method of the container to get the iterator of the container. It then repeatedly calls the next() method (__next__() method in Python 3.x) of the iterator until the iterator raises a StopIteration exception. Once the exception is raised, the for loop ends.

A couple of definitions

Time for a couple of definitions ...
  • Iterable - A container is said to be iterable if it has the __iter__ method defined.
  • Iterator - An iterator is an object that supports the iterator protocol which basically means that the following two methods need to be defined.
    • It has an __iter__ method defined which returns itself.
    • It has a next method defined (__next__ in Python 3.x) which returns the next value every time the next method is invoked on it.
      • 我们可以看到有__iter__方法就是是Iterable,Iterator更严格。
For example consider a list. A list is iterable, but a list is not its own iterator.
>>> a = [1, 2, 3, 4]
>>> # a list is iterable because it has the __iter__ method
>>> a.__iter__
<method-wrapper '__iter__' of list object at 0x014E5D78>
>>> # However a list does not have the next method, so it's not an iterator
>>> a.next
AttributeError: 'list' object has no attribute 'next'
>>> # a list is not its own iterator
>>> iter(a) is a
False

(iter()方法官方介绍:
iter(o[sentinel])

Return an iterator object. The first argument is interpreted very differently depending on the presence of the second argument. Without a second argument, o must be a collection object which supports the iteration protocol (the __iter__() method), or it must support the sequence protocol (the __getitem__() method with integer arguments starting at 0). If it does not support either of those protocols, TypeError is raised. If the second argument, sentinel, is given, then o must be a callable object. The iterator created in this case will call o with no arguments for each call to its next() method; if the value returned is equal to sentinel,StopIteration will be raised, otherwise the value will be returned.

One useful application of the second form of iter() is to read lines of a file until a certain line is reached. The following example reads a file until the readline() method returns an empty string:

with open('mydata.txt') as fp:
    for line in iter(fp.readline, ''):
        process_line(line)

New in version 2.2.

 
The iterator of a list is actually a listiterator object. A listiterator is its own iterator.
>>> # a iterator for a list is actually a 'listiterator' object
>>> ia = iter(a)
>>> ia
<listiterator object at 0x014DF2F0>
>>> # a listiterator object is its own iterator
>>> iter(ia) is ia
True
How to make your object an iterable

Let us try and define a subclass of the list class called MyList with a custom iterator.
class MyList(list):
    def __iter__(self):
        return MyListIter(self)
    
class MyListIter(object):
    """ A sample implementation of a list iterator. NOTE: This is just a 
    demonstration of concept!!! YOU SHOULD NEVER IMPLEMENT SOMETHING LIKE THIS!
    Even if you have to (for any reason), there are many better ways to 
    implement this."""
    def __init__(self, lst):
        self.lst = lst
        self.i = -1
    def __iter__(self):
        return self
    def next(self):
        if self.i<len(self.lst)-1:
            self.i += 1         
            return self.lst[self.i]
        else:
            raise StopIteration

if __name__ == '__main__':
    a = MyList([1, 2, 3, 4])
    ia = iter(a)
    print 'type(a): %r, type(ia): %r' %(type(a), type(ia))
    for i in a: 
        print i,
Here is the output of the program above.
type(a): <class '__main__.MyList'>, type(ia): <class '__main__.MyListIter'>
1 2 3 4
Now for a more practical example. Let us say you are implementing a game of cards and you have defined a card and a deck as follows.
class Card(object):
    def __init__(self, rank, suit):
        FACE_CARD = {11: 'J', 12: 'Q', 13: 'K'}
        self.suit = suit
        self.rank = rank if rank <=10 else FACE_CARD[rank]
    def __str__(self):
        return "%s%s" % (self.rank, self.suit)
    
class Deck(object):
    def __init__(self):
        self.cards = []
        for s in ['S', 'D', 'C', 'H']:
            for r in range(1, 14):
                self.cards.append(Card(r, s))
Now to iterate over the cards in the deck, you have to do ...
>>> for c in Deck().cards: print c
...
1S
2S
#... snip ...#
But Deck is a container that has multiple cards. Wouldn’t it be nice if you could just write for c in Deck() instead of writing for c in Deck().cards? Let’s try that!
>>> for c in Deck(): print c
...
TypeError: 'Deck' object is not iterable
Oops! It doesn’t work. For the syntax to work, we need to make Deck an iterable. It is in fact very easy. We just need to add an __iter__ method to our class that returns an iterator.
class Deck(object):
    def __init__(self):
self.cards = []
        for s in ['S', 'D', 'C', 'H']:
            for r in range(1, 14):
                self.cards.append(Card(r, s))
    def __iter__(self):
        return iter(self.cards)
Let’s try the syntax again.
>>> for c in Deck(): print c
...
1S
2S
#... snip ...#
Works perfectly. That’s it!

Summary
  • If you define a custom container class, think about whether it should also be an iterable. 
  • It is quite easy to make a class support the iterator protocol. 
  • Doing so will make the syntax more natural.
See also
原文地址:https://www.cnblogs.com/youxin/p/3060473.html