Study Guide: Iterators
Instructions
This is a study guide with links to past lectures, assignments, and handouts, as well as additional practice problems to assist you in learning the concepts.
Assignments
Important: For solutions to these assignments once they have been released, see the main website
Handouts
Lectures
Readings
Guides
Iteration
Iterators generalize a common pattern in computation. Often, we want to get
all the values from a compound value like the list ['c', 's', '6', '1', 'a']
.
We can write a program that prints the values in the list with a while
loop.
lst = ['c', 's', '6', '1', 'a']
i = 0
while i < len(lst):
value = lst[i]
i += 1
print(value)
Or, as we've seen before as well, a for
loop.
lst = ['c', 's', '6', '1', 'a']
for value in lst:
print(value)
The for
loop has the advantage of being much more compact and easy to read as
we no longer need to keep track of the index as we iterate through the list.
If we think of the two bits of code above as being equivalent, we can see that
the for
loop actually hides a lot of complexity for us. It is somehow able to
initialize the starting index, i = 0
, then check the length of the list to
determine whether we've stepped past the list's bounds, and moving the index
forward while assign the next value in the list.
What's more, the for
loop can event iterate over other sequence types too,
like dictionaries and strings, not just lists.
>>> for key in {1: 'first value', 2: 'second value'}:
... print(key)
...
1
2
>>> for character in 'cs61a':
... print(character)
...
c
s
6
1
a
Iteration Abstraction
We can think of iterating with a for
loop as a kind of abstraction. We
don't really need to know how to index into a list, dictionary, or string to
for
loop over their values because the for
loop hides the exact
implementation details from us. It's also a generalization: we can store our
data in any sequence type and iterate over it using the same for
loop without
having to worry too much about the underlying implementation.
But the for
loop, like the Python interpreter, is not magic. We'd like to
understand how this structure works.
Iterators
In Python, iterators are the objects which help implement for
loops. In
the list example, iterators do the entire job of knowing where to start the
iteration, checking to see if there are any more elements in the list, and
moving the index forward while returning the next value.
Iterators do all of this work in a magic method called __next__
. Given an
iterator, we can ask for its next value by calling the built-in next
function
on it, which really just invokes the __next__
method in the iterator.
Iterables
The example of a list iterator only makes sense in the context of the underlying list of values. But we often work with structures that aren't just lists: we could work with dictionaries and strings as well. These structures are more generally categorized as iterables because they can be iterated over.
Iterables have an __iter__
method which we can call with the built-in iter
function to return an iterator.
>>> iterable = ['c', 's', '6', '1', 'a']
>>> iterator = iter(iterable)
>>> next(iterator)
'c'
How does the iterator know how to stop?
>>> next(iterator)
's'
>>> next(iterator)
'6'
>>> next(iterator)
'1'
>>> next(iterator)
'a'
>>> next(iterator)
StopIteration
A StopIteration
exception is raised when there are no more values to return
from calling next
. This is how the for
loop knows how to move on. We can
convert the full for
loop to a while
loop that looks roughly like this.
iterator = iter(iterable)
try:
while True:
item = next(iterator)
print(item)
except StopIteration:
pass
Remember that the iterator is responsible for determining the stop condition,
and that stopping condition is conveyed to the loop by the StopIteration
exception.
Generators
Generator functions allow us to easily declare iterators. In CS 61A, they'll be the primary means by which we define iterators.
While generator functions look a lot like normal functions in Python, they
behave quite differently. While a normal function has a return
statement to
define the value returned when the function is called, generator functions
always return generator objects. The generator object is the iterator which
keeps track of state and remembers its place during execution.
A commonly-used built-in that behaves like a generator function is zip
.
>>> zip
<class 'zip'> # zip is not a generator, but it behaves similar to one
>>> pairs = zip([1, 2, 3], [4, 5, 6])
>>> pairs
<zip object at 0x...> # zip returns a zip object (an iterator similar to a generator object)
>>> next(pairs)
(1, 4)
>>> for pair in pairs:
... print(pair)
...
(2, 5)
(3, 6)
Generator functions are defined using the def
statement, but we use the new
keyword, yield
, to tell them apart from regular functions. For example, we
can define a generator function that, given an iterable, yields the running sum
of values in the iterable.
def running_sum(iterable):
total = 0
for value in iterable:
total += value
yield total
We can then use the running_sum
program in the following way.
>>> rs = running_sum([1, 2, 3, 4, 5])
>>> next(rs)
1
>>> next(rs)
3
>>> for n in rs:
... print(n)
...
6
10
15
When the generator object is first created, we start at the top of the body of
the function. Then, when we call next
on the generator, we start executing
the body of the function until we yield
the first value back to the caller.
The generator object pauses, remembering its current state and waiting until
the caller asks for the next
value. After all the values are yielded from the
generator, the next time the caller calls next
will cause a StopIteration
.
Practice Problems
Easy
Q1: Character generator
Write a generator that outputs each character of a string.
def char_gen(s):
"""
>>> for char in char_gen("hello"):
... print(char)
...
h
e
l
l
o
"""
"*** YOUR CODE HERE ***"
for char in s:
yield char
Medium
Q2: Pairs (generator)
Write a generator function pairs
that takes a list and yields all the
possible pairs of elements from that list.
def pairs(lst):
"""
>>> type(pairs([3, 4, 5]))
<class 'generator'>
>>> for x, y in pairs([3, 4, 5]):
... print(x, y)
...
3 3
3 4
3 5
4 3
4 4
4 5
5 3
5 4
5 5
"""
"*** YOUR CODE HERE ***"
for i in lst:
for j in lst:
yield i, j