We have already come across situations where we're dealing with a series of items, and we'd need the next item(s) in the series, but we wouldn't necessarily want to formulate the entire series up to that point each time a new item is required. Some recursive series, such as the Fibonacci number, are a good example of such a situation. If each function call recursively generates the entire series up to the desired point, we end up generating the beginning of the series many times over.
Python generators are a way of producing just the next item in a series when it is needed, essentially running the generation process for the series only once (for a given execution of a program). They work mostly like normal functions, as they can be called and will return values, but the value a generator function returns differs from a normal function. A normal function should return the same value every time, given the same arguments. A generator function, on the other hand, should remember its current state and return the next item in the series, which may be different from the previous item.
Just as there are many ways of solving most any programming problem, there are many ways of achieving a functionality similar to generators, but generators can help make the program easier to understand, and can in certain situations save memory or other computational resources.
A generator function must contain the keyword
yield, which marks out the value which the function returns. Let's take a look at a function which generates integer numbers, starting from zero and ending at a pre-determined maximum value:
def counter(max_value: int): number = 0 while number <= max_value: yield number number += 1
counter function can be passed as an argument to the function
if __name__ == "__main__": numbers = counter(10) print("First value:") print(next(numbers)) print("Second value:") print(next(numbers))
First value: 0 Second value: 1
As you can see from the example above, the keyword
yield is similar to the keyword
return: both are used to define a return value. The difference is that
yield doesn't "close" the function in the same sense as
return. A generator function with the
yield keyword keeps track of its state, and the next time it is called, it will continue from the same state.
This generator also requires a maximum value, which was
10 in the example above. When the generator runs out of values, it will raise a
if __name__ == "__main__": # creates a generator with maximum value 1 numbers = counter(1) print(next(numbers)) print(next(numbers)) print(next(numbers))
The exception can be caught with a
if __name__ == "__main__": numbers = counter(1) try: print(next(numbers)) print(next(numbers)) print(next(numbers)) except StopIteration: print("ran out of numbers")
0 1 ran out of numbers
Traversing through all the items in a generator is easily done with a
if __name__ == "__main__": numbers = counter(5) for number in numbers: print(number)
0 1 2 3 4 5
Generators do not have to have a defined maximum value or termination point. They can generate values infinitely (within other computational and physical constraints, naturally).
Pay mind, though: traversing a generator with a
for loop only works if the generator terminates at some point. If the generator is built on an infinite loop, trying to traverse it with a simple
for loop will cause an endless execution, just like a
while loop with no end or break condition would.
You do not necessarily need a function definition to create a generator. We can use a structure similar to a list comprehension instead. This time we use round brackets to signify a generator instead of a list or a dictionary:
# This generator returns squares of integers squares = (x ** 2 for x in range(1, 64)) print(squares) # the printout of a generator object isn't too informative for i in range(5): print(next(squares))
<generator object <genexpr> at 0x000002B4224EBFC0> 1 4 9 16 25
In the following example we print out substrings of the English alphabet, each three characters long. This prints out the first 10 items in the generator:
substrings = ("abcdefghijklmnopqrstuvwxyz"[i : i + 3] for i in range(24)) # print out first 10 substrings for i in range(10): print(next(substrings))
abc bcd cde def efg fgh ghi hij ijk jkl
You can check your current points from the blue blob in the bottom-right corner of the page.