Use of Generator Functions and Yield in Python

Syed Shahzaib Ali
4 min readAug 8, 2022

In this small article, I’ll skim through the primary use of generator functions and yield statements in Python. There are a lot of different concepts and flexibilities provided by Python that are not being used frequently because they are hard to understand at the first go. Generator functions are one of the utilities that make things easy for us. We’ll briefly discuss its use later in the article.

Photo by Christopher Gower on Unsplash

What are Generator Functions and Yield Statements?

Generator functions are similar to regular functions in python, the only difference is that these functions return an Iterator Object. This Iterator Object is called a Generator.

  • A regular function can be converted into a generator function if you introduce a yield statement in it.
  • When a regular function is called, that function’s resources, scope, and memory context are destroyed afterward (i.e. after the “return” statement or usual ending)
  • When a generator function is called, the function internally looks for a yield statement. At this point, it will remember the position (or code statement) of the function and give the caller back control. Now, what's the magic? When that generator function is called again, it will resume the instructions from the point it left last time and again look for the yield statement.
  • That’s why generator functions return an iterator object so that function can be called multiple times in an iteration manner (i.e. one step at a time). The size of the generator is directly proportional to the number of yield statements encountered during processing.
  • These Iterator Objects are just like “Lists” and you can also loop through them but unlike lists, these objects allocate the resources on the fly on each iteration.

Example:

In the following statement, range needs to store all the values beforehand in memory whereas iterator object is memory efficient in this sense.

for i in range(0,100)

Why and how to use Generator Functions?

A baseline answer for the above question is; if you are required to perform some memory-intensive task and your system resources aren’t sufficient for that. You can always think of using generator functions.

A simple example could be that you are required to read large log files and for each data row, you need to perform some operations. You can load the whole file by iterating one line at a time using a generator function and it will be memory efficient. We will see this example in a while.

Using generator functions is quite easy, you just need to use a yield statement where you want to switch the context and need a particular value from that function. See the example below:

Defined a Generator Function for range

In the above use case, we defined a generator function to give us a range of values BUT one value at a time. When we called our function it gave us an iterator object and every time we iterate, the value is returned until a yield statement is encountered. Another simple example is as follows:

Generator Function iteration using next function

In the above example, we wanted to iterate over 3 times. Thus, there are 3 yield statements and we get an error on 4th call because the function doesn’t have any more yield statements left to process.

I hope the concept is pretty clear up till now!

Some Generator Functions Use cases in Action

Infinite Sequence Generation

Ever wondered if you are required to generate a list of infinite sequences how would you do it without exhausting your memory? You obviously can’t use a range function because these types of functions only work for finite numbers. You can obviously use an infinite loop to do that but that’s not really a good and practical approach, you need to make sure to have control over your code. Let's see how to achieve it using the generator function.

Infinite Sequence generation using generator functions

Reading Large Files

As discussed earlier, using generator functions come really handy when you want to access and analyze large data files. In python, when you use open() function to load a file, it gives you a generator object which you can use to iterate through the file line by line. Usually, we use read() function on a file object that loads all the data into memory at once causing a MemoryError exception. Following code can be simply used to achieve this.

Reading log files line by line using a generator function

That's all folks! I hope it was brief and helpful for you.

--

--

Syed Shahzaib Ali

Lead AI Expert| Data Scientist | Natural Language Processing Expert | xIBM