What Does Yield Keyword Do in Python

Last updated on September 26th, 2024 at 05:08 am

Python developers rely heavily on functions and loops to get their work done. Most functions return data after doing their job. This is fine if your are working with small data sets. However, if you need to work with large data then returning such a large amount of data will take a lot of memory. That is where yield keyword comes in. When you are working large data, then instead of loading it into memory, it is better to use yield keyword, since it is memory efficient. In this article, we will learn what yield keyword does, how to use, its advantages and disadvantages.

Why You Need Yield Keyword in Python

Typically, when you return data from a function or loop through a list or other iterable data variables, then Python will load the entire data into memory before returning it or looping through it. This is fine if your data is small, say, a list of 100 items. But this does not work well if your data has, say, a million items. In such cases, you need to use yield keyword which will basically return a pointer to the data, instead of, the actual data. Then you can use it to iterate through your data, one item at a time, instead of loading it completely in memroy.

What Does Yield Keyword Do in Python

Yield keyword basically creates a generator object for any expression that is passed to it, and returns an iterator to this object to the caller. This iterator is like a pointer to the data, but not the actual data. It does not return the entire expression (data). If you want the entire result, you need to loop over the generator object, using the iterator. This is very useful especially if you are using large data. This iterator can be used to iterate over the data, one item at a time.

In order to completely understand how yield works, we need to understand what iterators and generator functions do.

What Are Iterators

In Python, sequential data types such as Lists, Tuples, Strings and some container objects such as Dictionaries are called Iterables, since you can iterate over them. They generally contain a finite number of countable items that can be stored in memory.

An iterator is like a pointer to an iterable such as lists. It returns the current item and keeps track of the next item in the iterable. Every iterator supports __iter__() and __next__() functions, out of the box. __iter__() returns an iterator to the given object and __next__() returns the next item in iterable.

What are Generator Functions

Regular functions load the result data into memory and return it directly to the caller. Generator function is a special type of function that returns an iterator to the iterable result. It is also known as lazy iterator or generator iterator. Such an iterator does not store the entire result in the memory. Once you have the iterator to the result, you can simply fetch the next item of iterable by calling next() function on it. You can use it to loop over the iterator.

Yield Keyword

The yield keyword basically controls the way a generator function works. It is similar to the return keyword in regular functions but with a few key differences. When python interpreter encounters yield keyword, it will immediately pause execution of generator function and return a generator iterator object. It will also save the state of the function, along with other details such as bindings of variables, function stack, command pointer and so on.

Yield keyword will convert the expression supplied to it into a generator object and an iterator to this generator object, before returning the iterator to the caller. If you want to access the values present in generator object, you need to iterate over it. You can do this using next() function available to the iterator, or other looping constructs such as for or while.

Here is a simple example to illustrate the use of yield keyword.

def generator(x):
for i in range(x):
yield i

generator(5)

Here is the output. It is a generator object unlike the result data obtained using return statement.

<generator object genenerator at 0x02EEE4B8>

To access the values of this generator object, you need to save the iterator object that is returned by the generator function and then call next() function on this iterator.

>>> result=generator(5)
>>> result.next()
0
>>> result.next()
1
>>> result.next()
2
>>> result.next()
3
>>> result.next()
4
>>> result.next()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration

Once the iterator reaches end of generator object, it raises StopIteration exception.

Please note, the yield keyword allows you to pause the execution of function and resume it using next() function as and when you want.

Alternatively, you can also use a for loop directly on the generator iterator object to fetch all values at once. In this case, you will not see the StopIteration exception when the for loop reaches end of generator object.

>>> for i in genenerator(5):
... print i
...
0
1
2
3
4

On the other hand, see what happens if you use return statement instead of yield keyword.

>>> def generator(x):
... for i in range(x):
... return i
...
>>> generator(5)
0

In above case, the function stopped executing after returning the result of first iteration, when it encountered return statement.

Difference Between Return and Yield

Yield and Return are similar and can confuse Python developers. Yield keyword returns the iterator to a generator object, created using the expression passed to it. It pauses the function execution and saves its state. When it is called again, it resumes function execution by running the statement immediately after the yield keyword.

On the other hand, return statement directly returns a value to the caller. It ends function execution without saving its state and cannot be resumed. Return statement is simple to understand. Here are the key differences between yield and return keywords.

YieldReturn
Returns iterator to a generator object that encapsulates the value to be returned. Execution starts only when iterator is calledDirectly returns specific value
Allows you to return multiple values from a functionAllows you to return only single value from function
When the returned generator is iterated over, the function pauses after its iterator is called the first time. Every subsequent call to the iterator resumes the function until the next yield statement.Function execution terminates after return statement
Multiple yield statements per function are allowedOnly one return statement allowed per function
No memory allocation when yield keyword is usedMemory is allocated to all returned values
Memory efficient, especially for large dataSuitable only for small data
Faster for large dataSlower than yield keyword
Allows you to process the same function call multiple timesA function call can be processed only once
Converts a regular Python function to generator functionNo change in function

Advantages of Yield in Python

There are several advantages of using yield keyword in Python:

  1. Since yield returns an iterator to a generator object, it is memory efficient. An iterator hardly occupies any memory compared to the actual data that is encapsulated in generator. In fact, it is allocated memory only when the iterator is called.
  2. It allows you to pause and resume code execution, since yield keyword saves the state of execution along with variable values and stack state.
  3. It is suitable for working with large datasets. It can help you write really scalable applications that are super efficient.

Disadvantages of Yield in Python

There are also certain disadvantages of using Yield:

  1. It is a little complicated since you need to learn about generators and iterators before you dive into yield keyword. So it takes some time for developers to be comfortable with yield.
  2. Yield can make the code complex and difficult to understand since the control execution is not straight forward.

Examples Using Yield in Python

We have already learnt a simple example above to understand how yield keyword works and how it is different from return statement. Let us look at other common use cases.

Return Multiple Values from Function

While return statement can only return one value, yield keyword can return multiple values. Here is an example of function with multiple return statements, which returns the same first return statement’s value every time it is called.

>>> def hello():
return 'hello'
return 'world'
>>> hello()
'hello'
>>> hello()
'hello'

Let us look at the same function using yield keyword. Observe how it returns subsequent yield statement’s values on each call to next().

>>> def hello():
yield 'hello'
yield 'world'
>>> value=hello()
>>> value.next()
'hello'
>>> value.next()
'world'

Generate Infinite Sequence

It is impossible to return the result of an infinite sequence since your system will run out of memory just by storing these values. Also return statement cannot be used to iterate over the infinite sequence of values.

Here yield statement is a very useful tool to generate an infinite sequence of values, since it does not store the data but only the iterator to the generator object containing the sequence.

def infinity():
num = 0
while True:
yield num
num += 1

for i in infinity():
print(i)

When you run the above code, your Python interpreter will keep displaying sequential numbers one after the other. You will need to use keyboard interrupt (Ctrl+c) to stop execution.

Such capabilities are very useful if you need to deal with large amount of data or streaming data in your code.

Conclusion

In this article, we have learnt what yield keyword is, how to use it, its advantages and disadvantages. We have also learnt about how it is different from return keyword. Yield is a very powerful and useful feature of Python programming language. It is memory efficient and improves code performance. It is great to work with large data sets and even data streaming. We have also looked at several use cases yield keyword can be very effective. We urge you to take some time to carefully learn how to use yield keyword in Python, since it is an advanced topic like metaclasses. It will be a great asset for your code as well as software development career.

Also read:

How to Access Correct This in JavaScript Callback
How to Return Response from Asynchronous Call
How Slicing in Python Works