Method Chaining
We are kicking off the Pandas section of the course with a very important and widely used concept called method chaining. This is not unique to Pandas, indeed it is available in every Python environment. However, we are introducing it now, as method chaining helps us to clearly and sequentially express multiple methods, making complex data cleaning and transformation tasks (which we will do using Pandas) much more readable and manageable. As method chaining is so useful, we will be using it throughout the rest of the course.
Method chaining passes the output of one method to the next method as its input, chaining them together into a single statement; it allows us to express a sequence of multiple operations in a clear and concise manner. Its usefulness becomes apparent when we have several steps that can be applied together. At first, method chaining might seem a bit tricky to grasp. However, as we go through more examples and explore different use cases, it will become more intuitive and feel like second nature.
Let's review methods by making a class and embedding a method into this class.
In the code above, we defined a class called MyClass
,
which includes the initializer method __init__
and another
method called add
. We then created an instance of
MyClass
and assigned it to the variable a.
This add
method we defined here is more restrictive than
other ways we could do addition, as we have designed this method to
add a number to the current value of
self.value
, which is initialized to 1. We further used
some shorthand notation self.value += 1
, which is equivalent
to self.value = self.value + 1
.
Additionally, the add
method is called on
an instance of MyClass
using the dot operator
.
. This method can be invoked multiple times
in succession, as it returns a reference to the instance
on which it was called, enabling method chaining.
Let's see this method in action. Here, we call the add
method on the instance of the class. Importantly, we can call the
add
method multiple times in succession, creating a
chain of method calls. This chaining of method calls on line 11 is what we
have been referring to as method chaining.
Let's expand on this concept further. In the next code editor,
we extend the previous class by adding an additional method called
multiply
. Once again, we utilize shorthand notation
for multiplication with the expression x *= 2
,
which is equivalent to x = x * 2
.
In the below code snippet, we first call the add
method
on an instance of the class. Then, we proceed to call the
multiply
method on the same instance. It's important to note
that this approach differs from calling a function because we are
invoking the method on the instance of the class itself, rather
than on the class itself. By doing so, we update the self.value
that exists within the instance of the class.
So, why use method chaining in this case? It might seem unnecessary
from this example as it takes up more space than doing something like
(1 + 1) * 2
and further doesn't appear to make
the code clearer or faster. However, the true power of the pipe operator
shines when performing a sequence of operations. It helps make complex
data cleaning and transformation tasks much more readable and
understandable, as we will see in the upcoming lessons.
Practice exercise
Copy the code from the previous code editor, and update the
add
and multiply
methods
to use arguments 2 and 3, respectively.