Assign

We have learned enough to start applying multiple concepts together such as method chaining, the methods in the previous lesson, and in this lesson, the assign method. We will also learn about white space and how to use it.

Say we want to add another column to the iris dataframe, which we will name Sepal_Length_2, that is equal to twice the Sepal.Length. We can do so with the following assign method. Note, one limitation of the assign method is that we cannot create the new column name as a string, as such, we need to use a naming convention that is a valid variable name. So with new column names from here on, we will use underscores instead of periods.

Just as the mean method, the assign method comes after the dataframe name, and within the method argument, the ordering goes: "new variable name" on the left of the equal sign and whatever operation on an existing column in the dataframe on the right of the equal sign.

We can make multiple variables at the same time, we need only separate the two with a comma.

White space

Unlike other programming languages such as R, Python uses white space to associate code with classes and functions. However, we can make code more readable by collecting code within parenthesis, pushing text to a new line and indenting so similar code is stacked on top of each other.

The design of not evaluating white space allows code to be presented in easer to read formats and avoids horizontal scrolling. There is no hard limit to what can be put on one line, but we have found keeping Python code to a maximum of 80 characters per line easier to read; the exact number that works for you depends on your screen size, resolution, and personal preference.

In the following code editor we have the same content as the previous code editor, we have just added white space to make it easier to read. We have added white space by pressing the enter (aka return) key to make new lines, using the tab key to indent, and lining up the closing parentheses with the tab before the indentation.

Method chaining multiple methods

Now, suppose we want to double the Sepal.Length and then also compute the mean of the modified variable. Naturally, we may accomplish this in two separate steps as illustrated below:

But there is a simpler way by chaining these two methods using method chaining. We added additional parenthesis around the method chain so we may freely add white space to make the code easier to read.

Here, we can finally see the efficiency of method chaining! By employing two method chains in a single operation, we've reduced the necessity for saving multiple objects and have streamlined the code overall.

The iris dataframe at the start of the chain, is channeled to the assign method. The assign method then appends a new column, Sepal_Length_2, to the dataframe. This modified dataframe is subsequently conveyed to the mean method, and we returned the mean of only the Sepal_Length_2 column.

Practice exercise

Fill in the rest of the code within the assign method to find the standard deviation of Petal.Length + 1 by calling the std method on a column you will create called Petal_Length_2.