Functional Programming with Functions

How we structure and design the flow of our code is up to the programmer. However, most programming languages have evolved to support a particular paradigm more than others. For R, this is functional programming. Functional programming is a programming paradigm that places focus on the creation of functions as first-class citizens, and avoids changing-state and mutable data, and is currently a popular paradigm. Two other popular programming paradigms are object-oriented programming and procedural programming which focus on the creation of objects and procedures, respectively.

For demonstration purposes for this lesson, let's imagine we want to build the next Netflix. A crucial system that Netflix needs to have in place is a mechanism to consolidate all information related to a movie, such as its title and the year it was released, into a single bundle. To illustrate the strength and versatility of functional programming, we will create a few functions to create, bundle and update this information together. Let's review our definition of a function once more:

A function is a self-contained block of code that encapsulates a set of instructions. It provides an abstraction over these instructions, promoting modularity and reusability. Typically, functions can accept input parameters, processes it, and returns an output without modifying any external state or the input itself.

Think of a function as a way to transform something, much like the recipe analogy we made in the Functions lesson. To write a function in R, we use the keyword function, then specify its parameters, and finally define its body. Good coding naming convention usually suggests to use meaningful names for functions or variables. This naming convention is usually in snake_case. Before we go further into the details, let's recall what a function looks like.

Imagine we have a template for a paper notebook called movie_data. Let's visualize that we have a special printer that can create a notebook precisely following this template provided to it. The line movie_data <- function() is analogous to saying, "Here is our template named movie_data, which outlines the content that can be included in a notebook." Within this template, the line movie <- list(title = "Digi Cafe: The Movie") specifies that every notebook created should bear the title: "Digi Cafe: The Movie". We save the named element title into the list movie; for a review on lists, please checkout the Data Structures lesson. Finally, we return the variable movie with the return keyword.

Then, in the line movie = movie_data(), it's as if we are initiating the actual printing process and creating the physical notebook using our movie_data template. We add the parenthesis after the function name to indicate that we are constructing the notebook, not merely referencing the template, and we assign it to the variable movie.

Finally, we can access the title named element by separating it and the list object returned by the function with a dollar sign. As we did in the movie$title line.

Let's write a second function which will take as input, the output from the movie_data function. We'll name this function, print_title. And this function will add the title into some text using paste0 and print this out. Adding a method to our notebook is giving it the power to have functionalities of it's own. The physical representation of this, is that we can imagine we are not just making a notebook made of paper, but now an electronic notebook such as a tablet or iPad. This electronic notebook has the ability to display the title on its screen and any method we add to it, is like adding a button to the screen that we can press to make something happen.

Function Parameterization

The function that we created above is rather simple in structure. In reality, functions can be much more complex. Let's start exploring how we can increase the generalizability of our function by adding parameters into it.

We may notice that in the function the title is actually hard-coded as "My Catalogue". What if we want to catalogue any other movie? We can do so by adding a parameter to our function. We will call this parameter title. We will then assign this parameter to the title named element in our list. Something else we add is a default value for the title parameter. This is done by adding an equal sign and the default value after the parameter name. In this way, if we do not pass in a value for the title parameter, then it will default to "My Catalogue". Allowing variability in the inputs facilitates code reuse and simplifies our work.

In the next code block, we will update our initial function to take in two parameters, title and release_year. We will then assign these parameters to the named elements title and release_year respectively in the list.

We printed out the title with the print_title method, but what if we want to print out the title and the release year? We can do so by replacing this function with another more general one that takes the movie object as an input, and then collect both the title and release year together into one string which can then be printed out using print. Within this function, which we will call movie_data_info we will cast the release_year value to a string (checkout the Datatypes lesson for a review on casting). This casting is necessary to collect both values into one string.

Oh no, the title is not quite right. Which movie in the Harry Potter series in particular do we mean? If we want to update the title value, one way to do so would be to create another list by running the function again. This would be akin to creating a completely new electronic notebook with the new title. However, for complex objects, creating everything from scratch can lead to more errors occurring. A safer thing to do is to create a new notebook with the updated title while keeping everything else the same. This way, we adhere to the functional programming principle of immutability, which means not changing the state of the original object.

In the update_title function, we take in a parameter new_title and create a new list with the updated title and the original release year. This way, we create a new notebook with the updated title without modifying the original notebook.

Great! We have created a way to collect information together about our movies. In this lesson we showed the immutability and not changing state aspects of functional programming. Not changing state refers to the fact that functions do not have any side effects; they only use the inputs to compute the output and do not modify any external variables. The movie_data_info function demonstrated this aspect by only using its input, movie, to compute the output string, s, without modifying movie or any external variables.

This covers most of the utilities we will use with functions in data science. There are further advanced functional programming techniques such as: function composition, higher-order functions, closures, currying and recursion. We will cover the first item of function composition shortly, in the Pipe Operator lesson.