Digi Cafe - Beginner Data Science in R

What is a function? The definition we gave in the previous lesson was:

A function transforms one or multiple inputs, follows a set of instructions and produces an output

Let's give a bit more precise answer:

A function is a self-contained block of code that encapsulates a set of instructions. It provides an abstraction over these instructions, promoting modularity and reusability. Typically, functions can accept input parameters, processes it, and returns an output without modifying any external state or the input itself.

Great! However, there are a few new key terms in this definition that we need to define, these are: encapsulates a set of instructions, abstraction, modularity, input parameters, and return output. We will give a thorough definition of each of these five terms in the following sections, however before that, let's make an analogy to baking to help explain these complex concepts.

How about we suppose we want to make chocolate-chip cookies, there are four things we need to make them: the ingredients, a recipe, cooking utensils, and you! For the ingredients we need: flour, eggs, chocolate chips (yum), and any other food the recipe calls for. We follow the recipe to learn, in what order to mix the ingredients together in a bowl, what temperature to set the oven at, and how long to leave the cookies in there.

three chocolate chip cookies and a recipe

The recipe probably guided us in how to make the the cookies in a step-by-step way. A function works similarly, except the computer is the one following the recipe and now you (the computer programmer) is the author of this delicious recipe!

Let's see what this cookie recipe would look like in something called pseudo-code. Pseudo-code is written to look like code, however it is written using more generalized language in order to not be language specific. So we, the reader, can get an idea of what the code is about, without getting into too fine details.

Encapsulates a Set of Instructions

The first key point of a function is that it encapsulates a set of instructions. In the cookie recipe example, the set of instructions were the 3 step instructions on how to make the cookies. Encapsulation therefore, is the process of grouping together a set of instructions. This facilitates the process of making the cookies, as we can now just follow this one recipe, rather than three separate recipes. This increases efficiency and readability.

Input Parameters and Return Output

In our cookie recipe function, flour, eggs, and chocolate chips, are the inputs in the function. An input in a function is technically called an input parameter, or just parameter, this parameter allows for variability in how the function operates. For example, we may pass into the flour parameter: white flour, whole-wheat flour, gluten-free flour, or any other type of flour.

Let's look at the most general form of a function in pseudo-code.

The format of a function varies by programming language. In R, the format is: to start with the keyword function, have within open and closing parenthesis the input_parameters. Then to have the step-by-step instructions within some curly brackets { }. The last line in the function within these brackets uses a statement called return.

Let's discuss the concept of the return statement. In our cookie recipe function, the output — or what we get back — is the cookies. In programming, this output from a function is typically referred to as the return value. This return value can be a specific data type, an object, or even nothing. In the context of our cookie recipe function, the return value is the tray of cookies, which the function provides to us once it has executed.

Abstraction

Next let's talk about what we mean when we say a function is an abstraction.

In the simplest way to think of it, an abstraction is a generalization. Consider the flour parameter example: Even if we introduce a different type of flour, the cookie recipe function still executes the same three steps. Yet, because the flour is different, the taste of the cookies would also turn out different.

However, abstraction goes beyond generalization. It further allows us, as programmers, to delegate the minutiae of the step-by-step processes, trusting the computer to consistently execute the predefined steps. Provided that the steps are correctly defined to yield a cookie, our primary focus then shifts to ensuring the arguments passed into the function are of the appropriate types. Once this condition is met, we can confidently expect our function to deliver a cookie!

Modular

Finally let's talk about why a function is modular.

Representing our cookie recipe as a function introduces modularity. This means it's more straightforward to execute and can adapt to various contexts. For instance, imagine we have a programmable robot baker capable of following any recipe. By inputting our specific cookie recipe, the robot can replicate the baking process. Need a large batch for a party? We can instruct the robot to repeat the recipe until we have a sufficient quantity. Perhaps we fancy freshly baked cookies every Sunday morning; the robot can be scheduled for that. Even lending the robot to a friend becomes a means of sharing the joy of these cookies.

An example function

Let's delve into an example function in R. Functions are created using the keyword function and are typically assigned to a variable for reference. In our case, we'll name the function hello_function, and it will accept a single parameter, name. The sequence of operations the function will execute is enclosed within curly brackets { }. Inside this function, the parameter name is used as a variable. This name variable is integrated into a string (which we will cover in the datatypes lesson), hello_string, utilizing R's built-in function, paste0. Finally, the hello_string is returned using R's return function

In the 6th and final line of the below code we call the function, since the final line of the function returned hello_string, this is the value that will be returned and printed out.