Object Oriented Programming with Classes
How we structure and design the flow of our code is up to the programmer. However, most programming languages have evolved to support a particular paradigm more than others. For Python, this is object-oriented programming. Object-oriented programming is a programming paradigm that places focus on the creation of objects derived from a class, that contain both data and functionality, and is currently the most popular paradigm. Two other popular programming paradigms are functional programming and procedural programming which focus on the creation of functions and procedures, respectively.
For demonstration purposes for this lesson, let's imagine we want to build the next Netflix. A crucial system that Netflix needs to have in place is a mechanism to consolidate all information related to a movie, such as its title and the year it was released, into a single bundle. To illustrate the strength and versatility of object-oriented programming, we will create a class to create, bundle and update this information together.
A class is a template that defines properties (often called attributes) and behaviors (referred to as methods) for creating objects. These objects are individual instances of the class, embodying the defined attributes and methods.
Think of a class as a template for creating something, much like the recipe analogy we made in the Functions lesson. To write this template in Python, we use the keyword class, then give our template a name. Good coding naming convention usually suggests to start each word in the class name with a capital letter to differentiate it from functions or variables. This naming convention is called CamelCase. Before we go further into the details, let's see what a class looks like.
Imagine we have a template for a paper notebook called MovieData.
Let's visualize that we have a special printer that can create a notebook
precisely following this template provided to it. The line
class MovieData:
is analogous to saying, "Here is our
template named MovieData, which outlines the content that can be
included in a notebook." Within this template, the line
title = "Digi Cafe: The Movie"
specifies that every notebook
created should bear the title: "Digi Cafe: The Movie". When we associate a
variable with a class, as we did with title, the variable is specially
referred to as an attribute.
Then, in the line movie = MovieData()
, it's as if we are
initiating the actual printing process and creating the physical notebook
using our MovieData template. We add the parenthesis after the
class name to indicate that we are constructing the notebook, not merely
referencing the template, and we assign it to the variable movie;
this action is referred to as instantiating an object.
Finally, we can access the title attribute by separating it
and the class object with a period. As we did in the
print(movie.title)
line.
Inside this template, we can have functions. Recall a function
starts with the keyword def.
However, when a function is inside a class, we also give it a special name to
differentiate itself called a method. Let's add a method to our
MovieData template which we will call print_title
.
Adding a method to our notebook is giving it the power to have functionalities
of it's own. The physical representation of this, is that we can imagine
we are not just making a notebook made of paper, but now an electronic
notebook such as a tablet or iPad. This electronic notebook has the ability to
display the title on its screen and any method we add to it, is like adding a
button to the screen that we can press to make something happen.
In this method we will add a parameter called self, which is a special parameter that allows us to access the attributes or methods that belong to the class. In this case, we will access the title attribute.
Class construction
The class that we created above is rather simple in structure. In reality, classes can be much more complex. Let's start exploring how we can increase the generalizability of our class by adding parameters into it.
We may notice that the title is actually hard-coded
as "Digi Cafe: The Movie". What if we want to catalogue any other movie?
We can do so by adding a parameter to the class initializer.
There is a unique method named __init__
that Python
looks for when we are constructing the class (that is the word "init",
with double underscores "_" on each side of it). This method acts like
the initial setup for our electronic notebook template. It helps us
set some initial things when we're creating a notebook based on our
template. With this method, we can pass in parameters to set the
initial values of our attributes. And this method is called an
initializer and allows us to move the assignment
of the attributes to within this, instead
of at the top level of the class.
Being able to accept parameters allows for variability and
facilitates code reuse and simplifies our work.
In the next code block, we will update our class to take in two parameters, title and release_year, by adding them to the initializer method. We will then assign these parameters to the attributes title and release_year respectively. Note that we still have to pass in the self parameter, even though we don't use it, due to the way Python handles classes.
We printed out the title with the print_title
method, something
else we could do is define a special method called __str__
which will be called whenever we print the object. Just like
__init__
this a special method (also called a
magic method) and is denoted by the double underscores.
This method, when added to our class, will be called whenever an object
is called by a string, such as with str
or within a
print
statement. Within this method, we will cast
the release_year attribute to a string (checkout the
Datatypes lesson for a review on casting),
and then return a string with the title and release year.
This casting is necessary to collect both values into one string.
Oh no, the title is not quite right. Which movie in the Harry Potter series in particular
do we mean? If we want to update the title attribute, one way to
do so, would be to create the class again by instantiating a new object.
This would be akin to creating a completely new electronic notebook with
the new title. However, that would be rather wasteful, a more efficient
thing to do, is to define a method that can make this update on the
original notebook.
In the update_title
method, we
will take in a parameter new_title and assign it to the
title attribute, updating the title of the notebook.
Great! We have created a way to collect information together about our movies. This covers most of the utilities we will use with classes in data science. There are many more things we could do, such as multiple inheritance, method nesting, and composition, if you are interested in delving deeper into classes, then please checkout the Let's Learn Python for Software Development! course.