Packages
In our Libraries and Packages lesson, we first introduced the concept of the package. In this lesson, let's dive deeper into this topic.
Up until now we have been utilizing built-in functions like
print
, c
, and data.frame
,
available by default in R. However, there's a vast universe of
functions and tools created by other developers that are
available for public use. These tools and functions are bundled
together in what is aptly called, a package.
Essentially, a package is a collection of code, data, and
accompanying documentation. For R, these are primarily hosted on the
Comprehensive R Archive Network (CRAN).
Every package on CRAN is free for users. For instance, when we accessed
the iris dataset, we used the datasets package. To use a
package, it generally involves two steps: installing the package using
install.packages
and then loading it into our environment
with library
.
install.packages('datasets')
library(datasets)
The command install.packages('datasets')
fetches and installs the datasets package from CRAN onto
the computer, a step that is required only once. Subsequently,
using library(datasets)
loads the package into our R
environment, granting access to the
package's data and functionalities. It's important
to note that every time we initiate a new R session, we'll need to
reload any package we wish to use since R's environment resets at each start.
In fact, the datasets package is automatically
loaded in every R session, so there's no need to install it explicitly
from CRAN. Other foundational packages, which include functions
like data.frame
and c
, are also pre-loaded.
CRAN hosts tens of thousands of packages. Before being made available, each package undergoes rigorous checks to ensure its contents are accurate and reliable. So we can be safe in knowing the function correctly does what it says it does.
Package Versions
When package maintainers want to update the code in their package, they will push a new version of their package to a repository with the new code. Users then have the choice to update to this newer version at their convenience. Such updates are crucial to ensure the package is up-to-date with the latest changes, bug fixes, and new features.
As with most of programming, convention has been established in how
packages are versioned.
Packages are identified by three numbers separated by
a period such as 1.0.2
.
Here is what each of these numbers mean:
1._._
: The first digit is the majour version number. A change here implies significant, potentially breaking alterations to the package._.0._
: The middle digit is the minor version number. An increment here introduces new features without compromising existing functions._._.2
: The last digit represents the patch number. An increment typically means minor fixes or enhancements to the package's functions.