Let's Learn Data Science with R!

Course Summary

Welcome to Let's Learn R for Data Science! We extend a warm welcome, especially if this is your first encounter with R, data science, or programming in general.

In this course, we will embark on a journey through R, assuming no prior programming experience. Each topic is designed to be beginner-friendly, allowing learners from all backgrounds to comfortably grasp the concepts.

While data science heavily relies on statistics, we will prioritize simplicity by keeping the statistics and math to a minimum throughout this course.

R stands out as one of the most popular programming languages for data science due to its user-friendly nature and its powerful abilities in tackling complex challenges in big data analysis. The adoption of R is further amplified by the comprehensive data science package Tidyverse, which offers extensive functionalities. In addition, R is increasingly being integrated into academic curricula and embraced across various industries.

Digi Cafe courses are built around the following three pillars:

  • User-friendly text: Courses are designed as user-friendly as possible, and are text-based so it is easier for a student to find and review material.
  • Interactive code: Code editors that can be run in the browser are spread throughout lessons to get hands-on experience learning the programming language.
  • Community: If you have any questions or just want to engage in discussion on the course material, you may join the Digi Cafe Discord community where we have a chat room for each programming language.

Learning Goals

Upon completing this course, you will have acquired the following knowledge and skills:

  • What is data: We will start with exploring the world of data science and learn about R's pioneering concept of the dataframe, a data structure that organizes data in an intuitive and easy to work with way.
  • What is programming: We will take a brief detour into the history of programming to gain an understanding of how a how a computer interprets code as well as how it's managed.
  • Assignment and classes: We will finish the introduction section by learning some fundamental programming concepts such as assignment and classes.
  • Tidyverse: We will learn the Tidyverse package, which is R's most popular collection of packages for data manipulation and visualization. Two of the key packages in the Tidyverse are dplyr and ggplot2 which includes the data manipulation functions and visualization functions respectively.
  • What is the Tidyverse: In the Tidyverse we will learn how to chain operations, summarise data, create new data from existing data, work with grouped data, and transform its shape.
  • ggplot2: In this visualization package we will learn more on how to make visualizations such as scatter plots, line plots, bar plots, and add colour to them.
  • Loops and conditional statements: These two essential programming techniques are crucial for controlling program flow and automating tasks effectively, where loops enable code repetition and conditional statements allow for selective code execution based on conditions.
  • A final project: We will conclude with a final project which utilizes everything we have learned in the course.

In this course, our primary focus will be on learning R in the context of data science. Instead of simply memorizing individual commands and functions, we will adopt a comprehensive approach. Each lesson will build upon the previous ones, systematically breaking down R concepts to ensure a thorough understanding. So that you can gain first hand experience with R, many of the lessons in this course include interactive code blocks with R code that can be run directly on the webpage.

By the end of the course, you will possess a strong foundation in both R programming and data science. This knowledge will serve as a solid base for further exploration and continued learning in these fields.