CSV Files

Thus far in our course, we have primarily utilized the iris dataset, which is readily available in every R environment through the simple command data(iris). As you continue to expand your R programming skills after this course, you will inevitably engage with a variety of other datasets. In this lesson, we delve into the process of importing data stored in a CSV file format, a common format for data handling.

CSV - an acronym for Comma Separated Value - stores data by separating values with commas. This preference stems from its straightforward structure and universal readability across modern programming languages.

Creating a CSV file

To see what a CSV file looks like, let's save a dataframe as a CSV file. This task is accomplished through the pd.write_csv method, which necessitates the parameter of a designated filename to save the dataframe to.

Although the writing process doesn't provide a visible output, it structures the file with the header/column names in the first row, the index/row number as the first column then the contents of the dataframe are written, separating all values with commas. In the code block above, we wrote out what the text in this file looks like. Up next, we will explore how to load this file back into memory.

Reading a CSV file

The process of reading a CSV file mirrors that of writing one. Leveraging the read_csv pandas method, we input the filename as the primary argument. To ensure clarity during this demonstration, we will store the loaded dataframe in a new object termed df2, distinguishing it from the previously created df.

However, upon execution, we notice an unexpected Unnamed: 0 column. This addition occurred because Python interpreted the first column of row indices as a data column, defaulting to Unnamed: 0 as its header. To correctly specify this column as the row names, we introduce an additional index_col=0 to the read_csv method. Similarly, in the write_csv method we may add index=False to prevent the index from being written to the file.

Excellent, with this understanding, you are now equipped to both read and write CSV files efficiently!

As we approach the culmination of this course, the next lesson will guide you through a final project, putting into practice the skills acquired throughout our sessions.