Describe

A Pandas DataFrame class object comes with quite a few useful methods. Two that we have already seen are head and tail. In this lesson we will learn a few additional methods that are useful for providing data summary statistics, here is the list we will look at:

  • describe: generate descriptive statistics
  • mean: calculate the mean
  • std: calculate the standard deviation
  • shape: return the length and width of the dataframe
  • max: calculate the maximum
  • min: calculate the minimum

In the previous code block we looked at describe. This method generates descriptive statistics that provides several summary statistics on each column in the dataframe.

If we simply want the means, then we can use the mean method.

If we want the standard deviation, we can use std.

If we want the number of rows and the number of columns, then we can use shape. Because shape is not a method but an instance attribute (just like self.value we saw in a previous lesson), accessing it doesn't require any parenthesis. The first value is the number of rows and the second is the number of columns.

If we want the maximum or minimum, we can use max or min.

If we want to calculate a statistic for only one column, then we can use bracket notation along with the column name as a string, followed by the method of interest.

Practice exercise

Please print out the maximum value in the Petal.Length column.