Describe
A Pandas DataFrame class object comes with quite a few useful methods.
Two that we have already seen are head
and tail
.
In this lesson we will learn a few additional methods that are useful for
providing data summary statistics, here is the list we will look at:
describe
: generate descriptive statisticsmean
: calculate the meanstd
: calculate the standard deviationshape
: return the length and width of the dataframemax
: calculate the maximummin
: calculate the minimum
In the previous code block we looked at describe
. This
method generates descriptive statistics that provides several
summary statistics on each column in the dataframe.
If we simply want the means, then we can use the mean
method.
If we want the standard deviation, we can use std
.
If we want the number of rows and the number of columns, then we can
use shape
. Because shape is not a method but an instance
attribute (just like self.value we saw in a previous
lesson), accessing it doesn't require any parenthesis. The first value
is the number of rows and the second is the number of columns.
If we want the maximum or minimum, we can use max
or min
.
If we want to calculate a statistic for only one column, then we can use bracket notation along with the column name as a string, followed by the method of interest.
Practice exercise
Please print out the maximum value in the Petal.Length column.