Descriptive or summary statistics in python – pandas, can be obtained by using describe function – describe(). Describe Function gives the mean, std and IQR values.
- Generally describe() function excludes the character columns and gives summary statistics of numeric columns
- We need to add a variable named include=’all’ to get the summary statistics or descriptive statistics of both numeric and character column.
- Descriptive statistics or summary statistics of entire dataframe in pandas.
- summary statistics or descriptive statistics of a specific columns of dataframe in pandas
Lets see with an example
Example of Descriptive or Summary Statistics in python
# creation of DataFrame import pandas as pd import numpy as np #Create a Dictionary of series d = {'Name':pd.Series(['Alisa','Bobby','Cathrine','Madonna','Rocky','Sebastian','Jaqluine', 'Rahul','David','Andrew','Ajay','Teresa']), 'Age':pd.Series([26,27,25,24,31,27,25,33,42,32,51,47]), 'Score':pd.Series([89,87,67,55,47,72,76,79,44,92,99,69])} #Create a DataFrame df = pd.DataFrame(d) df
So the resultant DataFrame will be
Pandas- Descriptive or Summary Statistic of the numeric columns:
# summary statistics df.describe()
- describe() Function gives the mean, std and IQR values. It excludes character column and calculate summary statistics only for numeric columns
so the output will be
Pandas – Descriptive or Summary Statistic of all the columns:
# summary statistics of character column df.describe(include='all')
describe() Function with include=’all’ gives the summary statistics of all the columns.
So the output will be
summary statistics of a specific columns of dataframe in pandas:
# summary statistics df[["Age"]].describe()
- df[[‘col_name’]].describe() Function gives the mean, std, count of the specific column in pandas, so the output will be
Pandas – Descriptive or Summary Statistic of the character columns:
# summary statistics of character column df.describe(include=['object'])
- describe() Function with an argument named include along with value object i.e include=’object’ gives the summary statistics of the character columns.