Let’s see how to create Hierarchical indexing or multiple indexing in python pandas dataframe. We will be converting a normal dataframe to hierarchical dataframe. Lets see with an example
Create Dataframe:
import pandas as pd import numpy as np #Create a DataFrame d = { 'Name':['Alisa','Bobby','Cathrine','Alisa','Bobby','Cathrine', 'Alisa','Bobby','Cathrine','Alisa','Bobby','Cathrine'], 'Exam':['Semester 1','Semester 1','Semester 1','Semester 1','Semester 1','Semester 1', 'Semester 2','Semester 2','Semester 2','Semester 2','Semester 2','Semester 2'], 'Subject':['Mathematics','Mathematics','Mathematics','Science','Science','Science', 'Mathematics','Mathematics','Mathematics','Science','Science','Science'], 'Score':[62,47,55,74,31,77,85,63,42,67,89,81]} df = pd.DataFrame(d,columns=['Name','Exam','Subject','Score']) df
so the resultant dataframe will be
Hierarchical indexing or multiple indexing in python pandas:
# multiple indexing or hierarchical indexing df1=df.set_index(['Exam', 'Subject']) df1
set_index() Function is used for indexing , First the data is indexed on Exam and then on Subject column
So the resultant dataframe will be a hierarchical dataframe as shown below
View Index:
One can view the details of index as shown below
# View index df1.index
So the result will be
MultiIndex(levels=[[‘Semester 1’, ‘Semester 2’], [‘Mathematics’, ‘Science’]],labels=[[0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1], [0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 1]],
names=[‘Exam’, ‘Subject’])
Swap the column in the hierarchical index:
Now let’s swap the “Subject” and “Exam” columns in the above hierarchical dataframe as shown below
# Swap the column in multiple index df1.swaplevel('Subject','Exam')
So the resultant swapped hierarchical dataframe will be
Hierarchical indexing or multiple indexing in python pandas without dropping:
Now lets create a hierarchical dataframe by multiple indexing without dropping those columns
So all those columns will again appear
# multiple indexing or hierarchical indexing with drop=False df1=df.set_index(['Exam', 'Subject'],drop=False) df1