Generate row number in pandas python

In order to generate row number in pandas python we can use index() function and arange() function. row number of the dataframe in pandas is generated from a constant of our choice by adding the index to a constant of our choice.  row number of the group in pandas can also generated in similar manner. Let’s see how to

  • Generate row number of the dataframe in pandas python using arange() function
  • Generate row number of the group.i.e. row number by group in pandas dataframe.
  • Generate the column which contains row number and locate the column position on our choice
  • Generate the row number from  a specific constant in pandas
  • Assign value for each group in pandas python

First let’s create a dataframe

### Create dataframe
import pandas as pd
import numpy as np

data = {'Product':['Box','Bottles','Pen','Markers','Bottles','Pen','Markers','Bottles','Box','Markers','Markers','Pen'],
       'State':['Alaska','California','Texas','North Carolina','California','Texas','Alaska','Texas','North Carolina','Alaska','California','Texas'],
       'Sales':[14,24,31,12,13,7,9,31,18,16,18,14]}

df1=pd.DataFrame(data, columns=['Product','State','Sales'])

df1

df1 will be

generate row number of the dataframe in pandas python 11

 

 

Generate row number of the dataframe in pandas python using arange() function:

In order to generate the row number of the dataframe in python pandas we will be using arange() function.  arange() function takes up the dataframe  as input and generates the row number.

 
#### row number using arange() in numpy

import numpy as np

df1['row_num'] = np.arange(len(df1))
print (df1)

so the resultant dataframe with row number will be

generate row number of the dataframe in pandas python 12

 

 

Generate row number in pandas using index() function:

In order to generate the row number in pandas we can also use index() function.  dataframe.index() function generates the row number.

 
##### Generate in row number using index() function

df1['row_num'] = df1.reset_index().index
df1

so the resultant dataframe with row number will be

generate row number of the dataframe in pandas python 12

 

 

Generate row number in pandas and insert the column on our choice:

In order to generate the row number of the dataframe in python pandas we will be using arange() function. insert() function inserts the respective column on our choice as shown below. in below example we have generated the row number and inserted the column to the location 0. i.e. as the first column

 
##### get row number of the dataframe and insert it as first column

df1.insert(loc=0, column='row_num', value=np.arange(len(df1)))
df1

so the resultant dataframe with row number generated and the column inserted at first position will be

generate row number of the dataframe in pandas python 14

 

 

Generate the row number from  a specific constant in pandas

We need to add a value (here 430) to the index to generate row number and the result is stored in a new column as shown below. AS the result row numbers are started from 430 and continued to 431,432 etc, as 430 is kept as base.

### Generate row number from a constant of our choice
df1['New_ID'] = df1.index + 430
df1

So the resultant dataframe with row number generated from 430 will be

generate row number of the dataframe in pandas python 14

 

 

Generate row number of the dataframe by group in pandas:

In order to generate the row number of the dataframe by group in pandas  we will be using cumcount() function and groupby() function.  groupby() function takes up the dataframe  columns that needs to be grouped as input and generates the row number by group.

 
##### Row number by group

df1['row_number_by_group']=df1.groupby(['Product'])['Sales'].cumcount()+1
df1

So the resultant dataframe with row number generated by group is

generate row number of the dataframe in pandas python 15

 

Assign the value for each group in pandas:

We can assign a value for each group in pandas using ngroup() function and groupby() function. in our example we have assigned a value of distinct product groups. say Bottles as 0, Box as 1, Marker as 2 and Pen as 3.

 
### Assign a number to the group

df1['group_number']=df1.groupby(['Product'])['Sales'].ngroup()
df1

so the resultant dataframe will be

generate row number of the dataframe in pandas python 16

 

p Generate row number in pandas python                                                                                                           n Generate row number in pandas python

Author

  • Sridhar Venkatachalam

    With close to 10 years on Experience in data science and machine learning Have extensively worked on programming languages like R, Python (Pandas), SAS, Pyspark.

    View all posts