Replace a substring of a column in pandas python can be done by replace() funtion. Let’s see how to
- Replace a substring/string in entire pandas dataframe
- Replace multiple string/substring of multiple columns at once in entire pandas dataframe
- Replace a pattern in entire pandas dataframe.
- Replace Space with underscore in column of pandas dataframe.
- Replace a substring with another substring in pandas column
- Replace a pattern of substring with another substring using regular expression in pandas column
- Replace multiple string/substring at once in a column of pandas dataframe
With examples
First let’s create a dataframe
import pandas as pd import numpy as np #Create a DataFrame df1 = { 'State':['zona AZ','Georgia GG','Newyork NY','Indiana IN','Florida FL','Maharashtra','Delhi'], 'Country':['US','US','US','US','US','India','India']} df1 = pd.DataFrame(df1,columns=['State','Country']) df1
df1 will be
Replace a substring with another substring in pandas
df1.replace(regex=['zona'], value='Arizona')
A substring Zona
is replaced with another string Arizona
. So the resultant dataframe will be
Replace a word/Text/substring in Entire pandas Dataframe:
The Substring India
is replaced with Bharat
, using replace function with regex=True
argument as shown below. Entire dataframe is replaced with substring using regex
## Replace a word/Text in Entire Dataframe using Regex # Method 1 df2 = df1.replace('India','Bharat', regex=True) df2
Result:
Replace a pattern of substring using regular expression:
Using regular expression we will replace the first character of the column by substring ‘HE’
df1.replace(regex=['^.'],value='HE')
so the resultant dataframe will be
Replace Multiple columns at once in pandas dataframe:
In the below example we have replaced multiple values of the pandas dataframe at once using apply function and replace function along with regex.
Method 1:
#Replace Multiple values at once in entire pandas dataframe #Method 1 df2 = df1.apply(lambda x: x.replace({'zona':'Arizona', 'US':'United States'}, regex=True)) df2
Result:
Method 2:
In this example, we will show how to replace part of the string by using regex=True
param. To update multiple string columns, use the dict with key-value pair. The below example updates Zona with Arizona
with on State
column and US
with United States
on Country
column.
# Method 2 Replace multiple values at multiple columns df2 = df1.replace({'State': 'zona', 'Country': 'US'}, {'State': 'Arizona', 'Country': 'United States'}, regex=True) df2
Result:
Replace a Pattern of entire dataframe pandas:
In this example we will be replacing a pattern on entire dataframe, India
is replaced with IND
in the entire dataframe across all columns as shown below.
### Replace a Pattern of entire dataframe pandas df2=df1.replace(regex=['India'],value='IND') df2
Result:
Replace a Pattern of the particular column in dataframe pandas
In this example we will be replacing a pattern on a particular column of the dataframe, India
is replaced with IND
in the Country
column using regex as shown below.
Method1:
### Replace a Pattern of the particular column in dataframe pandas ## Method 1 df1['Country']=df1['Country'].replace(regex=['India'],value='IND') df1
Result:
Method 2:
In this example we will be replacing a pattern on a particular column of the dataframe, India
is replaced with IND
in the Country
column using str.replace() function as shown below.
## Method2 : str.replace() df1['Country'] = df1['Country'].str.replace('India','IND') df1
Result:
Replace space with underscore:
df1.replace(regex=[' '], value='_')
Space is replaced with underscore (_) . So the resultant dataframe will be