sub() and gsub() function in R are replacement functions, which replaces the occurrence of a substring with other substring. gsub() function and sub() function in R is used to replace the occurrence of a string with other in Vector and the column of a dataframe. gsub() function can also be used with the combination of regular expression. Lets see an example for each
- sub() Function in R replaces the first instance of a substring
- gsub() function in R replaces all the instances of a substring
- Replacing the occurrence of the string using sub() and gsub() function of the column in R dataframe
- Replacing the occurrence of the string in vector using gsub() and sub() function
Syntax for sub() and gsub() function in R:
- sub(old, new, string)
2. gsub(old, new, string)
old – Already exiting pattern to be replaced.
new – New string to be used for replacement.
String – string, character vector/ dataframe column for replacement
Example of sub() function in R:
sub() function in R replaces only the first occurrence of a substring. The sub function finds the first instance of the old substring and replaces it with the new substring. let’s see with an example.
# sub function in R mysentence <- "England is Beautiful. England is not the part of EU" sub("England", "UK", mysentence)
only England in the first occurrence is replaced with UK. so the output will be
Example of gsub() function in R:
gsub() function in R is global replace function, which replaces all instances of the substring not just the first. Lets see the same example
# gsub function in R mysentence <- "England is Beautiful. England is not the part of EU" gsub("England", "UK", mysentence)
all the occurrences of England is replaced with UK. so the output will be
Example of gsub() function with regular expression in R:
The old argument in the syntax can be a regular expression, which allows you to match patterns in which you want to replace a substring. Lets see an example
# gsub function in R with regular expression mysentence <- "UK is Beautiful. UK is not the part of EU since 2016" gsub("[0-9]*", "", mysentence)
In the above example we have removed all the numbers from the sentence with the help of regular expression.
So the output will be
Example of gsub() function in the column of a dataframe :
First lets create the dataframe as depicted below
df = data.frame (NAME =c ('Alisa','Bobby','jodha','jack','raghu','Cathrine', 'Alisa','Bobby','kumar','Alisa','jack','Cathrine'), Age = c (26,24,26,22,23,24,26,24,22,26,22,25), Score =c(85,63,55,74,31,77,85,63,42,85,74,78)) df
so the resultant dataframe will be
gsub() function in the column of R dataframe to replace a substring:
gsub() function is also applicable in the column of the dataframe in R. Lets see the below example.
## Replace substring of the column in R dataframe df$NAME = gsub("A","E",df$NAME) df
As mentioned every occurrences of “A” is replaced with “E”. so the resultant dataframe will be
gsub() function in the column of R dataframe to replace a substring:
gsub() function in R along with the regular expression is used to replace the multiple occurrences of a pattern in the column of the dataframe. Lets see the below example.
## Replace substring of the column in R dataframe using REGEX df$NAME = gsub(".*^","MR/MRS.",df$NAME) df
As mentioned “MR/MRS.” will be added to the Name column using regular expression. so the resultant dataframe will be