In this chapter, let’s learn how to perform union in R for Vector and data frame. union() function in R performs union of two or more vectors or data frames. union of two dataframes in R can also accomplished using other roundabout methods which will be discussing below.
union() function in R – union of vectors:
# union in R - Union of two vectors in R x <- c(1:4) y <- c(2:7) union(x,y)
on execution of above code the output will be union of two vectors
union of data frames in R :
Union of two data frames can be easily achieved by using merge() function. Lets see with an example. First lets create two data frames
# Create two data frames df1 = data.frame(CustomerId = c(1:6), Product = c(rep("Oven", 3), rep("Television", 3))) df2 = data.frame(CustomerId = c(4:7), Product = c(rep("Television", 2), rep("Air conditioner", 2)))
df1 will be
CustomerId Product
1 1 Oven
2 2 Oven
3 3 Oven
4 4 Television
5 5 Television
6 6 Television
df2 will be
CustomerId Product
1 4 Television
2 5 Television
3 6 Air conditioner
4 7 Air conditioner
Example 1 : union of two dataframes using merge()
The merge() function takes these two data frames as argument which unions these two dataframes with an option all=TRUE as shown below
# union in R - union of data frames in R df_union1<-merge(df1,df2,all=TRUE) df_union1
so the resultant data frame will be
Thus we have applied union in R for data frames
Example 2 on union function in R of data frames using union() function:
UNION function combines all rows from both the tables and removes duplicate records from the combined dataset. So the resultant dataframe will not have any duplicates.
library(dplyr) # union two dataframes without duplicates union(df1,df2)
so the resultant dataframe will be
Example 3 on union of data frames using rbind() function:
There is an indirect way for union of two data frames in R. it can done In two steps.
- First row bind (rbind) all the data frames ·
- Then Remove the duplicates
These two step has to be done sequentially and has been explained with an example.
Row bind these two data frames as shown below
#row bind the data frames. df_cat<-rbind(df1,df2) df_cat
so the resultant df_cat data frame will be
CustomerId Product
1 1 Oven
2 2 Oven
3 3 Oven
4 4 Television
5 5 Television
6 6 Television
7 4 Television
8 5 Television
9 6 Air conditioner
10 7 Air conditioner
Retrieve only unique rows from the above df_cat data frame as shown below.
#unique function. df_union <- unique(df_cat) df_union
So the final output will be
CustomerId Product
1 1 Oven
2 2 Oven
3 3 Oven
4 4 Television
5 5 Television
6 6 Television
9 6 Air conditioner
10 7 Air conditioner
Thus we have learned how to apply union of data frames indirectly.
For further understanding of union of dataframes using dplyr package in R refer the dplyr documentation