by() function in R applies a function to specified subsets of a data frame.First parameter of by() function, takes up the data and second parameter is by which the function is applied and third parameter is the function.
Syntax of by() function in R:
data | an R object, normally a data frame, possibly a matrix. |
data$byvar | a factor or a list of factors by which the function is applied |
FUN | a function to be applied to the subsets of data. |
Example of by() function in R:
Lets use the iris and mtcars data set to demonstrate R by() function. If we want to find the mean of sepal.length of the different species, we can use by() function.
# by() function in R with mean mydata <- iris by(mydata$Sepal.Length,list(mydata$Species),mean)
in the above example, mean of sepal.length is calculated for distinct Species, so the output will be
: setosa
[1] 5.006
——————————————————————————-
: versicolor
[1] 5.936
——————————————————————————-
: virginica
[1] 6.588
by() function in R with more than one list:
Lets use mtcars table to demonstrate one more example.
# by() function in R with more than one list by(mydata$hp,list(mydata$gear,mydata$cyl),mean)
in the above example, mean of hp is calculated for distinct gear and cyl combination, so the output will be
: 3
: 4
[1] 97
——————————————————————————-
: 4
: 4
[1] 76
——————————————————————————-
: 5
: 4
[1] 102
——————————————————————————-
: 3
: 6
[1] 107.5
——————————————————————————-
: 4
: 6
[1] 116.5
——————————————————————————-
: 5
: 6
[1] 175
——————————————————————————-
: 3
: 8
[1] 194.1667
——————————————————————————-
: 4
: 8
[1] NA
——————————————————————————-
: 5
: 8
[1] 299.5
Mean of hp with gear =5 and cyl=8 is 299.5 and so on.