TAPPLY in r function helps to apply statistical operations like mean, median, min or max to ragged arrays. we can also apply the self-written operations for every factor of the input vector.
This function also helps to generate the subset regarding the vector and then we can also apply operations on every created subset. Such as, if we have to find the mean of the marks of the boys and girls in an educational institute stored in as a ragged array. Thus, we can use the
tapply() method with the factor variable gender containing subsets of both male & female.
tapply( X, index, fun= NULL, …, default= NA, simplify= TRUE) #signature
- X: It shows the input data.
- index: It specifies the factor vector that assists us to differentiate the data.
- function: It shows the operation that you want to apply to the input data.
- …: It shows addition arguments as **args in python does.
So, now you have knowledge why we use lappy in r, simply to apply functions on ragged arrays. Let’s discuss a coding example to learn how
lapply() can be used on a data. To do that we are importing iris dataset. Iris dataset is one of the famous dataset used in machine learning or data analysis.
# loading library tidyverse library(tidyverse) # printing the head of dataset iris print(" Head of the dataset:") head(iris) # applying tapply method to get the median of the width of the sepals of all the species regarding iris flower print("Median:") tapply(iris$Sepal.Width, iris$Species, median)
- Line#2: Here we have called the library
tidyverseis basically an open-source library that has many predefined data science techniques to process data.
- Line#5: This line of code shows the print statement for the head of the dataset iris.
- Line#10: Here we have used the
tapply()method to find the median of the width of the sepals regarding all the species of iris. The Sepal.Width is used as input & Species is used as factor vector while applying the function median.