When dealing with efficiency issues, a fast way to evaluate two alternative functions can be really useful.
This recipe is going to show you how to do this quickly and effectively and display the results of your comparison in a ggplot
diagram that is easy to understand.
This recipe is going to leverage the microbenchmark
package to compute the function comparison and the ggplot2
package for comparison plotting:
install.packages(c("microbenchmark","ggplot2")) library(microbenchmark) library(ggplot2)
The example that follows is represented by two alternative functions to determine, for a given numeric vector, which elements of the vector are even and which are odd.
Therefore, we first need to initialize the vector we are going to use, populating it with a sequence of numbers from 1 to 1000:
vector <- seq(1:1000)
ifelse()
vectorized function and the standard if()else{}
statement:vectorised_if <- function(vector) { result <- c() for(i in 1:length(vector)) { ifelse(vector[i]%%2==0,result[i] <- "even",result[i] <- "odd") } return(result) } standard_if <- function(vector) { result <- c() for(i in 1:length(vector)) { if(vector[i]%%2 ==0){ result[i] <- "even"} else{result[i] <- "odd"} } return(result) }
microbenckmark
object by passing the two functions that we defined previously to the microbenchmark()
function:comparison <- microbenchmark(standard_if(vector),vectorised_if(vector), times = 100)
What the microbenchmark()
function does is run the two (or more) functions a number n of times (specified from the times argument), recording running time. This lets you understand which function requires less time to run. Plot the microbenchmark
object.
The microbenchmark
objects can be easily plotted using a built-in autoplot()
function, as follows:
autoplot(comparison)
Calling this function on a comparison
object will result in the viewer pane showing up the following plot:
For each one of the two functions, a frequency distribution of the times it's run is shown, assuming this typical mandolin shape. In our example, quite surprisingly, the standard_if()
function shows consistently better performance. For more information, refer to the comparison details. A summary function is provided to gain greater insights on the performed simulations and results:
summary(comparison) expr min lq mean median uq max neval 1 standard_if(vector) 3.354620 3.732035 4.540858 4.157452 4.524025 16.54318 100 2 vectorised_if(vector) 4.987396 5.497869 7.387133 6.009339 6.891614 19.61941 100
This summary shows the min, max, mean, median, and lower and upper quartile, for each evaluated function.