How to do it...

Here are the following few steps to complete this recipe:

  1. Import the dataset.
  2. Write the customized function.
  3. Use the newly defined function in the dplyr framework.

Here are the necessary code blocks to implement all the preceding three steps:

        USAairlineData2016 <- read.csv("USAairlineData2016.csv", as.is        
= T)

# the new customized function to calculate summary statistics
fourNumSum <- function(x){
MIN_DELAY = min(x, na.rm=T)
MEAN_DELAY = mean(x, na.rm=T)
MEDIAN_DELAY = median(x, na.rm=T)
MAX_DELAY = max(x, na.rm=T)
return(data.frame(MIN_DELAY=MIN_DELAY, MEAN_DELAY=MEAN_DELAY,
MEDIAN_DELAY=MEDIAN_DELAY, MAX_DELAY=MAX_DELAY))
}
  1. Now, the fourNumSum function will be used within the dplyr framework to carry out the task as follows:
        desStat <- USAairlineData2016 %>%
select(MONTH, ORIGIN, DEP_DELAY) %>%
group_by(ORIGIN, MONTH) %>%
do(fourNumSum(.$DEP_DELAY))
  1. The new object desStat will contain the output and the summary statistics of the DEP_DELAY variable using the fourNumSum function that has been applied over all possible combinations of the ORIGIN and MONTH variables.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset