How it works...

In the first place, the new function fourNumSum has been defined that produces multi-valued output as a data frame. Later, this function has been passed through the dplyr framework. The code works as follows:

At the start of the dplyr framework code, it takes the entire dataset USAairlineData2016 as an input, and then, it passes through the select() function. Inside the select() function, it only keeps the variable of interest such as MONTH, ORIGIN, and DEP_DELAY. Then, it creates groups by taking all possible combinations of the ORIGIN and MONTH variables. After that, inside the do() function, it calls the user-defined customized function fourNumSum.

Note that, inside the do() function, a dot (.) and the dollar ($) sign are used. The dot is the placeholder for the input data frame, and the dollar sign is to specify the variable of interest for the fourNumSum function. The output of this function is given as follows:

    > head(desStat)
    Source: local data frame [6 x 6]
    Groups: ORIGIN, MONTH [6]

      ORIGIN MONTH MIN_DELAY MEAN_DELAY MEDIAN_DELAY MAX_DELAY
       <chr> <int>     <int>      <dbl>        <dbl>     <int>
    1    ABE     1       -13   9.994186           -2       374
    2    ABE     2       -15  15.841772           -3       449
    3    ABE     4       -14   1.386905           -3       229
    4    ABE     5       -15   2.777778           -4       150
    5    ABE     6       -12   9.200837           -4       460
    6    ABE     7       -16  22.064103           -1      1015
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset