How to do it…

The sequence of tasks will be as follows:

  1. Import the dataset.
  2. Carry out variables selection by taking only relevant variables.
  3. Remove the rows with negative values in departure delay.
  4. Calculate mean delay over months.
  5. Visualize the results.

Here is the code that performs these operations in one go:

        USAairlineData2016 <- read.csv("USAairlineData2016.csv", as.is  
= T)
USAairlineData2016 %>%
select(MONTH, DEP_DELAY) %>%
filter(DEP_DELAY>=0) %>%
group_by(MONTH) %>%
summarize(avgDelay=mean(DEP_DELAY)) %>%
qplot(factor(MONTH),avgDelay,data=.,group=1,geom=c("line",
"point")) %>%
add(xlab("Month")) %>%
add(ylab("Mean delay (in min)")) %>%
add(ggtitle("Mean delay in departure over months of 2016"))
%>%
add(theme_bw()) %>%
print
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset