How to do it…

The following are the steps to calculate the processing time to find out mean departure delay for each combination of origin and destination airports:

  1. Set up a working directory where the CSV and XDF files are stored.
  2. Read the CSV file.
  3. Calculate the mean departure delay for each combination of origin and destination airport and report the processing time.
  4. Create an object connecting the XDF file with the R session.
  5. Load the RevoScaleR library.
  6. Calculate the mean departure delay for each combination of the origin and destination airport and report the processing time.

The following shows the implementation of the preceding steps:

      # Step-1
setwd("D:/AllSync/Drive/Book-3/codeBundle/ch9")

# Step-2
system.time(
usairlineCSV <- read.csv("csv_USAairlines2016.csv")
)

# Step-3
system.time(
meanDelay<- with(usairlineCSV, aggregate(DEP_DELAY,
by=list(ORIGIN, DEST), FUN= "mean", na.rm=T))
)

# Step-4
system.time(
xdfFile <- file.path(getwd(), "USAirlines2016.xdf")
)

# Step-5
system.time(
sumstatxdf <- rxSummary(DEP_DELAY~ORIGIN:DEST,
summaryStats = "Mean", data = xdfFile)
)
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset