Visualizing flights using D3

To get a powerful and fun visualization of the flight paths and connections in this dataset, we can leverage the Airports D3 visualization (https://mbostock.github.io/d3/talk/20111116/airports.html) within our Databricks notebook. By connecting our GraphFrames, DataFrames, and D3 visualizations, we can visualize the scope of all the flight connections as noted for all on-time or early departing flights within this dataset.

The blue circles represent the vertices (that is, airports) where the size of the circle represents the number of edges (that is, flights) in and out of those airports. The black lines are the edges themselves (that is, flights) and their respective connections to the other vertices (that is, airports). Note for any edges that go offscreen, they are representing vertices (that is, airports) in the states of Hawaii and Alaska.

For this to work, we first create a scala package called d3a that is embedded in our notebook (you can download it from here: http://bit.ly/2kPkXkc). Because we're using Databricks notebooks, we can make Scala calls within our PySpark notebook:

%scala
// On-time and Early Arrivals
import d3a._
graphs.force(
  height = 800,
  width = 1200,
  clicks = sql("""select src, dst as dest, count(1) as count from departureDelays_geo where delay <= 0 group by src, dst""").as[Edge])

The results of the preceding query for on-time and early arrivals flights are visualized in the following screenshot:

Visualizing flights using D3

You can hover over the airports (blue circle, vertex) in the airports D3 visualization where the lines are the edges (flights). The preceding visualization is a snapshot when hovering over Seattle (SEA) airport; while the following visualization is a snapshot when hovering over Los Angeles (LAX) airport:

Visualizing flights using D3
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset