Making 3D plots with lattice

Three-dimensional visualization, although undesirable for certain purposes (where precise interpretation is important, such as in the scientific literature), can nevertheless be particularly impressive and aesthetically appealing. In this section, we are going to use lattice to create three-dimensional plots of spatial and nonspatial data, which is not possible to do with ggplot2 since it only allows two-dimensional plotting. The lattice graphics framework and syntax are no less complex than those of ggplot2, and a single section is far too short to comprehensibly review the subject. Our purpose here is much more modest: to show some of the things that can be achieved and inspire interested readers to investigate further. For more information on lattice, readers are referred to the authoritative overview in the book by package author Deepayan Sarkar, Lattice: Multivariate Data Visualization with R, Springer, which was published in 2008.

As already mentioned in Chapter 2, Working with Vectors and Time Series, lattice is an R package defining a graphics system in R (in addition to base R and ggplot2). We have already been using lattice indirectly, in fact, since the levelplot function from the rasterVis package is an adaptation of a function with the same name from lattice.

Note

It is worth noting that lattice is not the only package that can be used for three-dimensional visualization in R. The most notable alternatives are the base R function called persp, which can be used to create plots analogous to the ones we are going to create with lattice, and the rgl package, which allows you to create dynamic, rotatable 3D plots.

There are, generally speaking, two main types of three-dimensional plots. We can plot points (also known as point clouds or 3D scatterplots) or grids/surfaces. In both cases, the input data comes in the form of (x,y,z) coordinate sets, but to create a grid, these points need to be equally spaced along the x and y axes, while in point clouds there is no such restriction. For example, in statistics, we may have a dataset with two independent variables and one dependent one, and we may wish to plot the observed data points (as points in 3D space), the predicted surface generated by a regression model (as a 3D grid), or both. In lattice, the cloud function can be used to create a three-dimensional scatterplot, while the wireframe function can be used to plot three-dimensional grids. In the following two examples, we will concentrate on wireframe, but keep in mind that points can be plotted essentially the same way with cloud.

Our first example will involve the already familiar Haifa DEM, and this time we are going to create a three-dimensional representation of this raster. The best way to pass data on to lattice is through a data.frame object (although, unlike with ggplot2, other options are also possible). To create a three-dimensional plot, the data.frame needs to contain three columns, holding the x, y, and z coordinates. The following code section creates a data.frame, named dem_df, after the DEM is five-fold aggregated (to make the grid sparser and visually simpler):

> dem_df = as.data.frame(aggregate(dem, 5), xy = TRUE)
> dem_df = dem_df[complete.cases(dem_df), ]
> colnames(dem_df)[3] = "z"
> head(dem_df)
           x       y         z
123 693841.3 3648773  4.000000
124 694291.3 3648773  5.647059
125 694741.3 3648773  7.250000
126 695191.3 3648773  6.500000
127 695641.3 3648773  9.550000
128 696091.3 3648773 11.300000

Before applying wireframe, we need to figure out the right aspect ratio for our 3D plot. The default aspect ratio for the dimensions of the 3D box encompassing the plot is 1:1:1, referring to the x, y, and z axis lengths, so that the box forms a cube. This ratio is specified with the aspect parameter of wireframe, which accepts a two-element vector specifying the y/x and z/x ratios (with the default being c(1,1)). In the case of dem_df, the data represents geographical distances on all three axes. Unless our DEM, by any chance, represents a cubical geographic extent, these aspect ratios are inappropriate. For example, similar to 2D plots produced with ggplot2, we are usually interested in equal cell dimensions on the x and y axes. To accomplish this, we need the y/x ratio to correspond to the ratio of the coordinates' range of y and x. For example, if our DEM represents a rectangle of 100 meters on the x axis and 200 meters on the y axis, to correctly represent it with wireframe, we would have to specify a y/x aspect ratio of 2 (200/100). In our case, the ranges of values on each axis are as follows:

> x_range = diff(range(dem_df$x, na.rm = TRUE))
> x_range
[1] 21600
> y_range = diff(range(dem_df$y, na.rm = TRUE))
> y_range
[1] 36000
> z_range = diff(range(dem_df$z, na.rm = TRUE))
> z_range
[1] 527.0933

The correct y/x ratio is thus equal to y_range/x_range or 1.666667. As for the z/x ratio, since the z axis range of 527.0933 (the difference, in meters, between the lowest and highest points of the DEM) is much smaller than either the x or y axes ranges, drawing the 3D plot with realistic proportions between elevation and x-y distances (z_range/x_range or 0.02440247) would result in a very flat image, with the relief hardly protruding. Unless we make an image of a small area with very steep topography, it is very common to use an exaggerated z to x-y distance ratio in 3D topographic plots, thus making the topography more tangible. Choosing a z/x aspect ratio is really a matter of taste. For this particular plot, a seven-fold exaggeration was chosen, but lower or higher values are also appropriate. It is obviously important to declare the fact that exaggeration has been used when such an image is used in a publication.

Now that we have decided which aspect ratios we are going to use, we are ready to apply wireframe on dem_df. Let's first see the code and the image produced and then review its components:

> library(lattice)
> wireframe(z ~ x * y, 
+ data = dem_df,
+ drape = TRUE, 
+ colorkey = TRUE,
+ col.regions = terrain.colors(100),
+ screen = list(z = 165, x = -60, y = 0),
+ aspect = c(y_range/x_range, 7*(z_range/x_range)),
+ zoom = 1.1)

The following screenshot shows the graphical output:

Making 3D plots with lattice

The first thing to note is that the lattice syntax is analogous to the base R plotting, in the sense that we have distinct plotting functions for each type of plot (such as plot or wireframe), and each function has a set of parameters covering all modifiable elements of the respective plot type. This is very different from the ggplot2 approach, where a plot is constructed layer by layer using numerous function calls rather than just one.

As for the set of parameters we used in this case (which is only a small fraction of all parameters that wireframe has), here is a short explanation for each one:

  • x: In case the data is a data.frame object, this argument should be a formula specifying the columns to use, as a formula object with the dependent variable to the left of the ~ operator and the independent variables to the right, separated by * (for example, z~x*y means we plot z as a function of x and y).
  • data: The data.frame where the values come from.
  • drape: Determines whether the surface will be covered with color (TRUE) or remain a simple skeleton of intersecting lines (FALSE).
  • colorkey: Determines whether a legend is drawn (TRUE) or not (FALSE) alongside the plot.
  • col.regions: The vector of colors used to draw the surface, analogous to the colours parameter in ggplot2.
  • screen: A set of parameters defining the viewing direction by specifying the x, y, and z rotation with respect to the origin (it is really a matter of trial and error to come up with a list of arguments giving the desired perspective).
  • aspect: The aspect ratios of the three-dimensional box encompassing the plot. The effect of this parameter was discussed earlier. In this case, a seven-fold exaggeration of the z axis is specified by multiplying the true z/x ratio by 7 in 7*(z_range/x_range).
  • zoom: A scale factor to magnify or shrink the plot.

It is important to keep in mind that three-dimensional plots are useful to display any kind of three-dimensional dataset and not just topographic surfaces. For example, returning to the spatio-temporal dataset s that we created from the MODIS images time series (see Chapter 6, Modifying Rasters and Analyzing Raster Time Series), we can produce a three-dimensional image with:

  • Time on the x axis
  • Space (y coordinate, in this case) on the y axis
  • NDVI on the z axis

Similarly to the previous example, we first have to create a data.frame object, which we will name s_df, out of the raster s. Using the combination of as.matrix and as.data.frame, we can convert s to a data.frame with columns corresponding to s columns and rows corresponding to s rows (remember that using as.data.frame directly on a raster results in data.frame with rows representing raster cells, which is not what we want in this case):

> s_df = as.data.frame(as.matrix(s))

Since we know that the columns of s represent dates of image acquisition, we can assign dates as the column names:

> colnames(s_df) = dates$date
> s_df[1:5, 1:5]
  2000-02-18 2000-03-05 2000-03-21 2000-04-06 2000-04-22
1   0.341684   0.397015   0.408640   0.416793   0.359633
2   0.341664   0.396391   0.428758   0.427817   0.352741
3   0.349044   0.405022   0.426911   0.429224   0.352297
4   0.351129   0.413696   0.434334   0.417303   0.344761
5   0.358012   0.408954   0.439244   0.411540   0.344320

Since we also know that the rows of s correspond to y axis spatial coordinates, we can assign these coordinates to an additional column (named coord). To obtain the vector of y coordinates of a raster, we can use the yFromRow function of the raster package. All that is left to do after that is to melt the data.frame in order to move the dates from separate columns into a single one:

> s_df$coord = yFromRow(r)
> s_df = melt(s_df, 
+ id.vars = "coord", 
+ variable.name = "date", 
+ value.name = "ndvi")
> head(s_df)
    coord       date     ndvi
1 3494750 2000-02-18 0.341684
2 3494250 2000-02-18 0.341664
3 3493750 2000-02-18 0.349044
4 3493250 2000-02-18 0.351129
5 3492750 2000-02-18 0.358012
6 3492250 2000-02-18 0.342920

What we have here is a data.frame representing a regular (in the x-y direction) three-dimensional grid, so in principle everything is ready to apply wireframe. Unfortunately, however, lattice does not have an automatic method to format date values and draw an appropriate set of tick marks and labels (such as what we have seen in ggplot2 with scale_x_date earlier). Therefore, we are compelled to give up the Date formatting of the date column in s_df. Since what we have is an annual-scale series, a simple way to make proper labels is to convert the dates to numeric values representing year fractions. For example, 2000-1-1 can become 2000, 2000-1-2 can become 2000+1/365=2000.003, and so on. Although it would not be difficult to write our own function to make such a calculation, there already is a function in the lubridate package that does exactly that, called decimal_date:

> library(lubridate)
> s_df$date = decimal_date(as.Date(s_df$date))
> head(s_df)
    coord     date     ndvi
1 3494750 2000.131 0.341684
2 3494250 2000.131 0.341664
3 3493750 2000.131 0.349044
4 3493250 2000.131 0.351129
5 3492750 2000.131 0.358012
6 3492250 2000.131 0.342920

Note

The lubridate package contains a very helpful set of convenience functions to work with dates and times more easily than through base R packages. For more information on this package, see the introductory paper by Grolemund and Wickham (2011).

The resulting data.frame can now be passed to wireframe:

> wireframe(ndvi ~ date * coord, 
+ data = s_df, 
+ drape = TRUE, 
+ arrows = FALSE,
+ col.regions = 
+ colorRampPalette(c("darkred","white","darkblue"))(100),
+ screen = list(z = 15, x = -55, y = 10),
+ aspect = c(0.3, 0.2),
+ panel.aspect = c(0.45),
+ lty = 0,
+ scales = list(arrows = FALSE, cex = 0.6),
+ xlab = list("Time"),
+ ylab = list("Y", cex = 0),
+ zlab = list("NDVI", cex = 0),
+ zoom = 0.95)

The graphical output is shown in the following screenshot:

Making 3D plots with lattice

This time, the colorRampPalette function was used within the wireframe function call to create a custom color scale (going from dark red, through white, and to dark blue). An interesting point to note in this respect is that the function call colorRampPalette(c("darkred","white","darkblue")) in fact returns a function. Indeed, there is no reason why the returned object of a function cannot be another function. The returned function is then used to create a vector of color codes (analogous to the way we did so with built-in functions such as terrain.colors). The other parameters used in the latter wireframe function call (arrows, panel.aspect, lty, scales, xlab, ylab, and zlab) refer to minor details regarding plot appearance and we will not discuss them here (see ?wireframe for more information).

The plot itself shows the periodic behavior of NDVI, which we already visualized in Chapter 6, Modifying Rasters and Analyzing Raster Time Series, where NDVI was mapped to color alone (rather than to both colors and the z axis position, as in the present visualization), and so the plot was two-dimensional. We will leave it up to the reader to decide which version is prettier or easier to interpret. The important point here is that spatio-temporal data is inherently three-dimensional and therefore it is only natural to consider three-dimensional visualization of such data.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset