Three-dimensional visualization, although undesirable for certain purposes (where precise interpretation is important, such as in the scientific literature), can nevertheless be particularly impressive and aesthetically appealing. In this section, we are going to use lattice
to create three-dimensional plots of spatial and nonspatial data, which is not possible to do with ggplot2
since it only allows two-dimensional plotting. The lattice
graphics framework and syntax are no less complex than those of ggplot2
, and a single section is far too short to comprehensibly review the subject. Our purpose here is much more modest: to show some of the things that can be achieved and inspire interested readers to investigate further. For more information on lattice
, readers are referred to the authoritative overview in the book by package author Deepayan Sarkar, Lattice: Multivariate Data Visualization with R, Springer, which was published in 2008.
As already mentioned in Chapter 2, Working with Vectors and Time Series, lattice
is an R package defining a graphics system in R (in addition to base R and ggplot2
). We have already been using lattice
indirectly, in fact, since the levelplot
function from the rasterVis
package is an adaptation of a function with the same name from lattice
.
It is worth noting that lattice
is not the only package that can be used for three-dimensional visualization in R. The most notable alternatives are the base R function called persp
, which can be used to create plots analogous to the ones we are going to create with lattice
, and the rgl
package, which allows you to create dynamic, rotatable 3D plots.
There are, generally speaking, two main types of three-dimensional plots. We can plot points (also known as point clouds or 3D scatterplots) or grids/surfaces. In both cases, the input data comes in the form of (x,y,z) coordinate sets, but to create a grid, these points need to be equally spaced along the x and y axes, while in point clouds there is no such restriction. For example, in statistics, we may have a dataset with two independent variables and one dependent one, and we may wish to plot the observed data points (as points in 3D space), the predicted surface generated by a regression model (as a 3D grid), or both. In lattice
, the cloud
function can be used to create a three-dimensional scatterplot, while the wireframe
function can be used to plot three-dimensional grids. In the following two examples, we will concentrate on wireframe
, but keep in mind that points can be plotted essentially the same way with cloud
.
Our first example will involve the already familiar Haifa DEM, and this time we are going to create a three-dimensional representation of this raster. The best way to pass data on to lattice
is through a data.frame
object (although, unlike with ggplot2
, other options are also possible). To create a three-dimensional plot, the data.frame
needs to contain three columns, holding the x, y, and z coordinates. The following code section creates a data.frame
, named dem_df
, after the DEM is five-fold aggregated (to make the grid sparser and visually simpler):
> dem_df = as.data.frame(aggregate(dem, 5), xy = TRUE) > dem_df = dem_df[complete.cases(dem_df), ] > colnames(dem_df)[3] = "z" > head(dem_df) x y z 123 693841.3 3648773 4.000000 124 694291.3 3648773 5.647059 125 694741.3 3648773 7.250000 126 695191.3 3648773 6.500000 127 695641.3 3648773 9.550000 128 696091.3 3648773 11.300000
Before applying wireframe
, we need to figure out the right aspect ratio for our 3D plot. The default aspect ratio for the dimensions of the 3D box encompassing the plot is 1:1:1, referring to the x, y, and z axis lengths, so that the box forms a cube. This ratio is specified with the aspect
parameter of wireframe
, which accepts a two-element vector specifying the y/x and z/x ratios (with the default being c(1,1)
). In the case of dem_df
, the data represents geographical distances on all three axes. Unless our DEM, by any chance, represents a cubical geographic extent, these aspect ratios are inappropriate. For example, similar to 2D plots produced with ggplot2
, we are usually interested in equal cell dimensions on the x and y axes. To accomplish this, we need the y/x ratio to correspond to the ratio of the coordinates' range of y and x. For example, if our DEM represents a rectangle of 100 meters on the x axis and 200 meters on the y axis, to correctly represent it with wireframe
, we would have to specify a y/x aspect ratio of 2 (200/100). In our case, the ranges of values on each axis are as follows:
> x_range = diff(range(dem_df$x, na.rm = TRUE)) > x_range [1] 21600 > y_range = diff(range(dem_df$y, na.rm = TRUE)) > y_range [1] 36000 > z_range = diff(range(dem_df$z, na.rm = TRUE)) > z_range [1] 527.0933
The correct y/x ratio is thus equal to y_range/x_range
or 1.666667
. As for the z/x ratio, since the z axis range of 527.0933
(the difference, in meters, between the lowest and highest points of the DEM) is much smaller than either the x or y axes ranges, drawing the 3D plot with realistic proportions between elevation and x-y distances (z_range/x_range
or 0.02440247
) would result in a very flat image, with the relief hardly protruding. Unless we make an image of a small area with very steep topography, it is very common to use an exaggerated z to x-y distance ratio in 3D topographic plots, thus making the topography more tangible. Choosing a z/x aspect ratio is really a matter of taste. For this particular plot, a seven-fold exaggeration was chosen, but lower or higher values are also appropriate. It is obviously important to declare the fact that exaggeration has been used when such an image is used in a publication.
Now that we have decided which aspect ratios we are going to use, we are ready to apply wireframe
on dem_df
. Let's first see the code and the image produced and then review its components:
> library(lattice) > wireframe(z ~ x * y, + data = dem_df, + drape = TRUE, + colorkey = TRUE, + col.regions = terrain.colors(100), + screen = list(z = 165, x = -60, y = 0), + aspect = c(y_range/x_range, 7*(z_range/x_range)), + zoom = 1.1)
The following screenshot shows the graphical output:
The first thing to note is that the lattice
syntax is analogous to the base R plotting, in the sense that we have distinct plotting functions for each type of plot (such as plot
or wireframe
), and each function has a set of parameters covering all modifiable elements of the respective plot type. This is very different from the ggplot2
approach, where a plot is constructed layer by layer using numerous function calls rather than just one.
As for the set of parameters we used in this case (which is only a small fraction of all parameters that wireframe has), here is a short explanation for each one:
x
: In case the data
is a data.frame
object, this argument should be a formula specifying the columns to use, as a formula
object with the dependent variable to the left of the ~
operator and the independent variables to the right, separated by *
(for example, z~x*y
means we plot z
as a function of x
and y
).data
: The data.frame
where the values come from.drape
: Determines whether the surface will be covered with color (TRUE
) or remain a simple skeleton of intersecting lines (FALSE
).colorkey
: Determines whether a legend is drawn (TRUE
) or not (FALSE
) alongside the plot.col.regions
: The vector of colors used to draw the surface, analogous to the colours
parameter in ggplot2
.screen
: A set of parameters defining the viewing direction by specifying the x
, y
, and z
rotation with respect to the origin (it is really a matter of trial and error to come up with a list of arguments giving the desired perspective).aspect
: The aspect ratios of the three-dimensional box encompassing the plot. The effect of this parameter was discussed earlier. In this case, a seven-fold exaggeration of the z axis is specified by multiplying the true z/x ratio by 7 in 7*(z_range/x_range)
.zoom
: A scale factor to magnify or shrink the plot.It is important to keep in mind that three-dimensional plots are useful to display any kind of three-dimensional dataset and not just topographic surfaces. For example, returning to the spatio-temporal dataset s
that we created from the MODIS images time series (see Chapter 6, Modifying Rasters and Analyzing Raster Time Series), we can produce a three-dimensional image with:
Similarly to the previous example, we first have to create a data.frame
object, which we will name s_df
, out of the raster s
. Using the combination of as.matrix
and as.data.frame
, we can convert s
to a data.frame
with columns corresponding to s
columns and rows corresponding to s
rows (remember that using as.data.frame
directly on a raster results in data.frame
with rows representing raster cells, which is not what we want in this case):
> s_df = as.data.frame(as.matrix(s))
Since we know that the columns of s
represent dates of image acquisition, we can assign dates as the column names:
> colnames(s_df) = dates$date > s_df[1:5, 1:5] 2000-02-18 2000-03-05 2000-03-21 2000-04-06 2000-04-22 1 0.341684 0.397015 0.408640 0.416793 0.359633 2 0.341664 0.396391 0.428758 0.427817 0.352741 3 0.349044 0.405022 0.426911 0.429224 0.352297 4 0.351129 0.413696 0.434334 0.417303 0.344761 5 0.358012 0.408954 0.439244 0.411540 0.344320
Since we also know that the rows of s
correspond to y axis spatial coordinates, we can assign these coordinates to an additional column (named coord
). To obtain the vector of y coordinates of a raster, we can use the yFromRow
function of the raster
package. All that is left to do after that is to melt
the data.frame
in order to move the dates from separate columns into a single one:
> s_df$coord = yFromRow(r) > s_df = melt(s_df, + id.vars = "coord", + variable.name = "date", + value.name = "ndvi") > head(s_df) coord date ndvi 1 3494750 2000-02-18 0.341684 2 3494250 2000-02-18 0.341664 3 3493750 2000-02-18 0.349044 4 3493250 2000-02-18 0.351129 5 3492750 2000-02-18 0.358012 6 3492250 2000-02-18 0.342920
What we have here is a data.frame
representing a regular (in the x-y direction) three-dimensional grid, so in principle everything is ready to apply wireframe
. Unfortunately, however, lattice
does not have an automatic method to format date values and draw an appropriate set of tick marks and labels (such as what we have seen in ggplot2
with scale_x_date
earlier). Therefore, we are compelled to give up the Date
formatting of the date
column in s_df
. Since what we have is an annual-scale series, a simple way to make proper labels is to convert the dates to numeric values representing year fractions. For example, 2000-1-1 can become 2000
, 2000-1-2 can become 2000+1/365=2000.003
, and so on. Although it would not be difficult to write our own function to make such a calculation, there already is a function in the lubridate
package that does exactly that, called decimal_date
:
> library(lubridate) > s_df$date = decimal_date(as.Date(s_df$date)) > head(s_df) coord date ndvi 1 3494750 2000.131 0.341684 2 3494250 2000.131 0.341664 3 3493750 2000.131 0.349044 4 3493250 2000.131 0.351129 5 3492750 2000.131 0.358012 6 3492250 2000.131 0.342920
The resulting data.frame
can now be passed to wireframe
:
> wireframe(ndvi ~ date * coord, + data = s_df, + drape = TRUE, + arrows = FALSE, + col.regions = + colorRampPalette(c("darkred","white","darkblue"))(100), + screen = list(z = 15, x = -55, y = 10), + aspect = c(0.3, 0.2), + panel.aspect = c(0.45), + lty = 0, + scales = list(arrows = FALSE, cex = 0.6), + xlab = list("Time"), + ylab = list("Y", cex = 0), + zlab = list("NDVI", cex = 0), + zoom = 0.95)
The graphical output is shown in the following screenshot:
This time, the colorRampPalette
function was used within the wireframe
function call to create a custom color scale (going from dark red, through white, and to dark blue). An interesting point to note in this respect is that the function call colorRampPalette(c("darkred","white","darkblue"))
in fact returns a function
. Indeed, there is no reason why the returned object of a function cannot be another function. The returned function is then used to create a vector of color codes (analogous to the way we did so with built-in functions such as terrain.colors
). The other parameters used in the latter wireframe
function call (arrows
, panel.aspect
, lty
, scales
, xlab
, ylab
, and zlab
) refer to minor details regarding plot appearance and we will not discuss them here (see ?wireframe
for more information).
The plot itself shows the periodic behavior of NDVI, which we already visualized in Chapter 6, Modifying Rasters and Analyzing Raster Time Series, where NDVI was mapped to color alone (rather than to both colors and the z axis position, as in the present visualization), and so the plot was two-dimensional. We will leave it up to the reader to decide which version is prettier or easier to interpret. The important point here is that spatio-temporal data is inherently three-dimensional and therefore it is only natural to consider three-dimensional visualization of such data.