Chapter 7
Spatial data (also known as geospatial data) are directly or indirectly referenced to a location on the surface of the Earth. Their spatial reference is composed of coordinate values and a system of reference for these coordinates. Spatial data are often accessed, manipulated, or analyzed through Geographic Information Systems (GIS).
Real objects represented by GIS data can be divided into two abstractions: discrete objects (e.g., a road or a river) represented with vector data (points, lines, and polygons), and continuous fields (such as elevation or solar radiation) represented with raster data. The sp package is the preferred option to use vector data in R, and the raster package is the choice for raster data1.
This part exposes several examples where vector and raster data are displayed to show geographic location of features and physical landscape features of a place (reference and physical maps, Chapter 9) or a specific variable in the context of a geographic reference (thematic maps, Chapter 8). These examples make use of several datasets (available at the book website) described in Chapter 10.
The CRAN Tasks View “Analysis of Spatial Data”2 summarizes the packages for reading, vizualizing, and analyzing spatial data. This section provides a brief introduction to sp, raster, rasterVis, maptools, rgdal, gstat, and maps. Most of the information has been extracted from their vignettes, webpages, and help pages. You should read them for detailed information.
The sp package (E. J. Pebesma and R. S. Bivand 2005) provides classes and methods for dealing with spatial data in R. The spatial data classes implemented are points (SpatialPoints), grids (SpatialPixels and SpatialGrid), lines (Line, Lines and SpatialLines), rings, and polygons (Polygon, Polygons, and SpatialPolygons), each of them without data or with data (for example, SpatialPointsDataFrame or SpatialLinesDataFrame).
Selecting, retrieving, or replacing certain attributes in spatial objects with data is done using standard methods:
A number of spatial methods are available for the classes in sp:
The raster package (R. J. Hijmans 2013) has functions for creating, reading, manipulating, and writing raster data. The package provides general raster data manipulation functions. The package also implements raster algebra and most functions for raster data manipulation that are common in Geographic Information Systems (GIS).
The raster package can work with raster datasets stored on disk if they are too large to be loaded into memory. The package can work with large files because the objects it creates from these files only contain information about the structure of the data, such as the number of rows and columns, the spatial extent, and the filename, but it does not attempt to read all the cell values in memory. In computations with these objects, the data are processed in chunks.
The package defines a number of S4 classes. RasterLayer, RasterBrick, and RasterStack are the most important:
The raster package defines a number of methods for raster algebra with Raster* objects: arithmetic operators, logical operators, and functions such as abs, round, ceiling, floor, trunc, sqrt, log, log10, exp, cos, sin, max, min, range, prod, sum, any, and all. In these functions, Raster* objects can be mixed with numbers.
There are several functions to modify the content or the spatial extent of Raster* objects, or to combine Raster* objects:
The rasterVis package (Oscar Perpiñán and R. Hijmans 2013) complements the raster package, providing a set of methods for enhanced visualization and interaction. This package defines visualization methods (levelplot) for quantitative data and categorical data, both for univariate and multivariate rasters.
It also includes several methods in the frame of the Exploratory Data Analysis approach: scatterplots with xyplot, histograms and density plots with histogram and densityplot, violin and boxplots with bwplot, and a matrix of scatterplots with splom.
On the other hand, this package is able to display vector fields using arrows, vectorplot, or with streamlines (Wegenkittl and Gröller 1997), streamplot. In this last method, for each point, droplet, of a jittered regular grid, a short streamline portion, streamlet, is calculated by integrating the underlying vector field at that point. The main color of each streamlet indicates local vector magnitude (slope). Streamlets are composed of points whose sizes, positions, and color degradation encode the local vector direction (aspect).
The maptools package (R. Bivand and Lewin-Koh 2013) provides a set of tools for manipulating and reading geographic data, in particular ESRI (Environmental Systems Research Institute) shapefiles. The package also provides interface wrappers for exchanging spatial objects with packages such as PBSmapping, spatstat, maps, RArcInfo, Stata tmap, WinBUGS, Mondrian, and others. The main functions in the context of this book are
The topology operations on geometries performed by this package (for example, unionSpatialPolygons) use the package rgeos, an interface to the Geometry Engine Open Source (GEOS)3.
The rgdal package (R. Bivand, Keitt, and Rowlingson 2013) provides bindings to the Geospatial Data Abstraction Library (GDAL)4. With readOGR and readGDAL, both GDAL raster and OGR vector map data can be imported into R, and GDAL raster data and OGR vector data can be exported with writeGDAL and writeOGR.
In addition, this package provides access to projection and transformation operations from the PROJ.4 library5. This package implements several spTransform methods providing transformation between datums and conversion between projections using PROJ.4 projection arguments.
The gstat package (E. J. Pebesma 2004) provides functions for geostatistical modeling, prediction, and simulation, including variogram modeling and simple, ordinary, universal, and external drift kriging.
Most of the functionality of this package is beyond the scope of this book. However, some functions must be mentioned:
The maps (Becker, Wilks, Brownrigg, and Minka 2013), mapdata (Becker, Wilks, and Brownrigg 2013), and mapproj (McIlroy et al. 2013) packages are useful to draw or create geographical maps. mapdata contains higher resolution databases, and mapproj converts latitude/longitude coordinates into projected coordinates.
1 Although sp and raster are the most important packages, there are an increasing number of packages designed to work with spatial data. They are summarized in the corresponding CRAN Task View. Read Section 7.2 for details.
2 http://CRAN.R-project.org/view=Spatial
5 https://trac.osgeo.org/proj/
6 http://www.pearsonhighered.com/slocum3e/ and http://highered.mcgraw-hill.com/sites/0072943823/
9 http://CRAN.R-project.org/view=Spatial
10 http://r-forge.r-project.org/softwaremap/trove_list.php?form_cat=353
11 https://stat.ethz.ch/mailman/listinfo/R-SIG-Geo/