Chapter 2

Displaying Time Series: Introduction

A time series is a sequence of observations registered at consecutive time instants. When these time instants are evenly spaced, the distance between them is called the sampling interval. The visualization of time series is intended to reveal changes of one or more quantitative variables through time, and to display the relationships between the variables and their evolution through time.

The standard time series graph displays the time along the horizontal axis. Several variants of this approach can be found in Chapter 3. On the other hand, time can be conceived as a grouping or conditioning variable (Chapter 4). This solution allows several variables to be displayed together with a scatterplot, using different panels for subsets of the data (time as a conditioning variable) or using different attributes for groups of the data (time as a grouping variable). Moreover, time can be used as a complementary variable that adds information to a graph where several variables are confronted (Chapter 5).

These chapters provide a variety of examples to illustrate a set of useful techniques. These examples make use of several datasets (available at the book website) described in Chapter 6.

2.1 Packages

The CRAN Tasks View “Time Series Analysis”1 summarizes the packages for reading, vizualizing, and analyzing time series. This section provides a brief introduction to the zoo and xts packages. Most of the information has been extracted from their vignettes, webpages, and help pages. You should read them for detailed information.

Both packages extensively use the time classes defined in R. The interested reader will find an overview of the different time classes in R in (Ripley and Hornik 2001) and (Grothendieck and Petzoldt 2004).

2.1.1 zoo

The zoo package (Zeileis and Grothendieck 2005) provides an S3 class with methods for indexed totally ordered observations. Its key design goals are independence of a particular index class and consistency with base R and the ts class for regular time series.

Objects of class zoo are created by the function zoo from a numeric vector, matrix, or a factor that is totally ordered by some index vector. This index is usually a measure of time but every other numeric, character, or even more abstract vector that provides a total ordering of the observations is also suitable. It must be noted that this package defines two new index classes, yearmon and yearqtr, for representing monthly and quarterly data, respectively.

The package defines several methods associated with standard generic functions such as print, summary, str, head, tail, and [ (subsetting). In addition, standard mathematical operations can be performed with zoo objects, although only for the intersection of the indexes of the objects.

On the other hand, the data stored in zoo objects can be extracted with coredata, which drops the index information, and can be replaced by coredata<-. The index can be extracted with index or time, and can be modified by index<-. Finally, the window and window<- methods extract or replace time windows of zoo objects.

Two zoo objects can be merged by common indexes with merge and cbind. The merge method combines the columns of several objects along the union or the intersection of the indexes. The rbind method combines the indexes (rows) of the objects.

The aggregate method splits a zoo object into subsets along a coarser index grid, computes a function (sum is the default) for each subset, and returns the aggregated zoo object.

This package provides four methods for dealing with missing observations:

  1. na.omit removes incomplete observations.
  2. na.contiguous extracts the longest consecutive stretch of non-missing values.
  3. na.approx replaces missing values by linear interpolation.
  4. na.locf replaces missing observations by the most recent non-NA prior to it.

The package defines interfaces to read.table and write.table for reading, read.zoo, and writing, write.zoo, zoo series from or to text files. The read.zoo function expects either a text file or connection as input or a data.frame. write.zoo first coerces its argument to a data.frame, adds a column with the index, and then calls write.table.

2.1.2 xts

The xts package (Ryan and Ulrich 2013) extends the zoo class definition to provide a general time-series object. The index of an xts object must be of a time or date class: Date, POSIXct, chron, yearmon, yearqtr, or timeDate. With this restriction, the subset operator [ is able to extract data using the ISO:86012 time format notation CCYY-MM-DD HH:MM:SS. It is also possible to extract a range of times with a from/to notation, where both from and to are optional. If either side is missing, it is interpreted as a request to retrieve data from the beginning, or through the end of the data object.

Furthermore, this package provides several time-based tools:

  • endpoints identifies the endpoints with respect to time.
  • to.period changes the periodicity to a coarser time index.
  • The functions period.* and apply.* evaluate a function over a set of non-overlapping time periods.

2.2 Further Reading

  • (Wills 2011) provides a systematic analysis of the visualization of time series, and a section of (Jeffrey Heer, Bostock, and Ogievetsky 2010) summarizes the main techniques to display time series.
  • (Cleveland 1994) includes a section about time series visualization with a detailed discussion of the banking to 45° technique and the cut-and-stack method. (J. Heer and Agrawala 2006) propose the multi-scale banking, a technique to identify trends at various frequency scales.
  • (Few 2008; J. Heer, Kong, and Agrawala 2009) explain in detail the foundations of the horizon graph (Section 3).
  • The small multiples concept (Sections 3.2 and 4.1) is illustrated in (Tufte 2001; Tufte 1990).
  • Stacked graphs are analyzed in (Byron and Wattenberg 2008), and the ThemeRiver technique is explained in (Havre et al. 2002).
  • (Cleveland 1994; Friendly and Denis 2005) study the scatterplot matrices (Section 4.1), and (D. B. Carr et al. 1987) provide information about hexagonal binning.
  • (Harrower and Fabrikant 2008) discuss the use of animation for the visualization of data. (Few 2007) exposes a software tool resembling the Trendalyzer.
  • The D3 gallery3 shows several great examples of time-series visualizations using the JavaScript library D3.js.

1 http://CRAN.R-project.org/view=TimeSeries

2 http://en.wikipedia.org/wiki/ISO_8601

3 https://github.com/mbostock/d3/wiki/Gallery

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset