Time series

In this section, we will cover a special case of data you may find in R—the time series. This class of data is used in R to represent time data, such as hours, years, or dates in general. Of the data available in the R installation, there are some datasets containing time series, such as the UKgas and economics datasets. For our simple example, we will use the latter, which is a data frame containing population and employment information in the US over the last 40 years. You can find an overview of the dataset information in the help page at ?economics.

Let's first have a look to the dataset and see its structure:

head(economics)
     date         pce        pop      psavert     uempmed     unemploy
1 1967-06-30     507.8     198712       9.8         4.5           2944
2 1967-07-31     510.9     198911       9.8         4.7           2945
3 1967-08-31     516.7     199113       9.0         4.6           2958
4 1967-09-30     513.3     199311       9.8         4.9           3143
5 1967-10-31     518.5     199498       9.7         4.7           3066
6 1967-11-30     526.2     199657       9.4         4.8           3018

As illustrated, this data has the usual dataset structure, with the first column of the data containing dates. For such data, you will not need any special modification. For instance, let's plot unemployment versus time in a typical scatterplot. In this case, instead of dots to represent individual data points, we can use a continuous line, so that we will have a continuous description of the data with time. We can simply use qplot as we would do normally, by selecting the two columns we want to plot and choosing the "line" geometry. This is shown in the following code:

qplot(date, unemploy, data=economics, geom="line")

As you can see in Figure 2.17, the plot looks exactly like a normal plot, with the difference that ggplot2 recognized the time series and represented dates and not simple numbers:

Time series

Figure 2.17: This is an example of a time series plot from the economics dataset, representing unemployment versus time

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset