Chapter 2. Getting Started

In this chapter, we will go through the main plot types that can be realized with ggplot2. In the examples, we will use the qplot() basic function so that you have a reference for how to realize such plots, even if you are not interested in a more detailed personalization of the graph details. We will see how to realize the following plots:

  • Histograms and density plots
  • Bar charts
  • Boxplots
  • Scatterplots
  • Time series plots
  • Bubble charts and dot plots

In Chapter 3, The Layers and Grammar of Graphics, we will describe the use of the ggplot function, and in the equivalent coding between qplot and ggplot, we will also discuss how to realize the plots with such a sophisticated function.

General aspects

The qplot (quick plot) function is a basic high-level function of ggplot2. The general syntax that you should use with this function is the following:

qplot(x, y, data, color, shape, size, facets, geom, stat)

The definitions of the various components of this function are as follows:

  • x and y: These represent the variables to plot (y is optional, with a default value of NULL).
  • data: This defines the dataset containing the variables.
  • color, shape and size: These are the aesthetic arguments that can be mapped on additional variables.
  • facets: This defines the optional faceting of the plot based on one variable contained in the dataset.
  • geom: This allows you to select the actual visualization of the data, which, basically, will define the plot that will be generated. The possible values are point, line, and boxplot, and we will see several different examples in the next pages.
  • stat: This defines the statistics to be used for the data.

These arguments represent the most important options available in qplot(). You can find descriptions of the other arguments of this function on the help page of the function, accessible with ?qplot or on the ggplot2 website at http://docs.ggplot2.org/0.9.3/qplot.html.

Thanks to the way the grammar of graphics was conceived, most of the previously mentioned arguments can be applied to different types of plots. For instance, you can use the color argument to do an aesthetics mapping to one variable, and you can do that on a scatter plot as well as a histogram. Exactly the same concept can be seen in facets, which you can use to split data into subplots, independently of the type of plot considered.

Before moving on to the different plots, we should clarify some details about the syntax of aesthetic mapping and faceting, so that you are able to adapt the coming examples to different situations.

Introduction to aesthetic attributes

In ggplot2, the color, shape, and size of graphical objects are aesthetic attributes that are usually mapped to the value of a variable contained in the data. For instance, in your dataset, if you have different series of measurements, you can associate the color attribute with a flag variable and have each series of data in a different color, exactly as we did in Chapter 1, Graphics in R in section ggplot2 and the grammar of graphics, with the following code:

qplot(circumference,age,data=Orange, color=Tree, geom=c("point","smooth"),method="lm", se=FALSE)

This generates a plot (see Figure 1.7) where each series of data from the same tree will have the same color and its relative regression line. We will go into more details on this point in Chapter 3, The Layers and Grammar of Graphics, but for now, what is important to know is that if you want to have, for instance, all the data with the same color, you will need to use the I() function. So, in order to get the same plot with all the data in blue, you will need to specify color=I("blue").

Exactly the same principle applies to size and shape attributes, which you can map to a variable or specify using I(), where, for instance, size=I(3) would produce bigger symbols and shape=I(2) would produce triangles instead of dots.

Introduction to faceting

You can use faceting to create multiple plots by creating a subplot for each level of a categorical variable. The general code for faceting would be facets=a~b, where a and b represent two categorical variables for which data is split. This code would generate a grid containing a subplot for each combination of the a and b variables. However, quite often you may be interested in faceting only relative to one variable; in this case, you would use a code such as facets=a~., where the period indicates that there is no second faceting variable.

Tip

How to change the faceting orientation

When using faceting, you may need to change the panel orientation. This is done by changing the order of the variables, so facets=a~b create one row for each value of a and one column for each value of b, while b~a will do it the other way around. Just remember that the same applies when you only have one variable. So facets=a~. will create one row for each value of a, while facets=.~a will arrange the plots in columns.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset