Equivalent coding between qplot and ggplot

In this section, we will have a look at how we can realize a few of the plots we introduced in Chapter 2, Getting Started, with the ggplot() function we have introduced in this chapter. The idea is that you can use this simple roadmap as guidance on how to use the ggplot() function to generate several kinds of plots, building on the knowledge of qplot() that you already have. We will not go into too much detail about the different plots since many basic concepts have already been introduced in the previous chapter and they apply to both functions.

In the following examples, we will use a few of the geom and stat functions listed in the summary tables previously presented; just remember that for each of these functions, you can map different aesthetic attributes. You can find a list of such attributes in the summary tables or in the help page of the function.

Histograms and density plots

In order to obtain a histogram, we will use the ggplot() function to create the aesthetic assignments to the dataset and the geom_histogram function to assign the geometry that creates the actual histogram. You will see how this is the general framework of how we can use the ggplot() function to create plots. The same process applies to the density plot, with the geom function being the only difference.

In this first example, you can also see the corresponding code with the qplot() function, while for the next example, we will stick to the code of the ggplot() function. The following code will produce the same plot as in Figure 2.4 of Chapter 2, Getting Started:

#### Example with the qplot() function
qplot(Petal.Length, data=iris, geom="histogram", color=Species, fill=Species, alpha=I(0.5))

qplot(Petal.Length, data=iris, geom="density", color=Species, fill=Species, alpha=I(0.5))

#### Example with the ggplot() function
ggplot(data=iris, aes(x=Petal.Length,color=Species,fill=Species)) + geom_histogram(alpha=I(0.5))

ggplot(data=iris, aes(x=Petal.Length,color=Species,fill=Species)) + geom_density(alpha=I(0.5))

You will also notice how the aesthetic assignments of position (only x in this case) and color are provided in the ggplot() function since they can also be applied to the overall plot, while the alpha aesthetic attribute, which applies directly to the histogram, is provided in the geom function. Since we do not make further use of the data other than to produce the histogram, the ggplot() function simply has the function of initializing the plot object. So, in this case, we could also alternatively provide all the arguments in the geom function, as shown in the following code:

ggplot() + geom_histogram(data=iris, aes(x=Petal.Length,color=Species,fill=Species),alpha=I(0.5))

Nevertheless, I would not recommend that you use this kind of coding since it can be more difficult to read.

Bar charts

We will now use the ggplot()function to create the plot represented in Figure 2.6. of Chapter 2, Getting Started. In order to do that, we need our myMovieData dataset which we used in Chapter 2, Getting Started.

ggplot(data=myMovieData, aes(x=Type,fill=factor(Short))) + geom_bar()

As illustrated even in this second example, when using ggplot() instead of qplot(), you simply need to remember that the aesthetic must be provided using the aes() function within the body of the ggplot() function, while the geometry of the plot must be provided using the dedicated geom function. So, this implies that for a traditional plot, that is, in qplot(), you would start by specifying the x and y attributes at the beginning of the function, while in ggplot() you should keep in mind that those assignments are aesthetic assignments, so they are performed with the other aesthetic attributes.

Boxplots

To show you an example of boxplot, we will reproduce Figure 2.12 of Chapter 2, Getting Started. In this case, you will also see how to combine different geometries since we will need to combine the boxplot with the jitter geometry. The following code shows this:

ggplot(data=myMovieData, 
aes(Type,Budget)) + geom_jitter() + geom_boxplot(alpha=I(0.6)) + scale_y_log10()

As illustrated in the jitter geometry defined by the geom_jitter() function, we do not need to specify any argument since it applies to all the arguments already specified in the ggplot() function. On the other hand, for the boxplot geometry, we need to specify the transparencies with the alpha argument. Also, in this case, as already described for the qplot() function in this corresponding example, the order of the geom functions will determine the order of drawing the plot components, so if you draw the boxplot first and then the jittered observations, they would cover the plot. Finally, in this example, you also saw the appearance of the first scale argument. As mentioned, scales are dedicated to the mapping of the data to the aesthetic arguments, and the x-y position is among the aesthetic arguments, so in order to change the axis to a log scale, we need to change the scale used to draw the plot. This is done by overwriting the default layer of the plot (the default scale) using the dedicated scale function. We will go into more detail about the different scales and their relative functions in Chapter 5, Controlling Plot Details.

Scatterplots

In this example, we will recreate Figure 2.16 of Chapter 2, Getting Started using the new functions that you learned about in this chapter. Here, we will represent the data of our ToothGrowth dataset as points, but we will split them into different facets depending on the supplement used to administer vitamin C, and we will also add a smooth line. You have already seen in the Faceting section how to split the data by faceting, but in this example, you will see how to add statistics to the plot, which, in this case, is the smooth line, and also how to combine the different components: geometry of points, statistics, and faceting. The following code shows this:

ggplot(data=ToothGrowth, aes(x=dose, 
y=len)) + geom_point() + stat_smooth() + facet_grid(.~supp)

As illustrated, we combined the different components in a way similar to the previous examples. You simply need to add the different functions on the plot created by ggplot(). In the geom or stat function, you can then provide additional arguments, which, in this example, were not needed.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset