When displaying the results of your analysis, even if at exploratory stages, it is crucial to have the ability to customize your data visualization.
One thing I always find particularly useful is using the text annotations on your plot to highlight the findings in the most effective way.
In ggplot2
, you can do this using the geom_text()
function, moving your string around the plot to adjust the position
argument. So, you will have to try and try again until you find the correct position for your handful of words. But, what if you could just select a location for your text by just clicking on it with your cursor?
That is exactly what this recipe is for; you will be able to add a custom text on your plot and place it at the defined location with a simple click on the plot itself.
We first need to install and load the ggplot2
and ggmap
packages:
install.packages(c("ggplot2","ggmap")) library(ggplot2) library(ggmap)
ggplot2
plot and print it.Define a basic ggplot
where you want to add your text (refer to the How it works… section for more information on ggplot2
plots):
plot <- ggplot(data = iris, aes(x = Sepal.Length, y = Sepal.Width)) + geom_point() plot
text <- "there are some cool correlations here"
location <- gglocator(1, xexpand = c(0,0), yexpand = c(0,0))
Running this code will result in a small pointer appearing on the plot area within the RStudio viewer pane, like the one shown here:
After selecting a point within the plot area, the location
object will have two attributes:
You can easily see them by running location
within your R console.
plot + scale_x_continuous(expand = c(0,0)) + scale_y_continuous(expand = c(0,0)) + geom_text(aes(x = location[[1]],y = location[[2]],label= text))
plot + scale_x_continuous(expand = c(0,0)) + scale_y_continuous(expand = c(0,0)) + geom_text(aes(x = location[[1]],y = location[[2]],label= text),colour = "red")
plot + scale_x_continuous(expand = c(0,0)) + scale_y_continuous(expand = c(0,0)) + geom_text(aes(x = location[[1]],y = location[[2]],label= text),colour = "red",size = 7)
Step 1 requires you to create a ggplot
, which is a plot based on the grammar of graphics.
An extensive introduction to the grammar of graphics and its implementation in the ggplot2
package is outside the scope of this book. For the grammar of graphics, you can refer to The Grammar of Graphics by Leland Wilkinson (http://www.springer.com/us/book/9780387245447), which is the foundation of this data visualization theory.
Talking about ggplot2
, even though the Web is full of articles and tutorials about the usage of this package, I would rather suggest reading the original paper, A Layered Grammar of Graphics by our dear Hadley Wickham (http://byrneslab.net/classes/biol607/readings/wickham_layered-grammar.pdf); this is what everything started from.
For our purposes, we can just say that every ggplot
must be composed at least of these three layers:
Let's look at an example based on the following data frame:
dataset <- data.frame(independent = seq(1:3),dependent = seq(4:6))
Now, we can define the first layer using the ggplot()
function, as follows:
ggplot(data = dataset, aes(x = independent,y = dependent))+
Finally, as the +
symbol suggests, we can add the geometric layer:
geom_point()
Summing up all:
ggplot(data = dataset, aes(x = independent,y = dependent))+ geom_point()
This is the base of every
ggplot2
plot. Starting from this point, it is possible to build infinite types of predefined and even customized plots, adding layer over layer.
In our example, we define a basic ggplot
mapping to the Sepal.Length
and Sepal.Width
variables from the Iris dataset.
In step 2, we define the text we want to add to the ggplot
plot and assign it to a text
object, which is a string vector.
Step 3 leverages gglocator()
, a function provided by the ggmap
package, which is intended to work as an equivalent of the locator()
function for the plot()
function. After clicking with the crossed cursor on a point on the plot, the location
object will be populated by x and y space coordinates, which will then be used as a reference for custom text placement.
In step 4, we reproduce our base plot by scaling it through scale_x_continuos()
and scale_y_continuos()
to ensure that the origins of the axes are plotted. These two functions actually define a new layer within our ggplot
, namely a scale layer specifying the scale for this plot.
This is needed in order to ensure comparability between plot coordinates and the coordinates stored within the location object, setting all to the point(x = 0, y = 0) at the origin.
Finally, we add the text using the geom_text()
function.
This function, in case you are wondering what it does, adds another layer to the plot containing the specified text.
The geom_text()
function requires you to specify the text to be placed as an annotation through the argument label
, and the x and y coordinates to be passed within the aes
argument, as seen in step 1. While the text is passed through the text
object, the last two parameters are set to extract from the location
object attributes x
and y
.
Be aware that not passing x
and y
, and generally the aes
argument, will result in geom_text()
looking within the aestethics arguments passed within the ggplot()
call.
This will result in text being printed on every point of your plot, as if it were a label of your data.
In step 5, the text color can be changed through the colour
parameter within the geom_text()
function.
If you want to get an idea of the colors you can use, you should take a look at the minimalist but rather great document Colors in R (http://www.stat.columbia.edu/~tzheng/files/Rcolor.pdf), which shows all the tonalities available in base R.
For more advanced color settings, I suggest the
RColorBrewer
package (https://cran.r-project.org/web/packages/RColorBrewer/index.html). This is based on the Color Brewer project, a really interesting initiative about colors in cartography developed by Cynthia Brewer. You can get to know more about the project at http://colorbrewer2.org/.
You can also change the size of the text using the size
argument.