googleVis
is a package in R that mainly interfaces R and Google Chart's API. This means that you can create Google charts within R via high-level functions. This has the great advantage of not needing to make service calls and parse the objects to generate the charts. Unlike traditional plotting in R, Google charts are displayed in a browser. In fact, their plot creation functions do not display a plot directly but generate an HTML code.
When working under R but not in a Shiny application, a plot()
call with the HTML object as argument automatically opens a browser with the corresponding plot. The following is an example of this:
data(iris) iris.table <- aggregate(Petal.Length ~ Species, data=iris, FUN="mean") column.chart <- gvisColumnChart(iris.table,"Species","Petal.Length") plot(column.chart)
As it was said previously, gvisColumnChart()
does not generate a plot by itself but it generates a list with an HTML code that will generate the corresponding plot afterwards. This can be seen, if required, by typing the object name where the output was stored (in this case, column.chart
) or by directly calling the function without storing it anywhere. This will display the generated list that contains the HTML code in the console.
From this small example, the advantages and disadvantages of this package can be already appreciated: the main advantage is that this enables us to create very attractive graphics (with tooltips and other types of in-graphic interactions) with very little effort. However, this has the drawback that the possibilities of customization are limited and in comparison to native R plotting options it is harder to code. Nevertheless, if what you need to display can be done within Google charts, the easiest and nicest way to do it will be probably in R.
The googleVis
functions tend to have unusual argument syntaxes. This relies on the fact that although most of the passed arguments and functionalities of Google charts have been adapted to R's coding style, there are still some parameters that need to be defined in an unusual way for R, especially when some layouts need to be different from the default.
Another unusual fact of the functions in googleVis
is that they always receive a data frame. These data frames already must have the data ready to plot, as Google charts will only ask for the variables needed to do the plots from within the passed data frame. This relies on the fact that all these functions only transform the data frame to a JSON data object but perform no calculations unlike, for example, boxplots in the graphics package (in which the whole vector can be directly passed and R calculates the different values needed to do the plots).
Options parameter for the googleVis() functions
Almost every function in googleVis
provides the possibility of changing some of the default layouts. In R, they can be changed by passing the desired value in the corresponding argument inside options
. Unfortunately, the full documentation of the different options that can be edited is not available in R's package documentation.
However, they can be found at https://developers.google.com/chart/interactive/docs/gallery. All the options that appear in each of the functions can be edited in the way described. Some of the following examples illustrate how to do this.
As it happened with the previous graphics package, only some of the graphical possibilities will be covered in depth in this section. However, the package covered here provides a very good demo that can be run by typing the following:
demo(googleVis)
This statement triggers commented examples where the user can clearly appreciate what each function does and how it must be constructed. Anyway, for further questions, the documentation can be found at http://cran.r-project.org/web/packages/googleVis/googleVis.pdf.
These are the topics that will be covered within this section. They were chosen mainly due to their novelty. As it can be seen in the demo, googleVis
provides all kinds of graphics but these are probably either unusual or different from traditional plotting options:
Candlesticks are graphics packages designed for financial analysis to describe the behavior of a variable within a period of time. Originally, they were created to describe the behavior of stocks per day. Although, they look very similar to boxplots, candlesticks should not be confused with these as they represent completely different things.
In candlesticks, only four values from the series that they represent are displayed: the first one, the last one, the highest one, and the lowest one. The graph mainly consists of a rectangle whose height is the difference between the first and the last value. Of course, the lowest value between the first and the last value will represent the lower side of the rectangle while the highest value will represent the upper side.
The fill color of the rectangle will be different according to whether the first value is greater or smaller than the last value. Lastly, from the rectangle's top and bottom, two lines will be drawn until the lowest or highest value respectively.
The googleVis
function that creates the HTML to do the candlestick charts receives a data frame and a string representing the name of the variable for each of the values needed to draw a candlestick chart (that is, low, high, open, and close). The categorical variable (that could be, for example, a date) goes under xvar
. gvisCandlestickChart()
will plot one candlestick per variable.
The following example shows an artificially created dataset that matches some conditions, for example, the low value is in fact the lowest one for each series, and then each variable is passed to its corresponding argument in gvisCandlestickChart()
:
library(googleVis) #Artificial dataset generation example.data <- data.frame(year = 2005:2014, open = runif(10,0,100), close = runif(10,0,100)) example.data$low <- apply(example.data[,2:3],1, function(x) min(x) - runif(1,0,10)) example.data$high <- apply(example.data[,2:3],1, function(x) max(x) + runif(1,0,10)) #Plotting candlestick.chart <- gvisCandlestickChart(example.data, xvar = "year", low="low",open="open", close="close",high="high") plot(candlestick.chart)
Although there are other possibilities to display geolocalized visualizations, using googleVis
is definitely the best one as Google charts is perfectly integrated to Google Maps to provide, in the end, very simple ways to display visualizations with maps and georeferenced data.
There are several different possibilities to plot geolocalized data in R, but they can be divided in two big groups: the ones that use latlong
values, and the ones that refer to a geographical space by name (for example, a country name). Most of the functions that create visualizations based on geolocalized data accept both the alternatives as locationvar
. Two examples using each of these are given in the following.
In the first one, an artificial data frame with approximate latlong values inside the USA is plotted. Here, region is set to US
inside the options argument. The default for this argument is world
(that is, display of the whole world):
library(googleVis) #Artificial Dataset generation latitudes <- runif(10,27,49) longitudes <- runif(10,-125,-72) values <- runif(10,0,100) us.dataset <- data.frame(lat=latitudes,long=longitudes,val=values) #Generate a latlong variable as expected in 'locationvar' us.dataset$latlong <- paste(us.dataset$lat,us.dataset$long,sep=":") #Map HTML creation us.map <- gvisGeoChart(us.dataset, locationvar="latlong",sizevar="val", options = list(region="US")) #Plotting plot(us.map)
Alternatively, different codes or names representing geographical regions can be used. In the following example, an artificial dataset for Brazil, Argentina, Peru, and Paraguay is built. After this, the same function as before is plotted but with some differences. Apart from changing the region, displayMode
was set to regions
. This causes instead of dots the whole surface of the country to be painted with the corresponding color:
#Artificial Dataset Generation countries <- c("BR","AR","PE","PY") value1 <- runif(4,0,10) value2 <- round(runif(4,0,100)) sa.dataset <- data.frame(countries=countries,val1=value1,val2=value2) #Plot of the Map. '005' is the region code for South America southamerica.map <- gvisGeoChart(sa.dataset, locationvar="countries",sizevar="val1", hovervar="val2", options = list(region="005",displayMode="regions")) #Plotting plot(southamerica.map)
Treemaps are very useful visualizations for hierarchies, that is, subelements that belong to a greater element. It displays the relationship between three dimensions: the hierarchy, the colors, and the size.
They are used in multiple different areas, such as computer science (for instance, to display directories and subdirectories), economy (a very good example of this is available at the MIT's Observatory of Economic Complexity, http://atlas.media.mit.edu/explore/tree_map/), and news (http://newsmap.jp/) among others.
gvisTreemap()
is the function to create treemaps in googleVis
. In the following code, the structure of this can be clearly seen. Firstly, idvar
, the variable which indicates the name of the elements, is expected. In this case, this variable will be the regions variable. Each row must also have another row on which it depends or belongs to. This must be specified in another column and passed to the function in the parentvar
argument.
As you can see from the following code, the root node, that is, the node that does not belong to another node, has an NA
value under this column. gvisTreemap()
only accepts one root node. The size variable determines the size of each of the squares. This is done, however, by comparing only the elements of the same node, for example, Asia
, America
, and Europe
, or South America
and North America
. The values of Asia
and South America
neither the values of Japan
and Brazil
are compared. It's not necessary that the sum of the child nodes is equal to the parent node:
library(googleVis) #Generate random data with dependencies regions <- c("World","America","Europe","Asia","South America", "North America","Western Europe","Eastern Europe", "Middle East", "Far East", "Argentina","Brazil","USA","Canada", "Germany", "France","Hungary","Russia","Israel","Saudi Arabia","China","Japan") dependency <- c(NA,"World","World","World","America","America","Europe","Europe", "Asia","Asia","South America","South America","North America", "North America", "Western Europe", "Western Europe", "Eastern Europe", "Eastern Europe", "Middle East", "Middle East", "Far East", "Far East") size <- runif(22,1,100) color <- runif(22,1,100) frame <- data.frame(regions=regions,dependency=dependency,size=size,color=color) #Plot treemap treemap <- gvisTreeMap(frame, "regions","dependency","size","color") plot(treemap)
Originally developed by Hans Rosling in GapMinder and now offered by Google under the name of motion chart, this is a visualization whose main advantage relies on the amount of variables it can display at the same time without compromising visual clarity.
In a very general way, this describes the evolution of a series of variables over time. It consists mainly of bubbles whose positions depend on their values for the variables represented on the X and Y axes and whose color and size depict the value of the other two variables. These last two parameters are optional; in case they are not used, the bubbles will be of the same size/color.
A very impressing example of a problem described with motion charts is given by Rosling himself in this video: https://www.youtube.com/watch?v=jbkSRLYSojo.
For the following example, an additional WDI
package was installed. WDI
is a package that retrieves data from the World Bank API. As this type of visualization requires a temporal variable, WDI data is very easy to display in this kind of graphs. For this example, some arbitrary indicators and countries were taken. In this case, a variable was assigned to every option, even to size and color:
#Install WDI to obtain data from the World Bank API and call the library(gooeglVis) install.packages("WDI") library(WDI) # Load some data indicators <- c("BM.KLT.DINV.GD.ZS","BG.GSR.NFSV.GD.ZS","EN.ATM.CO2E.PP.GD","NY.GDP.MKTP.CD") countries <- c("AR","BR","DE","US","CA","FR","GB","CN","RU","JP") frame <- WDI(country = countries, indicator = indicators, start = 2005, end=2013) #Change indicator names just to make it easier to understand names(frame)[4:7] <- paste0("indicator",1:4) #Graph HTML Creation motionchart <- gvisMotionChart(data = frame, idvar = "iso2c", timevar = "year", xvar = "indicator1", yvar = "indicator2", sizevar = "indicator3", colorvar = "indicator4") #Plotting plot(motionchart)
This visualization is similar to a small dashboard, as it provides us with the possibility of changing the variables of the different indicators (each of them has a small drop-down menu with all the available variables in the dataset) or even changing the type of visualization shown by clicking on one of the icons in the top-right corner.
googleVis
in Shiny has two particular characteristics that are worth mentioning: firstly, it has its own reactive function, which only works for googleVis
visualizations. This function is renderGvis()
. In the next example of a Shiny web application done entirely with googleVis
, it is shown clearly how this works.
Another particular thing about googleVis
is that, instead of plotOutput()
, it uses HTMLOutput()
in UI.R
. This makes absolute sense if we consider that the output of all the googleVis
functions are mainly HTML code
Taking the World Bank example in the motion chart, in the following, you will find a Shiny application done entirely with the googleVis
visualizations that you can reproduce as any other example, simply by creating the same files that appear here.
Due to some reasons that definitely exceed the scope of this book, the following example works properly only on a separate browser. This means that after running it, please select Run External in newer versions of RStudio, or click on Open in Browser and test this from the browser window in older ones.
In global.R
, the WDI library is used, which is mainly an interface to connect to the World Bank API and where data from different indicators can be retrieved by year and country. In this script, firstly, all indicators are retrieved with WDIsearch()
and some indicators are chosen (the election was arbitrary). After this, the data for these indicators for an arbitrary list of countries between 2005 and 2013 is retrieved.
Finally, an indicator vector and a country vector is created. These vectors are named just to illustrate how a named vector works in UI.R
. However, this is not necessary. Have a look at the following code snippet for global.R
:
#Call WDI library library(WDI) library(reshape2) library(googleVis) #Load all indicators all.indicators <- as.data.frame(WDIsearch()) #Take 6 indicators used.indicators <- all.indicators[c(1:3,12,14,15),] #Retrieve Data from indicators countries <- c("AR","BR","DE","US","CA","FR","GB","CN","RU","JP") frame <- WDI(country = countries, indicator = as.character(used.indicators[,1]) , start = 2005, end=2013) #Create indicator's vector indicators.vector <- as.character(used.indicators[,1]) names(indicators.vector) <- as.character(used.indicators[,2]) #Create countries' vector countries.vector <- unique(frame$iso2c) names(countries.vector) <- unique(frame$country)
In UI.R
, the input options are defined by the data retrieved in global.R
. As it was explained, UI.R
uses the named character vectors in checkboxGroupInput()
and selectInput()
. If a named vector is passed, the names are displayed in the applications frontend while the variable adopts the value from the selected element. With respect to sliderInput()
, the minimum and maximum values are directly taken from the dataset created in global.R
.
In the output section, a tabset with two tabs is displayed: one for the intensity map, and the second one for the motion chart. The following code is for UI.R
:
library(shiny) # Starting line shinyUI(fluidPage( # Application title titlePanel("World Bank Dashboard with GoogleVis"), # Sidebar sidebarLayout( sidebarPanel( #Country selection checkboxGroupInput("countries","Select the countries:", countries.vector, selected=countries.vector), #Years selection sliderInput("years","Select the year range",min(frame$year),max(frame$year), value = c(min(frame$year),max(frame$year))), #Map variable selection selectInput("map.var","Select the variable to plot in the map",indicators.vector)), #The plot created in server.R is displayed mainPanel( #htmlOutput("MotionChart") tabsetPanel( tabPanel("Map Chart",htmlOutput("Map")), tabPanel("Motion Chart",htmlOutput("MotionChart")) ) ) ) ))
In server.R
, subsets of the data are first created according to the filters applied. After this, each of the functions that create their visualization work differently according to their needs. In order to create the map, a sum aggregation by country code is performed for every variable. This is needed because the original dataset is split by years; in this case, one value per item (that is, country) is needed.
After this, the selected variable in the drop-down menu (selectInput()
) is selected to be the intensity variable (passed in the sizevar
argument). This piece of code can be optimized as some variables are needlessly aggregated in the aggregation phase (basically, all the variables that will not be used).
Unfortunately, there is no link to provide here. The optimization relies mainly on the way it is coded. Basically, the aggregation expression can be written by dynamically taking only the variable needed, but this would have required an explanation of expression objects, which is definitely a more advanced stage of R.
The dataset for motion chart graphics on the contrary does not need any modifications in order to make it work. This is the reason why the chart creation function is called directly with the corresponding variables passed to it. The following is the code for server.R
:
library(shiny) #initialization of server.R shinyServer(function(input, output) { frame.sset <- reactive({subset(frame,iso2c %in% input$countries & year >= input$years[1] & year <= input$years[2])}) #Table generation where the summary is displayed output$Map <- renderGvis({ aggregated.frame <- aggregate(.~iso2c + country,frame.sset()[,-3], sum) map <- gvisGeoChart(aggregated.frame, locationvar="iso2c",sizevar=input$map.var, hovervar="country", options = list(region="world",displayMode="regions")) return(map) }) output$MotionChart <- renderGvis({ mchart <- gvisMotionChart(frame.sset(), "country","year") return(mchart) }) })