In this recipe, we will focus on how to create a bar chart to compare the occurrences of different crimes in Germany, Italy, and Spain in the year 2012. In particular, we will create a bar chart where we have three bars for each country, one with the number of burglaries, another with the number of robberies, and a third with the number of motor vehicle thefts.
For this recipe, we need the crim_gen.tsv
file which comes with this book. This file contains the number of crimes reported to the police by year and by country. This data has been downloaded from the Eurostat website (http://ec.europa.eu/eurostat).
We assume that this file is in the same directory as the code using it.
The following code example demonstrates how to create a bar chart. We will:
plotly
to make the chart.# bar charts import pandas as pd crimes = pd.read_csv('crim_gen.tsv', sep=',| ', na_values=': ') crimes = crimes[crimes.country.isin(['IT', 'ES', 'DE'])] burglary = crimes.query('iccs == "burglary"')[['country', '2012 ']].sort(columns='country').values robbery = crimes.query('iccs == "robbery"')[['country', '2012 ']].sort(columns='country').values motor_theft = crimes.query('iccs == "theft_motor_vehicle"')[['country', '2012 ']].sort(columns='country').values import plotly.plotly as py from plotly.graph_objs import * trace1 = Bar( x=burglary[:,0].tolist(), y=burglary[:,1].tolist(), name='burglary' ) trace2 = Bar( x=motor_theft[:,0].tolist(), y=motor_theft[:,1].tolist(), name='motor_theft' ) trace3 = Bar( x=robbery[:,0].tolist(), y=robbery[:,1].tolist(), name='robbery' ) data = Data([trace1, trace2, trace3]) layout = Layout( barmode='group' ) fig = Figure(data=data, layout=layout) plot_url = py.plot(fig, filename='bars-crimes')
In this recipe, we have used pandas (which were introduced in the first chapters of this book) to import and query the data. First, we isolated the data by the countries that we were interested in using the isin
method, and then by the types of crimes that we were interested in. In particular, we have the three matrices burglary
, robbery
, and motor_theft
, where the first column is the country code and the second is the number of times that crime has been reported in the country. Here's what the matrix motor_theft
looks like:
[['DE', 70511.0], ['ES', 55197.0], ['IT', 196589.0]]
For each of the matrices, we instantiated a bar object, just like we did for the scatter object, but this time the parameter x
is the first column of the matrix and y
is the second. The data was again organized in a list and passed to the method plot. The result should be as follows:
As we can see, we have three groups of bars, and each group contains three bars.
In this snippet, we also used a new object: layout. This object enables us to specify the layout properties of the chart. Setting the parameter of this object bar mode as group
, we specified that the bars needed to be grouped. If we set this attribute to stack
, we get something like this:
This means that now we also know how to stack the bars instead of just grouping them.