Now it's time to implement heatmap application. We will start from creating query to get sample data for application and then move forward to coding visualization using Javascript and Python.
Let us start to get acquainted with the data we have. We are going to explore sample data to make the process faster.
We are going to use the next query during development. We will display a heatmap for the center of Milano. The other simplification is a hardcoded time interval. We removed all other intervals from the sample dataset using the Pig script earlier. The general idea is to reduce the amount of data and make the development cycle shorter:
(index="milano_cdr_sample" time_interval=1385884800000 AND ( (square_id >5540 AND square_id < 5560) OR (square_id >5640 AND square_id < 5660) OR (square_id >5740 AND square_id < 5760) ) ) | fields square_id, sms_in, time_interval | stats sum(sms_in) as cdrActivityValue by square_id, time_interval | join square_id [search (index ="geojson" (square >5540 AND square < 5560) OR (square >5640 AND square < 5660) OR (square >5740 AND square < 5760) ) | fields square, lon1, lat1, lon3, lat3 | rename square as square_id] | fields cdrActivityValue, lon1, lat1, lon3, lat3 |head 10
Milano_cdr_sample
index, and filter by time_interval
(December 1, 2013, early morning) and square_id
while reading the data.sms_in
value while grouping by square_id
and time_interval
. Country code field is omitted for simplification purposes. We get sms_in
activity for each Milano city square after completing the group operation.square_id
, stored in Milano_cdr_example_index
. We use a join operation to join the index named geojson
. The second index contains longitudes and latitudes for squares. We need the top left and bottom right angles to display a rectangle on the map using the Google Maps API. That's why we select lon1
, lat1
, lon3
, and lat3
.| head 10
returns the first 10 results. We are going to remove this pipe operation later.Visit http://www.bigdatapath.com/2015/09/customizing-hunk/ and download the application. Untar it locally. Don't be afraid if you see multiple folders and files. They are generated by the SDK framework utility. SDK generated the Django application skeleton. Django framework description is omitted for simplification. We would have to change few files in order to make application solve our problem. We will work with one file located here:
milanocdr/django/googlemaps/templates/home.html
Please open chapter related to Mongo integration and see detailed description for application installation process.
Let's have a look at the code. Just search for the highlighted code snippets, to find the relevant discussion point.
Find the: <!-- Page layout -->
comment. You can see the layout for the page. The layout is simple; pick an activity type and map where the data will be displayed:
You can see small squares near Rosate. These squares are coded by color. Green denoted the lowest activity and red the highest.
We will use linear gradients for simplicity. You would definitely switch to logarithmic gradients or linear and logarithmic ones. We don't consider value amounts for each bean; the idea is just to equally split the difference between the current mix and max values on five equal ranges:
Find //colors for the values
to see the list of colors for the bins. The first is green, the last is dark red:
function assignBinColors(rectangles)
The preceding code is responsible for assigning color bins for squares. It uses a simple formula to calculate bin size:
var binSize = (max-min) / (binColors.length-1);