Chapter 6. Creating Complex Visualizations

Most business requirements can be fulfilled with the traditional charts that we reviewed in the previous chapters. Nevertheless, certain scenarios demand more complex visualizations for a complete understanding of the story hidden behind the data. In the next pages, we will embark on a journey that will take us to a land beyond "the classics" in the pursuit of insightful and stunning dashboards by discussing:

  • Histograms
  • Scatter plots
  • Gauges and infographics
  • Geographic representations
  • Overlapping objects

Histograms

One of the main disadvantages of using aggregators such as avg() is that they can hide interesting behaviors in the data. Picture the following scenario: you have access to the scores of five thousand students regarding four subjects—Science, Mathematics, English, and Literature. Each subject is graded with the help of two exams. In order to condense all these records and show only high-level figures, you decide to calculate an average. However, after analyzing the results, you start wondering: are there some extremely good students raising the global average, or are there alarmingly bad ones bringing it down? Are they all consistent? Can we separate them into groups (good, normal, or bad)? How did most of them fare?

One of the best ways to deal with this kind of questions is to create a histogram, a visualization focused on distributions instead of magnitudes. In this chart, the x-axis represents the exam grade, while the y-axis counts the number of students that scored it. As you can see, the data adopts a shape that gives us a better perspective of the situation:

Histograms

Far to the right, we can find the top students (there aren't a lot of them, but their scores are pretty high). At the other end, we find those who might need a little help (grades below 400 points), and in the middle lies the majority of pupils.

We can also appreciate that the curve is skewed to the right as most of the students have scored between 425 and 585 points. Remember, a higher bar represents more occurrences. For example, the orange bar (the highest of all the histogram) represents the 296 boys that got grades between 500 and 505 points.

Now, let's take a look at how to create this chart in QlikView:

  1. Let's start by creating a histogram regarding Science. Create a new bar chart. However, this time, we don't have an existing dimension to rely on. Instead, we need to create dynamic clusters that group the students depending on their scores; so, click on the Add Calculated Dimension… button:
    Histograms
  2. In the Edit Expression dialog, type the following calculation:
    =class(
    aggr(avg(Science), Student)
    , 5) 

    Note

    The final grade for each student is calculated by averaging the scores of two exams (midterm and final exams). Therefore, it is necessary to create a virtual table that uses Student as the dimension:

    aggr(XXX, Student)

    This also must calculate the average of both the tests:

    aggr(avg(Science), Student)

    Based on these scores, we must create a set of clusters in order to group the students. In this example, we used the class() function with a bin size of 5. Therefore, if a student scored an average of 456 points in both exams, she would be located in the 455 <= x < 460 cluster.

  3. Rename this dimension Score.
    Histograms
  4. Now that we have created the clusters, we need to define how many students reside in each one of them. So, create an expression called Students based on this formula:
    count(DISTINCT Student)
  5. In the Sort tab, select Numeric Value: Ascending in order to arrange the clusters from the lowest to highest scores.
  6. The current format of the clusters isn't exactly friendly, so let's change it for readability's sake. Instead of showing 455 <= x < 460, we will only display the bin's lower limit—455. In order to do this, we need to use only the first three characters of the string. Therefore, edit the Calculated Dimension formula by adding a left() function to it:
    =left(class(
    aggr(avg(Science), Student)
    , 5), 3) 
  7. Although the labels are now shorter, there are still some readability issues. Go to the Axes tab and disable the Stagger Labels option. Even though we don't show every single value, the user can accurately interpret the visualization. Besides, if a particular bar strikes her attention, the chart's popup can give her further details.
  8. Albeit that our histogram is now complete, we can increase its functionality by making its parameters dynamic. Let's start by changing the hardcoded bin size (previously set to 5) to a variable that the users can edit. Create a new variable in the script called BinSize:
    LET BinSize = 5; 
  9. In the Calculated Dimension formula, change the hardcoded parameter with our newly created variable:
    =left(class(
    aggr(avg(Science), Student)
    , BinSize), 3) 
  10. We can make this variable available to the user through various methods such as a slider object or an input box. In this example, we will choose the latter, so create a new input box and add the BinSize variable.
  11. Best practices indicate that whenever we use an input box, we should apply some constraints to ensure that the user does not get too creative with its contents. For this, go to the Constraints tab and edit the features as in the following image:
    Histograms
  12. Change the value in the input box to see how the histogram reacts. Now, the user can change the size of the bin in order to get a broader or narrower view of the data depending on his needs.
  13. Let's move on to the metric. Wouldn't it be great if we could also change the subject displayed in the histogram in a dynamic manner? Let's add a bit of spice by creating a menu of the subjects and associating it with our chart. Go to the script and add the following code:
    LET HidePrefix = '_';
    
    Menu:
    LOAD * INLINE [
      _Menu
      Science
      Math
      English
      Literature
    ];
  14. All the fields that start with the character defined in the HidePrefix will be treated as System Fields. Therefore, they will not appear in the current selections box (a desirable behavior for a navigation field like this one).
  15. Create a multi box that displays the _Menu field.
  16. Select any value (but only one item). Go to the Presentation tab and select the Always One Selected Value box.

    Note

    By nature, our histogram will display any subject we want, but only one subject at a time. Therefore, it is better to ensure that this condition is always met.

  17. The only thing left for us to do is to link this field to our chart; so, open the histogram's properties and modify the calculated dimension so that it references the _Menu field:
    =left(class(
      aggr(avg($(=_Menu)), Student)
      , BinSize), 3)

    Note

    We added the dollar sign expansion syntax because it is necessary to evaluate the _Menu field in order to extract its contents.

  18. As our dimension is now dynamic, it is a good idea to adjust its legend as well. Therefore, go to the Dimensions tab and substitute the Label text with:
    =_Menu
  19. Speaking of legends, let's create a label for the y-axis as well. Go to the Presentation tab and add a new Text in Chart that displays Number of Students. Before leaving this window, adjust the Angle to 270 degrees so that the text appears vertically.
    Histograms
  20. Position this new label along the y-axis by holding Ctrl + Shift.
  21. Adjust the colors, axes, and titles to finalize this visualization.
  22. Congratulations, you have created a histogram!
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset