Chapter 12. Set Analysis and Point In Time Reporting

Comparing performance metrics over a period of time is one of the most fundamental tasks expected from any BI solution. There are a few ways to deliver these sort of comparisons in QlikView, but the most flexible and dynamic ones involve the use of Set Analysis. Set Analysis, by itself, is a powerful tool that can be used for not only for Point In Time Reporting, but for many other complex calculations.

In this chapter, we will expand on what we've learned from all of the previous chapters, and introduce the following new concepts:

  • Set Analysis and modified record sets
  • Point In Time Reporting
  • Comparative analysis with alternate states

We will approach these topics with the use of some practical examples. Given the harshness of the syntax we are going to use, and the complexity of the expressions that can be built with Set Analysis, we recommend you to have a lot of patience and dedication to the subject at hand. Even with the best reference material, Set Analysis can take a while to master—so don't lose heart!

The magic of Set Analysis

We will now introduce one of the most powerful tools you, as a developer, have at your disposal when creating QlikView documents. We must say upfront that, as with anything, the excessive use of Set Analysis in chart expressions can yield poor performance or low response times. However, we should also know that, when used effectively, it can have a positive impact in both performance and user experience.

In this section, we will cover topics such as when to use Set Analysis, why you should use it, what the correct syntax is, and we will provide common examples and several tips and tricks for maximizing performance as well.

What is it for?

Set Analysis is a great feature in QlikView that lets you, as a developer, take control over what your charts display and allows calculations that wouldn't be possible otherwise, at least not as dynamically. To understand its inner workings, we can compare it to how selections that are made using listboxes work.

With UI selections, we can say that whatever is selected affects the entire document, and all of the charts only display information associated with the set of data corresponding to those selections; plain and simple. In a way, that is basically what Set Analysis does. It restricts, predefines, or extends the set of data that charts base their calculations on. Using a set expression, we can, for instance, specify that a certain chart should perform an aggregation only based on records that meet a set of criteria in certain fields (for example, Region A and Region B from the Region field), even if the non-matching values are part of the user's selected record set. We can also use Set Analysis to expand the selections made by the user to show, for example, results of the previous year even when it has not been actually selected. For example, when the user selects the year 2012 and a table displays data for 2012 in one column and 2011 in the other.

Note

The modified record set specified in a set expression affects only the expression in which it is being used, not the entire document.

That is the whole concept. However, it is sometimes not as simple to tell our chart to "show the numbers of the previous year/month/period" as it would be using selections. In Set Analysis, we need to specify the modified record set using an expression with the correct syntax.

These are some situations in which the use of Set Analysis is practical:

  • To compare results for two different time periods in one single view based on the same selection state
  • To restrict or exclude certain value(s) in a field from the calculation
  • To create a cumulative sum or YTD (Year-To-Date) result, even if the user selects only one month
  • To disregard selections in a certain field that may not be applicable to calculate a particular metric
  • To essentially ignore all selections
  • To use the set of data returned by a Bookmark, even when the Bookmark is not activated
  • To replace If expressions that make use of lots of hardware resources
  • A mix and match of all of the above

We will first work our way through the Set Analysis syntax, after that we will apply the concepts to achieve something that is a must in almost every QlikView document: Point In Time Reporting.

Syntax and examples

The details for creating a set expression are outlined in the following step-by-step procedure:

  1. We start off by defining the base expression. Say we want to sum the total number of flights performed. So, we will begin with the following:
    Sum([# Departures Performed])
  2. Then, we need to construct the set part of the expression. It is the set expression in which we specify the record set we want to use in our calculation. The set definition is placed just after the first parenthesis in the base expression, before the field name, and will be enclosed in curly brackets:
    Sum({set expression} [# Departures Performed])
  3. After the first curly bracket, we define a set identifier by adding either a dollar sign (which means the alternative record set will be initially based on the current selections), the number 1 (meaning we will use the full record set of all the data contained in the document, ignoring all user selections) or the ID of a bookmark (which uses the selection stored in the bookmark). To illustrate our example, we will use the dollar sign identifier, since it is the most common. We will now have:
    Sum({$} [# Departures Performed])

    Note

    Note that the dollar sign can be omitted, since it is the default identifier, and the set expression will not be affected. However, it is good practice to use it to maintain consistency.

  4. Right after the dollar sign, we define the fields that will play the role of set modifiers. This is the part in which we specify what to add or exclude from the initial record set. The entire set of field-value definitions will be enclosed in angle brackets (< >) and the syntax is FieldName = {NewValue}. Different variations to this syntax are described as follows:
    • FieldName = {value}: when the NewValue parameter is a numeric value, it is specified as such with no additional modifications
    • FieldName = {'TextValue'}: when the NewValue parameter is text, we should enclose it in straight single quotes
    • FieldName = {"SearchString"}: if we want to use a search string as the value definition, we should enclose it in straight double quotes

The value definition can also be a set of different values, in which case each element is separated by a comma.

Note

We can also refer to the NewValue parameter as the Element List parameter.

Once we add the set modifiers, our set expression will be complete and have the following structure (shown as pseudocode for illustration purposes):

Sum({$<Field1 = {NewValue1}, Field2 = {NewValue2}>} [# Departures Performed])

Taking our base expression as a starting point, here are some examples:

  • Sum({$<[Carrier Group] = {'Foreign Carriers'}>} [# Departures Performed])

    This will result in the total number of flights performed, but only taking into account the record set defined by the current selections ($), and where the Carrier Group field has a value of Foreign Carriers. All other Carrier Groups are excluded.

    In this example, if the user has specifically selected a different value in the Carrier Group field, that selection will be overridden and the calculation will be made based on the modified record set. It's important to convey this fact to the end users of the QlikView document and add pointers in the user interface as to what each calculation is being based on so that the use of Set Analysis doesn't negatively affect the user experience.

  • Sum({$<Year = {2010}>} [# Departures Performed])

    This will use a record set based on current selections ($) where the Year is 2010 even if the user selects something else in the Year field.

  • Sum({$<Year = {"20*"}>} [# Departures Performed])

    This will use a record set based on current selections ($) where the Year matches the search string "20*", meaning all years that begin with 20 will be taken into account.

  • Sum({$<Year = {">=2010"}>} [# Departures Performed])

    This will use a record set where the Year is greater than or equal to 2010 to calculate the number of flights. A search string is used in the field value definition.

  • Sum({$<[Carrier's Operating Region] = {'Domestic', 'Latin America'}, Year = {2010}>} [# Departures Performed])

    This will use a record set based on the current selections where the Carrier's Operating Region field is either Domestic or Latin America, and only those records corresponding to the year 2010.

    Notice how two elements in the field value definition have been separated by a comma.

  • Sum({1<[Carrier's Operating Region] = {'Domestic', 'Latin America'}, Year = {2011}>} [# Departures Performed])

    In this expression, the only difference from the preceding example is the use of the number 1 as the set identifier. The calculation will use the entire document record set as a starting point, disregarding all user selections, but take into account only those records where the Carrier's Operating Region is either Domestic or Latin America, and will only look at those records corresponding to the year 2011.

  • Sum({$<[Carrier's Operating Region] = {'Domestic', 'Latin America'}, Year = >} [# Departures Performed])

    In this expression, there is no modifier value assigned to the field Year. The calculation will use a record set initially based on the current selections, but disregard the selections made in the Year field, and where the Carrier's Operating Region is either Domestic or Latin America.

Internally, QlikView evaluates the field value definitions in Set Analysis in the same manner as conditional expressions are evaluated (but usually faster), to determine if a specific record should be part of the calculation or not. Therefore, the same rule discussed previously about numeric versus text-based comparisons applies for Set Analysis. That rule: using a set modifier based on a text-based field (Month = {'Jan'}, for instance) is slower than using its numeric equivalent (MonthNum= {1}).

Similarly, it's also important to consider that, when defining a numeric comparison value, it shouldn't be enclosed in single-quotes (MonthNum = {'1'}) as it will cause QlikView to treat it as a text-based comparison.

Using variables in set expressions

It's sometimes convenient to make use of variables in set expressions to make them even more dynamic. We've previously discussed how variables alone are used. We will now go through some examples of Set Analysis expressions that make use of variables instead of hard-coded field values.

As before, we use the Dollar Sign Expansion syntax inside the field value definition or element list. If, for instance, our variable contains a number, we would type the set analysis expression as follows:

Sum({$<Year = {$(vLastYear)}>} [# Departures Performed])

On the other hand, if our variable contains text, our expression would be:

Sum({$<[Carrier Group] = {'$(vInterestGroup)'}>} [# Departures Performed])

If we want to use a variable's value as a search string, the expression would be:

Sum({$<[Aircraft Group] = {"$(vSearchAircraftGroups)"}>} [# Departures Performed])

We have just gone through a basic introduction on the topic, but there is a lot more to know about Set Analysis. Let's move to the next part and discover some more of it.

Dynamic record sets

In the previous section, we reviewed some basic examples using set modifiers with explicit (hardcoded) field value definitions. Our next step will be about making our modified record set dynamic and based on the user's current selections, that is, using a calculated field value definition. By doing so, the alternative record set will dynamically change depending on what the user selects.

To embed actual calculations into the field value definition in a set expression, we use the Dollar Sign Expansion (DSE) syntax. The final expression would be as follows:

Sum({$<Year = {$(=Max(Year))}>} [# Departures Performed])

As you can see, it is just as if we were using a variable, just with an additional equal sign and the expression itself. Here, the DSE function is to evaluate the enclosed calculation and the set expression will only use the output value in the set modifier.

Note

We should never forget the equal sign when embedding calculations in set expressions. Otherwise, the embedded calculation will not be evaluated.

If we want to get the record set corresponding to the previous year to whatever year the user selects, we would use:

Sum({$<Year = {$(=Max(Year)-1)}>} [# Departures Performed])

In this case, the DSE is first calculating the last possible year from the current selections record set. It goes back one year and the output is then passed to the set modifier as the field value definition. Anytime the user changes his selections, the set modifier is changed as well.

Note

At this point, we must reinforce our recommendation that it is very important to let the user know exactly what records the calculation is taking into account. By doing so, we will avoid confusion as it might result counter-intuitive for the user to see values being calculated that are not associated to his selections.

The same concept also works with Quarters, Months, Days, and so on. However, additional considerations need to be made for some fields. For example, if we were to use the previous calculation for the Month field, Max(Month) – 1 would not work if the user selects January. The expression would return zero, which is not a valid month. Although we can easily build an expression that returns the number 12 instead of zero (with the use of the If function, for instance), we will provide an even simpler and straightforward solution for this scenario in an upcoming section of this chapter (Point In Time Reporting).

More assignment operators

All of the preceding examples use set expressions which have predefined field values in the element list definitions, overriding the user selections on the specified fields. However, in some cases, we will need to first take the actual record set that the user has selected and, from there, modify it by adding or removing certain values. To do that, we need to use a different assignment operator in the field-value definition, instead of the equal sign. The available assignment operators are:

  • =: This is what we have been using, and it simply redefines the selection for a certain field.
  • +=: This operator implicitly defines a union between the selected field values and the ones we specify in the element list.
  • -=: This operator implicitly defines an exclusion of the values we specify from the values the user has selected.
  • *=: This operator is used to define the corresponding field values based on the intersection between what the user has selected and the values we specify. That is, the resulting record set will be the values that "intersect" or are present in both the user's selection and the values we explicitly define in our element list.
  • /=: This one is used to define a symmetric difference (XOR), and the resulting record set will contain the values that are present in either one set (the user's selections or the explicitly defined values), but not in both.

Let's walk through some examples to better understand the assignment operators:

  • Sum({$<Year += {2007, 2008}>} Sales)

    This expression will return the sales for the years the user has selected and also for the years 2007 and 2008 whether they are selected or not.

  • Sum({$<Product -= {'Product X'}>} Sales)

    This expression will return the sales for the products the user has selected, but exclude records corresponding to Product X.

  • Sum({$<Product *= {'Product X', 'Product Y'}>} Sales)

    This expression returns the sales for the current selection, but only for the intersection of currently selected products, and products X and Y.

  • Sum({$<ProductNumber *= {"48*"}>} Sales)

    This expression returns the sales corresponding to the current selections, but only for the intersection of the currently selected products and all of the products whose number begins with 48.

Set operators

Set modifiers, the part of the set expression that is enclosed in angle brackets, can also be constructed by combining several different element lists in the field value definition. Furthermore, the entire set expression can be composed using several different set modifiers. This is accomplished using set operators.

The different set operators that can be used are:

  • +: Union
  • -: Exclusion
  • *: Intersection
  • /: Symmetric difference

Note

The exclusion operator (-) can also be used as a unary operator to retrieve the complement set.

The set operators work in a manner similar to the assignment operators described in the previous section. Let's review some basic examples:

  • Sum({$<Year = {2007, 2008} + {"<=2000"}>} [# Departures Performed])

    This expression will return the total flights performed in the years 2007, 2008, plus all of the years that are less than or equal to 2000.

  • Sum({$<[Carrier's Operating Region] = {'Latin America'},Year = {2011}> + <[Carrier's Operating Region] = {'Domestic'},Year = {2010}> - <[Carrier Group] = {'Foreign Carrier'}>} [# Departures Performed])

    This expression will result with the total number of flights performed during 2011 by carriers operating in Latin America plus flights performed during 2010 by carriers operating as Domestic, but exclude Foreign Carriers from both sets.

    This is one of those calculations that wouldn't be possible with simple selections.

Just as in arithmetic operations, parentheses can be used to enclose different set operations and ensure they are evaluated in the correct order.

Using element functions

There are two special functions that can be used in set expressions to implicitly specify an element list. The functions are:

  • P(): To use all possible values in a field as the element list
  • E(): To use all excluded values in a field as the element list

A quick example:

  • Sum({1<Year = p(Year)>} [# Departures Performed])

    This expression will use the full set of data disregarding all user selections (because the specified set identifier is the number 1), but take into account those records corresponding to the years that the user has selected. In other words, only selections made on the Year field are considered.

Note

For more examples, head to the Help menu in QlikView, select Contents, and activate the Index tab from the left pane to search for Set Analysis. There is also a document on QlikCommunity that explores additional examples: http://community.qlikview.com/docs/DOC-1867

The possibilities are endless. Take a moment to try it out in the Airline Operations document and see for yourself what you can do with Set Analysis. In the solution file corresponding to this chapter, we've included an additional sheet, named Set Analysis, with different examples of Set Analysis uses.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset