Basic Set Analysis should be in even the most junior QlikView developer's arsenal of tools. The ability to add modifiers, most frequently to the $
set, allows us to perform some very useful calculations that we either couldn't perform at all without Set Analysis, or that would have required us to do a lot more work.
We should already know about at least two of the identifiers that we can use in a Set Analysis expression: 1
and $
. We also should know that the $
set is the default so that if there is no set identifier specified, then QlikView will use the $
set, which is just for current selections.
The following table shows a list of all the identifiers that you may come across:
We know that the true power of Set Analysis comes not just with the ability to specify different identifiers in an expression (although having just that could be quite powerful) but with the ability to modify those sets with our own set of selections.
At this stage, we should be familiar with using a Set Analysis expression with modifiers as shown:
Sum({$<Year={2012,2013}>} SalesValue)
Here, we appear to have a field called Year
compared to an element value list of {2012,2013}
.
It makes some kind of sense that Year
is equal to either 2012
or 2013
but actually the =
sign here does not actually mean "equals". It can't really because Year
can't be "equal" to both values.
What we have to understand is that the values on both sides of the =
sign are both sets. Year
is a data field but that is actually a set of values. The {2012,2013}
list is also a set. Therefore, the =
sign becomes not a direct comparison, but like a union operation between the set of all Year
values and the set of values in the braces.
We have to be careful about this because I have seen confusion around it. For example, it is valid to have another field instead of an element list (the list of values enclosed in {}
) as shown:
Sum({$<OrderDate=DeliveryDate>} OrderValue)
I have seen this described as being where OrderDate
is equal to DeliveryDate
. This is incorrect! This set will give you all values where the OrderDate
values are in the range of the DeliveryDate
values. For example, suppose that we have the following dataset:
Orders: Load * Inline [ OrderID, OrderDate, DeliveryDate, OrderValue 1, 2014-08-10, 2014-08-10, 100 2, 2014-08-10, 2014-08-11, 100 3, 2014-08-11, 2014-08-12, 100 4, 2014-08-13, 2014-08-14, 100 ];
We might expect that the preceding expression would only match for the first order. However, it could match for the first three orders! The union of the values in the OrderDate
field with the values in DeliveryDate
will actually only exclude the last order. The order dates in orders 1
and 2
match to the delivery date from order 1
, while the order date from order 3
matches to the delivery date from order 2
.
We can use mathematical set arithmetic with any set such as identifier, field, or element list. The operators only work on sets and return a set result. The operators are listed in the following table:
As stated previously, these sets can be applied to identifiers, fields, and element lists. So we can create a set in the following manner:
Sum({$*BM01} SalesValue)
This will give us the intersection of current selections and the bookmark BM01
.
We can also have a set as follows:
Sum({$<OrderDate=DeliveryDate-{'2014-08-13','2014-08-11'}>} OrderValue)
We can get quite sophisticated with this set arithmetic. If we do need to have more than one set operators, we should remember to use parentheses because ($*BM01)-BM02
is different from $*(BM01-BM02)
.
Where there is a set comparison that includes the field that we are modifying, we can make use of some shorthand; this will be familiar to C/C#/Java programmers. For example, if we want every year except for one particular year, we can perform the following:
Sum({$<Year=Year-{2013}>} SalesValue)
We can shorten the expression in the following manner:
Sum({$<Year-={2013}>} SalesValue)
We can equally perform similar shorthand with the other operators:
+= *= /=
When we first learned to use Set Analysis, we might have learned that we can use wildcard search within a modifier. This is quite a powerful feature. However, we can really enhance what we can do with Set Analysis when we learn that we can also use advanced search within our modifiers.
Essentially, any exact match, wildcard, or advanced search that we can use in a search dialog in a listbox can be used in a modifier.
For example, if we want to see the sales for Germany, we can use an exact match:
Sum({<Country={'Germany'}>} LineValue)
If we are looking for sales for years in the 2010s, we might do this:
Sum({<Year={"201*"}>} LineValue)
If we want sales since 2011, we can do this:
Sum({<Year={">=2011"}>} LineValue)
How about we check for all sales for those countries that sold more than 5, 000, 000 in 2013:
Sum({<Country={"=Sum({<Year={2013}>} LineValue)>5000000"}>} LineValue)
We know that if we want to get the sales for a list of countries, we can simply list them in an element value list like this:
Sum({<Country={'Germany','China'}>} LineValue)
However, we also have the option to use search syntax like this:
Sum({<Country={"(Germany|China)"}>} LineValue)
And, as we saw in the earlier part of this chapter, that syntax allows us to include multiple search options:
Sum({<Year={"(2010|2013|200*)"}>} LineValue)
This can also be expressed as follows:
Sum({<Year={2010,2013,"200*"}>} LineValue)
So far, we have used mostly static values in our example modifiers. However, the most power will come when we combine modifiers and Dollar-sign Expansion.
There is no great magic here. Wherever we might use a static value, we just replace it with a Dollar-sign Expansion. For example, we can use the following:
Sum({<Year={$(vMaxYear)}>} LineValue)
We can also use the following:
Sum({<Year={$(=Max(Year))}>} LineValue)
The only thing that we need to really consider here is that when performing an exact match with dates, we need to make sure that the value returned from the Dollar-sign Expansion matches the text of the date's dual value. It isn't such an issue if you are doing a greater-than or less-than comparison, because then you can use either the dual text or numeric format. For example, if we have a Month
field that is Dual('Jan',1)
, Dual('Feb',2)
, and so on, then we can't do the following:
Sum({<Month={3}>} LineValue)
Sum({<Month={'Mar'}>} LineValue)
Although the following is also fine:
Sum({<Month={">=3<5"}>} LineValue)
Quite often, in a set modifier, we will want to compare the field to be modified to the values in a different field. There are a number of different options, and they are discussed in the upcoming sections.
We have seen this already, but it is acceptable for the set comparison to be directly against another field. For example:
Sum({<Year=DI.Year>} LineValue)
We can use set arithmetic on these like this:
Sum({<Year=Year+DI.Year>} LineValue)
One way that we can get over the limitation of only seeing selected values in the other field is to use the Concat
function along with Dollar-sign Expansion to derive an element value list, for example:
Sum({<Year={$(=Concat(Distinct DI.Year,','))}>} LineValue)
This might expand to something like this:
Sum({<Year={2011,2012,2013}>} LineValue)
With text values, we might need to make use of the Chr(39)
function, which returns a single quote, to derive the correct list:
Sum({<Country={'$(=Concat(Distinct DI.Cntr,Chr(39)&','&Chr(39)))'}>} LineValue)
This might expand to something like the following:
Sum({<Country={'France','Germany','Ireland','USA'}>} LineValue)
The P
and E
functions, which can only be used in a set modifier expression, will return a set of either the possible or excluded values. As they are functions, they can themselves accept a set identifier and modifier. We can also specify which field we want to return the set of values for. If left out, the field that we are modifying will be returned.
Let's look at some examples. First, if we perhaps want to modify the Year
field with the years that are selected in a particular bookmark, use:
Sum({<Year=P({BM01} Year)>} LineValue)
What if we want to modify the Year
field with all of the values in the DI.Year
field:
Sum({<Year=P({$} DI.Year)>} LineValue)
Otherwise, to get all of the values in the DI.Year
field that are not selected:
Sum({<Year=E({$} DI.Year)>} LineValue)
When using Alternate States in a QlikView document, we can now add additional complexity to calculations. The syntax is quite straightforward though.