Calculating vertically

One of the most powerful features in QlikView is the ability to create vertical calculations in charts. We normally calculate values horizontally, where all values are in reference to the dimensions in the chart. It is a very important feature for us to also make vertical calculations across those horizontal numbers. For example, we might want to know what the total of all our calculations is so that we can calculate a ratio.

We might want to know the average, or the standard deviation, to draw a line in a chart. We might want to accumulate just the last four results to calculate a rolling average.

Using inter-record and range functions

There are several functions that allow us to compare between different records in a chart. Some work in all charts, but others are specific to a particular chart type, such as a pivot table. In the graphical charts (Bar, Pie, and so on), we should imagine their Straight Table equivalent to understand how these functions will work.

The main functions that we can use here are listed in the following table:

Function

Description

Above

This allows us to access the values in the chart above the current row

Below

Like Above, we get only access to the values below the current row

Before

This is used in a pivot table to access the values before the current column

After

Again, this is used in a pivot table to get access to the values after the current column

Top

This gives us access to the value in the first row of the chart

Bottom

This gives us access to the value in the last row of the chart

First

In a pivot table, this gives us access to the value in the first column of the chart

Last

This gives us the value in the last column in a pivot table

RowNo

This tells us the number of the current row in the chart

ColumnNo

In a pivot table, this tells us the number of the current column

NoOfRows

This tells us how many rows there are in the chart

NoOfColumns

In a pivot table, this tells us the number of columns

The default for the Above, Below, Before, After, Top, Bottom, First, and Last functions are to just return one value that will be, as you would expect, the value in the direction stated in the name of the function. Consider the following example:

Above(Sum(LineValue))

This will give us the value of Sum(LineValue) in the row directly above it.

These functions also take additional, optional parameters. The second parameter will accept an offset value, defaulting to 1, indicating how many rows above we want to take the value. So, consider the following example:

Above(Sum(LineValue),2)

This will give us the value two rows above. In fact, we can actually specify 0 as the offset and this will just give us the current row.

The third optional parameter, which defaults to 1, will specify how many row values we want to return. Consider the following example:

Above(Sum(LineValue),0,4)

This will give us four values, starting with the current row. Now, QlikView cannot handle multiple values like this, and if we try to use this in a chart, it will return null. This is where we need to use the range functions, which will handle a range of values like this. There are several range functions, such as RangeSum, RangeCount, and RangeAvg, that are designed for this purpose. So, if we want the average of the four values above, we would do the following:

RangeAvg(Above(Sum(LineValue),0,4))

This will give us, if the dimension in this chart were in months, a four-month moving average:

Using inter-record and range functions

If we include the RowNo function to tell us what row we are on, we can calculate a cumulative value:

RangeAvg(Above(Sum(LineValue),0,RowNo()))

This might be used in, say, a Pareto analysis:

Using inter-record and range functions

Applying the Total qualifier

By default, of course, expressions in charts will be calculated with respect to the dimensions in the chart. The Sum calculation on the USA row will only calculate for values that are associated with the USA. This is exactly what we will expect.

Sometimes, we will want to override this behavior so that we can create a calculation that ignores the dimensions in the chart. For example, we might want to calculate the percentage of the current value versus the total. When we add the Total qualifier into our expression, then the dimensions will be ignored and the expression will be calculated for the whole chart.

For example, suppose that we have a chart that has the following expression:

Sum(LineValue)

Now, we add a second expression with the Total qualifier:

Sum(Total LineValue)

We can see the effect of the Total qualifier:

Applying the Total qualifier

We can see that the dimensions in the chart have been ignored and the same total value has been calculated on each row. We can then change this to divide one by the other:

Sum(LineValue)/Sum(Total LineValue)

We will get a percentage calculated:

Applying the Total qualifier

Now, if we were to add a second dimension to this chart, we would get the percentage of each row in reference to the total as before. However, what if we wanted to see the percentage of the second dimension in reference to the first dimension's total? In this case, we can add a modifier to the Total qualifier to indicate that it should not ignore some dimensions:

Sum(LineValue)/Sum(Total<Region>LineValue)

Now, only the second dimension is ignored:

Applying the Total qualifier

Creating advanced aggregations with Aggr

QlikView has a fantastic chart engine. It is no surprise that we can get additional access to this chart engine, inside or outside of a chart, so as to create more advanced calculations. The Aggr function allows us to create a virtual chart—we can imagine it like a Straight Table—and then, we can do something with the set of values that are returned, that is, the expression column in our imaginary Straight Table.

Like a chart, the Aggr function takes an expression as a parameter. It also takes one or more dimensions. It then calculates the expression against the dimensions and returns a set of the results that we generally use in another aggregation function such as Sum, Avg, Max, Stdev, and so on. For example, suppose we were to perform the following Aggr function:

Aggr(Sum(OrderCounter), Country)

Then, we can imagine the virtual Straight Table that this will create:

Creating advanced aggregations with Aggr

We might then want to calculate the average of these values:

Avg(Aggr(Sum(OrderCounter), Country))

When an Aggr function is used within a chart, its context is set by the dimensions of the chart. This means that on each row of the chart, the Aggr function will only have access to the values that are related to the dimension values for that row. For example, if we want to use the Aggr above in a chart that contains the Country dimension, we will need to add the Total qualifier:

Avg(Total Aggr(Sum(OrderCounter), Country))

If we didn't, the calculation will respect the dimensionality of the chart and just give us the sum of the OrderCounter field on each row:

Creating advanced aggregations with Aggr

Note

There is an interesting issue present in this table. The average of 1,198.2 does not appear to be correct! The average should be 1,331.3. However, if we turn off the Supress Zero option for the chart, we will find that there is another blank value, Country! If we include this in the average calculation, then we will get 1,198.2 and this is what Aggr is doing. We can exclude the blank value by using a bit of Set Analysis:

Avg(Total Aggr(Sum({<Country={*}>} OrderCounter), Country))

Using Aggr to calculate a control chart

Statistical control charts were first proposed by Walter A. Shewhart, a statistician working for Bell Laboratories in the 1920s. They take into consideration that variation is normal in a process and that we should only be concerned with variation outside control limits.

A control chart will often use a combination and a mean of a range of values from a particular period to compare to another period. For example, we might say that we will treat 2012 as a sample for the deviation of our sales figures and then we want to see the trend of our sales figures in 2014.

To do this, we will have to have a calculation of the mean value that includes a Set Analysis statement to limit to the correct period:

Avg({$<Year={2012}>} Total Aggr(Sum({$<Year={2012}>} LineValue), YearMonth))

Note

Note that the Set Analysis expression needs to be contained in both the Aggr expressions and in the aggregation function that we use Aggr with.

We can add the upper control value for the control chart:

Avg({$<Year={2012}>} Total Aggr(Sum({$<Year={2012}>} LineValue), YearMonth))
+
2*Stdev({$<Year={2012}>} Total Aggr(Sum({$<Year={2012}>} LineValue), YearMonth))

We can add the lower control as well:

Avg({$<Year={2012}>} Total Aggr(Sum({$<Year={2012}>} LineValue), YearMonth))
-
2*Stdev({$<Year={2012}>} Total Aggr(Sum({$<Year={2012}>} LineValue), YearMonth))

Now we should have a chart that allows us to look at different year's performance versus the 2012 controls:

Using Aggr to calculate a control chart

Calculated dimensions

Another use that we can put the Aggr function to is to create calculated dimension. Before we had dimension limitations in charts, this was the only way that we can limit some charts to, say, the top 5. In fact, we still need to turn to this to calculate the top x dimension values in pivot tables. For example, if we want to have the top 5 customers, we need to add a calculated dimension of the following:

=If(Aggr(Rank(Sum(LineValue)), Customer)<=5, Customer, Null())

We should also set the Supress When Value is Null option for this dimension. We can also add a second dimension such as Year to the chart:

Calculated dimensions

This is a really interesting thing because it is using the calculated virtual chart to provide the values for the dimension, but knows how the dimension values are associated to the data so that it can correctly calculate the totals.

No to nodistinct

The Aggr function has as an optional clause, that is, the possibility of stating that the aggregation will be either distinct or nodistinct.

The default option is distinct, and as such, is rarely ever stated. In this default operation, the aggregation will only produce distinct results for every combination of dimensions—just as you would expect from a normal chart or straight table.

The nodistinct option only makes sense within a chart, one that has more dimensions than are in the Aggr statement. In this case, the granularity of the chart is lower than the granularity of Aggr, and therefore, QlikView will only calculate that Aggr for the first occurrence of lower granularity dimensions and will return null for the other rows. If we specify nodistinct, the same result will be calculated across all of the lower granularity dimensions.

This can be difficult to understand without seeing an example, so let's look at a common use case for this option. We will start with a dataset:

ProductSales:
Load * Inline [
Product, Territory, Year, Sales
Product A, Territory A, 2013, 100
Product B, Territory A, 2013, 110
Product A, Territory B, 2013, 120
Product B, Territory B, 2013, 130
Product A, Territory A, 2014, 140
Product B, Territory A, 2014, 150
Product A, Territory B, 2014, 160
Product B, Territory B, 2014, 170
];

We will build a report from this data using a pivot table:

No to nodistinct

Now, we want to bring the value in the Total column into a new column under each year, perhaps to calculate a percentage for each year. We might think that, because the total is the sum for each Product and Territory, we might use an Aggr in the following manner:

Sum(Aggr(Sum(Sales), Product, Territory))

However, as stated previously, because the chart includes an additional dimension (Year) than Aggr, the expression will only be calculated for the first occurrence of each of the lower granularity dimensions (in this case, for Year = 2013):

No to nodistinct

The commonly suggested fix for this is to use Aggr without Sum and with nodistinct as shown:

Aggr(NoDistinct Sum(Sales), Product, Territory)

This will allow the Aggr expression to be calculated across all the Year dimension values, and at first, it will appear to solve the problem:

No to nodistinct

The problem occurs when we decide to have a total row on this chart:

No to nodistinct

As there is no aggregation function surrounding Aggr, it does not total correctly at the Product or Territory dimensions. We can't add an aggregation function, such as Sum, because it will break one of the other totals.

However, there is something different that we can do; something that doesn't involve Aggr at all! We can use our old friend Total:

Sum(Total<Product, Territory> Sales)

This will calculate correctly at all the levels:

No to nodistinct

There might be other use cases for using a nodistinct clause in Aggr, but they should be reviewed to see whether a simpler Total function will work instead.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset