QlikView is really good at storing data. It operates on data in memory, so being able to store a lot of data in a relatively small amount of memory gives the product a great advantage—especially as Moore's Law continues to give us bigger and bigger servers.
Understanding how QlikView stores its data is fundamental in mastering QlikView development. Writing load script with this understanding will allow you to load data in the most efficient way so that you can create the best performing applications. Your users will love you.
A great primer on how QlikView stores its data is available on Qlik Design Blog, written by Henric Cronström (http://community.qlik.com/blogs/qlikviewdesignblog/2012/11/20/symbol-tables-and-bit-stuffed-pointers).
From a simple level, consider the following small table:
First name |
Surname |
Country |
---|---|---|
John |
Smith |
USA |
Jane |
Smith |
USA |
John |
Doe |
Canada |
For the preceding table, QlikView will create three symbol tables like the following:
Index |
Value |
---|---|
1010 |
John |
1011 |
Jane |
Index |
Value |
---|---|
1110 |
Smith |
1111 |
Doe |
Index |
Value |
---|---|
110 |
USA |
111 |
Canada |
And the data table will look like the following:
First name |
Surname |
Country |
---|---|---|
1010 |
1110 |
110 |
1011 |
1110 |
110 |
1010 |
1111 |
111 |
This set of tables will take up less space than the original data table for the following three reasons:
So, to summarize, each field in the data model will be stored in a symbol table (unless, as we will see later, it is a sequential integer value) that contains the unique values and an index value. Every table that you create in the script—including any synthetic key tables—will be represented as a data table containing just the index pointers.
To help us understand what is going on in a particular QlikView document, we can export details about where all the memory is being used. This export file will tell us how much memory is being used by each field in the symbol tables, the data tables, chart objects, and so on.
Perform the following steps to export the memory statistics for a document:
We can now see exactly how much space our data is taking up in the symbol tables and in the data tables. We can also look at chart calculation performance to see whether there are long running calculations that we need to tune. Analyzing this data will allow us to make valuable decisions about where we can improve performance in our QlikView document.
One thing that we need to be cognizant of is that the memory usage and calculation time of charts will only be available if that chart has actually been opened. The calculation time of the charts may also not be accurate as it will usually only be correct if the chart has just been opened for the first time—subsequent openings and changes of selection will most probably be calculated from the cache, and a cache execution should execute a lot quicker than a non-cached execution. Other objects may also use similar expressions, and these will therefore already be cached. We can turn the cache off—although only for testing purposes, as it can really kill performance.