Understanding why you should use an ETL approach

Hopefully, from the preceding section, you might start to see why the majority of expert QlikView developers use some kind of an ETL approach to data loading using QVDs.

There are several advantages to using an ETL approach to just load all the data directly from data sources, such as:

  • Speeding up overall data loading and reducing of load on database servers by archiving data to QVD
  • Reusing extracted data in multiple documents
  • Applying common business rules across multiple documents: one version of the truth
  • Creating conformed dimensions across multiple business processes, supporting a dimensional modeling approach
  • Provisioning a data layer that allows QlikView users to self-serve, without it being necessary to have database skills

Speeding up overall data loading

As mentioned in the previous section, it doesn't make sense to constantly load data from a database that doesn't change. It makes much more sense for the data that doesn't change to be stored locally in QVD files, and then we only need to go to the database server for the data that has changed since the last time that we queried for it.

This approach makes your network engineers and DBAs very happy because the database isn't over-taxed and the amount of network traffic is reduced.

As data volumes increase, it often becomes critical to make sure that reloads are as short as possible so as to fit inside a reload window. By having as much of the data as possible stored locally on the QlikView server in QVD files, we can make sure that we have the shortest reload times possible.

Reusing extracted data in multiple documents

It is not uncommon for the same data table to be used in many places. For example, you may have a staff list that is extracted from an HR system but is used right across all areas of the business. You may also have a global calendar table, which will be used by almost every application, which can be loaded from the finance system.

By extracting the data once into QVD, you are, again, reducing network traffic and database load. If this data is not updated on a very frequent basis, it is also not necessary to re-extract that from the database frequently during the day to feed into a more real-time application. A table like the calendar might only be refreshed monthly.

Note

Real time means different things to different people, but I would define it as the periodicity of the reload that gives the business the information that it needs to make decisions now. For some businesses that demand a refresh every minute, for others, once a week will do.

Applying common business rules across multiple documents

From a one version of the truth point of view, it is critical that measures are calculated the same way across all documents that use them. If two people use different calculations for, say, margin, then they will get different answers and drive, potentially, different actions.

By using an ETL approach, the same calculation can be used to feed the same measure to multiple fact tables, which helps ensure that the same result is obtained across the business.

Creating conformed dimensions

Conformed dimensions are a fundamental of dimensional modeling. What this means is that we create one dimensional table to represent the same entity across the entire business.

For example, we sell a product to customers, our sales people visit prospects, and we order from suppliers. All of these are examples of organizations. By creating a single organization dimension that can be shared across multiple dimensional models, we can gain insight that would otherwise be difficult to achieve.

In the Kimball dimensional modeling approach, there is a technique called the Enterprise Data Warehouse Bus Architecture that helps you identify dimensions that will be shared across multiple models. For more information, see The Data Warehouse Toolkit by Ralph Kimball and Margy Ross or their website:

http://www.kimballgroup.com/data-warehouse-business-intelligence-resources/kimball-techniques/kimball-data-warehouse-bus-architecture/

Provisioning a self-service data layer

By adopting an ETL approach, we can make our fact table and dimension table QVDs available for users to load into QlikView to create their own analyses, without having to have any database expertise or database connection credentials.

In fact, you can create a QlikView application that will create a script that will read the appropriate QVDs for a process into a new QlikView application, which means your power users do not even have to have any QlikView scripting knowledge.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset