Chapter 4. Data Governance

 

"The most valuable commodity I know of is information."

 
 --Gordon Gekko, Wall Street

Do you know how many QlikView applications are being used in your organization? Do you know who is using these applications? Do you know where all the data comes from for those applications? How are people calculating different metrics?

The answer to all of these questions might actually be, "Yes, yes I do." If so, it may be that you already have a data governance strategy in place or you just might not need one. If everything is tightly controlled within a small group of QlikView developers, perhaps a QlikView Center of Excellence, then you probably have a good grip on this. However, if your organization is less structured than that and you have many QlikView developers spread around, all creating their own applications, then you will need to think about answering these questions.

Data governance starts at the top of the organization. Without serious management buy-in to the process, most efforts at data governance will inevitably fail. Whatever team is assigned to the task of creating a data governance plan will have to take many facets into consideration, and the implementation of Qlik is just one of these.

The first part of establishing a good data governance plan for Qlik is to develop a good ETL process (see Chapter 3, Best Practices for Loading Data) to ensure that developers have a set of well-formed dimensional models (see Chapter 2, QlikView Data Modeling) to use. You should ensure that you understand these concepts. Of course, another part of establishing good data governance is to ensure that developers are using such data sources, and this is something that we will look at in this chapter.

After reviewing some basic concepts that you should be aware of, we will look at how developers can establish metadata in their QlikView applications that can help users to know which fields they are using when they create their own charts.

We will go on to discuss the concept of data lineage and how this applies in QlikView, especially in the Governance Dashboard. This is QlikView's free tool that utilizes the Expressor technology to scan your applications and source files to tell you where the information comes from. The Dashboard also scans the QlikView server logs to tell us exactly what users are doing with our applications.

Note

We should be aware that in some countries, the monitoring of employee behavior might be subject to legal restriction or subject to industrial relation agreements.

The following are the topics we'll cover in this chapter:

  • Reviewing basic concepts of data governance
  • Establishing descriptive metadata
  • Understanding lineage information in QlikView
  • Deploying the QlikView Governance Dashboard

Reviewing basic concepts of data governance

We already should know enough, technically, to handle all of the QlikView elements in this chapter. Therefore, the only element that I want to review is the whole concept of metadata.

Understanding what metadata is

So, what exactly is metadata and how does it apply to QlikView?

The prefix meta means several things, depending on how it is used. In the area of Epistemology, the study of knowledge, meta simply means about. So, metadata is information about data: where the data has come from, who owns the data, who produced the data, when the data was produced, what format the data is in, and so forth.

One piece of data can have quite a lot of metadata. Traditionally, metadata has been broken down into two types: structural and descriptive. A third type, administrative, is critical for correct data governance.

Each of these types can be broken down into many more subtypes, but we really need to be careful about how far we go with the process. We want to create some metadata, but we don't want to spend 2 years creating it. QlikView does a lot to help us here, but we will have to do some work.

Structural metadata

Structural metadata gives us information about how the data hangs together. At a simple level, the table viewer in QlikView is structural metadata. We can get additional information from the Tables tab in Document Properties, and we can export this information to tab delimited text files:

Structural metadata

As QlikView developers and designers, this information is very important to us. We need to know how the data model is built and how everything hangs together to be able to build the most effective QlikView applications.

Other important structural information is where the data comes from and what are the data sources, files, and so forth that make up the data. Knowing this data lineage information allows us to make decisions on the impact any changes to these sources might have. We can also analyze to see where data sources are shared among multiple applications so that we can make decisions about the reuse of data via QVD. Knowing which files are in use also allows us to work out which files are not in use and either clean up the file structure or ask questions about why these files aren't in use.

Descriptive metadata

Descriptive metadata is any additional data that we add to our applications to give more information and context about the application and the individual elements of the data.

This information is very useful to application designers and business users who need to know more about what they are using. It is also useful to add commentary about what we are doing so that we can review and recall at a later date.

This information can be added in multiple places in QlikView, and we will review this in the Establishing descriptive metadata section.

Administrative metadata

Administrative metadata is, as it sounds, information of interest to system administrators and managers—information about where applications reside, who can access an application, who is actually using them, and what they are being used for. All of this data is available from QlikView logs and system information, but it is not always easy to collate. Obviously, a QlikView application that can collate this information for us will be very useful.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset