The Table Files Wizard is used by many to load a file and generate the load script for it. However, there is a not-so-secret secret button with the word Next written on it that is often ignored:
There are some great things in here that are worth looking at.
When we first hit that Next button on the wizard, we are presented with the interesting option, Enable Transformation Step:
If we click on this button, it brings us to a new dialog with several tabs:
The Garbage tab allows us to clean out records that are not useful to us. In the preceding screenshot, there are a couple of rows that we can select and click on the Delete Marked button to remove. We can also click on the Conditional Delete… button and set up a rule to delete particular rows, for example, if they begin with the word Total:
The Fill tab allows us to fill in missing values, or overwrite other values, based on a condition. We can fill data from any direction:
The Column feature allows us to create new columns or replace columns, by copying the content of another column:
This is quite often used with the Context tab because it allows us have two columns: one with the original value and one with the value extracted from the context.
The Context tab is only available when working with HTML data. It allows us to extract information from tags in the data. For example, if we go to www.xe.com, we can get a table that lists currencies. In one column, we will have the currency name but this is also a hyperlink. The Context
function allows us to interpret the value as HTML and extract the hyperlink href
value:
The Unwrap
function allows us to take data that might be wrapped across multiple columns and unwrap it. For example, if we have the Year and Value columns followed by another pair of Year and Value columns, the Unwrap
function will allow us to wrap the second pair of columns back under the first pair where the data should be.
The last couple of pages of Table Files Wizard contain further manipulation options. Probably the most frequently used of these is the Crosstable option. This is used, often with Excel files but can also be from any data source, to correct the data where you have what appear to be field names in a two-dimensional matrix and you want to transform them into the field values that they should be. For example, in a budget file, we might have the budget month running across the top of the page:
However, the budget month is actually not a field in itself; it should be a field value. This is where the Crosstable wizard comes in:
In the wizard, we need to tell it how many fields are Qualifier Fields. This means fields that are already correct as field values and don't need to be unraveled. Next, we specify a name that we want to call the new field, for example, BudgetMonth
. Finally, we specify a name for the new field that will hold the values that are currently in the matrix, for example, BudgetValue
. Luckily, these days, QlikView provides a color coding to show you where each value applies.
When we click on OK in the Crosstable wizard, we will see a preview of the how the data will look:
When we now click on OK, the script to load the table, along with the Crosstable prefix, will be inserted into the script editor:
CrossTable(BudgetMonth, BudgetValue, 2) LOAD RepCode, Country, [41640], [41671], [41699], [41730], [41760], ... [42309], [42339] FROM [..SourcesBudget.xls] (biff, embedded labels, table is Sheet1$, filters( Remove(Row, Pos(Top, 1)), Remove(Row, RowCnd(CellValue, 1, StrCnd(start, 'Total'))), Remove(Row, Pos(Top, 5)), Replace(1, top, StrCnd(null)) ));
We can see that the values that we passed in from the wizard have gone in as parameters. Actually, once we understand what the parameters of the CrossTable
statement are, we might never use the wizard again!
It is interesting to note that we don't need to actually list all the fields in the file. We can, instead, have a piece of script like this:
CrossTable(BudgetMonth, BudgetValue, 2) LOAD * FROM [..SourcesBudget.xls] (biff, embedded labels, table is Sheet1$);
This will allow different months to be added or removed from the file as time goes on. However, we will need to ensure that the structure of the file doesn't change and no additional columns are added that are not month values (for example, totals
)
Another thing to note is that if the column names are numeric (as in this example), they will actually be loaded as text. This is correct because QlikView will otherwise just load the column name into all the values! In that case, we might need to add an additional step:
Budget_Temp: CrossTable(BudgetMonth, BudgetValue, 2) LOAD * FROM [..SourcesBudget.xls] (biff, embedded labels, table is Sheet1$); Budget: NoConcatenate Load RepCode, Country, Date(Num#(BudgetMonth,'#####')) As BudgetMonth, BudgetValue Resident Budget_Temp; Drop Table Budget_Temp;