Avoiding missteps – NULL values in facts

We learned that with MicroStrategy, it is very easy to create reports that please the Sales Managers who may say "show me the products we have sold on the Internet". But what about the products we did not sell, those we need to remove from the catalog or try to bundle with something else? To find them we first need to learn how to manage the NULL values. In SQL it would be just an OUTER JOIN between the DimProduct and fact table; how to do this in a report?

Getting ready

Create a report with only the Product attribute and run it. How many products do we have? Now add the Sum SalesAmount from FactInternetSales metric and run it again. How many rows are displayed?

So, we know that we sold 158 out of 606 products in our selection. We need to display the 448 rows of the DimProduct table where the value of the metric is NULL (actually there is no row in the fact table for those products).

How to do it...

This is done by changing a setting in Report Data Options:

  1. Open the Data menu and select Report Data Options.
  2. In the window that appears, select Attribute Join Type in the Categories list and uncheck the Use defaults box.
  3. A series of radio buttons is now enabled, select the last one which says Preserve lookup table elements joined to final pass result table based on template attributes with filter..
  4. Click on OK to confirm and answer Yes when prompted to re-execute the report.
  5. Now the report shows the full 606-product list.
  6. Save and close the report, give it the name 19 Products Sales on Internet.

How it works...

Look at the SQL view to spot the difference. In the first case, with the default setting, the SELECT query is:

from  FactInternetSales  a11
  join  DimProduct  a12
    on   (a11.ProductKey = a12.ProductKey)
group by  a11.ProductKey

In the second case, with the "Preserve lookup table…" setting the query is done in two steps; first step selects the sum of sales amount into a temporary table and the second looks like:

from  DimProduct  a11
  left outer join  ##TVP3OQJ22OL000  pa12
    on   (a11.ProductKey = pa12.ProductKey)

The OUTER JOIN with the dimension table is performed only in the second SQL statement.

There's more...

If you try to add the Product Subcategory to this report, you will have a disappointing surprise. Some products disappear from the radar. This is because in the database those products do not have a subcategory, in other words: the FK column ProductSubcategoryKey is set to NULL. This is something we should be aware of when dealing with normalized dimension tables. There is the possibility that if the integrity constraints are not enforced, we may lose some elements.

Different data warehouse designers may have different solutions to this problem; my personal philosophy is zero tolerance with NULL in dimension tables. During the ETL phase, I always set the empty FK values to -1 and then add a dummy row to the parent dimension table with -1 as the key and "not available" as description.

Let me stress that this is my personal way of dealing with it, and I found this solution acceptable in most of the projects involving MicroStrategy; nonetheless, another BI professional may have a different approach.

Try this:

  1. Modify the 19 Products Sales on Internet report to include Product Subcategory and run it, note that the number of rows is 397.
  2. In a command prompt window, open the sqlcmd utility and type the following instructions while using the AdventureWorksDW2008R2 database (find the command in the companion code file):
    SET IDENTITY_INSERT DimProductCategory ON
    
    insert into DimProductCategory (ProductCategoryKey, ProductCategoryAlternateKey, EnglishProductCategoryName, SpanishProductCategoryName, FrenchProductCategoryName) values (-1, -1, 'not available', 'no disponible', 'sans objet')
    go
    
    SET IDENTITY_INSERT DimProductCategory OFF
    
    SET IDENTITY_INSERT DimProductSubcategory ON
    
    insert into DimProductSubcategory (ProductSubcategoryKey, ProductSubcategoryAlternateKey, EnglishProductSubcategoryName, SpanishProductSubcategoryName, FrenchProductSubcategoryName, ProductCategoryKey) values (-1, -1, 'not available', 'no disponible', 'sans objet', -1)
    go
    
    update DimProduct set ProductSubcategoryKey = -1 where ProductSubcategoryKey is null
    go
    
  3. Run the report again with the Data | Reexecute Report menu, this time the row count is 606 and the products without subcategory appear next to the not available element.

With the previous INSERT and UPDATE statements, we created dummy -1 rows in the category and subcategory tables, and set all the NULL foreign keys in the product table to -1.

Note

You can watch screencasts of this operation at:

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset