Discovering data discrepancy issues

One of the main purposes of BI is creating reporting for the decision-making process. But what if the report produces the wrong data? We should find the reason and fix it. In order to do this, the best way is to copy the SQL statement in the SQL client and start to troubleshoot it.

Tip

Using VLDB settings, we can change the behavior of SQL generation in order to better understand the flow of data and easily troubleshoot it. For example, we can use sub-queries instead of temporary tables, and so on.

In addition, it is important to have the right numbers in order to know what to expect from numbers.

For example, we have the following report:

Discovering data discrepancy issues

It is obvious that Order Quantity isn't aggregated. In order to understand why, we should look at SQL:

select      a12.SALESREASONNAME  SALESREASONNAME, 
      a13.COLOR  COLOR, 
      sum(a11.ORDERQUANTITY)  WJXBFS1 
from  EDW.FACTRESELLERSALES   a11 
      cross join  EDW.DIMSALESREASON      a12 
      join  EDW.DIMPRODUCT    a13 
        on  (a11.PRODUCTKEY = a13.PRODUCTKEY) 
group by    a12.SALESREASONNAME, 
      a13.COLOR

Now it is clear. Report produce cross join. We can ask the DWH architect or look at the data model in order to fix the MicroStrategy schema.

It was a simple case, but we can get a number that will be only 1-2% different from the real value. In this case, it isn't enough just look at SQL; we should go deeper and research all transformations in the DWH and ETL. Sometimes, the problem can even be in the source files.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset