Loading a dimension with a combination lookup/update step

The BODYPARTS dimension is meant to hold the name of the body part where the injury occurred. This table has no corresponding table in the source database. Because of that, there is no business key. The content is just a list of the different body parts and a surrogate key. The following is the DDL for the dimension table:

CREATE TABLE DIM_BODYPARTS (
id integer NOT NULL,
bodypart CHAR(50) NOT NULL,
PRIMARY KEY (id)
);

 Loading a dimension like this one is simple:

  1. Create a transformation.
  2. Drag and drop a Table input step. We will use it to get the data to load into the table.
  3. Double-click on the step. As the Connection, select Sports. In the SQL box, type the following query:
      SELECT DISTINCT injury_type FROM injury_phases
  1. Run a preview to make sure that you have the expected data.
  2. Close the window.
  3. From the Data Warehouse folder, drag and drop a Combination lookup/update step.
  4. Create a hop from the Table input step to this new step.
  5. Double-click on the Combination lookup/update step, and configure it as follows:

Configuring a Combination L/U step
  1. Close the window and save the transformation.
  2. Run the transformation. This will cause the dimension table to be loaded.
  3. In the View tab, right-click on the sample_dw connection, open the Database Explorer, and run the following code:
      SELECT * FROM dim_bodyparts
  1. You should see the following result:

Previewing dimension data

In this simple example, we loaded a dimension with a Combination Lookup/Update step (Combination L/U, for short). Let's look at how it works.

For each row in the dataset, the Combination L/U step looks in the dimension table for a record that matches the key fields specified in the grid. In this case, it looks for a record where the column bodypart is equal to the value of the injury_type field. If the row doesn't exist, the step generates a new surrogate key, the Technical key field, and inserts a row with the key fields and the generated surrogate key.

Whether the row exists or not, the step returns the surrogate key and adds it to the main dataset. You can verify this by inspecting the output metadata of the step.

The Combination Lookup/Update step can be used to load dimensions like the one in this example, and also to load Junk dimensions, mini dimensions, and any Type I SCD; that is, dimensions for which you are not interested in keeping historical values.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset