Denormalization is the process of selectively violating normalization rules and reintroducing redundancy into the model (and therefore, the database). This extra redundancy can reduce data retrieval time, which is the primary reason for denormalizing. We can also denormalize to create a more user-friendly model. For example, we might decide to denormalize company information into an entity containing employee information, because usually when employee information is retrieved, company information is also retrieved.

There are a number of different ways to denormalize. In this section, well discuss the two most common:

·         Rolldown denormalization

·         Rollup denormalization.

We will apply these two techniques to the Offering/Category subset of our publisher logical data model from the previous chapter. See Figure 10.15 for the starting point logical for Offering and Category, and their physical counterparts using each of these two techniques.

Figure 10.15 Offering and Category logical with two different physical designs

Our Normalized Starting Point:

 

Rolldown Denormalization:


Rollup Denormalization:

Rolldown is the most common of the denormalization techniques. The parent entity in the relationship disappears, and all of the parents columns and relationships are moved down to the child entity. Youll recall that the child entity is on the many side of the relationship and contains a foreign key back to the parent entity, which appears on the one side of the relationship.

In addition to choosing denormalization because of the need for faster retrieval time or for more user friendly structures, rolldown can be chosen in the following situations:

·         When you need to maintain the flexibility of the normalized model. Folding the columns and relationships together using rolldown still allows one-to-one and one-to-many relationships to exist. In Figure 10.15 for example, we did not lose the flexibility that an Offering can belong to many Categories.

·         When you want to reduce development time and complexity. Often there is a direct relationship between the number of tables and relationships on a model, and the amount of effort it will take to develop the application. A developer will need to write code that jumps from one table to another to collect certain columns, and this can take time and add complexity. Denormalizing into fewer tables using rolldown means the columns and relationships from different entities now exist in the same entity. In Figure 10.15, for example, if the developer needs to retrieve both offering information and the category name, they can easily do so from the same entity, Assignment.

In rollup, the same column or group of columns is repeated two or more times in the same entity. Also known as an array, rollup requires making the number of times something can occur static. Recall that in 1NF we removed repeating groups, and rollup means we are adding back in repeating groups. We had to decide in Figure 10.15 for example, that the maximum number of categories an offering can be assigned is three.

In addition to choosing denormalization because of the need for faster retrieval time or for more user friendly structures, repeating groups may be chosen in the following situations:

·         When it makes more sense to keep the parent entity instead of the child entity. When the parent entity is going to be used more frequently than the child entity, or if there are rules and columns to preserve in the parent entity format, it makes more sense to keep the parent entity.

·         When an entity instance will never exceed the fixed number of columns added. In Figure 10.15, we are only allowing up to three categories for an offering. If we had a fourth category for a title for example, how would we handle this?

Denormalization is a term that is applied exclusively to relational physical data models, because you cant denormalize something unless it has already been normalized. However, denormalization techniques can be applied to dimensional models, as wellyou just cant use the term denormalization. So the relational term rolldown for example, can still be applied to a dimensional model, just in terms of terminology it should be called something else, like flattening.

A star schema is the most common dimensional physical data model structure. The term meter from the dimensional logical data model, is replaced with the term fact table on the dimensional physical data model. A star schema results when each set of tables that make up a dimension is flattened into a single table. The fact table is in the center of the model, and each of the dimensions relate to the fact table at the lowest level of detail. A star schema is relatively easy to create and implement, and visually appears elegant and simplistic to both IT and the business.

Recall the dimensional logical data model example from the last chapter (Figure 9.10), now converted to a star schema as shown in Figure 10.16.

Figure 10.16 Financial dimensional PDM

A star schema is when each hierarchy is flattened into a single table. So on this star schema, Customer is flattened into Account, Country is flattened into Region, and Year is flattened into Month.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset