Now we are going to remove redundancies and assign each data element to its proper entity, following the first three levels of normalization. Recall our initial definition of normalization: Every data element is single-valued and must provide a fact completely and only about its primary key. To formalize this slightly, and to more accurately fit the description for the first three levels of normalization, we can rephrase this definition of normalization into: Every data element must provide a fact about the key, the whole key, and nothing but the key. First Normal Form (1NF) is the Every data element must provide a fact about the key part. Second Normal Form (2NF) is the whole key part, and Third Normal Form (3NF) is the nothing but the key part.

1NF ensures that Every data element must provide a fact about the key. ‘Key’ refers to the primary key of the entity. ‘Provide a fact about’ is the most important phrase, which we defined earlier and will elaborate on here. It means that for a given primary-key value, we can identify, at most, one of every data element that depends on that primary key. For example, assume Department Number is the primary key to the Department entity. Table 16.4 shows some sample values.

Table 16.4 Sample values for Department

Department Number

Department Name

A

Accounting

A

Marketing

B

Information Technology

C

Manufacturing

In this example, Department Number A identifies two values for Department Name: Accounting and Marketing. Therefore, Department Name does not provide a fact about Department Number, and this example violates 1NF.

Ensuring each data element provides a fact about its primary key includes correcting the more blatant issue shown in Table 16.4, as well as addressing repeating groups and multi-valued data elements. Specifically, the modeler needs to:

·         Move repeating data elements to a new entity

·         Separate multi-valued data elements.

When there are two or more of the same data element in the same entity, they are called repeating data elements. The reason repeating data elements violate 1NF is that for a given primary key value, we are getting more than one value back for the same data element. Repeating data elements often take a sequence number as part of their name. For example, recall Figure 15.5, a subset of which is reproduced here in Figure 16.8.

Figure 16.8 Ice Cream Theme can group many Ice Cream Flavors

Recall these definitions:

Ice Cream FlavorThe distinctive taste of one ice cream product over another. Flavors are recognized by consumers. Examples include Chocolate, Vanilla, and Strawberry.

Ice Cream Theme – A collection of ice cream flavors that have been grouped together because they share at least one common property. For example, the Brownie Ice Cream and Cake Batter Ice Cream flavors are grouped into the Cake Ice Cream Theme.

Imagine the initial logical model for this ice cream example is the model shown in Figure 16.9. Table 16.5 contains the sample data for this model.


Figure 16.9 Un-normalized ice cream example

Table 16.5 Sample values for Ice Cream Theme

Ice Cream Theme Code

Ice Cream Theme Name

Ice Cream Flavor One Code

Ice Cream Flavor One Name

Ice Cream Flavor Two Code

Ice Cream Flavor Two Name

Ice Cream Flavor Three Code

Ice Cream Flavor Three Name

C

Cake - Cake Ice Cream Theme

Br

Brownie

Ca

Cake Batter

 

 

T

Trad – Traditional Ice Cream Theme

Ch

Chocolate

Va

Vanilla

St

Strawberry

W

Winter – Winter Ice Cream Theme

Sn

Snowball Slide

 

 

 

 

 

Note that to resolve this repeating group, we need to ask important business questions such as:

·         Is there any significance to the ‘One’, ‘Two’, or ‘Three’ in the name of Ice Cream Flavor? That is, could Ice Cream Flavor One Name be the most popular flavor, Ice Cream Flavor Two Name the second most popular, and Ice Cream Flavor Three Name the third most popular?

·         Can you ever have more than three flavors?

·         Can you ever have no flavors?

Repeating data elements hide business rules and limit the ‘many’ in a one-to-many or many-to-many relationship. If a data element is repeated three times in an entity, we can have at most three occurrences of this data element for a given entity instance. We cannot have four, and we waste space if we have only one or two.

Assume there is no significance to the ‘One’, ‘Two’, or ‘Three’ in the flavor name. Figure 16.10 contains the ice cream data model with repeating data elements resolved.

Figure 16.10 Ice Cream example with repeating data elements resolved

Now we can have any number of Ice Cream Flavors for a given Ice Cream Theme, such as zero, three, or one hundred.

In our contact example, by examining the sample data in Table 16.3 and asking lots of questions of business experts, we learn that a Contact can have more than one phone number and specialty description. We see that Jon Smith has three phone numbers: (908)333-1212, (908)555-1212, and 554-1212. We see that The Amazing Rolando has three specialty descriptions: Magic for all occasions, Walk around magic, and Children’s parties.

We can find ourselves asking many questions just to determine if there are any repeating data elements we need to address. We can have a question template, such as:

Can a [[insert entity name here]] have more than one [[insert data element name here]] ?

So these are all valid questions:             

·         Can a Contact have more than one Email Address?

·         Can a Contact have more than one Person Name?

·         Can a Contact have more than one Person Title?

·         Can a Contact have more than one Company Name?

·         Can a Contact have more than one Web Address?

·         Can a Contact have more than one Mailing Address?

·         Can a Contact have more than one Phone Number?

·         Can a Contact have more than one Logo Image?

·         Can a Contact have more than one Specialty Description?

Note that you might find yourself rephrasing some of these techie-sounding questions to make them more understandable to a business person. Can a Contact have more than one Email Address? might be better phrased as Do you ever have the need to reach a Contact by more than one Email Address?

Assuming that the four business cards we are examining contain a good representative set of data (and that’s a big assumption), and that the business people provide answers to our questions that are consistent with this data, a more accurate version of our Contact model would reflect that a Contact can have at most three phone numbers and at most three specialties, as shown in Figure 16.11.

Figure 16.11 More accurate view of Contact

If a Contact can have three phone numbers, there is a good chance that one day there will be a Contact that has four phone numbers, or maybe ten. So in general terms, a Contact can have one or many phone numbers, and also one or many specialties. So we need to create separate entities for phone number and specialty, as shown in Figure 16.12.

 

Figure 16.12 Repeating groups moved to new entities

To resolve a repeating data element, you can see that we need to create a one-to-many relationship, or a many-to-many relationship with a new entity that contains the repeating data element.

We are not yet done with completely resolving repeating groups, though. What other questions do we need to ask?

We already asked whether a contact can have more than one of each of its data elements. We also need to ask the other side of the equation, which is – can the values in these repeating data elements belong to more than one contact?

We need to make sure that one-to-many relationships from Contact to Contact Phone, and from Contact to Contact Specialty, are correct, and that they are not many-to-many relationships. So these two business questions need to be answered, as well:

·         Can the same Phone Number belong to more than one Contact?

·         Can the same Specialty Description belong to more than one Contact?

By looking at the sample data in Table 16.3 and confirming with business people, we learn that the same phone number and specialty can belong to more than one contact, as Jenn and Steve from findsonline.com both share the phone number (973)555-1212 and specialty Internet auction experts. Therefore, a more accurate model with repeating groups resolved is shown in Figure 16.13.

Figure 16.13 Repeating groups resolved

Each of the original relationships has have been converted into two relationships, but the roles they play remain the same, so the verbs (‘Reach’ and ‘Market’) are also used in the new relationship names.

Note that what currently makes the phone number and specialty description unique at this point, is the Phone Number and Specialty Description themselves, respectively. These are not ideal primary keys. but they serve to illustrate normalization in this example.

Multi-valued means that within the same data element we are storing at least two distinct values. There are at least two different business concepts hiding in one data element. For example, Name may contain both a first name and last name. First Name and Last Name can be considered distinct data elements, and therefore, John Smith stored in Name, is multi-valued, because it contains both John and Smith.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset