Chapter 12. MVDs and 4NF

Who’s on first, What’s on second, I Don’t Know’s on third

Bud Abbott and Lou Costello: Naughty Nineties

In Chapter 10, I said that 4NF, like 2NF and 3NF, is mostly of historical interest. However, that characterization is perhaps a little unfair, because:

  • First of all, 4NF is the normal form with respect to what are called multivalued dependencies or MVDs. Now, MVDs are really just a special kind of JD; so if you know about JDs in general, you know about MVDs already, in a sense. Nevertheless, MVDs are still worth studying in their own right (for one thing, they’re probably more common in practice than JDs that aren’t MVDs are).

  • Second, MVDs have a more intuitive real world interpretation than JDs in general do, and therefore tend to be a little easier to understand.

  • Third, MVDs, unlike JDs in general, do have an axiomatization, as we’ll see.

So let’s take a closer look.

AN INTRODUCTORY EXAMPLE

In this section and the next, I’ll examine MVDs from a comparatively informal point of view; in the section after that I’ll consider them again, but more formally, and use that more formal understanding to lead up to 4NF. I’ll begin with a definition.

  • Definition: A multivalued dependency (MVD) is a join dependency with exactly two components.

It follows from this definition that a nonloss decomposition on the basis of an MVD always yields exactly two projections (recall that JDs in general can be n-way for some n > 2; by contrast, MVDs are always exactly 2-way). It follows further that the following JD (for example) is in fact an MVD:

      { { SNO , SNAME , CITY } , { CITY , STATUS } }

Now, we’ve seen this particular JD repeatedly in this book; it holds in relvar S. But didn’t I say in Chapter 9 that this JD was implied by a functional dependency: viz., the FD {CITY} → {STATUS}? Indeed I did; what the example shows, therefore, is that some MVDs are implied by FDs. But not all are, and as you’d probably expect it’s the ones that aren’t that are the interesting ones, in a sense. So let’s take a look at one of those “interesting ones.” Consider Figure 12-1, which shows a sample value for a relvar called CTX.[113] The predicate is as follows: Course CNO can be taught by teacher TNO and uses textbook XNO.

Relvar CTX—sample value

Figure 12-1. Relvar CTX—sample value

Now, relvar CTX is “all key” and is therefore certainly in BCNF. Yet it suffers from redundancy, as you can see; for example, the fact that teacher T1 can teach course C1 appears twice, and so does the fact that course C1 uses textbook X1. (It therefore suffers from certain update anomalies also. See Exercise 12.3.) And the reason for these redundancies is that I’m assuming—perhaps not very realistically—that teachers and textbooks are quite independent of one another; that is, no matter who actually teaches any particular offering of some particular course, the same textbooks are used. I also assume a given teacher or given textbook can be associated with any number of courses. Thus:

  • Each course c has a set T of teachers who can teach it and a set X of textbooks that it uses.

  • And, for each such course c, there’s a tuple in CTX for every possible combination of a teacher t from T and a textbook x from X. (Loosely speaking, in other words, each CNO value appears together with the cartesian product of all of the TNO and XNO values that correspond to that CNO value.)

To state the matter more precisely, the following constraint holds in relvar CTX (recall that from Chapter 9 that the symbol “∈” means “appears in”):

     IF   ( c , t1 , x1 )  ∈  CTX
     AND  ( c , t2 , x2 )  ∈  CTX
     THEN ( c , t1 , x2 )  ∈  CTX
     AND  ( c , t2 , x1 )  ∈  CTX

But to say this constraint holds is equivalent to saying the following JD holds:

      { { CNO , TNO } , { CNO , XNO } }

It follows that CTX is subject to this JD, and it further follows that the relvar can, and probably should, be decomposed into its projections on {CNO,TNO} and {CNO,XNO}. Exercise: Show the values of these projections corresponding to the sample value of relvar CTX in Figure 12-1, and check that the redundancies disappear. (But what multirelvar constraint now needs to be enforced?)

As an aside, I remark that the constraint shown above could be reduced from four lines to three without loss, by simply dropping the last line. What I mean is, if tuples (c,t1,x1) and (c,t2,x2) both appear, then the tuple (c,t1,x2) must appear (that’s what the third line says); so, switching the first two tuples around, it follows that if (c,t2,x2) and (c,t1,x1) appear, then (c,t2,x1) must appear as well. But the four-line version of the constraint is more symmetric and aesthetically satisfying, as well as perhaps being easier to understand.

By the way, you might be thinking the redundancies in CTX are unnecessary; more specifically, you might be thinking the relvar doesn’t need to show all possible TNO / XNO combinations for a given CNO. For example, two tuples would clearly be enough to represent the information that course C1 has two teachers and two textbooks. The problem is, which two tuples? Any specific choice leads to a relvar having a very unobvious interpretation and very strange update behavior. (Try stating the predicate for such a relvar!—i.e., try stating the criteria for deciding whether or not some given update is logically acceptable on that relvar. If you try this exercise, I think you’ll see why the redundancies in CTX are necessary after all.)



[113] The example is a modified version of the CTXD example from Chapter 9.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset