CHAPTER 12

12.1 (a) Relvar CTX in the body of the chapter is an example, of course, but it would be better if you could come up with an example from your own work environment. (b) Let C be a certain club, and let relvar R{A,B} be such that the tuple (a,b) appears in R if and only if a and b are both members of C. Then R is equal to the cartesian product of its projections R{A} and R{B}; thus, it’s subject to the JD {A,B} and, equivalently, to the following MVDs:

     { } →→ A | B

These MVDs aren’t trivial, since they certainly don’t hold in all binary relvars, and they’re not implied by a superkey either (the only key in R is the entire heading). It follows that R isn’t in 4NF. However, it’s certainly in BCNF, because it’s “all key.”

12.2 Possible formulations:

  1. CONSTRAINT ... CTX = JOIN { CTX { CNO , TNO } ,
                               CTX { CNO , XNO } } ;
  2. CONSTRAINT ... CTXD { CNO , TNO , XNO } =
                        JOIN { CTXD { CNO , TNO } , CTXD { CNO , XNO } } ;

12.3 (a) Suppose the current value of CTX is as given in Figure 12-1. Then none of the four tuples shown can be deleted in isolation: a deletion anomaly. (b) Suppose the current value of CTX contains just “the first two” of the tuples shown in Figure 12-1. Then neither “the third” nor “the fourth” tuple shown can be inserted in isolation: an insertion anomaly.

12.4 Relvar SPJ from Chapter 9 is an example (no MVDs hold in that relvar at all, apart from trivial ones, and so the relvar is certainly in 4NF).

12.5 The following proof might be thought to make very heavy weather of such an obvious point: Let the projection in question be R′. The FD XY holds in R′ if and only if, whenever tuples t1′ and t2′ of R′ have the same X value, they also have the same Y value. Let T1 and T2 be, respectively, the set of tuples in R from which t1′ is derived and the set of tuples in R from which t2′ is derived. By the definition of projection, every tuple t1 in T1 has the same X and Y values as t1′; likewise, every tuple t2 in T2 has the same X and Y values as t2′. It follows that whenever tuples t1 and t2 of R have the same X value, they also have the same Y value; thus the FD XY holds in R. And it further follows that XY holds in R′ if and only if it holds in R.

12.6 This result is immediate from Heath’s Theorem: If R is subject to the FD XY, it’s also subject to the JD {XY,XZ}, where Z is “the other” attributes of R, and therefore it’s subject to the MVDs X →→ Y | Z.

12.7 The JD {XY,XZ} is trivial if and only if XY = H or XZ = H. If XY = H, we have Case (b). If XZ = H, then Z = H - X; but Z = HX - Y by definition, so Y is a subset of X, and we have Case (a).

12.8 The rule amounts to saying: If we start with a relvar with two or more independent relation valued attributes (RVAs) and we want to eliminate them—which we usually but not invariably do want to do (see the answer to Exercise 4.11)—then the first thing we should do is separate those RVAs. Using the notation of the exercise, this step will give us relvars with headings XY and XZ, respectively. The next thing we should do is ungroup the RVA in each of those relvars. Suppose the relations in Y and Z have headings A and B, respectively; then the relvars that result from those ungroupings will have headings XA and XB, respectively.[197] Now normalize those relvars in the usual way, replacing them by BCNF projections. Then those BCNF projections will “automatically” be in 4NF. In other words, MVDs that cause a relvar not to be in 4NF shouldn’t arise in practice, if the foregoing procedure is followed.

It’s interesting to note, incidentally, that in his famous 1970 paper (see Appendix C), Codd gave an example in which he actually followed the foregoing procedure, and he touched on it again, briefly, in another paper the following year (“Normalized Data Base Structure: A Brief Tutorial,” Proc. 1971 ACM SIGFIDET Workshop on Data Description, Access, and Control, San Diego, Calif., November 11th-12th, 1971; again, see Appendix C). But I don’t think he ever mentioned it again, at least not in writing (because it was so intuitively obvious, perhaps).

Note: In case you find the foregoing discussion too abstract, take R to be a relvar with heading {CNO,T,X}, where T and X are relation valued and contain relations with headings {TNO} and {XNO}, respectively. Separating the RVAs gives us relvars with headings {CNO,T} and {CNO,X}, respectively. Ungrouping then gives us relvars with headings {CNO,TNO} and {CNO,XNO}, respectively—which is precisely what we want, of course, in the CTX example.

12.9 First of all, we’ll presumably need three relvars for representatives, areas, and products, respectively:

     R { RNO , ... } KEY { RNO }
     A { ANO , ... } KEY { ANO }
     P { PNO , ... } KEY { PNO }

Next, we can represent the relationships (a) between sales representatives and sales areas and (b) between sales representatives and products by relvars like this:

     RA { RNO , ANO } KEY { RNO , ANO }
     RP { RNO , PNO } KEY { RNO , PNO }

Every product is sold in every area. So if we introduce a relvar

     AP { ANO , PNO } KEY { ANO , PNO }

to represent the relationship between areas and products, then we have the following constraint:

     CONSTRAINT C1 AP = JOIN { A { ANO } , P { PNO } } ;

(The join here is actually a cartesian product.) Note that this constraint implies that AP isn’t in 4NF. In fact, AP doesn’t give us any information we can’t obtain from the other relvars. To be precise, the following EQDs hold:

     AP { ANO } = A { ANO }
     AP { PNO } = P { PNO }

But let’s assume for the moment that relvar AP is included in our design anyway.

No two representatives sell the same product in the same area. In other words, given an {ANO,PNO} combination, there’s exactly one responsible sales representative, RNO, and so we can introduce a relvar

     APR { ANO , PNO , RNO } KEY { ANO , PNO }

in which (to state the FD explicitly)

     { ANO , PNO } → { RNO }

(Specification of {ANO,PNO} as a key is sufficient to express this FD.) However, relvars RA, RP, and AP are now all redundant, since they’re all projections of APR; they can therefore all be dropped. In place of constraint C1 we now need constraint C2:

     CONSTRAINT C2 APR { ANO , PNO } = JOIN { A { ANO } , P { PNO } } ;

This constraint must be separately and explicitly stated, since it isn’t “implied by keys.”

Also, since every representative sells all of that representative’s products in all of that representative’s areas, we have the additional constraint C3 on relvar APR:

     { RNO } →→ { ANO } | { PNO }

(These MVDs are nontrivial and not implied by keys, and relvar APR is thus not in 4NF.)[198] Again the constraint must be separately and explicitly stated.

Thus the final design consists of the relvars R, A, P, and APR, together with the constraints C2 and C3 (both of which are in fact equality dependencies once again):

     CONSTRAINT C2 APR { ANO , PNO } = JOIN { A { ANO } , P { PNO } } ;
     CONSTRAINT C3 APR = JOIN { APR { RNO , ANO } , APR { RNO , PNO } } ;

(There are also some foreign key constraints from APR to the other three relvars, but the details are straightforward and I omit them here.)

This exercise illustrates very nicely the point that, in general, normalization might be adequate to represent some of the semantic aspects of a given problem (basically, FDs, MVDs, and JDs that are implied by keys), but explicit statement of additional constraints might be needed for other aspects. It also illustrates the point that it might not always be desirable to normalize “all the way” (relvar APR is in BCNF but not in 4NF).

As a subsidiary exercise, you might like to consider whether a design involving RVAs might be appropriate for the problem under consideration. Might such a design mean that some of the comments in the previous paragraph no longer apply?

12.10 The first point to note here is that the MVDs A →→ B | C and A →→ C | D make no mention of attributes D and B, respectively. But didn’t I say the union of X, Y, and Z, given the generic pair of MVDs X →→ Y | Z, had to be equal to the heading? Well, yes, I did—but I must now explain that we allow a certain shorthand notation as well, illustrated in this exercise. For definiteness, let’s focus on the expression A →→ B | C. By definition, this expression means A →→ B and A →→ C; and A →→ B implies A →→ CD, and A →→ C implies A →→ BD. Moreover, since A, B, C, and D are single attributes and hence mutually disjoint, the decomposition rule for MVDs allows us to infer A →→ D from either A →→ CD or A →→ BD. Putting this all together, we see that A →→ B | C is shorthand for either or both of A →→ B | CD and A →→ BD | C. Given this state of affairs, moreover, we adopt a shorthand according to which A →→ B | CD and A →→ BD | C can both be written thus: A →→ B | C | D—and this latter expression in turn can also be thought of as shorthand for the following three MVDs in combination: A →→ B, A →→ C, and A →→ D.

Now let’s try the chase. Here are premise tuples for A →→ C | D, which as we’ve just seen is equivalent to A →→ BC | D:

      x1   x2   x3  y14
      x1  y22  y23   x4

Applying A →→ B | CD generates:

      x1   x2  y23   x4
      x1  y22   x3  y14

Applying BD gives y14 = x4. Replacing:

      x1   x2   x3   x4
      x1  y22  y23   x4
      x1   x2  y23   x4
      x1  y22   x3   x4

And now we have a tuple of all x’s, so the given dependencies do imply the target JD.



[197] We might have to do some attribute renaming first, if any attribute in either A or B has the same name as some attribute in X.

[198] Note, therefore, that relvar APR gives the lie to another popular misconception: viz., that a relvar consisting of a single key and a single nonkey attribute is necessarily in 4NF. See also the answer to Exercise 13.10, later in this appendix.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset