EXAMPLE 6

In a paper he wrote in 1979 (“Extending the Database Relational Model to Capture More Meaning,” ACM TODS 4, No. 4, December 1979), Codd proposed a certain design discipline, which (simplifying slightly) can be described as follows:

  • Let E be an “entity type,” and let ID be a data type such that every entity of type E has exactly one primary identifier (my term, not Codd’s), of type ID. For example, E and ID might be the entity type “suppliers” and the data type “character string,” respectively.

  • Let P1, ..., Pn be a set of “property types” such that every entity of type E has at most one property of each of the types P1, ..., Pn. For example, in the case of suppliers, P1, P2, and P3 might be the property types “name,” “status,” and “city” (so n = 3 in this example). Note: I’m assuming for the sake of the present discussion (only) that a given supplier can have any subset of the three properties, including the empty set in particular.

  • Then the database should contain:

    1. Exactly one E-relvar, containing ID values for those entities of type E that exist at any given time, and

    2. Exactly one P-relvar for each Pi (i = 1, ..., n), containing (ID value, Pi value) pairs for each entity of type E that exists at any given time and has a property of type Pi at that time.

I’ll refer to this discipline as “the RM/T discipline,” since it’s part of what Codd referred to, in that 1979 paper, as “the extended relational model RM/T” (T for Tasmania, where Codd first presented his ideas for that extended model). Applying the discipline to the case of suppliers specifically, we obtain a design that looks like this (here I ignore for simplicity the fact that there’s supposed to an FD from {CITY} to {STATUS}):

     S  { SNO }
        KEY { SNO } ;

     SN { SNO , SNAME }
        KEY { SNO }
        FOREIGN KEY { SNO } REFERENCES S

     ST { SNO , STATUS }
        KEY { SNO }
        FOREIGN KEY { SNO } REFERENCES S

     SC { SNO , CITY }
        KEY { SNO }
        FOREIGN KEY { SNO } REFERENCES S

Each of these relvars is irreducible; equivalently, each is in 6NF.[159] Figure 15-3 shows a set of sample values. Note: The values in question aren’t meant to be the same as our usual sample values, though they’re close. Observe in particular that (a) supplier S3 has no status, (b) supplier S4 has no status and no city, and (c) supplier S5 has no name, no status, and no city.

An “RM/T design” for suppliers—sample values

Figure 15-3. An “RM/T design” for suppliers—sample values

As I indicated in Chapter 13, this kind of design actually has quite a lot to recommend it (at least, it would do so given a well architected DBMS). For present purposes, however, all I want to do is call your attention to the following: So long as every entity of type E has at least one of the n properties, then the design certainly involves some redundancy—arguably, in fact, strong redundancy as defined by Codd himself in his 1970 paper—because, at any given time, the value of the E-relvar will be equal to the union of the projections of the P-relvars over the identifier attribute:

     CONSTRAINT ... S { SNO } =
                    UNION { SN { SNO } , ST { SNO } , SC { SNO } } ;

This kind of redundancy would apply to Figure 15-3, for example, if we deleted supplier S5 (that is, if every supplier had at least one of the three properties name, status, and city). Exercise for the reader: How does the redundancy here differ from that discussed under Example 4? Does it differ? Would it make any difference if the employees in Example 4 had additional properties (for example, salaries)?

Observe further that the design becomes “even more redundant,” as it were, in the (common?) special case in which every entity of type E in fact has all of the n properties. Figure 15-4 is a revised version of Figure 15-3 that illustrates this situation. Note in that figure that—speaking a trifle loosely—{SNO} is now a foreign key in each of the relvars that references the sole key {SNO} in each of the others; equivalently, the projection on {SNO} of any of the relvars is equal to the projection on {SNO} of any of the others. Well ... to be more precise about the matter, there’s actually an equality dependency interrelating every pair of the four relvars:

     CONSTRAINT ...
       IDENTICAL { S { SNO } , SN { SNO } , ST { SNO } , SC { SNO } } ;

Note: IDENTICAL is an operator proposed by Hugh Darwen and myself[160] as an addition to Tutorial D, with semantics as follows: The expression

     IDENTICAL { r1 , ... , rn }

returns TRUE if the relations r1, ..., rn are all equal and FALSE otherwise (you can think of it as a kind of n-adic “=” operator).

A revised version of

Figure 15-4. A revised version of Figure 15-3

Note carefully, however, that even in this extreme case, the design doesn’t violate orthogonality. What’s more, I say again that this kind of design would have quite a lot to recommend it given a well architected DBMS. In particular, the equality dependencies, and therefore the redundancies, would be “automatically” managed and maintained in such a system (see the section MANAGING REDUNDANCY, later).



[159] In the interest of historical accuracy, I note that P-relvars as described by Codd in his RM/T paper weren’t necessarily in 6NF, because he didn’t insist that each P-relvar involve just a single “property.”

[160] In Database Explorations: Essays on The Third Manifesto and Related Topics (Trafford, 2010) and elsewhere.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset