THE APPLICANTS AND EMPLOYEES EXAMPLE

This example (which, like the invoices and shipments example, is based on a real world application) concerns applicants for jobs in a certain enterprise. Relvar APPLICANT is used to keep a record of such applicants:

     APPLICANT { ANO , NAME , ADDR , ... }
               PRIMARY KEY { ANO }

The applicant number (ANO) is assigned at the time the applicant applies for the job; it’s unique to the applicant, and {ANO} thus constitutes the obvious primary key (in fact, it’s the only key).

Next, several further relvars are used to keep subsidiary applicant information (previous jobs held, list of references, list of dependants, etc.). I consider just one of these here, the “previous jobs held” relvar (APPLICANT_JOBS):

     APPLICANT_JOBS { ANO , EMPLOYER , JOB , START , END , ... }
                    PRIMARY KEY { ANO , START }
                    ALTERNATE KEY { ANO , END }
                    FOREIGN KEY { ANO } REFERENCES APPLICANT

Observe, incidentally, that once again we seem to be faced with an arbitrary choice of primary key, but that’s not the point I want to examine here.

Now, when a job applicant is successful, he or she is assigned an employee number (ENO, unique to the employee), and information regarding the new employee—job title, department number, phone number, etc.—is recorded in an EMP relvar:

     EMP { ENO , JOB , DNO , PHONENO , ... }
         PRIMARY KEY { ENO }

Now we have two distinct anchor relvars, APPLICANT and EMP, such that the very same entity (i.e., a successful applicant) is identified by an ANO value in one of the two and by an ENO value in the other. Of course, it’s true that the two relvars represent different roles—a tuple in APPLICANT represents a person in an applicant role and the corresponding tuple in EMP (if there is one) represents the same person in an employee role—but the fact remains that there’s just a single entity involved.

The foregoing isn’t the end of the story. Clearly, relvar EMP needs to refer back to relvar APPLICANT somehow (I’m assuming for simplicity, though the assumption might be a little unrealistic, that every employee was once an applicant). Thus, we need to add an ANO attribute to the EMP relvar and define a foreign key accordingly:

     EMP { ENO , ANO , JOB , DNO , PHONENO , ... }
         PRIMARY KEY { ENO }
         ALTERNATE KEY { ANO }
         FOREIGN KEY { ANO } REFERENCES APPLICANT

Now we have two candidate keys once again!—namely, {ENO} and {ANO}. This point will be relevant in a few moments; for now, however, I’ll ignore it.

Next, of course, we’ll need additional relvars to carry subsidiary information for employees (salary history, benefit details, etc.). Here’s the salary history relvar:

     SAL_HIST { ENO , DATE , SALARY , ... }
              PRIMARY KEY { ENO , DATE }
              FOREIGN KEY { ENO } REFERENCES EMP

Now we have the very same entity being not only identified, but also referenced, by an ENO value in one relvar (SAL_HIST) and by an ANO value in others (APPLICANT_JOBS, EMP). In other words, the database structure is as shown in Figure A-3.

The applicants-and-employees database

Figure A-3. The applicants-and-employees database

Now, we might avoid the apparent need for two different identifiers (ANO and ENO) for the same entity type by regarding EMP as a subtype of APPLICANT; after all, every employee is an applicant (loosely speaking), while the converse isn’t true. In this way we could use {ANO} as the primary key for EMP, treating {ENO} as an alternate key (or even dropping it altogether), and replace ENO by ANO in the SAL_HIST relvar. The database structure is now as shown in Figure A-4:

Using {ANO} as the primary key for EMP

Figure A-4. Using {ANO} as the primary key for EMP

However, note the implications of this state of affairs: It’s not just the database design that’s changed, it’s the way the enterprise has to operate. (For a start, it now has to identify employees by applicant number instead of employee number.) Why should the enterprise change its way of doing business, just because of a piece of relational dogma (“one primary key per entity type”)? To be specific, why shouldn’t it be allowed to identify applicants by applicant number and employees by employee number—even though applicants and employees are all persons, and indeed every employee is (or once was) also an applicant?

Aside: Another possibility would be to introduce a PERSON relvar and then regard both APPLICANT and EMPLOYEE as subtypes of PERSON. I leave the details as an exercise for the reader; I simply remark that this approach basically doesn’t solve anything, even if we invent a “person number” (PNO) and make {PNO} the primary key of PERSON. On the other hand, I definitely would recommend the supertype/subtype approach when “the same” primary key is involved everywhere (e.g., if we were dealing with employees and programmers and system programmers and application programmers, etc., etc., all identified by employee number). See Example 4 in Chapter 15. End of aside.

To summarize: The foregoing example strongly suggests there might be occasions on which it’s indeed desirable (a) to have several different anchor relvars for the same entity type; (b) to have a different primary key in each of those anchor relvars; and (c) to have different foreign keys referring to those different primary keys in different subsidiary relvars. Again, please note that I’m not saying the apparent need here to violate the rule “one primary key per entity type” can’t be avoided; what I’m saying is I don’t see a good way to avoid it, nor do I see a good reason for adopting a bad way. Again, therefore, I would like to suggest that the “one primary key for one entity type” precept be treated as a strong (?) guideline, but not as an inviolable rule.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset