Chapter 14. The Principle of Orthogonal Design

Orthogonal At right angles to; independent

David Darling: The Universal Book of Mathematics

Note: Portions of this chapter originally appeared, in considerably different form, in my book Date on Database: Writings 2000-2006 (Apress, 2006).

I’ll begin this chapter with a quick review of the principles of normalization and an analysis of how well normalization meets its objectives. Here’s a summary of those principles:

  1. A relvar not in RFNF should be “fully normalized”—i.e., decomposed into a set of (at least) RFNF projections.

  2. The original relvar should be reconstructable by joining those projections back together again—i.e., the decomposition should be nonloss.

  3. The decomposition process should preserve dependencies (FDs and JDs)—at least if it can do so without violating Principle No. 1.

  4. Every projection should be needed in the reconstruction process.

TWO CHEERS FOR NORMALIZATION

As I’ve repeatedly said, normalization is the science (or a large part of the science, at any rate) underlying database design. But it’s far from being a panacea, as we can easily see by considering what its goals are and how well it measures up against them. Here are those goals:

  • To achieve a design that’s a “good” representation of the real world (i.e., one that’s intuitively easy to understand and is a good basis for future growth)

  • To reduce redundancy

  • Thereby to avoid certain update anomalies that might otherwise occur

  • To simplify the statement and enforcement of certain integrity constraints

I’ll consider each in turn.

  • Good representation of the real world: Normalization does well on this one. I have no criticisms here.

  • Reduce redundancy: Normalization is a good start on this problem too, but it’s only a start. For one thing, it’s a process of taking projections, and we’ve seen that not all redundancies can be removed by taking projections; indeed, there are many kinds of redundancy that normalization simply doesn’t address at all. (Chapter 15 discusses this issue in detail.) For another thing, taking projections, even when the decomposition is nonloss, can cause dependencies to be lost, as we saw in Chapter 6 and elsewhere.

  • Avoid update anomalies: This point is, at least in part, just the previous one by another name. It’s well known that designs that aren’t properly normalized can be subject to certain update anomalies, precisely because of the redundancies they entail. In relvar STP, for example (see Figure 1-2 in Chapter 1), supplier S1 might be shown as having status 20 in one tuple and status 25 in another. Of course, this particular anomaly can arise only if a less than perfect job is being done on integrity constraint enforcement ... Perhaps a better way to think about the update anomaly issue is this: The constraints needed to prevent such anomalies will be easier to state, and might be easier to enforce, if the design is properly normalized than they would be if it isn’t (see the next paragraph). Yet another way to think about it is: More single tuple updates[140] will be logically acceptable if the design is properly normalized than would be the case if it isn’t (because unnormalized designs imply redundancy—i.e., several tuples saying the same thing—and redundancy implies that sometimes we have to update several things at the same time).

  • Simplify statement and enforcement of constraints: As we know from earlier chapters, some dependencies imply others. (More generally, in fact, constraints of any kind can imply others. As a trivial example, if shipment quantities must be less than or equal to 5000, they must certainly be less than or equal to 6000.) Now, if constraint A implies constraint B, then stating and enforcing A will effectively state and enforce B “automatically” (indeed, B won’t need to be separately stated at all, except perhaps by way of documentation). And normalization to 5NF gives a very simple way of stating and enforcing certain important constraints; basically, all we have to do is define keys and enforce their uniqueness—which we’re going to do anyway—and then all applicable JDs (and therefore all MVDs and FDs as well) will effectively be stated and enforced automatically, because they’ll all be implied by those keys. So normalization does a pretty good job in this area too. (Of course, I’m ignoring here the various multirelvar constraints that the normalization process is likely to give rise to.)

Here on the other hand are several more reasons, over and above those already given, why normalization is no panacea:

  • First, JDs and MVDs and FDs aren’t the only kind of constraint, and normalization doesn’t help with any others.

  • Second, given a particular set of relvars, there’ll often be several distinct nonloss decompositions into 5NF projections, and there’s little or no formal guidance available to tell us which one to choose in such cases. (To be honest, though, I doubt whether this lack is likely to cause major problems in practice.)

  • Third, there are many design issues that normalization simply doesn’t address. For example, what is it that tells us there should be just one suppliers relvar, instead of one for London suppliers, one for Paris suppliers, and so on? It certainly isn’t normalization as classically understood.

All of that being said, I must make it clear that I don’t want the foregoing comments to be seen as any kind of attack. As I said in Chapter 8, I believe anything less than a fully normalized design is strongly contraindicated. But the fact remains that normalization (“the scientific part of design”) as such really doesn’t do as much of the job as we’d like—and so it’s good to be able to say that now there’s a tiny piece of additional science available to us. That’s what the topic of orthogonality is all about.

Note: The concept of orthogonality has evolved over time. As a result, portions of this chapter are at odds, somewhat, with previous writings—mostly by myself—on this same subject. What’s more, I very much doubt whether this chapter is the last word, either. I do believe the chapter is accurate as far as it goes; however, further refinements to the material might well be possible, and desirable. Caveat lector.



[140] Perhaps better, more singleton set updates.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset