THE CHASE ALGORITHM

From everything we’ve seen in this chapter so far, the obvious question presents itself:

Given some set D of dependencies (FDs or JDs or a mixture), what dependencies d are implied by those in that set?

A partial answer to this question is provided by the chase algorithm, which is, precisely, an algorithm for testing whether some dependency d is implied by a set of dependencies D. More specifically, given the set D and some dependency d, the chase will either:

  1. Show d is implied by D, or

  2. Show it isn’t, by providing an explicit counterexample—that is, a relation that satisfies all of the dependencies in D and yet violates d.

As a matter of fact we’ve already seen some examples of the chase in action, as it were. In the previous section, I showed how a given FD and JD together implied a certain FD and not another (the latter was actually a key constraint, which is a special case of an FD, of course). And in the section before that, I gave two examples in which a given FD and JD together implied a certain JD (thereby showing the given JD was reducible, incidentally). All of these examples were in fact applications of the chase. But now let’s get more specific. In order to do that, I first need to introduce a little more terminology:

  • Consider FDs. Abstractly (though of course very loosely), an FD takes the form “If certain tuples t1, ..., tn appear, then certain attributes within those tuples must have equal values.” For this reason, FDs are sometimes said to be equality generating dependencies.

  • Now consider JDs. Abstractly, but again very loosely, a JD takes the form “If certain tuples t1, ..., tn appear, then a certain tuple t must appear.” JDs are therefore sometimes said to be tuple generating dependencies.

Before going any further, I must caution you not to confuse tuple generating and tuple forcing dependencies.[110] A tuple forcing dependency is a JD with the property that if tuples t1, ..., tn appear, then some tuple t is forced to appear that’s distinct from each of t1, ..., tn. By contrast, a tuple generating dependency (a) doesn’t require the “generated” tuple to be distinct from the given tuples and (b) doesn’t in fact have to be a JD, as such, at all. (However, the only tuple generating dependencies discussed in this book are indeed JDs specifically. For present purposes, therefore, you can take “tuple generating dependency” to mean a JD; thus, we can say that all tuple forcing dependencies are tuple generating, but some tuple generating dependencies aren’t tuple forcing.)

Equality generating and tuple generating dependencies both involve a set of premises—viz., the tuples t1, ..., tn—and a conclusion. For a tuple generating dependency, the conclusion is the generated tuple t; for an equality generating dependency, it’s the fact that a certain equality holds.

Now I can explain the chase algorithm as such. Perhaps I should say first that it’s essentially common sense; in fact, it tends to be easier to illustrate than to describe. In outline, however, it works like this. We’re trying to determine whether the dependency d follows from the dependencies in the set D. We proceed as follows:

  1. We write down tuples representing the premises of d.

  2. We apply the dependencies in D to those tuples (possibly generating additional tuples), and keep repeating this process until no further change occurs.

This procedure overall will eventually yield either:

  1. A representation of the conclusion of d, in which case d does follow from D, or

  2. A relation that satisfies D but not d, in which case d doesn’t follow from D.

Let’s do an example. Let the given set of dependencies be as follows (Heath notation once again):

     { AC , BC , CD , CEA , DEC }

(Actually they’re all FDs, as you can see.) Consider also the following JD (call it J):

      { AB , AD , AE , BE , CDE }

I’ll now show that the given FDs do in fact imply J (a state of affairs that, I’ll think you agree, isn’t immediately obvious).

The first step is to write down tuples representing the premises of the JD J. Now, let me spell out exactly what that JD says:

If all of the following are the case—

  • a tuple appears with A = a and B = b

  • a tuple appears with A = a and D = d

  • a tuple appears with A = a and E = e

  • a tuple appears with B = b and E = e

  • a tuple appears with C = c and D = d and E = e

—then

  • a tuple with A = a and B = b and C = c and D = d and E = e must appear.

However, it turns out to be more convenient to use, not a, b, c, d, and e as such, but rather suffixed x’s and y’s to denote attribute values. To be specific, I’ll use x1-x5 in place of a-e, respectively, and I’ll use y’s in all other positions; e.g., I’ll use y23 to denote the “third” or C value in the “second” premise tuple. So the premise tuples look like this:[111]

      x1   x2  y13  y14  y15
      x1  y22  y23   x4  y25
      x1  y32  y33  y34   x5
     y41   x2  y43  y44   x5
     y51  y52   x3   x4   x5

If (and only if) the JD is implied by the five FDs, then, these tuples will “generate” the following tuple:

     x1   x2   x3   x4   x5

So let’s see if it does; i.e., let’s apply the given dependencies.

  • From AC, we have y13 = y23 = y33; likewise, from BC, we have y13 = y43. So we can replace each of y23, y33, and y43 by y13. The premise tuples become (replacements shown in bold):

          x1   x2  y13  y14  y15
          x1  y22  y13   x4  y25
          x1  y32  y13  y34   x5
         y41   x2  y13  y44   x5
         y51  y52   x3   x4   x5
  • From CD, we have y14 = y34 = y44 = x4. Make the replacements:

          x1   x2  y13   x4  y15
          x1  y22  y13   x4  y25
          x1  y32  y13   x4   x5
         y41   x2  y13   x4   x5
         y51  y52   x3   x4   x5
  • From CEA, we have y41 = x1. Make the replacements:

          x1   x2  y13   x4  y15
          x1  y22  y13   x4  y25
          x1  y32  y13   x4   x5
          x1   x2  y13   x4   x5
         y51  y52   x3   x4   x5
  • From DEC, we have y13 = x3. Make the replacements:

          x1   x2   x3   x4  y15
          x1  y22   x3   x4  y25
          x1  y32   x3   x4   x5
          x1   x2   x3   x4   x5  ◂=  Success: all x's !!!
         y51  y52   x3   x4   x5
  • The “fourth” tuple here is all x’s, and so the JD J does indeed follow from the given FDs.

Let’s look at another example. Let the given set of dependencies consist of just the JD {AB,AC}. Does this set imply the FD AB? Note: We already know the answer is no, because what we’re talking about here is the converse of Heath’s Theorem, and we know from Exercise 5.4 that the converse of Heath’s Theorem is false. But let’s see what the chase tells us:

  • Premise tuples:

          x1  y12  y13
          x1  y22  y23

    If and only if the FD is implied by the JD, then applying the JD to these tuples will have to make y12 and y22 equal. Does it do so? Well:

  • The given JD “generates” tuples as follows:

          x1  y12  y23
          x1  y22  y13
  • The four tuples taken together satisfy the JD but not the FD; in particular, they don’t require that y12 = y22. So the FD doesn’t follow from the JD.



[110] By the same token, don't confuse equality generating dependencies and equality dependencies, mentioned in an earlier footnote in this chapter and elsewhere and discussed in detail in Chapter 13.

[111] Strictly speaking, the premise tuples aren't really tuples at all, because they contain variables instead of values. Likewise, the premise tuples taken together don't really constitute a relation, either. I propose to overlook these points from here on; however, I should at least mention in passing that—partly for such reasons—the research literature typically refers to those premise tuples as constituting not a relation but a tableau.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset