From everything we’ve seen in this chapter so far, the obvious question presents itself:
Given some set D of dependencies (FDs or JDs or a mixture), what dependencies d are implied by those in that set?
A partial answer to this question is provided by the chase algorithm, which is, precisely, an algorithm for testing whether some dependency d is implied by a set of dependencies D. More specifically, given the set D and some dependency d, the chase will either:
Show d is implied by D, or
Show it isn’t, by providing an explicit counterexample—that is, a relation that satisfies all of the dependencies in D and yet violates d.
As a matter of fact we’ve already seen some examples of the chase in action, as it were. In the previous section, I showed how a given FD and JD together implied a certain FD and not another (the latter was actually a key constraint, which is a special case of an FD, of course). And in the section before that, I gave two examples in which a given FD and JD together implied a certain JD (thereby showing the given JD was reducible, incidentally). All of these examples were in fact applications of the chase. But now let’s get more specific. In order to do that, I first need to introduce a little more terminology:
Consider FDs. Abstractly (though of course very loosely), an FD takes the form “If certain tuples t1, ..., tn appear, then certain attributes within those tuples must have equal values.” For this reason, FDs are sometimes said to be equality generating dependencies.
Now consider JDs. Abstractly, but again very loosely, a JD takes the form “If certain tuples t1, ..., tn appear, then a certain tuple t must appear.” JDs are therefore sometimes said to be tuple generating dependencies.
Before going any further, I must caution you not to confuse tuple generating and tuple forcing dependencies.[110] A tuple forcing dependency is a JD with the property that if tuples t1, ..., tn appear, then some tuple t is forced to appear that’s distinct from each of t1, ..., tn. By contrast, a tuple generating dependency (a) doesn’t require the “generated” tuple to be distinct from the given tuples and (b) doesn’t in fact have to be a JD, as such, at all. (However, the only tuple generating dependencies discussed in this book are indeed JDs specifically. For present purposes, therefore, you can take “tuple generating dependency” to mean a JD; thus, we can say that all tuple forcing dependencies are tuple generating, but some tuple generating dependencies aren’t tuple forcing.)
Equality generating and tuple generating dependencies both involve a set of premises—viz., the tuples t1, ..., tn—and a conclusion. For a tuple generating dependency, the conclusion is the generated tuple t; for an equality generating dependency, it’s the fact that a certain equality holds.
Now I can explain the chase algorithm as such. Perhaps I should say first that it’s essentially common sense; in fact, it tends to be easier to illustrate than to describe. In outline, however, it works like this. We’re trying to determine whether the dependency d follows from the dependencies in the set D. We proceed as follows:
We write down tuples representing the premises of d.
We apply the dependencies in D to those tuples (possibly generating additional tuples), and keep repeating this process until no further change occurs.
This procedure overall will eventually yield either:
A representation of the conclusion of d, in which case d does follow from D, or
A relation that satisfies D but not d, in which case d doesn’t follow from D.
Let’s do an example. Let the given set of dependencies be as follows (Heath notation once again):
{A
→C
,B
→C
,C
→D
,CE
→A
,DE
→C
}
(Actually they’re all FDs, as you can see.) Consider also the following JD (call it J):
{AB
,AD
,AE
,BE
,CDE
}
I’ll now show that the given FDs do in fact imply J (a state of affairs that, I’ll think you agree, isn’t immediately obvious).
The first step is to write down tuples representing the premises of the JD J. Now, let me spell out exactly what that JD says:
If all of the following are the case—
a tuple appears with A = a and B = b
a tuple appears with A = a and D = d
a tuple appears with A = a and E = e
a tuple appears with B = b and E = e
a tuple appears with C = c and D = d and E = e
—then
a tuple with A = a and B = b and C = c and D = d and E = e must appear.
However, it turns out to be more convenient to use, not a, b, c, d, and e as such, but rather suffixed x’s and y’s to denote attribute values. To be specific, I’ll use x1-x5 in place of a-e, respectively, and I’ll use y’s in all other positions; e.g., I’ll use y23 to denote the “third” or C value in the “second” premise tuple. So the premise tuples look like this:[111]
x1 x2 y13 y14 y15
x1 y22 y23 x4 y25
x1 y32 y33 y34 x5
y41 x2 y43 y44 x5
y51 y52 x3 x4 x5
If (and only if) the JD is implied by the five FDs, then, these tuples will “generate” the following tuple:
x1 x2 x3 x4 x5
So let’s see if it does; i.e., let’s apply the given dependencies.
From A → C, we have y13 = y23 = y33; likewise, from B → C, we have y13 = y43. So we can replace each of y23, y33, and y43 by y13. The premise tuples become (replacements shown in bold):
x1 x2 y13 y14 y15
x1 y22
y13
x4 y25
x1 y32
y13
y34 x5
y41 x2
y13
y44 x5
y51 y52 x3 x4 x5
From C → D, we have y14 = y34 = y44 = x4. Make the replacements:
x1 x2 y13
x4
y15
x1 y22 y13 x4 y25
x1 y32 y13
x4
x5
y41 x2 y13
x4
x5
y51 y52 x3 x4 x5
From CE → A, we have y41 = x1. Make the replacements:
x1 x2 y13 x4 y15
x1 y22 y13 x4 y25
x1 y32 y13 x4 x5
x1
x2 y13 x4 x5
y51 y52 x3 x4 x5
From DE → C, we have y13 = x3. Make the replacements:
x1 x2
x3
x4 y15
x1 y22
x3
x4 y25
x1 y32
x3
x4 x5
x1 x2
x3
x4 x5
◂=Success: all x's !!!
y51 y52 x3 x4 x5
The “fourth” tuple here is all x’s, and so the JD J does indeed follow from the given FDs.
Let’s look at another example. Let the given set of dependencies consist of just the JD {AB,AC}. Does this set imply the FD A → B? Note: We already know the answer is no, because what we’re talking about here is the converse of Heath’s Theorem, and we know from Exercise 5.4 that the converse of Heath’s Theorem is false. But let’s see what the chase tells us:
Premise tuples:
x1 y12 y13
x1 y22 y23
If and only if the FD is implied by the JD, then applying the JD to these tuples will have to make y12 and y22 equal. Does it do so? Well:
The given JD “generates” tuples as follows:
x1 y12 y23
x1 y22 y13
The four tuples taken together satisfy the JD but not the FD; in particular, they don’t require that y12 = y22. So the FD doesn’t follow from the JD.
[110] By the same token, don't confuse equality generating dependencies and equality dependencies, mentioned in an earlier footnote in this chapter and elsewhere and discussed in detail in Chapter 13.
[111] Strictly speaking, the premise tuples aren't really tuples at all, because they contain variables instead of values. Likewise, the premise tuples taken together don't really constitute a relation, either. I propose to overlook these points from here on; however, I should at least mention in passing that—partly for such reasons—the research literature typically refers to those premise tuples as constituting not a relation but a tableau.