12. Deferred Assertions and Other Pipeline Datasets
Contents
The Semantics of Deferred Assertion Time262
Assertions, Statements and Time264
The Internalization of Pipeline Datasets267
Deferred Assertions269
A Deferred Update to a Current Episode269
A Deferred Update to a Deferred Assertion274
Reflections on Empty Assertion Time275
Completing the Deferred Update to a Deferred Assertion278
The Near Future and the Far Future279
Approving a Deferred Assertion280
Deferred Assertions and Temporal Referential Integrity284
Glossary References285
We normally think of inserting a row into a table as the same thing as claiming, or asserting, that the statement which that row makes is true. From that point of view, a distinction between the physical act of creating a row in a table, and the semantic act of claiming that what the row says is true, is a distinction without a difference.
This is why, we surmise, the computer science community calls the second of their two bi-temporal dimensions “transaction time”, an expression with obvious physical connotations. Yet while a transaction is a physical act, an assertion is not. It is a semantic act. And while the semantic act can't happen before the physical one, we see no reason why it can't happen after it, and a number of advantages that result if it can.
With the standard temporal model, the rows inserted into bi-temporal tables begin to be asserted on the date they are physically inserted into the database. With Asserted Versioning, this is the default for those rows; but Asserted Versioning permits this default to be overridden. Temporal transactions may be submitted, and physical rows created in response to them, prior to the date on which those rows will begin to be asserted.
To put it the other way around, an Asserted Versioning temporal transaction may be submitted with an assertion begin date in the future, so that the row the transaction creates will have a row creation date earlier than its assertion begin date. The row will be physically part of the table, but it won't be asserted. It won't be anything we show to the world, anything we are yet willing to claim makes a true statement. It will be a row which is physically in the same table as the rows which make up the currently asserted production data in that table. But semantically, it will be distinct from those rows.
We will say that transactions like these are deferred transactions, and that what they place in the database are deferred assertions. Unlike rows in conventional tables, deferred assertions do not represent true statements. They do not have a truth value at all, because we do not yet attribute a truth value to them. By the same token, as described in earlier chapters, Asserted Versioning rows which are withdrawn into past assertion time also do not represent true statements. They do not have a truth value at all, because we no longer attribute a truth value to them. They are a record of what we once claimed was true, just as deferred assertions are a record of what we may eventually claim is true.
For the most part, we need not concern ourselves with these logical subtleties. But neither should we ignore them completely, because they will help us understand this important functionality of Asserted Versioning which distinguishes it from the standard temporal model and from all other computer science work on bi-temporal data that we are aware of. So before we get on with the task of understanding what deferred assertions are and how to manage them, we should look a little more closely at the logical and semantic foundation on which the distinction between assertions and statements is based.
The Semantics of Deferred Assertion Time
Data describes objects. Conventional tables represent types of objects. Rows in those tables represent instances of those types, and describe those instances.
We create and maintain the data in these tables. Those who access this data assume that we believe that the data is correct and that each row makes a true statement about the object it represents. They understand, of course, that we may sometimes be wrong; but they assume that our intention is to be truthful, and that we take reasonable care to be accurate. Without those assumptions, the creation and maintenance of data would be a pointless activity.
So underlying the activity of creating, maintaining and consuming data lies the matter of what we claim or assert to be true. For purposes of this discussion, we will take the following ways of describing our relationship to the data we create, maintain and retrieve as equivalent. A row in a conventional table, we may say, indicates:
(i) What we accept as a true statement of what the object it represents is like.
(ii) What we agree is a true statement of what that object is like.
(iii) What we assent to as a true statement of what that object is like.
(iv) What we assert is a true statement of what that object is like.
(v) What we believe is a true statement of what that object is like.
(vi) What we claim is a true statement of what that object is like.
(vii) What we know is a true statement of what that object is like.
(viii) What we say is a true statement of what that object is like. And
(ix) What we think is a true statement of what that object is like.
Whatever semantic differences there may be between accepting, agreeing, assenting, asserting, believing, claiming, knowing, saying and thinking—and such differences are of great importance in such fields as epistemology, linguistics and the foundations of logic—these differences make no difference as far as bi-temporal data management is concerned. The fundamental difference for our purposes is between ontology and epistemology, between talk about what the world is like, and talk about what we think it is like.
A more thorough discussion of the semantics of statements and assertions is outside the scope of this book, but the reader should be aware that there is more here than meets the eye. For one thing, assertions are not statements. They are what philosophers call speech acts, ones made by means of statements. A statement is true or false. That is a relationship between the statement and the object it represents and describes. 1 An assertion is either made or not made. But that is not a relationship between a statement and an object. It is a relationship between a statement and the person who does or does not assert it.
1Assuming, that is, a pre-critical correspondence theory of truth which, for purposes of clarifying the semantics of bi-temporal data, seems to us perfectly adequate.
We sometimes say, in rough equivalence, that we believe or do not believe that a statement is true. But just as assertions are not statements, beliefs are neither statements nor assertions. Beliefs are what philosophers call propositional attitudes. In fact, assent, assert, claim and say are all speech acts; they are things we do with words. But believe, know and think are propositional attitudes; they are cognitive stances we take with respect to those words. (Accept and agree could be one or the other, depending on whether they refer to behavior or to a behavioral disposition.)
Assertions, Statements and Time
Conventional tables are the bread and butter of IT. The data in those tables represent both what things are currently like and also what we currently believe those things are like. They represent both what things are like now and what we now believe they are like.
There is a timeline along which persistent objects are located, and a timeline along which we hold various beliefs. Data in conventional tables is “pinned”, along both timelines, to the moving point in time we call “the present” and which, in this book, we designate as Now(). The maintenance of conventional data is an ongoing effort to keep up with the changes that follow in the trail of that moving point.
But as well as the present, there are the past and the future. So if we “unpin” data along both these timelines, we end up with nine possible ways that data and time may be related.
In this section, we will use the terminology of beliefs even though, as we said previously, the nine different terms we listed there are equivalent, as far as our discussions in this book are concerned. This chapter is about assertions, and so we initially tried to write this section using that terminology. But it seems to us that the argument is easier to follow using the language of beliefs. Nonetheless, we are speaking about assertions, albeit in the more colloquial language of beliefs. Not all assertions, of course; and not all beliefs. Rather, as we said earlier, assertions that statements made by rows in database tables are true statements, and beliefs that those statements are true statements.
Using the terminology of beliefs, we may say that the rows in tables in relational databases may relate data to time in any of nine ways. So where “thing” means, more precisely, “persistent object”, we can organize these nine relationships of rows to time as shown in Figure 12.1.
B9780123750419000121/f12-01-9780123750419.jpg is missing
Figure 12.1
Facts, Beliefs and Time.
In Asserted Versioning, beliefs are what we assert by means of rows in our tables, and facts are what those rows describe about the objects they represent. Columns, in Figure 12.1, from left to right, represent past, present and future beliefs. Rows, in that same illustration, from top to bottom, represent past, present and future facts. Temporalized beliefs are represented by rows with assertion time periods. Temporalized facts are represented by rows with effective time periods, i.e. by versions. 2
2Of course, since we cannot know the future, we cannot state with certainty either what the facts will be, or what we will believe. Instead, “what things will be like” should be taken as shorthand for “what things may turn out to be like”, and “what we will believe” should be taken as shorthand for “what we may come to believe”.
But temporal transactions cannot insert, update or delete all nine types of rows. Specifically, temporal transactions cannot insert, update or delete rows making statements about what we used to believe, statements of type (i), (ii) or (iii).
It's important to understand why this is so. Temporal transactions create new rows in temporal tables. But these rows represent beliefs, and we can't now make a statement about what we used to believe. On the other hand we can, of course, now make a statement about what used to be true. To understand what the two temporal dimensions of bi-temporal data really mean, we need to understand why distinctions like these ones are valid—why, in this case, we can make statements about how things used to be, but cannot make statements about what we used to think about them.
So why can't we? Surely we make statements about what we used to believe all the time. For example, we can now state that we used to believe that Bernie Madoff was an honest man. If we can make such statements in ordinary conversation, why can't we make them as transactions that will update a database?
The reason is that in a database, as we said, a belief is expressed by the presence of a row in a table. No row, no belief. So if we write a transaction today that creates a row stating that we believed something yesterday, we are creating a row that states that we believed something at a time when there was no row to represent that belief. Given that the beliefs we are talking about are beliefs that certain statements about persistent objects are true, and given that those statements are the statements made by rows in tables, it would be a logical contradiction to state that we had such a belief at a point or period in time during which there was no row to represent that belief. 3
3In fact, we offer this as a statement of what we will call the temporalized extension of the Closed World Assumption (CWA). All too briefly: the CWA is about the relationship of a collection of statements to the world. Its temporalized extension is about the relationship of beliefs (assertions, claims, etc.) to each of those statements.
This leaves us six combinations of beliefs and what they are about that we can, without logical contradiction, modify by means of a temporal transaction. Asserted Versioning recognizes all six combinations. But the standard temporal model does not permit data to be located in future belief time, and so it does not recognize combinations (vii), (viii) or (ix) as meaningful. It does not attempt to develop a data management framework within which we can make statements about what we may in the future believe.
Future beliefs, and their representation in temporal tables as not yet asserted rows, are precisely what make the difference between the assertion time dimension of Asserted Versioning and the transaction time dimension of the standard temporal model. Without it, the two temporal dimensions of Asserted Versioning are semantically equivalent to the two temporal dimensions of the standard temporal model. Without it, assertion time is equivalent to transaction time.
But is it valid to locate data in future belief time? After all, as we noted in a footnote a short while ago, we can be certain about what we once believed and about what we currently believe, but we cannot be certain about what we will believe. On the other hand, a lack of certainty is not the same thing as a logical contradiction. There is nothing logically invalid about making statements about what we think was, is or may come to be true. By the same token, there is nothing logically invalid about making statements about what we currently believe or may come to believe was, is or may turn out to be true. The only logical contradition is the one already noted, that because of the temporalized extension of the CWA, it is a logical contradiction to create a row representing a statement about what, prior to the time the row was created, we then believed/asserted to be true.
We should now have a clear idea of what deferred transactions and deferred assertions are. They are the data in categories (vii), (viii) and (ix) of Figure 12.1. We understand that neither the standard temporal model nor, for that matter, any more recent computer science work on bi-temporality that we are aware of, recognizes data which represents what we are not yet willing to assert is true about what things were like, are like or may turn out to be like.
Before discussing deferred transactions and deferred assertions, we want to explain how they are one subtype of a more generalized concept, of something we call pipeline datasets. Once we have done that, the remainder of this chapter will focus on deferred transactions and deferred assertions, and the business value of internalizing them. Then, in the next chapter, we will look at several other kinds of pipeline datasets, and the business value of internalizing them as well.
The Internalization of Pipeline Datasets
We begin by introducing some new terminology. Dataset is an older technical term, and up to this point in the book, we have used it to refer to any physical collection of data. Going forward, we would like to narrow that definition a bit. From now on, when we talk about datasets, we will mean physical files, tables, views or other managed objects in which the managed object itself represents a type and contains multiple managed objects each of which represent an instance of that type. Thus, comma-delimited files are datasets, as are flat files, indexed files and relational tables themselves. A graphic image is not a dataset, in this narrower sense of the term, nor is a CLOB (a character large object).
Production datasets are datasets that contain production data. Production data is data that describes the objects and events of interest to the business. It is a semantic concept. Production databases are the collections of production datasets which the business recognizes as the official repositories of that data. Production databases consist of production tables, which are production datasets whose data is designated as always reliable and always available for use.
When production data is being worked on, it may reside in any number of production datasets, for example in those datasets we call batch transaction files, or transaction tables, or data staging areas. Once we've got the data just right, we use it to transform the production tables that are its targets. The transformation may be carried out by applying insert, update and delete transactions to the production tables. At other times, the transformation may be a merge of data we've been working on into those tables, or a replacement of some of the data in those tables with the data we've been working on.
When data is extracted from production tables, it has an intended destination. That destination may be another database or a business user, either of which may be internal to the business or external to it. Sometimes that data is delivered directly to its destination. At other times, it must go through one or more intermediate stages in which various additional transformations are applied to it. When first extracted from production tables, this data is usually said to be contained in query result sets. As that data moves farther away from its point of origin, and through additional transformations, the resulting production datasets tend to be called things like extracts. At its ultimate destinations, it is manifested as the content displayed on screens or in reports, or as data that has just been acquired by downstream organizations, perhaps to supply their own databases as datasets which tend to be call feeds.
Let's make the metaphor underlying this description a little more explicit by using the concept of pipelines. Pipeline production datasets (pipeline datasets, for short) are points at which data comes to rest along the inflow pipelines whose termination points are production tables, or along the outflow pipelines whose points of origin are those same tables. The points of origin of inflow pipelines may be external to the organization or internal to it; and the data that flows along these pipelines are the acquired or generated transactions that are going to update production tables. The termination points of outflow pipelines may also be either internal to the organization, or external to it; and we may think of the data that flows along these pipelines as the result sets of queries applied to those production tables.
There may be many points at which incoming production data comes to rest, for some period of time, prior to resuming its journey towards its target tables. Similarly, there may be many points at which outgoing data comes to rest, for some period of time, prior to continuing on to its ultimate destinations. These points at which production data comes to rest are these pipeline datasets.
But these points of rest, and the movement of data from one to another, exist in an environment in which that data is also at risk. The robust mechanisms with which DBMSs maintain the security and integrity of their production tables are not available to those pipeline datasets which exist outside the production database itself.
All in all, pipeline data flowing towards production tables would cost much less to manage, and would be managed to a higher standard of security and integrity, if that data could be moved immediately from its points of origin directly into the production tables which are its points of destination. Let's see now if this is as far-fetched a notion as it may appear to be to many IT professionals. We will look at deferred transactions and deferred assertions in this chapter, and consider other pipeline datasets in the next chapter.
Deferred Assertions
We will discuss deferred transactions and deferred assertions, and how they work, by means of a series of scenarios in which deferred transactions are applied to sample data.
A Deferred Update to a Current Episode
We begin with an open episode of policy P861. As shown in Figure 12.2, the current version in this episode—P861(r4)—has an [Aug 2012 – 12/31/9999] effective time period. 4 It also has an [Aug 2012 – 12/31/9999] assertion time period. From this, we know that there is no representation of this object anywhere else in the production table, in either temporal dimension, from August 2012 until further notice.
4The notation “P861(r4)” indicates row #4 in the referenced figure, in this case Figure 12.2. The policy identifier is not strictly necessary, and is included just to remind us which object we are talking about.
B9780123750419000121/f12-02-9780123750419.jpg is missing
Figure 12.2
A Current Episode: Before the Deferred Assertion.
By now we should know how to read an asserted version table like this. The episode extends from an effective begin date of November 2011 to an effective end date of 12/31/9999. Every version in this episode is currently asserted.
We will now submit a deferred temporal update. Again, we assume that it is now January 2013. That transaction looks like this:
UPDATE Policy [P861,,, $55] May 2012, Jul 2012, Jan 2090
The three temporal parameters following the bracketed data are the effective begin date, effective end date and assertion begin date. All temporal updates discussed so far have accepted the default value for the assertion begin date, that value being Now(). Here, with our first deferred transaction, we override that default with a future date.
There are several things to note about this transaction. First of all, the object specified in this transaction is policy P861, and the transaction's effective timespan is May 2012 to July 2012, i.e. the two months of May and June 2012. The assertion begin date is January 2090, a date which is several decades in the future.
The first thing the AVF does is to split one or more rows in the Policy table into multiple rows such that one or a contiguous set of those rows has the oid and the effective timespan specified on the transaction. When a set of one or more contiguous asserted version rows, and a temporal transaction, have the same oid and also the same effective time period, we will say that they match.
Since the transaction specifies an effective timespan of [May 2012 – July 2012], the AVF modifies the current assertions for P861 so that one version matches the transaction. That is P861(r6), as shown in Figure 12.3.
B9780123750419000121/f12-03-9780123750419.jpg is missing
Figure 12.3
A Current Episode: Effective Time Alignment.
This results in a set of rows that are semantically equivalent to the original row, those rows being P861(r5, r6 & r7). They cover the same effective time period as the original row; and they contain the same business data as the original row. Note that, in Figure 12.3, we have not yet created the deferred assertion. We have just realigned version boundaries, within current assertion time, as a preliminary step to carrying out the update. Prior to this realignment, the effective timespan of the transaction was located [during] the effective time period of P861(r3). Now the effective timespan of the transaction [equals] the effective time period of P861(r6), and so the transaction matches that asserted version.
The result of this alignment is shown in Figure 12.3. P861(r3) has been withdrawn into past assertion time, into an assertion time period that ends on January 2013. P861(r5, r6 & r7) have replaced it in current assertion time, in assertion time periods that begin on January 2013 (and not, let it be noted, on January 2090). Again, we use angle brackets on row numbers to indicate rows that are part of an atomic and isolated unit of work, a series of physical modifications to the database that must together all succeed or all fail, and a set of rows that are not visible in the database until the unit of work completes.
Note that P861(r5, r6 & r7) have the same episode begin date and the same business data as row 3. In addition, their three effective time periods cover exactly the same clock ticks as the withdrawn P861(r3). These three rows, together, are semantically equivalent to P861(r3). They represent the same object in exactly the same effective time clock ticks; and in every such clock tick, they attribute the same business data to that object.
Nor has the assertion time in the table been altered, either. Prior to this transaction, the statement made by P861(r3) was asserted from April 2012 to 12/31/9999. Midway into the transaction, at the point shown in Figure 12.3, the table still asserts that from April 2012 to 12/31/9999, P861 was owned by client C882, was an HMO policy, and had a copay of $30. It asserts this because the statement made by the logical conjunction of P861(r6, r7 & r8) is truth-functionally equivalent to the statement made by P861(r6), and the assertion times of [Apr 2012 – Jan 2013] and [January 2013 – 12/31/9999] both [meet] and, together, [equal] the original assertion time of P861(r3), before it was withdrawn. At this point in the transaction, we have performed syntactic surgery on the target table, but have in no way altered its semantic content.
There is now one and only one row in the target table that matches the transaction. It is P861(r6). The AVF next withdraws P861(r6), moving it into closed assertion time, i.e. giving it an assertion time period with a non-12/31/9999 assertion end date. It does so by giving P861(r6) an assertion end date that matches the assertion begin date on the transaction, thus preserving the assertion time continuity of this effective time history of P861.
The next thing the AVF does is to make a copy of P861(r6), apply the copay update to that copy, and give it an assertion time period of [Jan 2090 – 12/31/9999]. This becomes P861(r8), the row that supercedes row 6. This row is the deferred assertion.
The result is shown in Figure 12.4.
B9780123750419000121/f12-04-9780123750419.jpg is missing
Figure 12.4
Withdrawing a Current Assertion into Closed Assertion Time, and Superceding It.
Note that this closed assertion is still current. It is currently January 2013, and so Now() still falls between the assertion begin and end dates of P861(r6), and will continue to do so until January 2090. So a closed assertion time period is one with a non-12/31/9999 end date. Some closed assertion time periods are past; they are no longer asserted. But others are current, like this one. And yet others may be assertion time periods that lie entirely in the future.
Note that this process is almost identical to the familiar process of withdrawing a version into past assertion time and superceding it with a row in current assertion time. The only difference is that the withdrawn assertion is moved into closed but still current assertion time, and the superceding assertion is placed into future assertion time.
At this point, both P861(r3 & r6) are locked. The AVF will never modify P861(r3) because it is already located in past assertion time. But P861(r6) is also locked, even though it is still currently asserted. The AVF treats any row with a non-12/31/9999 assertion end date as locked. The reason all such rows are locked, including those whose assertion time periods are not yet past, is that the database contains a later assertion which otherwise matches the locked assertion.
In this case, P861(r6) is locked because the Policy table now contains a later assertion that was created from it. That later assertion was supposedly written and submitted based on then-current knowledge of the contents of the database, specifically of what the database then asserted about what P861 was like in May and June of 2012. If that description is allowed to change before the later assertion became current, then all bets are off.
Another way to think about the locking associated with deferred transactions and deferred assertions is that it serializes those transactions. If a process about to update a row in a database does not first lock that row from other updates, then another update process could read the row before the first process is complete. Then, whichever process physically updates that row on the database first, its changes will be lost, overwritten by the changes made by the process which updates the database last. This could happen with deferred assertions if they were not serialized.
The mechanics of deferred assertion locking are simple. Every temporal transaction has an assertion begin date, either the default date of Now() or an explicitly supplied future date. Temporal updates and temporal deletes begin their work by withdrawing the one or more versions which represent an object in any clock ticks included in the transaction's effective timespan. The versions they withdraw are those versions located in the most recent period of assertion time. That may be current assertion time, and usually is. But when a deferred transaction has been applied to versions in current assertion time, it closes their assertion periods with the same date that begins the assertion period of the deferred assertion it creates, just as the deferred update we are discussing closed P861(r6) and superceded it with P861(r8). And it creates a version that exists in future assertion time. Deferred transactions may then be applied to that deferred assertion, and we will explain how to do that in the next section.
Note what is not locked. The episode itself is not locked. Out of the entire currently asserted effective time period from November 2011 to 12/31/9999, for P861, only two months have been locked. Inserts, updates and deletes can continue to take place against any of the other clock ticks in the episode occupied by P861—or, for that matter, against any clock ticks not occupied by P861.
We have now completed the deferred transaction. As directed by the transaction, the AVF has created a version of P861, for the effective time months of May and June 2012, that will not be asserted until January 2090. If nothing happens between now and January 2090, then at that time, the database will stop asserting that P861 had a copay amount of $30 in May and June of 2012, and begin asserting, instead, that it had a copay amount of $55 during those two long-ago months.
A Deferred Update to a Deferred Assertion
Now we have a deferred assertion. Next, let's consider an update which will apply to that deferred assertion. This transaction takes place on February 2013.
UPDATE Policy [P861,,, $50] May 2012, Jun 2012, Jan 2090
Apparently, sometime in the month after the first deferred update, we decided that the copay update should have been increased to $50, not to $55, for the month of May 2012. To process this second deferred update, the AVF begins its work by looking for versions already in the target table, with the same oid, whose effective time periods [intersect] the effective timespan specified on the transaction. It ignores past assertions, because database modifications neither affect past assertions nor are affected by them.
The effective timespan for P861 that the AVF is looking for is [May 2012 – Jun 2012]. The AVF finds two rows—P861(r6 & r8) (as shown in Figure 12.4)—whose effective time includes that of the timespan on the transaction. Both rows have the same oid as the transaction, and both include the effective-time clock tick of May 2012.
P861(r6), however, is locked because there is a later assertion about the same object that includes all its effective time clock ticks. It is P861(r8) that is the latest assertion which has an effective time period that [intersects] that of the transaction. 5 That row's time period, to be more precise, [starts-1] the effective time period on the transaction.
5As we said in Chapter 3, we will refer to Allen relationships by using the relationship name enclosed in brackets. And as we said in Chapter 9, we will refer to temporal extent state transformations by using the transformation name enclosed in braces. In both cases, when we refer to non-leaf nodes in either taxonomy, we will underline the name. Thus we can say that one time period [meets] another, or that one time period [intersects] another. We italicize the Allen relationship name equals, as we explained in Chapter 3, to mark the fact that, unlike all other Allen relationships, it has no distinct inverse.
So the target of the deferred update must be P861(r8). It is the latest, i.e. future-most, assertion about the month of May 2012, in the life of P861.
Next, because P861(r8) includes June as well as May, the first thing the AVF does is to split that row to create a semantically equivalent pair of rows, one of which matches the transaction. This is shown in Figure 12.5. P861(r8) has been withdrawn. In its place, the AVF has created the two rows P861(r9 & r10).
B9780123750419000121/f12-05-9780123750419.jpg is missing
Figure 12.5
A Deferred Assertion: Effective Time Alignment.
P861(r8) has been withdrawn into closed assertion time, but that assertion time is neither past nor present assertion time. It is empty assertion time, because the time period [Jan 2090 – Jan 2090] includes no clock ticks, not a single one.
Reflections on Empty Assertion Time
In all our dealings with temporal transactions, the assertion date specified on the transaction (or accepted as a default) is used both as the assertion end date of the withdrawn row and also as the assertion begin date of the row or rows that replace and/or supercede it. In this way, our transactions build an unbroken succession of assertions about what the object in question is like during the unbroken extent of the episode's effective time.
P861(r8) cannot be withdrawn into past assertion time because it hasn't been asserted yet. But it also can't be allowed to remain in future assertion time because if P861(r9 and/or r10) are ever updated, they and P861(r8) would make different statements about what P861 was like at the same point in time, i.e. in either May or June 2012. In other words, P861(r8) can't be allowed to remain in future assertion time because it would then be a TEI conflict waiting to happen.
This is why the AVF moved it into empty assertion time. This is the semantically correct thing to do. With P861(r9 & r10) now in the database, which together match P861(r8), and with both being in yet-to-come assertion time, one of them had to go. Creating P861(r9 & r10) is a preparatory move made by the AVF, to isolate a single deferred assertion that will match the update transaction. So P861(r8) was the correct one to go. Having nowhere in past assertion time to go, and obviously not belonging in current assertion time, it went to the only place it could go—into non-asserted time, i.e. into empty assertion time.
A row in empty assertion time, however, is a row that never was asserted and never will be asserted. So there is an argument for simply physically deleting the row rather than moving it into empty assertion time. For one thing, Asserted Versioning cannot keep track of when it was moved into empty assertion time. The only physical date on an asserted version table is the row creation date, and the movement of a row into empty assertion time is a physical update, for which there is no corresponding date.
For another thing, since a row in empty assertion time never was asserted, and never will be asserted, what information does it contain that would justify retaining it in the database? Well, in fact, a row in empty assertion time is informative. The information it contains is information about an intention. At one point in time, we apparently intended that the business data on that row would one day be asserted. Perhaps we intended to deceive someone with that business data. In that case, that row is a record of an intent to deceive. By retaining the row, we retain a record of that intent.
Non-deferred transactions are always against currently asserted versions which have a 12/31/9999 assertion end date. They withdraw those target versions by ending their assertions on the same clock tick that their replacement and/or superceding versions begin to be asserted. The result is to withdraw those target versions into past assertion time, but leave no assertion time gap between them and the results of the transaction.
Deferred transactions against those same currently asserted versions do the same thing. They withdraw them by ending their assertions on the same clock tick that their replacement and/or superceding versions will begin to be asserted. But being deferred, those replacement and/or superceding versions begin on some future date. Using that future date as the assertion end date of the target versions, those target versions are withdrawn, but into current assertion time. This current assertion time, however, has a definite, non-12/31/9999, end date, and so we say that their assertion periods are current but closed. If nothing happens in the meantime, then when that date comes to pass, the current closed assertions will fall into past assertion time, and the deferred assertions which replaced and/or superceded them will fall into current assertion time. The mechanics of withdrawal supports these different semantics correctly, just as it supported the semantics of non-deferrals correctly.
Deferred update and delete transactions may also have deferred assertions as their target. However, for any oid and any effective-time clock tick, the target of a deferred update or delete transaction must be the latest assertion of that effective-time clock tick for that object because, if it were not, it would violate the serialization property of deferred assertions (as described earlier, in the section A Deferred Update to a Current Episode). And the AVF guarantees that this will be so because any but the latest assertion will be locked; it will be on a row with a non-12/31/9999 assertion end date.
The mechanics of the AVF does its job, as in the first two cases, by ending the withdrawn assertions on the same clock tick that their replacement and/or superceding versions begin to be asserted.
For example, P861(r8) has an assertion begin date of January 2090. If a deferred update transaction targeting P861(r8) specified any assertion date later than that, then it would leave P861(r8) to become currently asserted on January 2090, and to remain currently asserted until whatever assertion end date the transaction assigned to it. That's an ordinary enough case, and perhaps we should not be surprised that the machinery of deferral works correctly for it. But in fact, the deferred update we are discussing here specifies an assertion date of January 2090, the same date as the begin date on the target deferred assertion. And this is not so ordinary a case.
But in this case, too, what the mechanics achieves is precisely what the semantics demands. In this case, P861(r8)'s assertion end date is set to January 2090, with the result that its assertion time period is [Jan 2090 – Jan 2090]. With a closed-open convention for representing periods of time, this is an empty time period, one including not a single clock tick. It makes it as though P861(r8) had never been. It makes that row one which never was asserted and never will be asserted. For such rows, we will say, the transaction overrides them. So, to override a row is to withdraw it into empty assertion time prior to its ever being asserted in the first place.
What the semantics demands is a replacement row and a superceding row to cover the months of May and June 2012 in the life of P861, and for both those rows to begin to be asserted on January 2090. With P861(r9 & r10), that's exactly what it gets. There is now a target row which exactly matches the update transaction, and the transaction can now proceed on to completion.
Completing the Deferred Update to a Deferred Assertion
The remaining analysis is straightforward. P861(r9) matches the deferred update transaction. P861(r10) is of no interest to the transaction because its effective time period does not share even a single clock tick with the effective timespan of the transaction.
Having created a target row which matches the transaction, the AVF now updates that row with the new copay amount. Note that it does not withdraw P861(r9) and supercede it with a new row. It could do that, but there is no need to do so because we are still in the midst of an atomic and isolated unit of work. At this point, the change to the copay amount is recorded. At this point, the update is complete. The result is shown in Figure 12.6.
B9780123750419000121/f12-06-9780123750419.jpg is missing
Figure 12.6
Completing the Deferred Update.
As directed by the transaction, the AVF has created a version of P861, for the effective time period of May 2012, that will not be asserted until January 2090. The first deferred update changed the copay amount for P861, for the month of May 2012, from $30 to $55. This second deferred update corrected the copay amount which the first one set to $55. It changed it to $50.
Once again, we retain the angle brackets in the illustration to make it easy to identify the rows involved in the transaction. But the transaction, at this point, is complete. All DBMS locks are released, and all the rows in Figure 12.6 are now visible in the database. P861(r9 & r10) are not locked. P861(r8) has been overridden by those next two rows, and moved into empty assertion time. But note that P861(r6) is still both currently asserted and locked.
Does the business really intend to leave the database in this state? Does it really intend to continue saying until 2090 that in May and June 2012, P861 has a copay of $30, even though it apparently knows that the correct amount is $50 in May and $55 in June? Well, it certainly doesn't seem very likely.
The Near Future and the Far Future
Deferred assertions may be located in the near future or the far future. Deferred assertions located in the near future will become current assertions as soon as enough time has passed. In a real-time update situation, a near future deferred assertion might be one with an assertion begin date just a few seconds from now. In a batch update situation, a near future deferred assertion might be one that does not become currently asserted until midnight, or perhaps even for another several days. What near future deferred assertions have in common is that, in all cases, the business is willing to wait for these assertions to fall into currency, i.e. to become current not because of some explicit action, but rather when the passage of time reaches their begin dates.
Deferred assertions may be created in near future assertion time, or moved to it from far future assertion time when the business approves of those assertions becoming production data. Deferred assertions may also be placed in or moved to far future assertion time. Such are our two deferred assertions shown above, which will not become current until nearly eight decades from now. It is unlikely, of course, that the business intends to wait that long. So once the business reviews those assertions and approves them, it will want them to become current assertions as soon as possible. It will do that by moving them into near future assertion time.
Assertions located in the far future are, for one reason or another, not ready to be applied to the production database. For example, they may be transactions that are created by assembling data from multiple sources. One of those sources arrives before the others, and so can create only incomplete transactions. Rather than managing those incomplete transactions as an inflow pipeline dataset, the user can submit them to the AVF using a far future assertion begin date, such as one several decades from now, or perhaps several hundred or several thousand years from now. As the other data sources begin to provide their contributions to those transactions, deferred update transactions override the deferred assertions placed there by earlier data sources. Eventually, the transactions are completed. Once approved, they can be moved into near future assertion time, ready to fall into currency in the near future, on the same clock tick that the assertions they replace and/or supercede fall out of currency and into assertion time history.
And there are any number of other reasons for assembling updates in far future assertion time. One is that a group of updates may be so important that the business wants a careful review and approve process before they are applied to production tables. Another is to create a group of assertions that the business can use for simulations or forecasts.
Once far future deferred assertions are ready to become production data, they must be moved into near future assertion time. Located close to Now(), those deferred assertions will then quickly fall into currency. They will quickly become currently asserted production data.
What we need now is a transaction that will move assertions from the far future to the near future. We will call it the approval transaction.
Approving a Deferred Assertion
When a deferred transaction is applied to the database, it locks all prior but not yet past assertions for that object and that effective time period by setting the assertion end date to a non-12/31/9999 date. It withdraws matching current assertions, and either withdraws matching deferred assertions, or overrides them, or withdraws an earlier portion of them and overrides the remaining portion. The deferred transaction then creates a deferred assertion for the specified object in the specified effective time period, whose assertion begin date is set to the assertion begin date specified on the transaction.
For example, the first of the deferred transactions we looked at locked the effective time months of May and June 2012 for policy P861, and then created a deferred assertion for that policy in those two months. The second deferred transaction focused in on the month of May 2012, isolating it by splitting the deferred assertion P861(r8) into the two semantically equivalent deferred assertions P861(r9 & r10), and overriding P861(r8) with those two deferred assertions. Next, with P861(r9) representing the policy during May 2012, the deferred transaction applied the new copay amount to that row, completing the transaction and the atomic unit of work, ending the isolation of those rows and making them visible in the database, accessible to queries that specify assertions deferred until 2090.
As shown in Figure 12.6, the Policy table now contains only three deferred assertions that have not been overridden. One is P861(r6), whose withdrawal has been deferred until January 2090. The other two are P861(r9 & r10). They constitute a single deferred assertion group, that group being defined by the future assertion date that they share.
A deferred assertion group is another managed object introduced by Asserted Versioning but not supported by relational theory, relational technology, other temporal models, or ongoing research in the field. It is a designated collection of one or more rows which consist of assertions in the same future assertion period of time, and, transitively, any earlier non-past assertions that are locked because of them. These deferred assertion groups can contain assertions for different episodes of the same object, and for different objects in the same or in different tables.
Besides its own currently asserted production data, a production table may contain any number of deferred assertion groups. These deferred assertion groups are the internalization of inflow pipeline datasets. They are the internalization of collections of transactions which are not currently production data.
Usually, these collections are called batch transaction datasets. Typically, there may be any number of batch transaction datasets in which pending transactions are accumulated as they are acquired or created. One by one, on a scheduled or as-needed basis, these batch datasets are processed against their target databases, and production tables are updated. But with asserted version tables as the target production tables, these batch datasets aren't necessary. Transactions scheduled to be processed on a later date can be submitted immediately, with that later date as the assertion begin date.
Let us assume that the business has now reviewed the deferred assertion group and approved the assertions in it to become current as soon as possible. It is now March 2013, and so the next opportunity to update the database is April 2013.
The AVF moves deferred assertions backwards in time with a special temporal update transaction. This transaction takes deferred assertions in the far future and moves them into the near future.
But before we move P861(r9 & r10) backwards in assertion time, consider P861(r6). P861(r9 & r10) were created as assertion-time contiguous with P861(r8), which itself was created as assertion-time contiguous with P861(r6). The idea was that, on January 2090, when P861(r6) ceased being asserted, it would hand-off to P861(r8) on precisely that clock tick. But then a second deferred update was applied, which overrode P861(r8) with P861(r9 & r10), and then updated P861(r9).
When we originally created P861(r9 & r10), that future clock tick was January 2090. We are now about to change the assertion begin date on those two assertions to April 2013.
But if we do so, and do nothing about P861(r6), we will create a TEI violation. If we do nothing about P861(r6), then from January 2013 to January 2090, P861(r6) will assert that P861's copay amount in May 2012 was $30, but P861(r9) will assert that it was $50. So even though P861(r6) exists in a closed period of assertion time, it can, and indeed in this case must, be overridden. So rather than thinking of the approval transaction as changing the assertion begin date on one or more deferred assertions, we should think of it as changing the hand-over clock tick between locked assertions and the deferred assertions that are being moved backwards in assertion time.
The approval transaction looks like this: 6
UPDATE Policy [ ],, Jan 2090, Apr 2013
6As we have noted before, these examples do not use the syntax that will be used in release 1 of the AVF. The temporal data in these transactions is shown in a refinement of a comma-delimited positional notation.
This transaction is unlike the standard temporal update transaction in that its temporal parameters are both assertion dates. As indicated by the commas, there are no effective time dates on this transaction. And although a standard transaction can have one assertion date, this transaction has two assertion dates.
The first assertion date on the approval transaction is the assertion group date. The second is the assertion approval date.
The transaction proceeds as an atomic (all-or-nothing, and isolated) unit of work. For all assertions whose assertion begin date matches the assertion group date, it changes their assertion begin dates to the approval date. This is shown in Figure 12.7. P861(r9 & r10) have been moved from far future (2090) into near future (2013) assertion time. As soon as April 2013 occurs, those two rows will fall into currency.
B9780123750419000121/f12-07-9780123750419.jpg is missing
Figure 12.7
Approving a Deferred Assertion Group.
The approval transaction is almost complete, but it has one thing left to do. As shown in Figure 12.6, P861(r6) has a January 2090 assertion end date prior to the approval transaction. If nothing is done, then in less than a month after the approval transaction is applied, P861(r9 &r10) will be in TEI conflict with P861(r6), and will remain so for several decades.
This is because the override work of the approval transaction is incomplete. P861(r9 & r10) match P861(r6), which exists in current but closed assertion time. But in order to make room in near future assertion time, the AVF must withdraw any earlier assertions that would conflict with the assertions being moved backwards in time by the approval transaction. So, using the same withdraw/override mechanics it has always used, the AVF sets the assertion end date on P861(r6) to the assertion begin date of the two rows it has moved into near future assertion time, that date being April 2013.
The approval transaction is now complete. The deferred assertions have been moved into near future time, and are waiting to fall into currency. The database is in the state shown in Figure 12.7.
And, once again, we find that our mechanics, applied to a situation never anticipated for it, produces results that accurately express the correct semantics. For with its approval transaction, the business told us that we could update the copay amount for P861 in May of 2012 as soon as possible. As soon as possible is April 2013. So our database now shows the incorrect claim about P861 in May of 2012 continuing until that as soon as possible correction, and that correction, as the two rows P861(r9 & r10), taking over on that same clock tick.
In this way, multiple deferred assertions can be managed as a single group. For example, if we are adding 1000 clients to our database, then if all 1000 clients are assigned the same future assertion date, a single approval transaction can be used to assert all of them at once.
Deferred Assertions and Temporal Referential Integrity
Deferred update and delete transactions, like their non-deferred cousins, lock matching assertions that were already in the database at the time those transactions were carried out. It locks them by giving them a non-12/31/9999 assertion end date. In the case of a non-deferred update or delete, these locked assertions exist in past assertion time. But in the case of a deferred transaction, the locked assertions remain in current assertion time, and their assertion time periods [meet] the assertion time periods of the deferred assertions that replace or supercede them.
When an approval transaction is applied to a group of deferred assertions, those assertions are moved backwards in assertion time, usually to just a few clock ticks later than the current moment in time. Then, with the passage of those few clock ticks, those deferred assertions become current assertions.
In moving backwards in assertion time, those approved assertions override any locked matching assertions. In overriding them, it “sets them to naught” almost literally, by setting their assertion end dates to match their assertion begin dates, thus moving them into empty assertion time.
But there is one last issue to deal with. We have emphasized that semantic constraints do not exist across assertion time periods. But if a TRI child managed object is moved backwards into an earlier period of assertion time, one which begins before the assertion time period containing its parent managed object, then the TRI relationship between them will be broken. The assertion time movement will make the child managed object a referential “orphan” until the passage of time reaches the beginning of the assertion time period of the parent managed object.
So the AVF must block any such movement, or else insure that as part of the same atomic and isolated unit of work, parent and child managed objects are moved together so as to preserve the referential relationships.
It turns out that this isn't always easy to do, especially when the related managed objects exist in different deferred assertion groups. The problem is that, as long as an approval transaction is not applied, the assertion time of any TRI deferred parent is guaranteed to include the assertion time of all of its deferred children. But by applying an approval transaction, we may break the inclusion relationship by moving the start of the assertion time of the approved children to a date prior to the beginning assertion time of the not-yet-approved parent. We are working on the problem as this book goes to press. We know that the problem is not insoluble. But we also know that it is difficult.
Glossary References
Glossary entries whose definitions form strong interdependencies are grouped together in the following list. The same glossary entries may be grouped together in different ways at the end of different chapters, each grouping reflecting the semantic perspective of each chapter. There will usually be several other, and often many other, glossary entries that are not included in the list, and we recommend that the Glossary be consulted whenever an unfamiliar term is encountered.
We note, in particular, that the nine terms used to refer to the act of giving a truth value to a statement, listed in the section The Semantics of Deferred Assertion Time, are not included in this list. Nor are nodes in our Allen Relationship taxonomy or our State Transformation taxonomy included in this list.
12/31/9999
clock tick
closed-open
Now()
Allen relationships
approval transaction
assertion group date
deferred assertion group
deferred assertion
deferred transaction
empty assertion time
fall into currency
fall out of currency
far future assertion time
near future assertion time
override
lock
retrograde movement
Asserted Versioning Framework (AVF)
assertion begin date
assertion end date
assertion time period
assertion time
assertion
closed assertion
conventional table
dataset
episode
open episode
statement
hand-over clock tick
instance
type
managed object
object
oid
persistent object
thing
occupied
represented
match
replace
supercede
withdraw
pipeline dataset
inflow pipeline dataset
inflow pipeline
outflow pipeline dataset
outflow pipeline
production data
production database
production dataset
production table
row creation date
temporal dimension
temporal entity integrity (TEI)
temporal foreign key (TFK)
temporal referential integrity (TRI)
the standard temporal model
transaction table
transaction time
version
effective begin date
effective end date
effective time period
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset