System-versioned tables in SQL Server 2016

SQL Server 2016 introduces support for system-versioned temporal tables. Unfortunately, application-time tables are not implemented in this version. System-versioned temporal tables bring built-in support for providing information about data stored in the table at any point in time rather than only the data that is correct at the current moment in time. They are implemented according to the specification in the ANSI SQL 2011 standard with a few extensions.

How temporal tables work in SQL Server 2016

A system-versioned temporal table is implemented in SQL Server 2016 as a pair of tables: the current table containing the actual data, and the history table where only historical entries are stored. There are many limitations of both current and history tables. Here are limitations and considerations that you must take into account for the current table of a system-versioned temporal table:

  • It must have a primary key defined
  • It must have one PERIOD FOR SYSTEM_TIME defined with two DATETIME2 columns
  • Cannot be FILETABLE and cannot contain FILESTREAM data type
  • INSERT, UPDATE, and MERGE statements cannot reference and modify period columns: start column is always set to system time, end column to max date value
  • INSTEAD OF triggers are not allowed
  • TRUNCATE TABLE is not supported

The list of limitations for a history table is significantly longer and brings many additional restrictions. The history table of a system-versioned temporal table:

  • Cannot have constraints defined (primary or foreign keys, check, table, or column constraints). Only default column constraints are allowed
  • You cannot modify data in the history table
  • You can neither ALTER nor DROP a history table
  • It must have the same schema as the current table (column names, data types, ordinal position)
  • Cannot be defined as the current table
  • Cannot be FILETABLE and cannot contain FILESTREAM data type
  • No triggers are allowed (neither INSTEAD OF nor AFTER)
  • Change Data Capture and Change Data Tracking are not supported

You can read more about considerations and limitations when working with temporal tables at https://msdn.microsoft.com/en-us/library/mt604468.aspx.

The list might look long and discouraging, but all these limitations are there to protect data consistency and accuracy in history tables. However, although you cannot change logical attributes, you can still perform actions related to the physical implementations of the history table: you can switch between rowstore and columnstore table storage, you can choose columns for clustered and create additional non-clustered indexes.

Creating temporal tables

To support the creation of temporal tables, the CREATE TABLE and ALTER TABLE Transact-SQL statements have been extended. To create a temporal table, you need to:

  • Define a column of DATETIME2 data type for holding the info since when a row is valid from a system point of view
  • Define a column of DATETIME2 data type for holding the info until when a row is valid from the same point of view
  • Define a period for system time by using previously defined and described columns
  • Set the newly-added SYSTEM_VERSIONING table attribute to ON

The following code creates a new temporal table Product in the schema dbo in the database WideWorldImporters:

USE WideWorldImporters; 
CREATE TABLE dbo.Product 
( 
   ProductId INT NOT NULL CONSTRAINT PK_Product PRIMARY KEY, 
   ProductName NVARCHAR(50) NOT NULL, 
   Price MONEY NOT NULL, 
   ValidFrom DATETIME2 GENERATED ALWAYS AS ROW START NOT NULL, 
   ValidTo DATETIME2 GENERATED ALWAYS AS ROW END NOT NULL, 
   PERIOD FOR SYSTEM_TIME (ValidFrom, ValidTo) 
) 
WITH (SYSTEM_VERSIONING = ON); 

You can identify all four elements related to temporal table creation from the above list: two period columns, the period, and the SYSTEM_VERSIONING attribute. Note that all elements marked bold are predefined and you must write them exactly like this; data type, nullability, and default values for both period columns are also predefined and you can only choose their names and define data type precision. The data type must be DATETIME2; you can only specify its precision. Furthermore, the period definition itself is predefined too; you must use period column names you chose in the previous step. By defining period columns and the period, you have created the infrastructure required for implementing temporal tables. However, if you create a table with them but without the SYSTEM_VERSIONING attribute, the table will not be temporal. It will contain an additional two columns with values that are maintained by the system, but the table will not be a system-versioned temporal table.

The final, fourth, part is to set the attribute SYSTEM_VERSIONING to ON. When you execute the above code, you implicitly instruct SQL Server to automatically create a history table for the temporal table dbo.Product. The table will be created in the same schema (dbo), and with a name according to the following format MSSQL_TemporalHistoryFor_<current_temporal_table_object_id>_[suffix]. The suffix is optional and it will be added only if the first part of the table name is not unique. Figure 7.4 shows what you will see when you open SQL Server Management Studio and find the dbo.Product table :

Creating temporal tables

Figure 7.4: Temporal table in SQL Server Management Studio

You can see that all temporal tables have a small clock icon indicating temporality. Under the table name, you can see its history table. Note that columns in both tables are identical (column names, data types, precision, nullability), but also that the history table does not have constraints (primary key).

Note

Period columns must have DATETIME2 as their data type. If you try with DATETIME, you will get an error. Standard does not specify data type precision, so this is not strictly implemented according to standard. This is very important when you migrate your existing temporal solution to new temporal tables in SQL Server 2016. Usually, columns that you were using are DATETIME data type and you have to extend this to DATETIME2.

You can also specify the name of the history table only and let SQL Server create it with the same attributes as described earlier. Use the following code to create a temporal table with a user-defined history table name:

USE WideWorldImporters; 
CREATE TABLE dbo.Product2 
( 
   ProductId INT NOT NULL CONSTRAINT PK_Product2 PRIMARY KEY, 
   ProductName NVARCHAR(50) NOT NULL, 
   Price MONEY NOT NULL, 
   ValidFrom DATETIME2 GENERATED ALWAYS AS ROW START NOT NULL, 
   ValidTo DATETIME2 GENERATED ALWAYS AS ROW END NOT NULL, 
   PERIOD FOR SYSTEM_TIME (ValidFrom, ValidTo) 
) 
WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.ProductHistory2)); 
 

What storage type is used for the automatically created history table? By default, it is a rowstore table with the clustered index on the period columns. The table is compressed with PAGE compression, if it can be enabled for compression (has no SPARSE or (B)LOB columns). To find out the storage type, use the following code:

SELECT temporal_type_desc, p.data_compression_desc  
FROM sys.tables t 
INNER JOIN sys.partitions p ON t.object_id = p.object_id 
WHERE name = 'ProductHistory2'; 
 

The result of the above query shows that PAGE compression has been applied:

temporal_type_desc    data_compression_desc 
--------------------- --------------------- 
HISTORY_TABLE         PAGE 

You can also see that the history table has a clustered index. The following code extracts the index name and the columns used in the index:

SELECT i.name, i.type_desc, c.name, ic.index_column_id 
FROM sys.indexes i 
INNER JOIN sys.index_columns ic on ic.object_id = i.object_id 
INNER JOIN sys.columns c on c.object_id = i.object_id AND ic.column_id = c.column_id 
WHERE OBJECT_NAME(i.object_id) = 'ProductHistory2'; 

The output of this query shows that the automatically created history table has a clustered index on the period columns and the name in the following ix_<history_tablename> format:

name                  type_desc     name               index_column_id 
--------------------- ------------- -----------        ------------
ix_ProductHistory2    CLUSTERED     ValidFrom          1
ix_ProductHistory2    CLUSTERED     ValidTo            2

However, if the predefined implementation of the history table (rowstore, period columns in clustered index) doesn't meet your criteria for historical data, you can create your own history table. Of course, you need to respect all constraints and limitations listed at the beginning of the chapter. The following code first creates a history table, then a temporal table, and finally assigns the history table to it. Note that, in order to proceed with the code execution, you need to remove the temporal table created in the first example in this chapter:

USE WideWorldImporters; 
ALTER TABLE dbo.Product SET (SYSTEM_VERSIONING = OFF);    
ALTER TABLE dbo.Product DROP PERIOD FOR SYSTEM_TIME;    
DROP TABLE IF EXISTS dbo.Product; 
DROP TABLE IF EXISTS dbo.ProductHistory; 
GO 
CREATE TABLE dbo.ProductHistory 
( 
   ProductId INT NOT NULL, 
   ProductName NVARCHAR(50) NOT NULL, 
   Price MONEY NOT NULL, 
   ValidFrom DATETIME2 NOT NULL, 
   ValidTo DATETIME2 NOT NULL 
); 
CREATE CLUSTERED COLUMNSTORE INDEX IX_ProductHistory ON dbo.ProductHistory; 
CREATE NONCLUSTERED INDEX IX_ProductHistory_NC ON dbo.ProductHistory(ProductId, ValidFrom, ValidTo); 
GO 
CREATE TABLE dbo.Product 
( 
   ProductId INT NOT NULL CONSTRAINT PK_Product PRIMARY KEY, 
   ProductName NVARCHAR(50) NOT NULL, 
   Price MONEY NOT NULL, 
   ValidFrom DATETIME2 GENERATED ALWAYS AS ROW START NOT NULL, 
   ValidTo DATETIME2 GENERATED ALWAYS AS ROW END NOT NULL, 
   PERIOD FOR SYSTEM_TIME (ValidFrom, ValidTo) 
) 
WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.ProductHistory)); 

You will learn how to alter and drop system-versioned tables in more detail later in this chapter. Here, you should focus on the fact that you can create your own history table with a clustered columnstore index on it. Figure 7.5 shows what you will see when you look at SQL Server Management Studio and find the created temporal table:

Creating temporal tables

Figure 7.5: Temporal table in SQL Server Management Studio with user-defined history table

You can see that the table created by you acts as a history table and has a clustered columnstore index.

Period columns as hidden attributes

Period columns are used to support temporality of data and do not have business logic value. By using the HIDDEN clause, you can hide the new PERIOD columns to avoid impacting on existing applications that are not designed to handle new columns.

Converting non-temporal to temporal tables

In SQL Server 2016, you can create temporal tables from scratch, but you can also alter an existing table and add attributes to it to convert it to a system-versioned temporal table. All you need is to add period columns, define the SYSTEM_TIME period on them, and set the temporal attribute. The following code example demonstrates how to convert the Department table in the AdventureWorks2016CTP3 database to a temporal table:

USE AdventureWorks2016CTP3; 
ALTER TABLE HumanResources.Department 
ADD ValidFrom DATETIME2 GENERATED ALWAYS AS ROW START HIDDEN NOT NULL CONSTRAINT DF_Validfrom DEFAULT SYSDATETIME(), 
   ValidTo DATETIME2 GENERATED ALWAYS AS ROW END HIDDEN NOT NULL CONSTRAINT DF_ValidTo DEFAULT '99991231 23:59:59.9999999', 
   PERIOD FOR SYSTEM_TIME (ValidFrom, ValidTo); 
GO 
ALTER TABLE HumanResources.Department SET (SYSTEM_VERSIONING = ON (HISTORY_TABLE = HumanResources.DepartmentHistory));  

You have to use two ALTER statements: with the first statement, you define the period columns and the period, while the second statement sets the SYSTEM_VERSIONING attribute to ON and the name of the history table that will be created by the system. You should be aware that you need to provide default constraints for both columns, since they must be non-nullable. You can even use a date value from the past for the default constraint for the first period column; however, you cannot set values in the future.

Note

Adding a non-nullable column with a default constraint is a metadata operation in the Enterprise Edition only; in all other editions that means a physical operation with the allocation space to update all table rows with newly added columns. For large tables, this can take a long time and be aware that, during this action, the table is locked.

As you can see, it is very easy and straightforward to add temporal functionality to an existing non-temporal table. It works transparently; all your queries and commands will work without changes. But, what if you did not follow best practice and recommendations and used SELECT * in your statements? Since SELECT * includes all columns from tables involved in the FROM clause, two additional columns could lead to exceptions and breaking changes in the application.

To avoid this situation, you are allowed to define period date columns with the HIDDEN attribute, as shown in the above code. That means that these two columns will not be returned as part of the result set for a query that uses the SELECT * operation. Of course, it is still, and will always be, recommended to use a named list of columns (and only required ones) instead of SELECT *.

Note

I need, here, to express my concerns about another implementation which takes care of solutions where SELECT * is implemented. I can understand that the vendor does not want to introduce a feature that can break customers' existing applications, but on the other hand, you cannot expect a developer to stop using SELECT * when new features and solutions don't sanction bad development habits.

Hidden attributes allow you to convert any non-temporal table (which does not violate the temporal table limitations listed in the previous section) to a temporal table without worrying about breaking changes in your solutions.

Migration existing temporal solution to system-versioned tables

Most probably, you have had to deal with historical data in the past. Since there was no out-of-the-box feature in previous SQL Server versions, you had to create a custom temporal data solution. Now that the feature is available, you might think to use it for your existing temporal solutions. You saw earlier in this chapter that you can define your own history table. Therefore, you can also use an existing and populated historical table. If you want to convert your existing solution to use system-versioned temporal tables in SQL Server 2016, you have to prepare both tables so that they fill all requirements for temporal tables. To demonstrate this, you will again use the AdventureWorks2016CTP3 database and create both current and history tables by using tables that exist in this database. Use the following code to create and populate the tables:

USE WideWorldImporters; 
CREATE TABLE dbo.ProductListPrice 
( 
   ProductID INT NOT NULL CONSTRAINT PK_ProductListPrice PRIMARY KEY, 
   ListPrice MONEY NOT NULL, 
); 
INSERT INTO dbo.ProductListPrice(ProductID,ListPrice) 
SELECT ProductID,ListPrice FROM AdventureWorks2016CTP3.Production.Product; 
GO 
CREATE TABLE dbo.ProductListPriceHistory 
( 
   ProductID INT NOT NULL, 
   ListPrice MONEY NOT NULL, 
   StartDate DATETIME NOT NULL, 
   EndDate DATETIME   NULL, 
   CONSTRAINT PK_ProductListPriceHistory PRIMARY KEY CLUSTERED  
   ( 
         ProductID ASC, 
         StartDate ASC 
   ) 
); 
INSERT INTO dbo.ProductListPriceHistory(ProductID,ListPrice,StartDate,EndDate) 
SELECT ProductID, ListPrice, StartDate, EndDate FROM AdventureWorks2016CTP3.Production.ProductListPriceHistory;  

Consider the rows for the product with ID 707 in both tables:

SELECT * FROM dbo.ProductListPrice WHERE ProductID = 707; 
SELECT * FROM dbo.ProductListPriceHistory WHERE ProductID = 707; 

Here are the rows in the current and history tables respectively:

ProductID   ListPrice
---------   ---------
707         34,99
    
ProductID   ListPrice    StartDate               EndDate
---------   ---------    -------------------     -------------------
707         33,6442      2011-05-31 00:00:00.000 2012-05-29 00:00:00.000
707         33,6442      2012-05-30 00:00:00.000 2013-05-29 00:00:00.000
707         34,99        2013-05-30 00:00:00.000 NULL

Assume that this data has been produced by your temporal data solution and that you want to use system-versioned temporal tables in SQL Server 2016 instead of it, but to also use the same tables. The first thing you have to do is align the columns in both tables. Since the current table has no date columns, you need to add two period columns and define the period. The columns should have the same name as the counterpart columns from the history table. Here is the code that creates the temporal infrastructure in the current table:

ALTER TABLE dbo.ProductListPrice 
ADD StartDate DATETIME2 GENERATED ALWAYS AS ROW START HIDDEN NOT NULL CONSTRAINT DF_StartDate1 DEFAULT SYSDATETIME(), 
   EndDate DATETIME2 GENERATED ALWAYS AS ROW END HIDDEN NOT NULL CONSTRAINT DF_EndDate1 DEFAULT '99991231 23:59:59.9999999', 
   PERIOD FOR SYSTEM_TIME (StartDate, EndDate); 
GO 

The next steps are related to the history table. As you can see from the sample data, your current solution allows gaps in the history table and also contains the current value with the undefined end date. As mentioned earlier in this chapter, the history table only contains historical data and there are no gaps between historical entries (the new start date is equal to the previous end date). Here are the steps you have to implement in order to prepare the dbo.ProductLisPriceHistory table to act as a history table in a system-versioned temporal table in SQL Server 2016:

  • Update the non-nullable EndDate column to remove the gap between historical values described earlier and to support the open-closed interval
  • Update all rows where the EndDate column is null to the StartDate of the rows in the current table
  • Remove the primary key constraint
  • Change the data type for both date columns StartDate and EndDate to DATETIME2

Here is the code that implements all these requests:

--remove gaps 
UPDATE dbo.ProductListPriceHistory SET EndDate = DATEADD(day,1,EndDate); 
--update EndDate to StartDate of the actual record 
UPDATE dbo.ProductListPriceHistory SET EndDate = (SELECT MAX(StartDate) FROM dbo.ProductListPrice) WHERE EndDate IS NULL; 
--remove constraints 
ALTER TABLE dbo.ProductListPriceHistory DROP CONSTRAINT PK_ProductListPriceHistory; 
--change data type to DATETIME2 
ALTER TABLE dbo.ProductListPriceHistory ALTER COLUMN StartDate DATETIME2 NOT NULL; 
ALTER TABLE dbo.ProductListPriceHistory ALTER COLUMN EndDate DATETIME2 NOT NULL; 

Now, both tables are ready to act as a system-versioned temporal table in SQL Server 2016:

ALTER TABLE dbo.ProductListPrice SET (SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.ProductListPriceHistory,  DATA_CONSISTENCY_CHECK = ON)); 

The command has been executed successfully and the dbo.ProductListPriceHistory table  is now a system-versioned temporal table. Note that the option DATA_CONSISTENCY_CHECK = ON is used to check that all rows in the history table are valid from a temporal data point of view (no gaps, the end date not before the start date). Now, you can check the new functionality by using the UPDATE statement. You will update the price for the product with the ID 707 to 50 and then check the rows in both tables:

UPDATE dbo.ProductListPrice SET Price = 50 WHERE ProductID = 707; 
SELECT * FROM dbo.ProductListPrice WHERE ProductID = 707; 
SELECT * FROM dbo.ProductListPriceHistory WHERE ProductID = 707; 

Here are the rows for this product in both tables: 

Migration existing temporal solution to system-versioned tables

You can see another row in the history table (compare with previous result). Of course, when you try these examples, you will get different values for the columns StartDate and EndDate since they are managed by the system.

As you can see, it is not so complicated to migrate an existing solution to a system-versioned table in SQL Server 2016, but it is not a single step. You should take into account, that most probably it will take time to update the data type to DATETIME2. However, by using the system-versioned temporal tables feature, your history tables are completely and automatically protected by changes from anyone except the system. This is a great, out-of-the-box, data consistency improvement.

Altering temporal tables

You can use the ALTER TABLE statement to perform schema changes on system-versioned temporal tables. When you use it to add a new data type, change a data type, or remove an existing column, the system will automatically perform the action against both the current and the history table. To check this, run the following code to create a temporal table from the previous section:

USE WideWorldImporters; 
CREATE TABLE dbo.Product 
( 
   ProductId INT NOT NULL CONSTRAINT PK_Product PRIMARY KEY, 
   ProductName NVARCHAR(50) NOT NULL, 
   Price MONEY NOT NULL, 
   ValidFrom DATETIME2 GENERATED ALWAYS AS ROW START NOT NULL, 
   ValidTo DATETIME2 GENERATED ALWAYS AS ROW END NOT NULL, 
   PERIOD FOR SYSTEM_TIME (ValidFrom, ValidTo) 
) 
WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.ProductHistory)); 

The following code adds a new column named Color into both the current and the history table:

ALTER TABLE dbo.Product ADD Color NVARCHAR(15); 

Since the column accepts null values, the action is done instantly.

When you add a non-nullable column, the situation is a bit different. First, you need to provide a default constraint and then you should take into account that this operation is an offline operation in all editions of SQL Server, except in the Enterprise Edition. The following code adds a new column Category that requires a value:

ALTER TABLE dbo.Product ADD Category SMALLINT NOT NULL CONSTRAINT DF_Category DEFAULT 1; 

This action will be online (metadata operation) in the Enterprise Edition only. In all the other editions, all rows in both the current and the history table will be touched to add additional columns with its value.

However, adding LOB and BLOB columns will cause a mass update in both the current and the history table in all SQL Server editions. The next code examples add a new LOB column Description:

ALTER TABLE dbo.Product ADD Description NVARCHAR(MAX) NOT NULL CONSTRAINT DF_Description DEFAULT N'N/A'; 

This action will internally update all rows in both tables. For large tables, this can take a long time and during this time, both tables are locked.

You can also use the ALTER TABLE statement to add or remove the HIDDEN attribute to period columns or to remove it. This code line adds the HIDDEN attribute to the columns ValidFrom and ValidTo:

ALTER TABLE dbo.Product ALTER COLUMN ValidFrom ADD HIDDEN; 
ALTER TABLE dbo.Product ALTER COLUMN ValidTo ADD HIDDEN;  

Clearly, you can also remove the HIDDEN attribute:

ALTER TABLE dbo.Product ALTER COLUMN ValidFrom DROP HIDDEN; 
ALTER TABLE dbo.Product ALTER COLUMN ValidTo DROP HIDDEN; 

However, there are some changes that are not allowed for temporal tables:

  • Adding an IDENDITY or computed column
  • Adding a ROWGUIDCOL column or changing an existing column to it
  • Adding a SPARSE column or changing an existing column to it, when the history table is compressed

When you try to add a SPARSE column, you will get an error, as in the following example:

ALTER TABLE dbo.Product ADD Size NVARCHAR(5) SPARSE;  

The command ends up with the following error message:

Msg 11418, Level 16, State 2, Line 20
Cannot alter table 'ProductHistory' because the table either contains sparse columns or a column set column which are incompatible with compression. 

The same happens when you try to add an identity column, as follows:

ALTER TABLE dbo.Product ADD ProductNumber INT IDENTITY (1,1); 

And here is the error message:

Msg 13704, Level 16, State 1, Line 26
System-versioned table schema modification failed because history table 'WideWorldImporters.dbo.ProductHistory' has IDENTITY column specification. Consider dropping all IDENTITY column specifications and trying again.

If you need to perform schema changes to a temporal table not supported in the ALTER statement, you have to set its SYSTEM_VERSIONING attribute to false to convert the tables to non-temporal tables, perform the changes, and then convert back to a temporal table. The following code demonstrates how to add the identity column ProductNumber and the sparse column Size into the temporal table dbo.Product:

ALTER TABLE dbo.ProductHistory REBUILD PARTITION = ALL WITH (DATA_COMPRESSION=NONE);  
GO 
BEGIN TRAN    
ALTER TABLE dbo.Product SET (SYSTEM_VERSIONING = OFF);    
ALTER TABLE dbo.Product ADD Size NVARCHAR(5) SPARSE;    
ALTER TABLE dbo.ProductHistory ADD Size NVARCHAR(5) SPARSE;    
ALTER TABLE dbo.Product ADD ProductNumber INT IDENTITY (1,1);    
ALTER TABLE dbo.ProductHistory ADD ProductNumber INT NOT NULL DEFAULT 0;    
ALTER TABLE dbo.Product SET(SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo. ProductHistory));    
COMMIT;    

To perform ALTER TABLE operations, you need to have CONTROL permission on the current and history tables. During the changes, both tables are locked with schema locks.

Dropping temporal tables

You cannot drop a system-versioned temporal table. Both current and history tables are protected until the SYSTEM_VERSIONING attribute  of the current table is set to ON. When you set it to OFF, both tables automatically become non-temporal tables and are fully independent of each other. Therefore, you can perform all operations against them that are allowed according to your permissions. You can also drop period if you definitely want to convert a temporal table to a non-temporal one. The following code converts the Product table into a non-temporal table and removes the defined SYSTEM_TIME period:

ALTER TABLE dbo.Product SET (SYSTEM_VERSIONING = OFF); 
ALTER TABLE dbo.Product DROP PERIOD FOR SYSTEM_TIME; 

Note that the period columns ValidFrom and ValidTo remain in the table and will be further updated by the system. However, the history table will not be updated when data in the current table is changed.

Data manipulation in temporal tables

In this section, what happens when temporal data is inserted, updated, or deleted will be demonstrated. Note that the history table is protected; only the system can write into it. Even members of the sysadmin server role cannot insert, update, or delete rows from a history table of a system-versioned temporal table in SQL Server 2016.

In this section, you will insert into, and manipulate data from, the dbo.Product temporal table created in the previous section. Assume you have added the Fog product with the 150 price into the table on 21st November 2016. Here is the code for this action and the state of the current and history tables after the statement's execution:

INSERT INTO dbo.Product(ProductId, ProductName, Price)  
VALUES(1, N'Fog', 150.00) ;-- on 21st November 

Data manipulation in temporal tables

Note that the dates in the ValidFrom and ValidTo columns are displayed in short format for clarity: their actual value is of DATETIME2 data type.

As you can see after an INSERT into a temporal table:

  • In the current table, a new row has been added with attributes from the INSERT statement, the period start date column is set to the system date, the period end date column is set to the max date
  • The history table is not affected at all

Now, assume that the price for the product has been changed to 200 and that this change was entered into the database on 28th November 2016. Here is the code for this action and the state of the current and history tables after the statement's execution:

UPDATE dbo.Product SET Price = 200.00 WHERE ProductId = 1;-- on 28th November 

Data manipulation in temporal tables

We remind you again that values in the ValidFrom and ValidTo columns are displayed in short format for clarity. The value in the ValidTo column in the history table is identical to the ValidFrom value in the current table: there are no gaps.

Now, assume that you have reduced the price the next day to 180. Here is the code for this action and the state of the current and history tables after the statement's execution:

UPDATE dbo.Product SET Price = 180.00 WHERE ProductId = 1;-- on 29th November 

Data manipulation in temporal tables

You can see another entry in the history table indicating that the price 200 was valid for one day. What would happen if you execute the same statement again, say, on 30th November? There is no real change; no business logic attributes are changed, but what does it mean for temporal tables? Here is the code for this action and the state of the current and history tables after the statement's execution:

UPDATE dbo.Product SET Price = 180.00 WHERE ProductId = 1;-- on 30th November 

Data manipulation in temporal tables

As you can see in the history table, even if there is no real change to the attributes in the current table, an entry in the history table is created and period date columns are updated.

Note

You can think about this as a bug, but although no attributes have been changed, if you use temporal tables for auditing you probably want to see all attempts of data manipulation in the main table.

Here is how an UPDATE of a single row in a temporal table affects the current and history tables:

  • Values in the row of the current table are updates to those provided by the UPDATE statement, the period start date column is set to system date, the period end date column is set to max date
  • From the current table before updating is copied to the history table, only the period end date column is set to the system date (exactly the same value as in the period start date column of the current row after the update)
  • Every UPDATE statement issued against a single row in the current table will generate an entry in the history table

You can also see that there are no gaps in the dates in the same row in the history table. Even duplicates are possible: the history table does not have constraints to prevent them! Therefore, it is possible to have multiple records for the same row with the same values in period columns. Moreover, even values in period columns can be identical! The only constraint that is enforced is that the date representing the period end date column cannot be before the date representing the period start date (therefore, it is guaranteed that ValidFrom <= ValidTo).

You will finally remove the row from the current table in order to demonstrate how the DELETE statement affects temporal tables. Here is the code for this action and the state of the current and history tables after the statement's execution:

DELETE FROM dbo.Product WHERE ProductId = 1;-- on 1st December 

Data manipulation in temporal tables

As you expected, there is no row in the current table, but another row has been added into the history table. After executing the DELETE statement against a single row in a temporal table:

  • Current table: The row has been removed
  • History table: The row from the current table before deleting is copied to the history table, only the period end date column is set to the system date

Querying temporal data in SQL Server 2016

System-versioned tables are primarily intended for tracking historical data changes. Queries on system-versioned tables often tend to be concerned with retrieving table content as of a given point in time or between any two given points in time. As you saw, Microsoft has implemented them according to SQL:2011 standard, which means that two physical tables exist: a table with actual data and a history table. In order to simplify queries against temporal tables, SQL:2011 standard introduced a new SQL clause; FOR SYSTEM_TIME. In addition to it, some new temporal-specific sub-clauses have been added too. SQL Server 2016 has not only implemented these extensions, but added two more extensions. Here is the complete list of clauses and extensions you can use to query temporal data in SQL Server 2016:

  • FOR SYSTEM_TIME AS OF
  • FOR SYSTEM_TIME FROM TO
  • FOR SYSTEM_TIME BETWEEN AND
  • FOR SYSTEM TIME CONTAINED_IN
  • FOR SYSTEM_TIME ALL

Retrieving temporal data at a specific point in time

When you want to retrieve temporal data that was valid at a given point in time, the resulting set could contain both actual and historical data. For instance, the following query would return all rows from the People temporal table in the WideWorldImporters sample database that were valid at 20th March 2016 at 8 A.M.:

SELECT PersonID, FullName, OtherLanguages, ValidFrom, ValidTo  
FROM Application.People WHERE ValidFrom <= '2016-03-20 08:00:00' AND ValidTo > '2016-03-20 08:00:00'  
UNION ALL 
SELECT PersonID, FullName, OtherLanguages, ValidFrom, ValidTo  
FROM Application.People_Archive WHERE ValidFrom <= '2016-03-20 08:00:00' AND ValidTo > '2016-03-20 08:00:00'; 

The query returns 1,109 rows. For a single person, only one row is returned: either the actual or a historical record. A record is valid if its start date was before or exactly on the given date and its end date is greater than the given date.

The new FOR SYSTEM_TIME clause with the AS OF sub-clause can be used to simplify the preceding query. Here is the same query with temporal Transact-SQL extensions:

SELECT PersonID, FullName, OtherLanguages, ValidFrom, ValidTo  
FROM Application.People FOR SYSTEM_TIME AS OF '2016-03-20 08:00:00'; 

Of course, it returns the same result set and the execution plans are identical, as shown in Figure 7.6:

Retrieving temporal data at a specific point in time

Figure 7.6. Execution plans for point in time queries against temporal tables

Under the hood, the query processor touches both tables and retrieves data, but the query looks simpler.

A special case of a point-in-time query against a temporal table is a query where you specify the actual date as the point in time. The following query returns actual data from the same temporal table:

DECLARE @Now AS DATETIME = CURRENT_TIMESTAMP; 
SELECT PersonID, FullName, OtherLanguages, ValidFrom, ValidTo  
FROM Application.People FOR SYSTEM_TIME AS OF @Now; 

The query is logically equivalent to this one:

SELECT PersonID, FullName, OtherLanguages, ValidFrom, ValidTo  
FROM Application.People; 

However, when you look at the execution plans (see Figure 7.7), for the execution of the first query, both tables have been processed, while the non-temporal query had to retrieve data from the current table only:

Retrieving temporal data at a specific point in time

Figure 7.7: Compare execution plans for temporal and non-temporal queries that retrieve current data only

Therefore, you should not use temporal queries with the FOR SYSTEM_TIME AS clause to return data from the current table.

Retrieving temporal data from a specific period

You can use a new FOR SYSTEM_TIME clause to retrieve temporal data that was or is valid between two points in time. These queries are typically used for getting changes to specific rows over time. To achieve this, you could use one of two SQL:2011 standard specified sub-clauses:

  • FROM…TO returns all data that started before or at the beginning of a given period and ended after the end of the period (closed-open interval)
  • BETWEEN…AND returns all data that started before or at the beginning of a given period and ended after or at the end of the period (closed-closed interval)

As you can see, the only difference between these two sub-clauses is how data with a starting date to the right side of the given period is interpreted: BETWEEN includes this data, FROM…TO does not.

The following queries demonstrate usage and the difference between them:

--example using FROM/TO 
SELECT PersonID, FullName, OtherLanguages, ValidFrom, ValidTo  
FROM Application.People FOR SYSTEM_TIME FROM '2016-03-20 08:00:00' TO '2016-05-31 23:14:00' WHERE PersonID = 7; 
 
--example using BETWEEN 
SELECT PersonID, FullName, OtherLanguages, ValidFrom, ValidTo  
FROM Application.People FOR SYSTEM_TIME BETWEEN '2016-03-20 08:00:01' AND '2016-05-31 23:14:00' WHERE PersonID = 7; 
 

Here are the result sets generated by the preceding queries:

Retrieving temporal data from a specific period

As you can see, the second query returns three rows: it includes the row where the start date is equal to the value of the right boundary of the given period.

These two sub-clauses return row versions that overlap with a specified period. If you need to return rows that existed within specified period boundaries, you need to use another extension CONTAINED IN. This extension (an implementation of one of the Allen's Operators) is not defined in the SQL:2011 standard, it is implemented in SQL Server 2016. Rows that either start or end outside a given interval will not be part of a result set when the CONTAINED IN sub-clause is used. When you replace the sub-clause BETWEEN with it in the above example, only rows whose whole life belongs to the given interval will survive:

SELECT PersonID, FullName, OtherLanguages, ValidFrom, ValidTo  
FROM Application.People FOR SYSTEM_TIME CONTAINED IN ('2016-03-20 08:00:01','2016-05-31 23:14:00') WHERE PersonID = 7; 

Instead of three rows by using BETWEEN or two with FROM…TO sub-clauses, this time only one row is returned:

Retrieving temporal data from a specific period

Although this extension is not standard, its implementation in SQL Server 2016 is welcomed: it covers a reasonable and not-so-rare use case and simplifies the development of database solutions based on temporal tables.

Retrieving all temporal data

Since temporal data is separated into two tables, to get all temporal data you need to combine data from both tables. However, there is no sub-clause defined in SQL:2011 standard for that purpose. However, the SQL Server team has introduced the extension (sub-clause) ALL to simplify such queries.

Here is a temporal query that returns both actual and historical data for the person with the ID 7:

SELECT PersonID, FullName, OtherLanguages, ValidFrom, ValidTo  
FROM Application.People FOR SYSTEM_TIME ALL 
WHERE PersonID = 7; 

The query returns 14 rows, since there are 13 historical rows and one entry in the current table. Here is the logically equivalent, standard query but it's a bit more complex:

SELECT PersonID, FullName, OtherLanguages, ValidFrom, ValidTo  
FROM Application.People  
WHERE PersonID = 7 
UNION ALL 
SELECT PersonID, FullName, OtherLanguages, ValidFrom, ValidTo  
FROM Application.People_Archive 
WHERE PersonID = 7; 

The only purpose of this sub-clause is to simplify queries against temporal tables. New extensions for querying temporal tables do not bring performance benefits, but they significantly simplify queries against them.

Performance and storage considerations with temporal tables

Introducing temporal tables and especially adding this functionality to existing tables can significantly increase the storage used for data in your system. Since there is no out-of-the-box solution for managing the retention of history tables, if you don't do something, data will remain there forever. This can be painful in terms of storage costs, maintenance, and the performance of queries on temporal tables, especially if rows in your current tables are heavily updated.

History data retention

Unfortunately, you cannot configure a history table to automatically remove data according to a user-defined retention policy. That means that you have to implement data retention manually. This is not very complicated, but it is not a trivial action. I had expected it as a part of the implementation and will be disappointed if it is not delivered in the next SQL Server version. As mentioned, you can use some other SQL Server features to implement data retention for history tables:

  • Stretch databases allows you to move an entire part of historical data transparently to Azure.
  • Partitioning of history tables allows you to easily truncate the oldest historical data by implementing a sliding window approach.
  • Custom Cleanup doesn't require any other features. You use Transact-SQL scripts to convert a temporal table to non-temporal, delete old data, and convert the table back to a temporal table.

Note

More details about data retention can be found in Books Online at the following address: https://msdn.microsoft.com/en-us/library/mt637341.aspx.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset