History table physical implementation

You saw in one of the previous sections that you can create your own history table and associate it with a current table or you can let SQL Server do it for you. When SQL Server creates a history table, by default it creates a row stored clustered table with a clustered index on period columns. If the current table does not contain data types that prevents the usage of data compression, the table is created with PAGE compression. Is this OK? Well, it depends on the use case. This approach is good, if dominant temporal queries are based on date range that is, return a snapshot for all rows that were valid at the given point in time. However, if your temporal queries usually look for historical records for individual items, a clustered index on primary key columns followed by period columns would be a better solution.

Finally, if you plan to process a lot of data in temporal queries or to aggregate them, the best approach is to create your own history table with a clustered columnstore index and eventual additional non-clustered normal B-tree indexes.

History table overhead

Converting a non-temporal table into a temporal table is very easy. With two ALTER TABLE statements, you get the full temporal table functionality. However, this cannot be completely free; adding a temporal feature to a table brings a performance overhead in data modification operations. According to my measurements, a single update operation against a row in a temporal table is about 30% slower than it would be when the table is non-temporal. I performed a simple test where I updated a different number of rows in a table before it was converted to a temporal table and after that. Figure 7.8 displays the results of this test. For a small amount of rows (<1000) locking, calculating, and inserting data into the history table slows down the update about two times. Updating 50,000 rows in a temporal table takes about 8 times longer than the same operation against the same table when it is implemented as a non-temporal table.

History table overhead

Figure 7.8: Temporal table updates overhead

Finally, massive updates (>100K rows) are 10 times slower for temporal tables compared to those against their non-temporal counterparts.

Temporal tables with memory-optimized tables

System-versioned temporal tables are also supported for memory-optimized tables. You can assign or let SQL Server create a history table for your memory-optimized table. The history table must be a disk table, but this is exactly what you want: frequently used (hot) data remains in memory, while cold data can reside in disk tables. By taking this approach, you can use all the benefits provided by memory-optimized tables (high transactional throughput, lock-free concurrency) and save their historical data on disk-based tables and leave memory for active datasets only.

Figure 7.9 shows the architecture of system-versioned memory-optimized tables. It is taken from the Books Online page at https://msdn.microsoft.com/en-us/library/mt590207.aspx.

Temporal tables with memory-optimized tables

Figure 7.9: System-Versioned Temporal Tables with Memory-Optimized Tables Architecture

As you can see, system-versioned temporal tables are implemented with three tables:

  • Current table is a memory-optimized table and all native compiled operations are supported
  • Recent history table is an internal memory-optimized table that handles changes in the current table synchronously and enables DMLs to be executed from natively compiled code
  • History table is a disk table that contains changes in the current table and manages them asynchronously

Historical data is a union of data in the recent history and history tables. A history row is either in the staging memory table or in the disk table: it cannot be in both tables. The following code example creates a new memory-optimized temporal table:

USE WideWorldImporters; 
CREATE TABLE dbo.Product 
( 
   ProductId INT NOT NULL PRIMARY KEY NONCLUSTERED, 
   ProductName NVARCHAR(50) NOT NULL, 
   Price MONEY NOT NULL, 
   ValidFrom DATETIME2 GENERATED ALWAYS AS ROW START HIDDEN NOT NULL, 
   ValidTo DATETIME2 GENERATED ALWAYS AS ROW END HIDDEN NOT NULL, 
   PERIOD FOR SYSTEM_TIME (ValidFrom, ValidTo) 
) 
WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA, SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.ProductHistory)); 

After the execution of this query, you can see that one memory-optimized table is:

SELECT CONCAT(SCHEMA_NAME(schema_id),'.', name) AS table_name, is_memory_optimized, temporal_type_desc FROM sys.tables WHERE name IN ('Product', 
 'ProductHistory'); 

The result of the preceding query is as follows:

table_name        is_memory_optimized temporal_type_desc
----------------- ------------------- ------------------------------
dbo.Product         1                 SYSTEM_VERSIONED_TEMPORAL_TABLE
dbo.ProductHistory  0                 HISTORY_TABLE

As mentioned earlier, SQL Server creates a third table automatically: an internal memory-optimized table. Here is the code that you can use to find its name and properties:

SELECT CONCAT(SCHEMA_NAME(schema_id),'.', name) AS table_name, internal_type_desc FROM  sys.internal_tables WHERE name = CONCAT('memory_optimized_history_table_', OBJECT_ID('dbo.Product')); 

And here is its output:

Temporal tables with memory-optimized tables

Only durable, memory-optimized tables can be system-versioned temporal tables and history tables must be disk-based. Since all current rows are in memory, you can use natively compiled modules to access this data. Use the following code to create a native compiled stored procedure that handles the inserting and updating of products:

CREATE OR ALTER PROCEDURE dbo.SaveProduct   
(    
@ProductId INT, 
@ProductName NVARCHAR(50), 
@Price MONEY 
)    
WITH NATIVE_COMPILATION, SCHEMABINDING, EXECUTE AS OWNER    
AS     
   BEGIN ATOMIC WITH    
   (TRANSACTION ISOLATION LEVEL = SNAPSHOT, LANGUAGE = N'English')    
   UPDATE dbo.Product SET ProductName = @ProductName, Price = @Price    
   WHERE ProductId = @ProductId 
   IF @@ROWCOUNT = 0 
         INSERT INTO dbo.Product(ProductId,ProductName,Price) VALUES (@ProductId, @ProductName, @Price); 
END 
GO 

Now you can, for instance, add two rows and update one of them by using the above procedure:

EXEC dbo.SaveProduct 1, N'Home Jersey Benfica', 89.95; 
EXEC dbo.SaveProduct 2, N'Away Jersey Juventus', 89.95; 
EXEC dbo.SaveProduct 1, N'Home Jersey Benfica', 79.95; 

Under the hood, everything works perfectly: both the current and history tables are updated. Here are the resulting datasets:

Temporal tables with memory-optimized tables

Note that values in the column ValidFrom and ValidTo are shortened to fit in a single line.

The querying of historical data is effectively under the SNAPSHOT isolation level and always returns a union between the in-memory staging buffer and the disk based table without duplicates. Since temporal queries (queries that use the FOR SYSTEM_TIME clause) are touching memory-optimized and disk tables, they can be used only in the interop mode; it is not possible to use them in native compiled procedures.

Data from the internal memory-optimized staging table is regularly moved to the disk-based history table by the asynchronous data flush task. This data flush mechanism has the goal of keeping the internal memory buffers at less than 10% of the memory consumption of their parent objects.

When you add system-versioning to an existing non-temporal table, expect a performance impact on update and delete operations because history table is updated automatically. Every update and delete is recorded into the internal memory-optimized history table, so you may experience unexpected memory consumption if your workload uses those two operations massively.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset