IT systems today are required to deliver maximum performance with zero downtime and maximum flexibility for developers and administrators. As developers, we all know that if the performance of our applications is somehow lacking, then the cause of the slowdown must be in the database!
Major causes of performance problems inside database projects can be followed back to database design mistakes or to the limitations built into the database engine to ensure data consistency for concurrent connections, or a mixture of both of these aspects.
Microsoft SQL Server adheres to the ACID theory. A simplified explanation of the ACID theory is that it describes how changes made to a database are required to fulfil four properties: Atomicity, Consistency, Isolation, and Durability:
To adhere to the ACID theory, Microsoft SQL Server employs the pessimistic concurrency model. Pessimistic concurrency assumes there will be conflicting concurrent transactions attempting to change the same data simultaneously. Therefore, transactions must (temporarily) lock the data they are changing to prevent other transactions from interfering with the isolation of those changes.
However, a system with enough concurrent users will eventually reach a logical limitation for transactional throughput through the design of the pessimistic concurrency model. Assuming unlimited hardware resources, the locking of data through pessimistic concurrency will become the bottleneck for transaction processing inside SQL Server.
Microsoft has developed a solution to this logical limitation, by implementing a method for employing the optimistic concurrency model. This model uses row versioning to avoid the need to lock and block data changes between concurrent users. Their implementation was made available in SQL Server 2014 under the code name Hekaton (Greek for 100-fold, hinting at the possible performance increases) and the official feature name In-Memory OLTP.
This chapter will introduce the In-Memory OLTP feature as it was initially made available in SQL Server 2014. We will discuss the feature's implementation and limitations before the specific additions and improvements in SQL Server 2016 are expanded upon in Chapter 12, In-Memory OLTP Improvements in SQL Server 2016.
For the examples in this chapter, we will be working with SQL Server 2014. Chapter 12, In-Memory OLTP Improvements in SQL Server 2016 will cover the improvements and enhancements of In-Memory OLTP for SQL Server 2016.
In-Memory OLTP is the name of the optimistic concurrency model implementation offered from SQL Server 2014 onwards. The challenge that In-Memory OLTP is designed to solve is the ability to process massively concurrent transaction workloads while removing the logical limitations of the pessimistic concurrency model present in the traditional transaction processing engine in SQL Server. Microsoft also wanted to introduce In-Memory OLTP as an additional transaction processing engine and not as a replacement for the standard transaction processing engine. This decision was made to ensure that existing applications running on SQL Server would be compatible with newer SQL Server versions, but offer the ability to allow newer applications to take advantage of the new engine without separating the two systems.
In this section, we will take a look at the architecture of the In-Memory OLTP engine and see how data is stored and processed.
The first major difference in In-Memory OLTP is the storage structure of both tables and indexes. The traditional storage engine in SQL Server was optimized for disk storage, especially for the block storage of hard disk subsystems. However, In-Memory OLTP was designed from the ground up to be memory resident, or memory-optimized. This small, but important difference allowed a design with byte storage in mind. This means that memory-optimized tables avoid the needs of data pages and extents that we know from the normal, disk-optimized tables in SQL Server. Through this change, a significant overhead in page and extent management is removed. This reduction in overhead provides a major performance increase in itself.
Rows in memory-optimized tables are stored as heaps. These heaps are not to be confused with heaps in the traditional storage engine, but are memory structures storing the row data itself. These rows have no specific physical ordering and no requirement or expectation of being located near to other rows in memory. They also have no knowledge of connected rows themselves; they are solely created to store row data. The connection of these rows is controlled via the indexes created on memory-optimized tables. Knowing that the index stores and controls the connection of the memory-optimized row data heaps, we can infer that a table must have at least one index. This index (or indexes) must be defined during table creation.
In the introduction to the chapter, we discussed that the In-Memory OLTP system is an implementation of the optimistic concurrency model. This model uses row versioning to allow concurrent connections to change data and still provide the necessary isolation. Row versioning is an important portion of how memory-optimized tables are structured. Let's look at how a row is structured for a memory-optimized table.
In Figure 11.1, we can see a depiction of a memory-optimized table, which is split into a row header (information about the row) and payload (the actual row data). The row header structure is expanded in the lower part of the diagram:
Figure 11.1: Memory-optimized table row structure - Header and Payload
The row header is split into five sections:
The row payload is where the actual row data is stored. The key columns and all other columns are stored here. The structure of the payload depends upon the columns and data types that make up the table from the CREATE TABLE
command.
All memory-optimized tables must have at least one index. The indexes on a memory-optimized table store the connections between the rows. It is also important to remember that, as the name suggests, In-Memory OLTP indexes reside inside non-permanent memory (RAM) and are not written to disk. The only portions written to disk are the data rows of the base memory-optimized table and data changes, which are written to the transaction log (which is a requirement to fulfill the Durability part of the ACID concept).
There are two types of index available for memory-optimized tables: non-clustered index and hash index.
To be able to discuss the structure of a non-clustered index itself, we first need to understand how the storage engine creates, references, and processes these indexes. The name of this type of index should be well known to all SQL Server developers. A non-clustered index for a memory-optimized table is widely similar to a non-clustered index in the traditional storage engine. This type of index uses a variation of the B-tree structure called a Bw-tree and is very useful for searching on ranges of data in a memory-optimized table. A Bw-tree is essentially a B-tree structure, without the locking and latching used to maintain a traditional B-tree. The index contains ordered key-values which themselves store pointers to the data rows. With the Bw-tree, these pointers do not point to a physical page number (these data page structures don't exist for memory-optimized tables); the pointers direct to a logical Page ID (PID), which directs to an entry in a Page Mapping Table. This Page Mapping Table stores the physical memory address of the row data. This behavior mimics a covering index from the traditional disk-based non-clustered indexes, containing the columns of the key columns of the index. As mentioned, memory-optimized tables run in optimistic concurrency and use versioning to avoid the need for locks and latches. The memory-optimized non-clustered index never updates a page when a change occurs; updated pages are simply replaced by adding a new page to the index and adding the physical memory address to the Page Mapping Table.
In Figure 11.2, we see an example of a memory-optimized non-clustered index:
Figure 11.2: Memory-optimized non-clustered index - Bw-tree structure
The Bw-tree looks a lot like a B-tree except that we have a connection with the Page Mapping Table (instead of an Index Allocation Map for disk-based tables), which translates the index to the physical memory location of the row data. The root page and the non-leaf pages store key value information and a PID of the page "below" themselves in the tree. Particularly noteworthy here, is that the key value stored is the highest value on the page on the next level down and not, as with a traditional B-tree, the lowest value. The leaf level also stores the PID to the memory address of the row that the index references. If multiple data rows have the same key, they will be stored as a "chain of rows", where the rows reference the same physical address for the value. Data changes in a memory-optimized non-clustered index are processed through a delta record process. The value that was originally referenced is then no longer referenced by the index and can be physically removed via the Garbage Collector (GC).
A memory-optimized non-clustered index page follows a familiar pattern, with each page having a header which carries uniform control information:
For pages lower down the Bw-tree structure, that is, internal and leaf, there are additional fields containing more detailed information required for deeper traversal than the root page:
The second type of index for memory-optimized tables is the hash index. An array of pointers is created, with each element in the array representing a single value—this element is known as a hash bucket. The index key column of a table that should have a hash index created on it has a hashing function applied to it and the resulting hash value is used to "bucketize" rows. Values that generate the same hash value are stored in the same hash bucket as a linked chain of values. To access this data, we pass in the key value we wish to search for; this value is also hashed to determine which hash bucket should be accessed. The hash index translates the hash value to the pointer and allows us to retrieve all the duplicate values in a chain.
In Figure 11.3, we can see a very simplified representation of the hashing and retrieval of data. The row header has been removed for clarity's sake.
Figure 11.3
In Figure 11.3, we can see a table storing company address information. We are interested in the city column, which is the key for a hash index. If we were to insert a value of London into this hash index, we will get a value 23 (the hash function is simplified here for illustration purposes). This stores the key value London and the memory address of the table row of the memory-optimized table. If we then insert the second row with a city value of Lingen and our hash function hashes Lingen to 23 too, then this will be added to the hash index in the hash bucket for the value 23 and added to the chain started with London. Each hash index on a memory-optimized table adds an index pointer column (shown here with a dashed outline). This is an internal column that is used to store the pointer information for the next row in the chain. Every row of a chain has a pointer to the next row, barring the last row, which can have no following chain row. When no pointer is in this column, the retrieval routine knows it has reached the end of the chain and can stop loading rows.
The creation of a hash index requires some thought. We have seen that we can have multiple values that hash to the same hash value, which creates chains of records. However, the idea of a hashing function is to try and keep these chains as short as possible. An optimal hash index will have enough hash buckets to store as many unique values as possible, but has as few buckets as possible to reduce the overhead of maintaining these buckets (more buckets equal more memory consumption for the index). Having more buckets than is necessary will not improve performance either but rather hinder it, as each additional bucket makes a scan of the buckets slower. We must decide how many buckets to create when we initially create the index, which we will see with a sample script shortly. It is also important to realize that the hash index and hashing function consider all key columns of an index we create. So a multi-column index raises the chances that hash values will be unique and can require more hash buckets to assist with keeping those buckets emptier.