With columnar storage, the CPU can become a bottleneck. SQL Server solves these problems with batch mode processing. In batch mode processing, SQL Server processes data in batches rather than processing one row at a time. A batch represents roughly 900 rows of data. Each column within a batch is stored as a vector in a separate memory area, meaning that batch mode processing is vector-based. Batch mode processing interrupts a processor with metadata only once per batch rather than once per row, as in row mode processing, which lowers the CPU burden substantially. This means that batch mode spreads the metadata access costs over all of the 900 rows in a batch.
Batch mode processing is orthogonal to columnar storage. This means SQL Server can use it with many different operators, no matter whether the data is stored in row or column storage. However, batch mode processing gives the best results when combined with columnar storage. DML operations, such as insert, update, or delete, work in row mode. Of course, SQL Server can mix batch and row mode operators in a single query.
SQL Server introduced batch mode also in version 2012. The batch mode operators in this version include the following:
- Filter
- Project
- Scan
- Local hash (partial) aggregation
- Hash inner join
- Batch hash table build, but only in-memory, no spilling
- Bitmap filters limited to a single column, data types represented with a 64-bit integer
In SQL Server 2014, the following batch mode operators were added:
- All join types
- Union all
- Scalar aggregation
- Spilling support
- Complex bitmap filters, all data types supported
SQL Server 2016 added the following improvements to batch mode processing:
- Single-threaded queries
- Sort operator
- Multiple distinct count operations
- Left anti-semi join operators
- Window aggregate functions
- Window analytical functions
- String predicate and aggregate pushdown to the storage engine
- Row-level locking on index seeks against a nonclustered index and rowgroup-level locking on full table scans against the columnstore
Finally, SQL Server 2017 added two more improvements to batch mode processing:
- Batch mode adaptive joins
- Batch mode memory grant feedback
The following table summarizes the most important features and limitations of columnar storage and batch mode processing in SQL Server versions 2012 to 2106:
Columnstore Index/Batch Mode Feature |
SQL 2012 |
SQL 2014 |
SQL 2016 |
SQL 2017 |
Batch execution for multi-threaded queries |
yes |
yes |
yes |
yes |
Batch execution for single-threaded queries |
yes |
yes |
||
Batch mode adaptive joins |
yes |
|||
Batch mode memory grant feedback |
yes |
|||
Archival compression |
yes |
yes |
yes |
|
Snapshot isolation and read-committed snapshot isolation |
yes |
yes |
||
Specify CI when creating a table |
yes |
yes |
||
AlwaysOn supports CIs |
yes |
yes |
yes |
yes |
AlwaysOn readable secondary supports read-only NCCI |
yes |
yes |
yes |
yes |
AlwaysOn readable secondary supports updateable CIs |
yes |
yes |
||
Read-only NCCI on heap or B-tree |
yes |
yes |
yes |
yes |
Updateable NCCI on heap or B-tree |
yes |
yes |
||
NCCI online build and rebuild |
yes |
|||
B-tree indexes allowed on a table that has a NCCI |
yes |
yes |
yes |
yes |
Updateable CCI |
yes |
yes |
yes |
|
CCI LOB columns support |
yes |
|||
B-tree index on a CCI |
yes |
yes |
||
CCI on a memory-optimized table |
yes |
yes |
||
Filtered NCCI |
yes |
yes |
You can check whether SQL Server uses row or batch mode for an operator by analyzing the properties of the operator in the execution plan. Before checking the batch mode, the following code adds nine time as many rows to the test fact table:
DECLARE @i AS INT = 1; WHILE @i < 10 BEGIN SET @i += 1; INSERT INTO dbo.FactTest (SaleKey, CustomerKey, Customer, CityKey, City, DateKey, StockItemKey, Product, Quantity, TotalAmount, Profit) SELECT @i * 1000000 + f.[Sale Key] AS SaleKey, cu.[Customer Key] AS CustomerKey, cu.Customer, ci.[City Key] AS CityKey, ci.City, f.[Delivery Date Key] AS DateKey, s.[Stock Item Key] AS StockItemKey, s.[Stock Item] AS Product, f.Quantity, f.[Total Excluding Tax] AS TotalAmount, f.Profit FROM Fact.Sale AS f INNER JOIN Dimension.Customer AS cu ON f.[Customer Key] = cu.[Customer Key] INNER JOIN Dimension.City AS ci ON f.[City Key] = ci.[City Key] INNER JOIN Dimension.[Stock Item] AS s ON f.[Stock Item Key] = s.[Stock Item Key] INNER JOIN Dimension.Date AS d ON f.[Delivery Date Key] = d.Date; END;
Let's check how much space this table uses now:
EXEC sys.sp_spaceused N'dbo.FactTest', @updateusage = N'TRUE'; GO
The result is as follows:
Name rows reserved data index_size unused ------------ ------- --------- --------- ---------- ------- dbo.FactTest 2279810 502152 KB 498528 KB 2072 KB 1552 KB
Now let's start querying this table. Before executing the following query, make sure you turn the actual execution plan on:
SELECT f.StockItemKey, SUM(f.TotalAmount) AS Sales FROM dbo.FactTest AS f WHERE f.StockItemKey < 30 GROUP BY f.StockItemKey ORDER BY f.StockItemKey;
You can hover the mouse over any of the operators. For example, the following screenshot shows the details of the Hash Match (Partial Aggregate) operator:
You can see that SQL Server used row mode processing. As mentioned, batch mode is not strictly bound to columnar storage; however, it is much more likely that SQL Server would use it as you use the columnar storage. The following code creates a filtered nonclustered columnstore index. It is actually empty:
CREATE NONCLUSTERED COLUMNSTORE INDEX NCCI_FactTest ON dbo.FactTest (SaleKey, CustomerKey, Customer, CityKey, City, DateKey, StockItemKey, Product, Quantity, TotalAmount, Profit) WHERE SaleKey = 0; GO
Now, execute the same query again. As you can see from the following screenshot, this time batch mode is used for the Hash Match (Partial Aggregate) operator: