Chapter 4. Transact-SQL Enhancements

Each new SQL Server version brings numerous extensions and improvements to Transact-SQL language. Most of them are used to support newly added database engine features, but some of them address missing functionalities and limitations in previous versions. SQL Server 2016 comes up with many features that require extensions in Transact-SQL: temporal tables, JSON support, improvements for memory-optimized tables, columnstore tables and indexes, new security enhancements, and so on. They will be explored in detail in chapters dedicated to appropriate features.

This chapter covers small Transact-SQL features that can make developers' work more productive and enhancements that can increase the availability of database objects and enlarge the scope of existing functionalities, limited in the previous SQL Server versions.

This chapter is divided into the following three sections:

  • New and enhanced Transact-SQL functions and expressions
  • Enhanced DML and DDL statements
  • New query hints

In the first section, you will see new out-of-the-box functions and expressions that allow developers string manipulations, compressions with the GZIP algorithm, and playing with session-scoped variables. In addition, the removal of restrictions or limitations of some functions makes them more interesting for usage in SQL Server 2016.

The further sections cover enhancements in data manipulation and data definition statements. The most important one will let you change the data type or other attributes of a table column, while the table remains available for querying and modifications. This is a very important feature for systems where continuous availability is required. You will also be aware of other improvements that let you perform some actions faster or with less written code.

Finally, there will be a demonstration on how to use newly added query hints to improve query execution and avoid problems caused by the Spool operator or inappropriate memory grants.

New and enhanced functions and expressions

SQL Server 2016 introduces several new functions that can help developers to be more productive and efficient. Additionally, by removing limitations in some existing functions, their scope of usage has been enlarged. Now, SQL Server contains more than 300 built-in functions. Here is the list of new or changed functions and expressions in SQL Server 2016:

  • Two new string functions STRING_SPLIT and STRING_ESCAPE
  • New date function and new expression DATEFDIFF_BIG and AT TIME ZONE
  • Four new system functions COMPRESS, DECOMPRESS, SESSION_CONTEXT, and CURRENT_TRANSACTION_ID
  • Enhancements in the cryptographic function HASHBYTES
  • Four JSON-related functions ISSJON, JSON_VALUE, JSON_QUERY, JSON_MODIFY, and one new rowset function OPENJSON

STRING_SPLIT

Since SQL Server does not support arrays, when multiple values need to be send to it, developers use a list of values (usually comma-separated ones).

SQL Server 2008 introduced an excellent feature called Table-Valued Parameters (TVP), which allows you to pack values in a table and transfer them to SQL Server in table format. On the server, stored procedures, or queries, use this parameter as a table variable and can leverage set-based operations to improve performance, compared to separate executions per single parameter value. Thus, from SQL Server 2008, it is strongly recommended to use TVP instead of a list of values in such cases.

However, lists of values as parameters for stored procedures are still widely used, mainly for the following two reasons:

  • Missing support for TVP in JDBC drivers: Therefore, Java applications and services still have to use comma-separated lists or XML to transfer a list of values to SQL Server
  • Legacy code: Significant amount of Transact-SQL code from the previous versions, where TVP was not supported

When a list of values is transferred to SQL Server as a stored procedure parameter in the stored procedure body, this list should be converted to a table. Until SQL Server 2016, there was no built-in function that could perform this action. Developers had to write user-defined functions (UDF) or play with the FOR XML PATH extension for that purpose. An excellent overview and performance comparison of the existing user-defined functions for converting a string to a table can be found in the article Split strings the right way - or the next best way written by Aaron Bertrand. This article is available at http://sqlperformance.com/2012/07/t-sql-queries/split-strings.

Note

You might ask yourself why companies still have legacy code in production systems and why they don't migrate the old code so that they can use the benefits of new features and enhancements. For instance, why are old implementations with comma-separated lists not replaced by the recommended TVPs? The migration is an easy task. This is true, the migration steps are not complex and every developer can perform them. However, in a medium or large company, developers cannot decide what should be done. Their responsibility scope is related to how and not to what. Therefore, in such cases, the developer can suggest the migration to project managers or product owners and the decision about the priority of the action is made on the business side. To migrate a comma-separated list to TVP, you need to change not only the body of the stored procedures, but also its parameters, and its interface. You also need to change the data access layer, to touch the application code, unit tests, to compile the project, to deploy it. Even if your tests are fully automated, this is not a trivial effort. On the other hand, the migration does not bring significant improvements for customers. Nowadays, development processes are mostly based on the agile methodology and features mostly wanted and appreciated by customers have the highest priority. Therefore, such migration actions usually remain further down the to-do list, waiting for a miracle to happen.

Finally, the SQL Server development team added the STRING_SPLIT function into the latest release. This is a table-valued function and converts a delimited string into a single column table. The function accepts two input arguments:

  • String: This is an expression of any nondeprecated string data type that needs to be split
  • Separator: This is a single character used as a separator in the input string

Since it is a table-valued function, it returns a table. The returned table contains only one column with the value name and with a data type and length that are the same as that of the input string.

Here is an example showing how this function can be used to produce a three-row table for a comma-separated list as input. Execute this code:

USE tempdb; 
SELECT value FROM STRING_SPLIT(N'Rapid Wien,Benfica Lisboa,Seattle Seahawks',','); 

The preceding query produces the following output:

value
------------------
Rapid Wien
Benfica Lisboa
Seattle Seahawks

The actual execution plan for the preceding query looks as follows:

STRING_SPLIT

Figure 4.1: Estimated Number of Rows for the STRING_SPLIT function

Notice that, the Estimated Number of Rows is 50. This is always the case with this function: the estimated output is 50 rows and it does not depend on the number of string elements. Even when you specify the OPTION (RECOMPILE) query hint, the estimation remains the same. In the case of user-defined table-valued functions, the estimated number of rows is 100.

As a table-valued function, STRING_SPLIT can be used not only in the SELECT clause, but also in FROM, WHERE, and wherever a table expression is supported. To demonstrate its usage, you will use the new SQL Server sample database WideWorldImporters. The database is available for download at https://github.com/Microsoft/sql-server-samples/releases/tag/wide-world-importers-v1.0. The following query extracts stock items having the Super Value tag in the Tags attribute:

USE WideWorldImporters; 
SELECT StockItemID, StockItemName, Tags  
FROM Warehouse.StockItems  
WHERE '"Super Value"' IN (SELECT value FROM STRING_SPLIT(REPLACE(REPLACE(Tags,'[',''), ']',''), ','));   

This query produces the following result:

StockItemID StockItemName                             Tags
----------- ---------------------------------------- ----------------
150         Pack of 12 action figures (variety)      ["Super Value"]
151         Pack of 12 action figures (male)         ["Super Value"]
152         Pack of 12 action figures (female)       ["Super Value"]

The following code example demonstrates how this function can be used to return details about orders for IDs provided in a comma-separated list.

USE WideWorldImporters; 
DECLARE @orderIds AS VARCHAR(100) = '1,3,7,8,9,11'; 
SELECT o.OrderID, o.CustomerID, o.OrderDate FROM Sales.Orders AS o 
INNER JOIN STRING_SPLIT(@orderIds,',') AS x ON x.value= o.OrderID; 

This produces the following output:

OrderID     CustomerID  OrderDate
----------- ----------- ----------
1           832         2013-01-01
3           105         2013-01-01
7           575         2013-01-01
8           964         2013-01-01
9           77          2013-01-01
11          586         2013-01-01

Note that, since the function returns a column of string data type, there is an implicit conversion between the columns involved in the JOIN clause.

The function returns an empty table if the input string is not provided, as you can see in the next code example:

DECLARE @input AS NVARCHAR(20) = NULL; 
SELECT * FROM STRING_SPLIT(@input,','); 

This is the output produced by the preceding command:

value
--------

The STRING_SPLIT function requires that the database is in compatibility level 130. If this is not the case, you will get an error. The next code example demonstrates an attempt to use this function under compatibility level 120:

USE WideWorldImporters; 
ALTER DATABASE WideWorldImporters SET COMPATIBILITY_LEVEL = 120; 
GO 
SELECT value FROM STRING_SPLIT('1,2,3',','); 
/*Result: 
Msg 208, Level 16, State 1, Line 65 
Invalid object name 'STRING_SPLIT'. 
*/ 
--back to the original compatibility level 
ALTER DATABASE WideWorldImporters SET COMPATIBILITY_LEVEL = 130; 

This is a handy function and will definitely have its use cases. However, as you will have already noticed, there are some limitations:

  • Single character separator: The function accepts only a single character separator; if you had a separator with more characters, you would still need to write your own user-defined function.
  • Single output column: The output is a single column table, without the position of the string element within the delimited string. Thus, you can only sort the output by the element name.
  • String data type: When you use this function to delimit a string of numbers, although all the values in the output column are numbers, their data type is string and when you join them to numeric columns in other tables, data type conversion is required.

If these limitations are acceptable to you, you should use them in future developments. I would always suggest a built-in function rather than a user-defined one if they are similar from a performance point of view. It is always predeployed and available in all databases. Once again, take note that the database must be in the latest compatibility mode (130).

STRING_ESCAPE

The STRING_ESCAPE function is a scalar function and escapes special characters in input text according to the given formatting rules. It returns input text with escaped characters. The function accepts the following two input arguments:

  • Text: This is an expression of any nondeprecated string data type
  • Type: This must have the JSON value, since SQL Server 2016 currently supports only JSON as the formatting type

The return type of the function is nvarchar(max). The STRING_ESCAPE function is a deterministic function; it always returns the same result for the same input parameters.

Note

Data types text, ntext, and image are marked as deprecated features in SQL Server 2005. This means they can be removed in one of the next versions. The fact that they are still deprecated could mean that they will still be supported, because of legacy code. Microsoft does not want to take risks with actions that can cause damaging changes in customer applications. However, as you can see with these two functions, all new features, functions, and expressions that manipulate with strings don't accept these data types. This is an implicit way to force you to use the recommended data types varchar(max), nvarchar(max), and varbinary(max) instead of their deprecated counterparts.

JSON's escaping rules are defined in the ECMA 404 standard specification. Table 4.1 provides the list of characters that must be escaped according to this specification and their JSON conform representation:

Special character

JSON conform character

Double quote

"

Backspace



Solidus

/

Reverse solidus

\

Form feed

f

Tabulation

Carriage return

New line

Table 4.1: JSON escaping rules

In addition to this, all control characters with character codes in the range 0-31 need to be escaped, too. In JSON output, they are represented in the following format: u<code>. Thus, the control character CHAR(0) is escaped as u0000, while CHAR(31) is represented by u001f.

The following several examples will be used to demonstrate how this function works. Suppose you need to escape the following string: ac/de"f . According to JSON's escaping rules, three characters should be escaped: back slash, solidus, and double quote. You can check it by calling the STRING_ESCAPE function for this string as the input argument:

SELECT STRING_ESCAPE('ac/de"f','JSON') AS escaped_input; 

Here is the result returned by the function:

escaped_input
-------------
a\bc/de"f

The following example demonstrates the escape of the control characters with the code 0, 4, and 31:

SELECT  
  STRING_ESCAPE(CHAR(0), 'JSON') AS escaped_char0,  
  STRING_ESCAPE(CHAR(4), 'JSON') AS escaped_char4,  
  STRING_ESCAPE(CHAR(31), 'JSON') AS escaped_char31; 

This function call produces the following output:

escaped_char0      escaped_char4      escaped_char31
-------------      -------------      --------------
u0000             u0004             u001f

The next example shows that the horizontal tab represented by the string and by the code is escaped with the same sequence:

SELECT  
  STRING_ESCAPE(CHAR(9), 'JSON') AS escaped_tab1,  
  STRING_ESCAPE('    ', 'JSON') AS escaped_tab2; 

Both statements resulted in a sequence:

escaped_tab1    escaped_tab2
------------    --------------
	               	

The function returns a NULL value if the input string is not provided. To check this, run the following code:

DECLARE @input AS NVARCHAR(20) = NULL; 
SELECT STRING_ESCAPE(@input, 'JSON') AS escaped_input; 

You will get the expected result:

escaped_input
--------------
NULL

Escaping occurs both in the names of properties and in their values. Consider the following example, where one of the keys in the JSON input string contains a special character:

SELECT STRING_ESCAPE(N'key:1, id:4', 'JSON') AS escaped_input;

Here is the output:

escaped_input
---------------
key:1, i\d:4

The STRING_ESCAPE function is internally used by the FOR JSON clause to automatically escape special characters and represents control characters in the JSON output. It can also be used for formatting paths, especially if you need to run it on UNIX systems (which is happening with R integration and SQL Server on Linux). Sometimes, a forward slash or backslash needs to be doubled, and this function is perfect when preparing code for Unix or CMD commands; a backslash needs to be doubled and converted to a forward slash. Unlike the STRING_SPLIT function, this function is available in a SQL Server 2016 database, even in old database compatibility levels.

COMPRESS

The COMPRESS function is a scalar function and compresses the input variable, column, or expression using the GZIP algorithm. The function accepts an expression, which can be either string or binary, but again, the deprecated data types text, ntext, and image are not supported.

The return type of the function is varbinary(max).

Use this function with wide text columns, especially when you do not plan to query them often. For large strings, the compression rate can be significant, particularly when the original string is an XML. Here is an example of significant compression. The example uses the output of the system Extended Event session system_health to check the compression rate when you use the COMPRESS function for the target_data column. Here is the code:

SELECT 
  target_name, 
  DATALENGTH(xet.target_data) AS original_size, 
  DATALENGTH(COMPRESS(xet.target_data)) AS compressed_size, 
  CAST((DATALENGTH(xet.target_data) - DATALENGTH(COMPRESS(xet.target_data)))*100.0/DATALENGTH(xet.target_data) AS DECIMAL(5,2)) AS compression_rate_in_percent 
FROM sys.dm_xe_session_targets xet   
INNER JOIN sys.dm_xe_sessions xe ON xe.address = xet.event_session_address   
WHERE xe.name = 'system_health';  

The following is the output generated by this query on my test server. You might get a different output, but similar results.

target_name  original_size   compressed_size  compression_rate_in_pct
------------ --------------  ---------------- -----------------------
ring_buffer  8386188         349846           95.83
event_file   410             235              42.68

You can see a quite impressive compression rate of 96%. On the other hand, the compressed representation of a short string can be even longer than the original. Consider the following example, where a short string with a size of 30 bytes is used as input:

DECLARE @input AS NVARCHAR(15) = N'SQL Server 2016'; 
SELECT @input AS input, DATALENGTH(@input) AS input_size, COMPRESS(@input) AS compressed, DATALENGTH(COMPRESS(@input)) AS comp_size; 

The result of this query (with abbreviated compressed value) is:

input            input_size  compressed                         comp_size
------------     ----------- ---------------------------------  ---------
SQL Server 2016  30          0x1F8B08000000000004000B660864F061 46

The COMPRESS function is not a replacement for row or page compression. It is invoked for a single expression and additional optimizations are not possible (exactly the same string tokens exist in some other row or column).

To compare compression rates for the Row and Page compression on one side and the compression by the COMPRESS function on the other side, I have created four clone tables of the system table sys.messages. I have left one uncompressed and have compressed the other three with ROW, PAGE, and COMPRESS functions respectively. The complete code for creating and populating tables, as well as for a comparison between compression methods, can be found in the code accompanying this book. Figure 4.2 displays the result of this comparison:

COMPRESS

Figure 4.2: Compare compression rates between Row, Page, and Compress

You can see that (slightly) more compression can be achieved using Row and Page compression, but a notable compression is also obtained using the COMPRESS function.

Use this function when you want to save some storage space or to compress data that needs to be archived or logged and is thus rarely queried. Since it uses a common and well-known GZIP algorithm, you can compress/decompress data not only in SQL Server but also in client applications and tools communicating with SQL Server.

DECOMPRESS

The DECOMPRESS function decompresses the compressed input data in binary format (variable, column, or expression) using GZIP algorithm.

The return type of the function is varbinary(max). Yes, you read it right---the result of the decompression is still a varbinary data type and if you want to get the original data type, you need to cast the result explicitly.

Consider the following example, where the input string is first compressed and then decompressed with the same algorithm:

DECLARE @input AS NVARCHAR(100) = N'SQL Server 2016 Developer''s Guide'; 
SELECT DECOMPRESS(COMPRESS(@input));  

Since the function DECOMPRESS is logically complementary to the COMPRESS function, you would expect to get the original input string as the result. The result is, however, in the binary format:

input
---------------------------------------------------------------------
0x530051004C00200053006500720076006500720020003200300031003600200044006500760065006C006F0070006500720027007300200047007500690064006500

To get the input string back, you need to convert the result data type to the initial data type:

DECLARE @input AS NVARCHAR(100) = N'SQL Server 2016 Developer''s Guide'; 
SELECT CAST(DECOMPRESS(COMPRESS(@input)) AS NVARCHAR(100)) AS input; 

Now, you will get the expected result:

input
----------------------------------
SQL Server 2016 Developer's Guide

Note

The input parameter for the DECOMPRESS function must have previously been with the GZIP algorithm compressed binary value. If you provide any other binary data, the function will return NULL.

Notice an interesting phenomenon if you miss the correct original type and cast to varchar instead of nvarchar:

DECLARE @input AS NVARCHAR(100) = N'SQL Server 2016 Developer''s Guide'; 
SELECT CAST(DECOMPRESS(COMPRESS(@input)) AS VARCHAR(100)) AS input; 

When you use the Results to Text option to display query results, the following result is shown in SQL Server Management Studio:

input
------------------------------------------------------------------
S Q L   S e r v e r   2 0 1 6   D e v e l o p e r ' s   G u i d e 

However, when the Results to Grid option is your choice, the output looks different, as shown in Figure 4.3:

DECOMPRESS

Figure 4.3: Side effect of an incorrect data type casting

Moreover, if you change the original type and cast to the Unicode data type, the result is very strange. When you swap the data types in the input string and the casted result, as shown here:

DECLARE @input AS VARCHAR(100) = N'SQL Server 2016 Developer''s Guide'; 
SELECT CAST(DECOMPRESS(COMPRESS(@input)) AS NVARCHAR(100)) AS input; 

The output looks the same in all display modes, as shown here:

DECOMPRESS

As a comment on this bizarre behavior, keep in mind that it is always good to cast to the original data type and not to rely on the conversion internals.

CURRENT_TRANSACTION_ID

The CURRENT_TRANSACTION_ID function, as its name suggests, returns the transaction ID of the current transaction. The scope of the transaction is the current session. It has the same value as the transaction_id column in the dynamic management view sys.dm_tran_current_transaction. The function has no input arguments and the returned value is of type bigint.

Multiple calls of this function will result in different transaction numbers, since every single call is interpreted as an implicit transaction.

SELECT CURRENT_TRANSACTION_ID(); 
SELECT CURRENT_TRANSACTION_ID(); 
BEGIN TRAN 
SELECT CURRENT_TRANSACTION_ID(); 
SELECT CURRENT_TRANSACTION_ID(); 
COMMIT 

The result on my machine is (you will definitely get different numbers, but with the same pattern):

-------------
921170382
921170383
921170384
921170384

There is also the SESSION_ID function, which returns the current session ID, but it works only in Azure SQL Data Warehouse and in Parallel Data Warehouse. When you call it in an on-premises instance of SQL Server 2016, instead of the current session ID, you will see the following error message:

Msg 195, Level 15, State 10, Line 1
'SESSION_ID' is not a recognized built-in function name.

You can use the CURRENT_TRANSACTION_ID function to check your transaction in active transactions as follows:

SELECT * FROM sys.dm_tran_active_transactions WHERE transaction_id = CURRENT_TRANSACTION_ID(); 

SESSION_CONTEXT

Using and maintaining session variables or data within a user session in SQL Server is not so straightforward. With the SET CONTEXT_INFO statement, you can set a 128-bytes long binary value and you can read it with the CONTEXT_INFO function. However, having one single value within a scope of the session is a huge limitation. SQL Server 2016 brings more functionality for playing with session-scope-related data.

The SESSION_CONTEXT function returns the value of the specified key in the current session context. This value is previously set using the sys.sp_set_session_context procedure. It accepts the nvarchar data type as an input parameter. Interestingly, the function returns a value with the sql_variant data type.

Use the following code to set the value for the language key and then call the SESSION_CONTEXT function to read the value of the session key:

EXEC sys.sp_set_session_context @key = N'language', @value = N'German'; 
SELECT SESSION_CONTEXT(N'language'); 

The result of this action is shown as follows:

-------------
German

As mentioned earlier, the input data type must be nvarchar. An attempt to call the function with a different data type (including varchar and nchar!) ends up with an exception.

SELECT SESSION_CONTEXT('language'); 

You get the following message:

Msg 8116, Level 16, State 1, Line 51
Argument data type varchar is invalid for argument 1 of session_context function.

The function argument does not need to be a literal; you can put it in a variable, as shown in the following code example:

DECLARE @lng AS NVARCHAR(50) = N'language'; 
SELECT SESSION_CONTEXT(@lng); 

The size of the key cannot exceed 256 bytes and the limit for the total size of keys and values in the session context is 256 KB.

The system stored procedure sys.sp_set_session_context and the function SESSION_CONTEXT allow you to create and maintain session variables within SQL Server and overcome limitations from previous SQL Server versions. The SESSION_CONTEXT function is used as a part of the Row-Level Security feature, and it will be explored in more detail in Chapter 8, Tightening the Security.

DATEDIFF_BIG

The DATEDIFF function returns a number of time units crossed between two specified dates. The function accepts the following three input arguments:

  • datepart: This is the time unit (year, quarter, month... second, millisecond, microsecond, and nanosecond)
  • startdate: This is an expression of any date data type (date, time, smalldatetime, datetime, datetime2, and datetimeoffset)
  • enddate: This is also an expression of any date data type (date, time, smalldatetime, datetime, datetime2, and datetimeoffset)

The return type of the function is int. This means that the maximum returned value is 2,147,483,647. Therefore, if you specify minor units (milliseconds, microseconds, or nanoseconds) as the first parameter of the function, you can get an overflow exception for huge date ranges. For instance, this function call will still work, as follows:

SELECT DATEDIFF(SECOND,'19480101','20160101') AS diff; 

And it returns this result:

diff
-----------
2145916800

However, the following example will not work:

SELECT DATEDIFF(SECOND,'19470101','20160101') AS diff; 

The result of this call is the following error message:

Msg 535, Level 16, State 0, Line 392
The datediff function resulted in an overflow. The number of dateparts separating two date/time instances is too large. Try to use datediff with a less precise datepart.

Due to the aforementioned data type limit, the maximal date difference that DATEDIFF can calculate for the second as date part is about 68 years. In Table 4.2, you can find the list of date part units and the maximum supported date difference for them.

Date part

Maximal supported date difference

Hour

250,000 years

Minute

4,086 years

Second

68 years

Millisecond

25 days

Microsecond

36 minutes

Nanosecond

2,14 seconds

Table 4.2: Maximal supported date difference per date part for the function DATEDIFF

In order to cover a greater date range for short date parts, the SQL Server development team has added a new function in SQL Server 2016 DATEDIFF_BIG.

It has exactly the same interface as DATEDIFF, the only difference is its return type—bigint. This means that the maximal returned value is 9,223,372,036,854,775,807. With this function, you will not get an overflow even when you specify a huge date range and choose a minor date part. The following code calculates the difference between the minimal and maximal value supported by the datetime2 data type in microseconds:

SELECT DATEDIFF_BIG(MICROSECOND,'010101','99991231 23:59:59.999999999') AS diff; 

The following is the very large number representing this difference:

diff
--------------------
252423993599999999

However, even with the DATEDIFF_BIG function, you can get an exception if you call it for the same dates and choose the date part nanosecond.

SELECT DATEDIFF_BIG(NANOSECOND,'010101','99991231 23:59:59.999999999') AS diff; 

The maximal value of the bigint data type is not enough to host this difference and to avoid an overflow:

Msg 535, Level 16, State 0, Line 419
The datediff_big function resulted in an overflow. The number of dateparts separating two date/time instances is too large. Try to use datediff_big with a less precise datepart.

Of course, the last two statements are listed just for demonstration purposes; I cannot imagine a reasonable use case where you will need to represent 10,000 years in microseconds or nanoseconds. Therefore, you can say that DATEDIFF_BIG meets all the reasonable requirements related to date difference calculations.

AT TIME ZONE

The AT TIME ZONE expression can be used to represent time in a given time zone. It converts an input date to the corresponding datetimeoffset value in the target time zone. It has the following two arguments:

  • inputdate: This is an expression of the following date data types: smalldatetime, datetime, datetime2, and datetimeoffset.
  • timezone: This is the name of the target time zone. The allowed zone names are listed in the sys.time_zone_info catalog view.

The return type of the expression is datetimeoffset in the target time zone.

Use the following code to display local UTC time, and the local time in New York and Vienna.

SELECT  
  CONVERT(DATETIME, SYSDATETIMEOFFSET()) AS UTCTime, 
  CONVERT(DATETIME, SYSDATETIMEOFFSET() AT TIME ZONE 'Eastern Standard Time') AS NewYork_Local, 
  CONVERT(DATETIME, SYSDATETIMEOFFSET() AT TIME ZONE 'Central European Standard Time') AS Vienna_Local; 

This query generates the following result:

UTCTime                 NewYork_Local           Vienna_Local        
------------------      -------------------     ------------------- 
2016-06-18 15:09:09.410 2016-06-18 10:09:09.410 2016-06-18 16:09:09.410

As mentioned earlier, the values supported for time zone can be found in a new system catalog sys.time_zone_info. This is exactly the same list as in the registry: KEY_LOCAL_MACHINESOFTWAREMicrosoftWindows NTCurrentVersionTime Zones.

The target time zone does not need to be a literal; it can be wrapped in a variable and parameterized. The following code displays the time in four different time zones:

SELECT name, CONVERT(DATETIME, SYSDATETIMEOFFSET() AT TIME ZONE name) AS local_time  
FROM sys.time_zone_info 
WHERE name IN (SELECT value FROM STRING_SPLIT('UTC,Eastern Standard Time,Central European Standard Time,Russian Standard Time',',')); 

Note that another new function, STRING_SPLIT, is used in this query. The result of the previous query is as follows:

name                                 local_time
------------------------------------ -----------------------
Eastern Standard Time                2016-06-18 11:10:46.257
UTC                                  2016-06-18 15:10:46.257
Central European Standard Time       2016-06-18 17:10:46.257
Russian Standard Time                2016-06-18 18:10:46.257

By using AT TIME ZONE, you can convert a simple datetime value without a time zone offset to any time zone by using its name. What time is it in Seattle when a clock in Vienna shows 22:33 today (15th August 2016)? Here is the answer:

SELECT CAST('20160815 22:33' AS DATETIME)  
AT TIME ZONE 'Central European Standard Time'  
AT TIME ZONE 'Pacific Standard Time' AS seattle_time; 

Note that you must convert the string value to datetime. Usually, a string literal formatted as YYYMMDD HH:ss is interpreted as a valid datetime value, but, in this case, you need to cast it explicitly to datetime data type.

HASHBYTES

The HASHBYTES built-in function is used to hash the string of characters using one of the seven supported hashing algorithms. The function accepts the following two input arguments:

  • algorithm: This is a hashing algorithm for hashing the input. The possible values are: MD2, MD4, MD5, SHA, SHA1, SHA2_256, and SHA2_512, but only the last two are recommended in SQL Server 2016.
  • input: This is an input variable, column, or expression that needs to be hashed. The data types that are allowed are varchar, nvarchar, and varbinary.

The return type of the function is varbinary (8000).

This function has been available in SQL Server since 2005, but it is enhanced in SQL Server 2016. The most important enhancement is removing the limit for the input size. Prior to SQL Server 2016, the allowed input values were limited to 8.000 bytes; now, no limit is defined. In addition to this significant enhancement, five old algorithms MD2, MD4, MD5, SHA, and SHA1 are marked for deprecation. The SHA2_256 and SHA2_512 algorithms are stronger, require more storage space, and hash calculation is slower, but the collision probability is very low.

To demonstrate the importance of the removed input limit, the former standard sample Adventure Works database will be used. Execute the following code in a SQL Server 2014 instance with this sample database installed to calculate a hash value for the XML representation of the first six orders in the SalesOrderHeader table:

USE AdventureWorks2014; 
SELECT HASHBYTES('SHA2_256',(SELECT TOP (6) * FROM Sales.SalesOrderHeader FOR XML AUTO)) AS hashed_value; 

The following hashed value is produced by the previous command:

hashed_value
------------------------------------------------------------------
0x26C8A739DB7BE2B27BCE757105E159647F70E02F45E56C563BBC3669BEF49AAF

However, when you want to include the seventh row in the hash calculation, use the following code:

USE AdventureWorks2014; 
SELECT HASHBYTES('SHA2_256',(SELECT TOP (7) * FROM Sales.SalesOrderHeader FOR XML AUTO)) AS hashed_value; 

Instead of the hashed value, this query executed in a SQL Server 2014 instance generates a very well known and very user-unfriendly error message:

Msg 8152, Level 16, State 10, Line 2
String or binary data would be truncated.

Clearly, the reason for this error is the size of the input string that exceeds the limit of 8,000 bytes. You can confirm this by executing the following query:

SELECT DATALENGTH(CAST((SELECT TOP (7) * FROM Sales.SalesOrderHeader FOR XML AUTO) AS NVARCHAR(MAX))) AS input_length; 

Indeed, the size of the input argument for the HASHBYTES function exceeds 8,000 bytes:

input_length
--------------------
8754

In SQL Server 2016, this limitation has been removed:

USE AdventureWorks2016CTP3; 
SELECT HASHBYTES('SHA2_256',(SELECT TOP (7) * FROM Sales.SalesOrderHeader FOR XML AUTO)) AS hashed_value; 

The preceding hash query returns the following result:

hashed_value
------------------------------------------------------------------
0x864E9FE792E0E99165B46F43DB43E659CDAD56F80369FD6D2C58AD2E8386CBF3

Prior to SQL Server 2016, if you wanted to hash more than 8 KB of data, you had to split input data to 8 KB chunks and then combine them to get a final hash for the input entry. Since the limit does not exist anymore, you can use the entire table as an input parameter now. You can slightly modify the initial query to calculate the hash value for the entire order table:

USE AdventureWorks2016CTP3; 
SELECT HASHBYTES('SHA2_256',(SELECT * FROM Sales.SalesOrderHeader FOR XML AUTO)) AS hashed_value; 

This generates the following output:

hashed_value
------------------------------------------------------------------
0x2930C226E613EC838F88D821203221344BA93701D39A72813ABC7C936A8BEACA

I played around with it and could successfully generate a hash value, even for an expression with a size of 2 GB. It was slow, of course, but it did not break. I just want to check the limit; it does not make much sense to use HASHBYTES to detect changes in a large table.

Note

Note that I have used the old standard SQL Server sample database Adventure Works. The version for SQL Server 2016 is available under the name AdventureWorks2016CTP3 at https://www.microsoft.com/en-us/download/details.aspx?id=49502. Yes, it contains CTP3 in the name and it seems that it will not be officially presented as a sample database in SQL Server 2016, but in this example, we had to use it to compare two identical queries with different behaviors in the previous and current SQL Server version.

This function can be very useful to check whether relative static tables are changed or to compare them between two instances. With the following query, you can check the status of products in the AdventureWorks database:

USE AdventureWorks2016CTP3; 
SELECT HASHBYTES('SHA2_256',(SELECT *  
 FROM  
  Production.Product p 
  INNER JOIN Production.ProductSubcategory sc ON p.ProductSubcategoryID = sc.ProductSubcategoryID 
  INNER JOIN Production.ProductCategory c ON sc.ProductCategoryID = c.ProductCategoryID 
  INNER JOIN Production.ProductListPriceHistory ph ON ph.ProductID = p.ProductID 
  FOR XML AUTO)) AS hashed_value; 

The following is the output generated by this query:

hashed_value
------------------------------------------------------------------
0xAFC05E912DC6742B085AFCC2619F158B823B4FE53ED1ABD500B017D7A899D99D

If you want to check whether any of the product attributes defined by the previous statement are different for two or more instances, you can execute the same query against the other instances and compare the values only, without loading and comparing the entire datasets.

With no limit for the input string, you do not need to implement workarounds for large inputs anymore and the fact that you can easily generate a hash value for multiple joined tables can increase the number of use cases for the HASHBYTES function.

JSON functions

SQL Server 2016 introduces JSON data support, and in order to implement this support, four JSON functions have been added to allow manipulation with JSON data:

  • ISJSON: This checks whether an input string represents valid JSON data.
  • JSON_VALUE: This extracts a scalar value from a JSON string.
  • JSON_QUERY: This extracts a JSON fragment from the input JSON string for the specified JSON path.
  • JSON_MODIFY: This modifies JSON data: updates the value of an existing property, adds a new element to an existing array, inserts a new property and its value, and deletes a property.
  • OPENJSON: This provides a row set view over a JSON document. This table-value function converts JSON text in tabular data.

These functions will be explored in more detail in Chapter 5, JSON Support in SQL Server.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset