Chapter 9. Performance Analysis and Tuning

IN THIS CHAPTER

Physical Database Design, Analysis, and Tuning

Troubleshooting Locking and Blocking

Query Performance Tuning

Designing Applications for Performance

Performance Monitoring and Tuning Tools

Performance analysis and tuning is a challenging exercise that requires understanding of tools and techniques to measure and troubleshoot a performance issue. Some of the top performance killers include poor application design, inferior physical/logical database design, and costly/slow-running queries. Insufficient hardware resources and improper configurations of the operating system and the database system are some more causes of performance degradation.

SQL Server 2005 changes the performance analysis and tuning paradigm by introducing innovative features and tools that facilitate efficient physical database design, query optimization, locking and blocking avoidance, and so on. In this chapter, you will learn how to use these features and tools to proactively design, monitor, and tune a database application.

Let’s begin by looking at some of the new techniques for optimizing the physical database design.

Physical Database Design, Analysis, and Tuning

Physical database design refers to specifying how on-disk data structures are stored and accessed. It plays an important role in maximizing the performance of and simplifying the management of databases. In general, physical database design involves where and what kind of indexes to use, how to partition the data, how to organize data and log in files and filegroups, and so on.

Imagine that you are working with developers to optimize the physical database design of a database that contains 50 tables. One of the tables has a million rows and grows at a rate of approximately 50,000 rows a weekday. There are about 1,500 users entering data into the system and about 50 users generating and viewing reports most of the time. Let’s say you have gone through the index analysis process, but still the performance is not as expected. The following sections describe some new things that you can try in SQL Server 2005 to improve the physical database design.

Creating Indexes with Included Columns

One of the best recommendations for nonclustered indexes is to create a narrow and efficient index. For best performance, it is recommended that you avoid using too many columns in an index. A narrow index can accommodate more rows into an 8KB index page than a wide index, thereby reducing I/O, reducing storage requirements, and improving database caching.

On the other hand, database professionals often try to gain performance by creating a covering index. A covering index is a nonclustered index built upon all the columns required to satisfy a SQL query without going to the base table. However, the key size cannot exceed 900 bytes, and you can have up to 16 columns in an index.

SQL Server 2005 introduces the new concept of including columns with nonclustered indexes. It results in smaller, efficient, and narrow keys, and at the same time it provides the benefits of a covering index and removes the 900 bytes/16 columns key size restriction. When you create a nonclustered index by using a CREATE NONCLUSTERED INDEX DDL statement, you can use the INCLUDE clause to specify up to 1,023 columns that are non-key columns to be included to the leaf level of the index. The included columns are not part of the index key, which keeps the key size small and efficient. However, having them available at the leaf pages with the index means you can avoid querying the base table if the included columns can satisfy the query.

In a nutshell, including non-key columns helps keep the key size small and efficient, provides the benefit of a covering index to improve the performance, and removes the 900 bytes/16 columns key size restriction because the included columns are not considered part of the index key.


Note

Columns of data type text, ntext, and image are not allowed as non-key included columns.


Let’s look at an example of how included non-key columns can be used to create efficient nonclustered indexes and to improve performance. The following statements create a sample table and insert some test data into the table:

SET NOCOUNT ON;
USE AdventureWorks;
GO

IF OBJECT_ID('dbo.tblTest') IS NOT NULL
   DROP TABLE dbo.tblTest;
GO
CREATE TABLE dbo.tblTest (C1 INT, C2 INT, C3 CHAR(255));
GO
DECLARE @counter INT;
SET @counter = 1;
WHILE @counter <= 99999
BEGIN
   INSERT INTO dbo.tblTest VALUES(@counter%500, @counter, @counter);
   SET @counter = @counter + 1;
END;

The table does not have any indexes yet. Next, you should query the table:

SET STATISTICS IO ON;
SELECT C1, C2, C3 FROM dbo.tblTest WHERE C1 = 2;
GO

After you run this query, you should notice that the number of logical reads is 3,449 and that the execution plan shows the table scan.

You can optimize the preceding query by creating a nonclustered index:

CREATE NONCLUSTERED INDEX idxtblTest ON dbo.tblTest(C1);
GO

SELECT index_type_desc, index_depth, index_level, page_count, record_count
FROM sys.dm_db_index_physical_stats (DB_ID(),
   OBJECT_ID('dbo.tblTest'), OBJECT_ID('idxtblTest'), NULL, 'DETAILED')
WHERE index_id > 1;

SELECT C1, C2, C3 FROM dbo.tblTest WHERE C1 = 2;
GO

These Transact-SQL (T-SQL) statements create a narrow nonclustered index on column C1. The first SELECT query here uses a dynamic management view (DMV) to find out the total number of pages used by the index. The index uses about 224 pages.

After you run the preceding query, you should notice that the number of logical reads comes down to 204 and that the execution plan shows the use of an index and avoids a table scan. The execution plan shows RID lookup, which is a bookmark lookup accompanied by a nested loop join.

You can further optimize the query by creating a covering index:

CREATE NONCLUSTERED INDEX idxtblTest ON dbo.tblTest(C1, C2, C3)
   WITH DROP_EXISTING;
GO

SELECT index_type_desc, index_depth, index_level, page_count, record_count
FROM sys.dm_db_index_physical_stats (DB_ID(),
   OBJECT_ID('dbo.tblTest'), OBJECT_ID('idxtblTest'), NULL, 'DETAILED')
WHERE index_id > 1;

SELECT C1, C2, C3 FROM dbo.tblTest WHERE C1 = 2;
GO

The index key now contains all the columns queried in the SELECT statement. Because the nonclustered index key now contains more columns, the index will use a higher number of pages. The DMV SELECT query proves this, and you should notice that the index now uses about 3,581 pages.

The SELECT query on the table is optimized, and you should notice that the number of logical rows has now come down to 12 and that the execution plan shows the index seek as 100% of the cost.

Here’s what happens when you use non-key included columns instead of a covering index:

CREATE NONCLUSTERED INDEX idxtblTest ON dbo.tblTest(C1)
   INCLUDE (C2, C3)
   WITH DROP_EXISTING;
GO

SELECT index_type_desc, index_depth, index_level, page_count, record_count
FROM sys.dm_db_index_physical_stats (DB_ID(),
   OBJECT_ID('dbo.tblTest'), OBJECT_ID('idxtblTest'), NULL, 'DETAILED')
WHERE index_id > 1;

SELECT C1, C2, C3 FROM dbo.tblTest WHERE C1 = 2;
GO

Note that the index key contains only the C1 column; C2 and C3 are included non-key columns. The DMV SELECT statement indicates that the index is using 3,463 pages, and the SELECT query on the table shows 11 logical reads and shows the index seek as 100% of the cost. (Compare this with the 3,581 index pages and 12 logical reads in the covering index.) On top of this, the index key size is small and efficient, and you can also overcome the 900 bytes/16 columns index keys restriction by using non-key included columns.

The data for non-key columns resides in the base table data pages and is also duplicated in index leaf-level pages. Therefore, you should avoid including unnecessary columns as non-key index columns. Because the included column data is duplicated, an index with included columns consumes more disk space, especially if included columns are of the varchar(max), nvarchar(max), varbinary(max), or xml data type. Fewer index rows will fit on an index page, which might increase the I/O and decrease the database cache efficiency. Index maintenance may increase in terms of the time that it takes to perform modifications, inserts, updates, or deletions to the underlying table. You should do analysis and testing to determine whether the gains in query performance outweigh the effect on performance during data modification and the additional disk space requirements.

Creating Indexes on Computed Columns

A computed column’s value is calculated from an expression by using other columns in the same table. With SQL Server 2000, computed columns are always virtual columns, not physically stored in the table. In order to create an index on such columns, the column expression must be deterministic and precise. In other words, an index cannot be created on a computed column if the column expression uses a nondeterministic function such as USER_ID() or CHARINDEX() or if the column expression results in an imprecise value such as a floating-point number.

SQL Server 2005 introduces the ability to persist the computed column values. In some situations, this can improve performance because the computed column value is already available in the data pages and is not calculated at runtime. In addition, having persisted computed columns allows you to create indexes on columns that are imprecise. Here’s an example of this:

SET NOCOUNT ON;
USE AdventureWorks;
GO
IF OBJECT_ID('dbo.tblTest') IS NOT NULL
   DROP TABLE dbo.tblTest;
GO
CREATE TABLE dbo.tblTest
    (C1 float, C2 float, C3 AS C1*C2 PERSISTED, C4 AS C1*C2);
GO
CREATE INDEX idxTest ON dbo.tblTest(C3);
GO
--CREATE INDEX idxTest2 ON dbo.tblTest(C4);
GO
SELECT is_persisted, * FROM sys.computed_columns
   WHERE [object_id] = OBJECT_ID('dbo.tblTest'),
GO
SELECT * FROM sys.indexes WHERE name = 'idxTest'
GO

This script creates a sample table with two computed columns. Column C3 is a persisted computed column, whereas C4 is a virtual computed column. Creating an index on an imprecise computed column succeeds if it is persisted; uncommenting and trying to create an index on C4 will fail. You can use the sys.columns system catalog view to find out whether the column is computed, and then you can use the sys.computed_columns catalog view to check whether the computed column is persisted.

Indexing XML Data

SQL Server 2005 introduces a new data type named xml that you can use to store XML documents inside a database. The data stored in xml type columns can be queried by using the XQuery syntax and xml type methods such as exist(), query(), and value(). The XML data is internally stored as binary large objects (BLOBs), and this internal binary representation of XML data cannot exceed 2GB. Without an index on such columns, these BLOBs are parsed and shredded at runtime to evaluate a query, which can have a significant adverse impact on the query performance.

You can use the CREATE XML INDEX DDL statement to create an index on an xml type column. Each table can have up to 249 XML indexes. Here is the T-SQL syntax for creating XML indexes:

CREATE [ PRIMARY ] XML INDEX index_name
    ON <object> ( xml_column_name )
    [ USING XML INDEX xml_index_name
        [ FOR { VALUE | PATH | PROPERTY } ]
    [ WITH ( <xml_index_option> [ ,...n ] ) ]
[ ; ]

      <xml_index_option> ::=
      {
          PAD_INDEX = { ON | OFF }
        | FILLFACTOR = fillfactor
        | SORT_IN_TEMPDB = { ON | OFF }
        | STATISTICS_NORECOMPUTE = { ON | OFF }
        | DROP_EXISTING = { ON | OFF }
        | ALLOW_ROW_LOCKS = { ON | OFF }
        | ALLOW_PAGE_LOCKS = { ON | OFF }
        | MAXDOP = max_degree_of_parallelism
      }

The primary XML index is a shredded and persisted representation of the XML BLOBs in the xml data type column. Using a primary XML index avoids the need to shred the XML BLOBs at runtime. After a primary XML index is created and when the shredded representation is available, the query performance can be further improved by creating secondary XML indexes on XML tags, values, and paths. Each xml column in a table can have one primary XML index and multiple secondary XML indexes. A primary XML index is required before any secondary index can be created. There are three types of secondary XML indexes—PATH, VALUE, and PROPERTY—and they are discussed later in this section.

A primary XML index requires a clustered index to be present on the primary key of the table. If you ever need to modify the primary key clustered index, then all XML indexes on the user table must be dropped first. When a primary XML index is created, SQL Server internally creates a clustered index, a B+ tree, with the clustered key formed from the clustering key of the user table and an XML node identifier. For each XML BLOB in the column, the index creates several rows of data.

Let’s assume that the following XML instance is stored into an xml type column:

<book subject="Security" ISBN="0735615882">
   <title>Writing Secure Code</title>
   <author>
      <firstName>Michael</firstName>
      <lastName>Howard</lastName>
   </author>
   <price currency="USD">39.99</price>
</book>

When a primary XML index is created on a column containing this XML document, SQL Server internally creates a clustered index that contains the columns and rows listed in Table 9.1.

Table 9.1. Primary XML Index Internal Representation

images

Table 9.1 contains partial columns and rows indicating how the primary XML index is stored internally. The first column, PK, is the value of the primary key column in the base table. The second column, XID, is an internal identifier generated for each node in the XML tree. These first two columns (PK, XID) together form a clustered index key for the primary XML index. The third column is the node ID (NID) and the XML tag name (TAG), the fourth column (TID) indicates the node type, the fifth column (VALUE) contains the node value, and the last column (HID) shows the hierarchical ID, which is an internal tokenized representation of the reversed path from a node to the root of the tree.

After the primary XML index is created, you can further optimize query performance by creating secondary XML indexes. There are three types of secondary XML indexes:

PATH index—Secondary indexes built on the paths are useful for path queries such as /person/address/zip. A PATH secondary XML index consists of the HID, VALUE, PK, and XID columns of the primary XML index.

VALUE index—These secondary indexes are useful for queries where you know the node value but don’t know the location of the node. A VALUE index is useful for queries such as //city[.=”Dallas”] or /item/@*[.=”NA”]. A VALUE secondary XML index consists of the VALUE, HID, PK, and XID columns of the primary XML index.

PROPERTY index—These secondary indexes are useful for “property extraction” scenarios, where queries retrieve multiple, sometimes related, values from individual XML instances.

The following example shows how XML indexes can be used to optimize the queries on xml type columns. You need to first create a sample table and insert some test XML data into the table:

SET NOCOUNT ON;
USE AdventureWorks;
GO
IF OBJECT_ID('dbo.tblXMLTest') IS NOT NULL
   DROP TABLE dbo.tblXMLTest;
GO
CREATE TABLE dbo.tblXMLTest
   (c1 int IDENTITY(1, 1) NOT NULL PRIMARY KEY,
    c2 xml NOT NULL);
GO
DECLARE @counter INT;
SET @counter = 1
WHILE @counter < 1000
BEGIN
   INSERT INTO dbo.tblXMLTest (c2)
      VALUES ('<book id="1"><category>Security</category></book>'),
   INSERT INTO dbo.tblXMLTest (c2)
      VALUES ('<book id="2"><category>Architecture</category></book>'),
   INSERT INTO dbo.tblXMLTest (c2)
      VALUES ('<book id="3"><category>Design Patterns</category></book>'),
   INSERT INTO dbo.tblXMLTest (c2)
      VALUES ('<book id="4"><category>Certification</category></book>'),
   SET @counter = @counter + 1
END;
GO

The table yet does not have any index on the xml column. You can click “the Include Actual Execution Plan” toolbar button in Management Studio, run the following query, and study the execution plan:

SELECT * FROM dbo.tblXMLTest
   WHERE c2.exist('/book/@id[. = "3"]') = 1;
GO

SQL Server parses and shreds the XML BLOB data for each row, and it runs the XPath expression to filter rows where the id attribute value is 3. The execution plan shows the use of the “Table Valued Function XML Reader with XPath Filter” operator, and the estimated subtree cost for the SELECT operator (approximately 4,012) is significantly large.

Next, you should create a primary XML index to create a persisted, shredded representation of the XML data and then execute the same SELECT statement again:

CREATE PRIMARY XML INDEX pxmlidxTest ON dbo.tblXMLTest (c2);
GO

SELECT * FROM dbo.tblXMLTest
   WHERE c2.exist('/book/@id[. = "3"]') = 1;
GO

Now, the execution plan no longer contains the XML Reader operator, which proves that XML parsing and shredding is not required. Instead of the XML Reader operator, the execution plan now shows a clustered index scan using the primary XML index, and the estimated subtree cost for the SELECT operator (approximately 0.19) is significantly reduced.

You can further optimize the query by creating a PATH secondary index:

CREATE XML INDEX sxmlidxPathTest ON dbo.tblXMLTest (c2)
   USING XML INDEX pxmlidxTest FOR PATH;

SELECT * FROM dbo.tblXMLTest
   WHERE c2.exist('/book/@id[. = "3"]') = 1;
GO

You should notice that the PATH secondary XML index results in a further simplified execution plan with a clustered index seek, and the estimated subtree cost for the SELECT operator (approximately 0.06) is further reduced.

Figure 9.1 shows the preceding three execution plans.

Figure 9.1. XML indexes avoid shredding and parsing XML BLOBs, and the PATH secondary XML index further optimizes the query.

image

You can find more details about XML support in SQL Server 2005 in Chapter 10, “XML and Web Services Support in SQL Server 2005.”

Optimizing Physical Design Structures by Using Database Engine Tuning Advisor

As a database’s workload, data, and application change over time, the existing indexes may not be entirely appropriate, and new indexes might be required. To help in this process, SQL Server 2000 provided a tool called Index Tuning Wizard (ITW) that could be used to identify an optimal set of indexes for a given workload. The workload could be a SQL trace saved to a trace file (.trc) or a SQL trace saved to a trace table or a SQL script file (.sql). ITW accepted workload as an input, used the query processor to determine the most effective indexes for the workload by simulating index configurations, and gave recommendations on deleting/adding indexes. These recommendations could be saved in a .sql script file to be analyzed and executed later.

SQL Server 2000 ITW had the following limitations:

• It could tune only one database during a tuning session.

• The wizard-based interface was not well suited for iterative tuning and analysis process.

• ITW recommendations were solely based on performance, and not on improving manageability. Therefore, ITW did not make recommendations on how tables/indexes should be partitioned.

• ITW did not provide the ability to limit the tuning time.

• ITW did not scale well with very large databases and workloads. In addition, ITW could not tune workloads that used T-SQL constructs, such as table-valued functions, temporary tables, or triggers.

• ITW allowed a developer to apply the recommendations immediately or at a later time, but it did not allow a developer to evaluate recommendations to perform “what-if” analysis.

• Only members of the sysadmin fixed server role could use the ITW to tune databases.

SQL Server 2005 introduces a new tool called Database Engine Tuning Advisor (DTA) that fixes the problems with ITW and offers recommendations on various physical design structures, including indexes, indexed views, and partitioning. Unlike ITW, which provided a wizard-based interface, DTA is a full-fledged application that provides a session-based analysis and tuning approach. DTA allows for the tuning of multiple databases per session and is designed to scale well with very large databases and workloads. You can find more details on DTA in Chapter 5, “SQL Server 2005 Tools and Utilities.”

In addition to the physical database design enhancements discussed so far, SQL Server 2005 offers improvements in the way indexed views and partitioning work. The SQL Server 2005 query optimizer can match more queries to indexed views than in previous versions, including queries that contain scalar expressions, scalar aggregate and user-defined functions, and so on. The table and index partitioning feature is enhanced in SQL Server 2005 to simplify the administration, design, and development of a partitioned data set. A developer can optimize query performance by aligning tables and the associated index and also by aligning related tables. Alignment refers to SQL Server’s ability to group rows that have the same partitioning key. With partitioned tables in SQL Server 2005, a table can be designed (using a function and a scheme) such that all rows that have the same partitioning key are placed directly on (and will always go to) a specific location. When multiple tables use the same function (but not necessarily the same scheme), rows that have the same partitioning key will be grouped similarly.

Troubleshooting Locking and Blocking

For a lot of database applications, performance degrades as the number of concurrent users increases. One of the common reasons for this behavior is blocking. Whenever two connections or system process IDs (SPIDs) try to access a common database resource in a conflicting way, the SQL Server lock manager ensures that the second SPID waits until the first SPID completes its work. This helps in maintaining data consistency. Poor query and transaction design can lead to excessive locking and blocking, significantly degrading system performance and affecting the scalability of the system.

Traditional techniques for blocking analysis include using sysprocesses and syslockinfo system tables, sp_lock and sp_who2 system stored procedures, the DBCC INPUTBUFFER statement, and Profiler trace. Microsoft released a sample script (see Knowledge Base article 271509) that uses these techniques to monitor blocking. Some of the techniques used to reduce blocking include keeping transactions short, optimizing queries by using indexes, using covering indexes, using a different (lower) isolation level, partitioning the data, and so on. In SQL Server 2005, you can continue to use these techniques to minimize the blocking and thereby increase the concurrency. The following sections outline some new techniques that you can implement in SQL Server 2005 to proactively monitor and troubleshoot blocking scenarios.

Using the Blocked Process Threshold Option

SQL Server 2005 introduces a new advanced system configuration (sp_configure) option called “blocked process threshold.” You can use this option to proactively monitor blocking and deadlocking instances. You can use sp_configure to set the blocked process threshold to an integer value between 0 (the default) and 86400 (24 hours). Let’s assume that you set the blocked process threshold configuration value to 10. Now, if there is a blocking for over 10 seconds, SQL Server will raise an event that can be captured by using the SQL Server event notification mechanism to perform an action, and the event can also be seen in SQL Profiler. The event is raised every 10 seconds after that until the blocking is resolved.

Let’s look at an example of using the blocked process threshold configuration option to monitor blocking. In this example, the blocked process threshold is set to 10, and then a blocking scenario is created. In response to a blocking event generated by SQL Server, an entry is made into an audit table. You can handle the event differently and do things such as notify a DBA or do something to resolve the blocking. For simplicity, this example makes an entry into an audit table whenever the blocked process threshold event is raised.

The following code block shows how you set the blocked process threshold configuration option to 10 seconds:

USE master;
GO

EXEC master.dbo.sp_configure 'show advanced options', 1;
RECONFIGURE WITH OVERRIDE;
EXEC master.dbo.sp_configure 'blocked process threshold', 10;
RECONFIGURE WITH OVERRIDE;
EXEC master.dbo.sp_configure 'show advanced options', 0;
RECONFIGURE WITH OVERRIDE;

Next, you should create an audit table that will be used to record the blocking instances:

IF OBJECT_ID('dbo.tblBlocking') IS NOT NULL
   DROP TABLE dbo.tblBlocking;
GO

CREATE TABLE dbo.tblBlocking
   (id INT IDENTITY(1,1) NOT NULL PRIMARY KEY,
    server_name VARCHAR(100) NULL,
    blocker_spids VARCHAR(8000) NULL,
    event_info xml NULL,
    date_time DATETIME NOT NULL DEFAULT GETDATE());
GO

SQL Server raises the BLOCKED_PROCESS_REPORT event when there is a blocking for 10 seconds and every 10 seconds after that until blocking is resolved. This event can be captured by using the event notification mechanism introduced in SQL Server 2005. The event notification mechanism captures the event and notifies a Service Broker service by placing a message in the queue. In this example, a stored procedure is associated with the queue and activated whenever there is a message in the queue. Service Broker is discussed in great detail in Chapter 14, “SQL Server 2005 Service Broker.”

In summary, SQL Server raises the BLOCKED_PROCESS_REPORT event, which is captured by the event notification mechanism, which places a message in the Service Broker queue, which in turn activates the stored procedure, which finally makes an entry into the audit table. Here is the stored procedure that is activated whenever there is a new message in the queue (created later in this example) in response to the BLOCKED_PROCESS_REPORT event:

IF OBJECT_ID('dbo.sp_NotifyBlocking') IS NOT NULL
   DROP PROCEDURE dbo.sp_NotifyBlocking;
GO

CREATE PROCEDURE dbo.sp_NotifyBlocking
AS
BEGIN
   --Find blocked process chain
   DECLARE @probclients TABLE(spid SMALLINT, ecid SMALLINT,
              blocked SMALLINT, waittype BINARY(2), dbid SMALLINT,
              PRIMARY KEY (blocked, spid, ecid));

   INSERT @probclients
      SELECT spid, ecid, blocked, waittype, dbid
      FROM master.dbo.sysprocesses
      WHERE blocked != 0 OR waittype != 0x0000;

   INSERT @probclients
      SELECT DISTINCT blocked, 0, 0, 0x0000, 0
         FROM @probclients
         WHERE blocked != 0 AND blocked NOT IN
          (SELECT spid FROM @probclients);

   DECLARE @HeadBlockerSPIDs varchar(4000)
   SELECT @HeadBlockerSPIDs =
      COALESCE(@HeadBlockerSPIDs + ', ', '') + CAST(spid AS varchar(5))
      FROM @probclients
      WHERE blocked = 0 AND
         spid in (SELECT blocked FROM @probclients WHERE spid != 0);

   --Receive message from the queue
   DECLARE @QMessage XML
   DECLARE @SBMessage TABLE (msgText XML);
   RECEIVE CAST(message_body AS XML) AS msgText
      FROM AdventureWorks.dbo.BlockingMonitoringQueue INTO @SBMessage;
   SELECT TOP 1 @QMessage = msgText FROM @SBMessage;

   --Insert into audit table
   INSERT INTO master.dbo.tblBlocking (server_name, blocker_spids, event_info)
      VALUES (@@SERVERNAME, @HeadBlockerSPIDs, @QMessage);
END;
GO

This stored procedure is activated in response to the blocking event. It uses the master.dbo.sysprocesses backward compatibility view to determine the blocking chain, including the head blocker process. Toward the end, the stored procedure retrieves the message from the Service Broker queue and inserts the queue message, along with blocked process chain information, into an audit table.

The final configuration step required is to enable Service Broker in the user database, create Service Broker objects such as the service and the queue, and set up the event notification to respond to the BLOCKED_PROCESS_REPORT event. Here’s how you do that:

IF NOT EXISTS (SELECT * FROM sys.databases
               WHERE name = 'AdventureWorks'
               AND is_broker_enabled = 1)
   ALTER DATABASE AdventureWorks SET ENABLE_BROKER ;
GO

USE AdventureWorks;
GO

IF EXISTS(SELECT * FROM sys.services WHERE name = 'BlockingMonitoringService')
    DROP SERVICE BlockingMonitoringService ;
GO

IF OBJECT_ID('[dbo].BlockingMonitoringQueue') IS NOT NULL AND
   EXISTS(SELECT * FROM sys.service_queues
           WHERE name = 'BlockingMonitoringQueue')
  DROP QUEUE [dbo].BlockingMonitoringQueue;
GO

CREATE QUEUE dbo.BlockingMonitoringQueue
   -- Activation turned on
   WITH STATUS = ON,
   ACTIVATION (
      -- The name of the proc to process messages for this queue
      PROCEDURE_NAME = master.dbo.sp_NotifyBlocking,
      -- The maximum number of copies of the proc to start
      MAX_QUEUE_READERS = 5,
      -- Start the procedure as the user who created the queue.
      EXECUTE AS SELF )
   ON [DEFAULT] ;
GO

CREATE SERVICE
   [BlockingMonitoringService] ON QUEUE BlockingMonitoringQueue
   ([http://schemas.microsoft.com/SQL/Notifications/PostEventNotification]);
GO

IF EXISTS(SELECT * FROM sys.server_event_notifications
          WHERE name = 'EventNotifyBlocking')
  DROP EVENT NOTIFICATION EventNotifyBlocking ON SERVER ;
GO

CREATE EVENT NOTIFICATION EventNotifyBlocking
   ON SERVER
   FOR BLOCKED_PROCESS_REPORT
   TO SERVICE 'BlockingMonitoringService', 'current database';
GO

These statements first enable Service Broker in the AdventureWorks sample database, in which the blocking scenario will be created later in this example. Next, they create a Service Broker queue with activation enabled, as well as a service. Toward the end, an event notification mechanism is set up to monitor BLOCKED_PROCESS_REPORT and send the event notification to the BlockingMonitoringService Service Broker service in the current database.

You can run the preceding script statements to set up the blocking monitoring process. Next, you should open two query windows in SQL Server Management Studio and run the following two scripts simultaneously to create the blocking scenario:

Connection 1:

USE [AdventureWorks];
GO
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
GO
BEGIN TRANSACTION;
GO

Connection 2:

USE [AdventureWorks];
GO
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
GO
BEGIN TRANSACTION;
GO
UPDATE Production.ProductInventory SET Quantity = Quantity - 1
   WHERE ProductID = 1;
GO

Next, you should switch back to Connection 1, run the following SELECT statement, and notice that it is blocked by Connection 2 because Connection 2 updated the information inside a transaction that Connection 1 is trying to access:

Connection 1:

SELECT SUM(Quantity) FROM Production.ProductInventory
   WHERE ProductID = 1;
GO


Note

This SELECT statement does not cause blocking if row versioning (discussed next) is enabled. You should run the following statement and ensure that it returns 0:

SELECT is_read_committed_snapshot_on
FROM sys.databases
WHERE name = 'AdventureWorks';

If this statement returns 1, you know that row versioning is enabled, and in such cases, the preceding SELECT statement will run successfully, without any blocking. Then you should turn off the READ_COMMITTED_SNAPSHOT database option by using the ALTER DATABASE statement to disable row versioning.


You should let Connection 1 be blocked for about 40 to 50 seconds. Then you should switch to Connection 2 and run ROLLBACK TRANSACTION; after this, you should notice that Connection 1 is no longer blocked. You should run ROLLBACK TRANSACTION in Connection 1 as well to end the transaction. Then, when you select the rows from the master.dbo.tblBlocking audit table, you should see a few rows in this table indicating one or more processes in the chain that caused the blocking. While Connection 1 is blocked, you may also run Profiler and select “Blocked Process Report” under “Errors and Warning.” Every time SQL Server raises the BLOCKED_PROCESS_REPORT event, you should then see an entry in Profiler, showing the blocking information in XML format.

When you’re done with this example, you can reset “the blocked process threshold” sp_configure setting to 0 and drop the Service Broker objects, event notification, and audit table.

The blocked process threshold option is an example of how SQL Server 2005 is changing the performance monitoring and tuning paradigm for you to be proactive so that you no longer simply react to performance issues.

Row Versioning and Snapshot Isolation

The SQL Server 2005 row versioning feature can significantly reduce the occurrences of blocking and deadlocking. You can enable the row versioning feature for a database by doing either of the following:

• Setting the READ_COMMITTED_SNAPSHOT database option to ON while using the “read committed” transaction isolation level, which is the default isolation level

• Setting the ALLOW_SNAPSHOT_ISOLATION database option to ON while using the new “snapshot” transaction isolation level

When row versioning is enabled, whenever a transaction modifies a row, SQL Server uses the tempdb system database to maintain a copy of the original row (that is, a before image of the row). If multiple transactions modify a row, multiple versions of the row are stored in a version chain. For short-running transactions, a version of a modified row may get cached in the buffer pool without getting written into the tempdb database. If the need for the versioned row is short, the row will simply get dropped from the buffer pool. The read request is served by traversing the version link list to retrieve the last version of each row that was committed when the read transaction or statement started.

By keeping versions of updated rows, SQL Server avoids the need to lock the data, and it still maintains the data consistency. When row versioning is enabled, the number of deadlocks is reduced, and the number of locks required by a transaction is reduced, thereby reducing the system overhead required to manage locks; in this case, fewer lock escalations take place. If row versioning is enabled, you must ensure that there is adequate space in tempdb to maintain the rows version store.

Let’s look at an example of how row versioning helps in reducing blocking and increasing concurrency. You should run the following statement to see whether row versioning is enabled for the AdventureWorks sample database:

SELECT is_read_committed_snapshot_on FROM sys.databases
   WHERE name = 'AdventureWorks';

If this statement returns 1, it indicates that transactions that use the READ COMMITTED (the default) isolation level will use row versioning to prevent readers from blocking the writers and vice versa, without locking any rows. If the preceding statement returns 0, you should run the following command to enable row versioning for the AdventureWorks sample database:

ALTER DATABASE AdventureWorks SET READ_COMMITTED_SNAPSHOT ON;

Before you run this statement, you need to make sure no users are connected to the AdventureWorks sample database. After you enable the READ_COMMITTED_SNAPSHOT option for the AdventureWorks database, you should run the preceding SELECT statement on the sys.databases catalog view to verify that row versioning is enabled.

Next, you should try to create the same blocking scenario explained earlier in this chapter, in the section “Using the Blocked Process Threshold Option.” You will notice that, even though transactions are using the READ COMMITTED isolation level, readers are not blocking the writers and vice versa.

Analyzing Deadlocks by Using SQL Profiler

In SQL Server 2005, SQL Profiler has been enhanced to better assist in troubleshooting deadlock scenarios. You can use the new “Deadlock Graph” trace event under “the Locks” event class to view deadlock information graphically or as an XML document (see Figure 9.2).

Figure 9.2. Profiler shows the deadlock information graphically and as an XML document.

image

Chapter 5 contains a script that can be used to simulate a deadlock scenario. As shown in Figure 9.2, Profiler shows the SPIDs involved in the deadlock. When you hover the mouse over each process (shown as a circle in a graph), Profiler shows the T-SQL statements that the process was executing when the deadlock happened. The circle with the blue X lines indicates the process that was selected as a deadlock victim. The TextData column contains the deadlock information as an XML document. You can right-click this column and select “Extract Event Data” to save the deadlock XML document to a disk file.

Locking and Blocking Monitoring Tools

In SQL Server 2005, Activity Monitor has been enhanced to support filtering and automatic refreshing. You can launch Activity Monitor from SQL Server Management Studio by connecting to a SQL Server instance by using Object Explorer, expanding the Management folder, and then double-clicking the Activity Monitor node. Figure 9.3 shows the redesigned Activity Monitor window:

Figure 9.3. Activity Monitor now supports filtering and automatic refreshing.

image

You can continue to use the T-SQL constructs sp_who, sp_who2, sp_lock, DBCC INPUTBUFFER, and so on to monitor and troubleshoot blocking. You can also use the following DMVs and dynamic management functions (DMFs) to further obtain the information on processes and locks:

sys.dm_tran_locks—You can use this DMV to obtain information about the current state of locking. Each row represents a currently active request to the lock manager that has either been granted or is waiting to be granted (that is, the request is blocked by an already granted request).

sys.dm_exec_connections—This DMV contains a row describing each connection established to SQL Server.

sys.dm_exec_requests—This DMV returns information about each request executing within SQL Server.

sys.dm_exec_sql_text—The fn_get_sql() function is being deprecated in SQL Server 2005. The recommended approach is to use the sys.dm_exec_sql_text DMF. Like the fn_get_sql() function, the sys.dm_exec_sql_text DMF accepts a SQL handle as an input parameter and returns a result set that contains the database ID, the object ID, a number for numbered stored procedures, a flag indicating whether text is encrypted, and the query text itself. The SQL handle can be obtained by using the sql_handle column from the sys.dm_exec_requests DMV results.

sys.dm_exec_query_plan—This DMF accepts the plan handle as an input parameter and returns a result set that contains the database ID, the object ID, a number for numbered stored procedures, a flag indicating whether text is encrypted, and the compile-time showplan representation in XML format of the query execution plan that is specified with plan_handle. The plan handle can be obtained by using plan_handle column from the sys.dm_exec_requests DMV results.

sys.dm_exec_query_stats—This DMV returns aggregate performance statistics for cached query plans. The returned information includes logical and physical reads and writes, elapsed time, and CLR time.

sys.dm_exec_sessions—Similar to the master.dbo.sysprocesses backward compatibility view, this DMV contains a row for each process (client or system) running on a current SQL Server 2005 instance. This DMV is available from any database, and it contains more information than the sysprocesses view.

Let’s look at an example of using these DMVs and DMFs to obtain information about long-running queries. You can run the following script to create a stored procedure that performs a long-running operation:

Connection 1:

USE AdventureWorks;
GO
IF OBJECT_ID ('dbo.sprocTest') IS NOT NULL
   DROP PROCEDURE dbo.sprocTest;
GO
CREATE PROCEDURE dbo.sprocTest
AS
BEGIN
   DECLARE @i INT
   SET @i = 0
   WHILE @i < 5000
   BEGIN
       SELECT TOP (@i) Name FROM Sales.Store;
       SET @i = @i + 1;
   END;
END;
GO

Then you open another query window and type the following T-SQL batch, but don’t run the following script yet:

Connection 2:

USE AdventureWorks;
GO

DECLARE @SessionID INT
DECLARE @SQLHandle varbinary(64)
DECLARE @PlanHandle varbinary(64)

SELECT TOP 1 @SessionID = session_id FROM sys.dm_exec_requests
   ORDER BY total_elapsed_time DESC;

SELECT @SQLHandle = sql_handle FROM sys.dm_exec_requests
   WHERE session_id = @SessionID;

SELECT @PlanHandle = plan_handle FROM sys.dm_exec_requests
   WHERE session_id = @SessionID;

SELECT @SQLHandle AS SQLHandle;
SELECT @PlanHandle AS PlanHandle;

SELECT DB_NAME(dbid) AS [db_name],
       OBJECT_NAME(objectid) AS obj_name,
       number, encrypted, [text]
   FROM sys.dm_exec_sql_text(@SQLHandle);

SELECT query_plan FROM sys.dm_exec_query_plan(@PlanHandle);

SELECT * FROM sys.dm_exec_query_stats
   WHERE plan_handle = @PlanHandle;

SELECT * FROM sys.dm_tran_locks
   WHERE request_session_id = @SessionID;
GO

This T-SQL script first uses the sys.dm_exec_requests DMV to obtain the SPID or session ID for the longest-running request. This session ID is then used to obtain the plan handle and SQL handle, which are then used to obtain the query text, query plan, and query stats information.

You need to now execute the stored procedure in Connection 1:

Connection 1:

EXEC dbo.sprocTest;
GO

Then you should run the T-SQL batch in Connection 2 to obtain the information about the longest-running request.

Query Performance Tuning

One of the common techniques for troubleshooting a performance issue is to use SQL Profiler to obtain the top n worse-performing queries. You can do this by looking at the Duration, CPU, Reads, and Writes columns in Profiler. After you identify such queries, some of the techniques you can use to tune the workload include generating and analyzing graphical and text query execution plans, obtaining statistics (such as STATISTICS PROFILE, STATISTICS IO, and STATISTICS TIME), changing the queries to effectively use the indexes and produce a better execution plan, providing query hints to the optimizer, rewriting stored procedures to avoid the use of cursors, avoiding recompilations, and so on. You can continue to use these techniques in SQL Server 2005. The following sections describe some new features introduced in this release that you can use for query performance tuning.

Plan Guides

As mentioned earlier in this chapter, one of the techniques you can use to improve query performance is to use the OPTION clause to specify query hints such as MAXDOP, LOOP JOIN, MERGE JOIN, and KEEP PLAN. As a general rule, you should avoid using optimizer hints because they overrule the decision-making process of the optimizer. The SQL Server 2005 query optimizer is smart enough to generate highly efficient execution plans, and therefore it is recommended that hints be used only as a last resort and only by experienced developers and DBAs.

Let’s assume that you have a third-party vendor application that you are trying to tune and optimize. You used SQL Profiler to obtain some of the queries that are not performing well. After you identified such queries, you tried to tune the queries by using Management Studio, providing query hints and analyzing the query execution plans. Now assume that you are sure that one or more query hints can optimize the query or fix a problem. But what if you cannot change the query? What if the query is hard-coded into the application and there is no way to change the query to provide optimizer hints? Enter plan guides.

In the Standard and Enterprise Editions of SQL Server 2005, you can run the sp_create_plan_guide system stored procedure in order to create plan guides to optimize the performance of queries by providing hints at runtime to queries when it is not possible or desirable to change the text of the query directly. In other words, if you can create a plan guide for a query or a type of query and specify the query hints while creating a plan guide, whenever such queries are executed, SQL Server attaches the query hints to them. This way, you don’t have to change the original query text, but you can still provide hints to the optimizer whenever such queries are executed. For plan guides to work, the query text executed on the server and specified with sp_create_plan_guide must match exactly character-by-character. If there is a difference of even a single character, the plan guide will not work, and the hints will not be attached with the query.

It is important to note that plan guides are not just for tuning third-party applications where you don’t have access to source code to change the query. Even in scenarios where you have access to source code and could possibly change the query, it is recommended that you create a plan guide for the query instead of directly specifying the optimizer hints with the query. You should consider plan guides a recommended approach whenever you have to do optimizer hinting. This allows you to change the query behavior without changing the application and also simplifies application maintenance and future database upgrades.

Here is the T-SQL syntax for the sp_create_plan_guide stored procedure:

sp_create_plan_guide [ @name = ] N'plan_guide_name',
  [ @stmt = ] N'statement_text',
  [ @type = ] N' { OBJECT | SQL | TEMPLATE }',
  [ @module_or_batch = ]
      {
        N'[ schema_name.]object_name'
        | N'batch_text'
        | NULL
      },
  [ @params = ] { N'@parameter_name data_type [,...n ]' | NULL },
  [ @hints = ] { N'OPTION ( query_hint [,...n ] )' | NULL };

The @type parameter can have one of following three values:

OBJECT—This value indicates that the specified statement_text appears in the context of a T-SQL stored procedure, scalar function, multistatement table-valued function, or T-SQL DML trigger in the current database.

SQL—This value indicates that the specified statement_text appears in the context of a standalone statement or batch that can be submitted to SQL Server through any mechanism.

TEMPLATE—This value indicates that the plan guide applies to any query that parameterizes to the form indicated in statement_text. If TEMPLATE is specified, then only the PARAMETERIZATION { FORCED | SIMPLE } query hint can be specified in the @hints argument.

Plan guides are database scoped, and you can use the sys.plan_guides catalog view to obtain a list of plan guides created in the current database. You can enable, disable, or drop plan guides by using the sp_control_plan_guide system stored procedure.


Note

Plan guides cannot be created against stored procedures, functions, or DML triggers that specify the WITH ENCRYPTION clause. Attempting to drop or alter a function, stored procedure, or DML trigger referenced by a plan guide, either enabled or disabled, results in an error. Attempting to drop a table with a trigger defined on it that is referenced by a plan guide also results in an error.


Let’s look at an example of a plan guide for attaching MAXDOP and LOOP JOIN optimizer hints with a SQL statement at runtime. First, you need to start SQL Server Management Studio and connect to a SQL Server 2005 instance by using the Query Editor. Then you should click the” Include Actual Execution Plan” toolbar button and run the following TSQL statements:

SET NOCOUNT ON;
USE AdventureWorks;
GO
SELECT h.SalesOrderID, h.OrderDate, h.DueDate, h.ShipDate,
       h.Status, h.SalesOrderNumber, c.FirstName, c.LastName, h.TotalDue,
       a.City, a.PostalCode
  FROM Sales.SalesOrderHeader h JOIN Person.Contact c
  ON c.ContactID = h.CustomerID
     LEFT OUTER JOIN Person.Address AS [a]
     ON [a].AddressID =
        (SELECT TOP 1 AddressID FROM Sales.CustomerAddress
        WHERE CustomerID = h.CustomerID)
ORDER BY c.LastName, c.FirstName;
GO

Figure 9.4 shows the execution plan for above query. It shows that the query utilizes multiple processors on the server to execute the query in parallel. The query processor also uses hash match (inner join) operator.

Figure 9.4. The execution plan for a sample query shows that the query is using parallelism and the hash match operator.

image

Let’s assume that you want to disable parallelism and that you want the optimizer to use a loop join operator for the preceding query. You run the following T-SQL statement to create a plan guide to provide MAXDOP 1 and LOOP JOIN hints for the query:

IF EXISTS(SELECT * FROM sys.plan_guides WHERE name = 'planguideTest')
   EXEC sp_control_plan_guide N'DROP', N'planguideTest';
GO

EXEC sp_create_plan_guide
@name = N'planguideTest',
@stmt = N' SELECT h.SalesOrderID, h.OrderDate, h.DueDate, h.ShipDate,
       h.Status, h.SalesOrderNumber, c.FirstName, c.LastName, h.TotalDue,
       a.City, a.PostalCode
  FROM Sales.SalesOrderHeader h JOIN Person.Contact c
  ON c.ContactID = h.CustomerID
     LEFT OUTER JOIN Person.Address AS [a]
     ON [a].AddressID =
        (SELECT TOP 1 AddressID FROM Sales.CustomerAddress
         WHERE CustomerID = h.CustomerID)
ORDER BY c.LastName, c.FirstName;',
@type = N'SQL',
@module_or_batch = NULL,
@params = NULL,
@hints = N'OPTION (LOOP JOIN, MAXDOP 1)';
GO

SELECT * FROM sys.plan_guides;
GO

This T-SQL script uses sys.plan_guides to check whether the named plan guide already exists. If it does, the script uses sp_control_plan_guide to drop the plan guide. Next, it creates a plan guide by using the sp_create_plan_guide system stored procedure.

Now if you run the same SELECT statement again, you should notice that at runtime, SQL Server attaches MAXDOP 1 and LOOP JOIN operator hints with the query. You can see this by looking at an execution plan generated for exactly the same query (see Figure 9.5).

Figure 9.5. After you create a plan guide, the same query generates a different execution plan with no parallelism and also uses a loop join instead of a hash join.

image

In summary, plan guides are a powerful feature that you can use to address queries targeted for improved or stabilized performance.

New Query Hints

SQL Server 2005 introduces four new query hints that you can directly specify with the queries (not recommended) or specify by using plan guides (the recommended approach, discussed in the preceding section). These are the new query hints:

RECOMPILE—SQL Server 2000 provided three ways to recompile a stored procedure: by using the sp_recompile stored procedure; by specifying the WITH RECOMPILE option while creating a stored procedure; and by specifying the WITH RECOMPILE option while executing the stored procedure. However, there was no way to recompile a subset of a stored procedure or just few queries inside the stored procedure. SQL Server 2005 fixes this by introducing a RECOMPILE hint that instructs SQL Server to discard the plan generated for the query after it executes, forcing the query optimizer to recompile a query plan the next time the same query is executed. By using this hint, you can implement a scenario where only a subset of a stored procedure is recompiled instead of the entire stored procedure.

OPTIMIZE FOR (@variable = value [,...n])—If you know that a local variable will have a certain value or a certain range of values, you can specify the OPTIMIZE FOR hint with the query to instruct SQL Server to use the specified value during the query optimization process.

USE PLAN 'xml plan'—You can use this query hint to force the query optimizer to use a specified query plan for a query. A common situation where this hint might be used is when a query plan chosen by the query optimizer in an earlier product version is preferred over the one chosen after the upgrade. You can obtain the 'xml plan' to specify with USE PLAN by either using the SET SHOWPLAN XML or SET STATISTICS XML statements, by using the plan column of the sys.dm_exec_query_plan DMV, or by using the SQL Profiler Showplan XML event class, the Showplan XML Statistics Profile event class, or the Showplan XML For Query Compile event class.

PARAMETERIZATION { SIMPLE | FORCED }—Using parameters or parameter markers in T-SQL statements increases the ability of the relational engine to match new SQL statements with existing, unused execution plans. If a SQL statement is executed without parameters, SQL Server 2005 parameterizes the statement internally (similarly to sp_executesql) to increase the possibility of matching it against an existing execution plan. This process is called simple parameterization. SQL Server parameterizes a relatively small class of queries. If you want all queries in a database to be parameterized, subject to certain limitations, you can use the ALTER DATABASE statement and change the PARAMETERIZATION setting from the default SIMPLE to FORCED. However, if you want to keep the database PARAMETERIZATION setting at the default but would like a particular query to be forced to be parameterized, you can use the PARAMETERIZATION query hint.

New Database Options

The previous section discusses a new database option, PARAMETERIZATION, that you can use to choose between simple and forced query parameterization behavior. The following sections discuss some other database options that you can set by using the ALTER DATABASE statement and that you can use to tune databases.

The AUTO_UPDATE_STATISTICS_ASYNC Option

By default, the AUTO_UPDATE_STATISTICS_ASYNC database option is turned off, which means that if a plan at compile time sees that statistics are out-of-date, the query will wait until the statistics are updated so that a better query optimization plan can be generated based on up-to-date statistics. This can result in different response times for the same query. If you turn on this option, and then if a plan at compile time sees that statistics are out-of-date, the query will not wait until the statistics are updated. The query will continue to use the existing statistics and start a new thread to update the statistics in parallel. The next time the query is called, it may use the updated statistics. This results in stable and consistent query response times. The AUTO_UPDATE_STATISTICS option must be ON for the AUTO_UPDATE_STATISTICS_ASYNC option to work. In addition, the sampling formula for statistics update has changed; SQL Server 2005 makes sure that the optimizer samples at least 8MB of pages.

You can use the is_auto_update_stats_async_on column in the sys.databases catalog view to find out whether the AUTO_UPDATE_STATISTICS_ASYNC option is turned on or off.

The DATE_CORRELATION_OPTIMIZATION Option

By default, the DATE_CORRELATION_OPTIMIZATION database option is turned off, which means that SQL Server does not maintain any correlation statistics for tables in the database that are linked by a foreign key constraint and have datetime columns. You can turn on this database option to improve the performance of queries that perform an equi-join between two tables whose datetime columns are correlated and that specify a date restriction in the query predicate. For instance, the OrderDate column of the Purchasing.PurchaseOrderHeader table can be considered correlated with the DueDate column of the Purchasing.PurchaseOrderDetail table. If the application workload has queries on these tables, using equi-joins, and uses DueDate or OrderDate columns in the WHERE condition, turning on the DATE_CORRELATION_OPTIMIZATION option can increase the performance of such queries.

You can use the is_date_correlation_on column in the sys.databases catalog view to find out whether the DATE_CORRELATION_OPTIMIZATION option is turned on or off.

The ENABLE_BROKER Option

SQL Server 2005 introduces a new asynchronous messaging platform for creating scalable, queued, distributed database applications. You can set the ENABLE_BROKER database option for a database to allow Service Broker communication from and to that database. Service Broker is discussed in great detail in Chapter 14.

Statement-Level Recompilation

The SQL Server 2005 performance-related enhancements discussed so far in this chapter require you to make the right design decisions and use the right techniques to maximize performance. This section presents one example of enhancements that are built into the SQL Server 2005 database engine to improve query performance without you making any changes to your application.

As discussed in Chapter 2, “SQL Server 2005 Database Engine Architectural Changes,” in order to optimize query execution performance, SQL Server uses a part of the memory buffer pool to cache the query execution plans. This space is called the plan cache. The execution plan consists of two main components:

Query plan—The query plan is a read-only data structure that can be used by any number of users of a query. The query plan is reentrant (that is, all users share the plan, and no user context information is stored in the plan). SQL Server keeps only one copy of the plan for serial executions and, if needed, another plan for parallel executions.

Execution context—With the execution context data structure, each user currently executing the query has a data structure that holds the data specific to their execution, such as parameter values.

When any SQL statement is executed in SQL Server 2005, the relational engine first looks through the plan cache to verify that an existing execution plan for the same SQL statement exists. SQL Server 2005 reuses any existing plan it finds, which saves the overhead of recompiling the SQL statement. If no existing execution plan exists, SQL Server 2005 generates a new execution plan for the query. Certain changes in a database can cause an execution plan to be either inefficient or invalid, given the new state of the database. SQL Server detects the changes that invalidate an execution plan and marks such a plan as invalid. A new plan must then be recompiled for the next connection that executes the query.

In SQL Server 2000, whenever a statement within a batch causes recompilation, the entire batch, whether submitted via a stored procedure, a trigger, an ad-hoc batch, or a prepared statement, is recompiled. In SQL Server 2005, only the statement inside the batch that causes recompilation is recompiled. Because of this “statement-level recompilation” feature, SQL Server 2005 spends less CPU time and memory on batch recompilations, and it obtains fewer compile locks, which increases the overall performance.

One other benefit of statement-level recompilations is that it is no longer necessary to break a long stored procedure into multiple short stored procedures just to reduce the recompilation penalty of the long stored procedure.

You can use a new Profiler event called SQL:StmtRecompile, which is under the TSQL event class, to trace statement-level recompilations. In addition, the existing SP:Recompile event under the Stored Procedures event class has been updated to show what query caused the recompilation in the TextData column on the SP:Recompile row.

The following is an example of statement-level recompilation:

IF OBJECT_ID('dbo.tblTest') IS NOT NULL
    DROP TABLE dbo.tblTest;
CREATE TABLE dbo.tblTest
    (C1 INT IDENTITY(1,1) NOT NULL PRIMARY KEY,
    C2 INT, C3 VARCHAR(20), C4 CHAR(5));

IF OBJECT_ID('dbo.sprocTest') IS NOT NULL
   DROP PROCEDURE dbo.sprocTest;
GO
CREATE PROCEDURE dbo.sprocTest
AS
BEGIN
   SELECT * FROM sysobjects WHERE 1 = 2;
   SELECT * FROM master.dbo.sysprocesses;

   SELECT * FROM dbo.tblTest WHERE C2 % 10 = 0 ORDER BY C4 DESC;
END;
GO

EXEC dbo.sprocTest;
CREATE INDEX idxTest ON dbo.tblTest (C4);
EXEC dbo.sprocTest;
DROP INDEX tblTest.idxTest;
EXEC dbo.sprocTest;

This script creates a sample table that is accessed in the stored procedure. After the first time the stored procedure is executed, an index is created on column C4, and the stored procedure is executed again. In SQL Server 2000, the entire stored procedure was recompiled during the second execution, whereas in SQL Server 2005, only the SELECT statement on the tblTest is recompiled. To see this in action, you can start a SQL Server 2000 Profiler trace, select the Stored Procedures:SP Recompile event, and select a few columns, including the TextData column, from the Data Columns tab. Then you can run the preceding script in Query Analyzer. You should see an SP:Recompile entry in Profiler and note that the TextData column is empty. You have to select another Profiler event, such as SP:Starting or SP:Completed, to find out which stored procedure is being recompiled.

Next, you should start SQL Server 2005 Profiler and select the Stored Procedures:SP Recompile and TSQL:StmtRecompile events. Then you run the preceding script in a query window in Management Studio, and you should notice that Profiler shows that only the SELECT statement is recompiled. Profiler should list SP:Recompile followed by SQL:StmtRecompile, but the TextData for both the rows is the SELECT statement that caused the recompilation (because the schema for the table being accessed changed).

Designing Applications for Performance

As mentioned at the beginning of this chapter, poor application design is one of the top reasons applications run slowly and don’t scale well. Application designers and architects should think about performance objectives early in the application life cycle. Designers should proactively make key design choices to maximize performance and scalability, while balancing other quality-of-service (QoS) considerations, such as availability, manageability, integrity, and security.

The following sections present a brief overview of new developer features introduced in SQL Server 2005 to create applications that perform and scale well.

Service Broker and Asynchronous Messaging

With just a few lines of T-SQL code, you can now build triggers that execute asynchronously, execute stored procedures in parallel, scale out with no or minimal change in an application, defer part of the processing to spread out the load evenly or for batch processing, increase the availability by distributing or splitting the server-side processing, and so on. The platform, or framework, that supports this inside SQL Server 2005 is Service Broker.

If you want to build scalable database applications, you should seriously consider using Service Broker in your applications. You can use T-SQL statements to create Service Broker queues, services, and routes; send and receive messages; and so on. Service Broker, which is built into the SQL Server 2005 database engine, provides the infrastructure required for queuing and communicating messages asynchronously, reliably, and securely. Refer to Chapter 14 for complete details on Service Broker.

.NET Runtime Integration

T-SQL is an interpreted language, and it is not well suited for complex computational and procedural tasks. The SQL Server 2005 database engine can host the .NET runtime to facilitate running compiled .NET code inside SQL Server. This feature is called the .NET Common Language Runtime (CLR) integration, or SQLCLR, and it can be used to write stored procedures, triggers, and user-defined functions, types, and aggregates, using any .NET programming language. You can use SQLCLR to implement computational and procedural tasks that may yield better performance than their T-SQL counterparts. SQLCLR is discussed in Chapter 11, “SQL Server 2005 and .NET Integration.”

T-SQL Enhancements

The previous section describes that you can use .NET to extend T-SQL. If something is not possible using T-SQL, you can implement it by using .NET (as a stored procedure, a function, a trigger, a type, or an aggregate) and execute it inside SQL Server. For data access, TSQL is still the premier and most optimal language. SQL Server 2005 introduces enhancements to the T-SQL languages that database professionals can leverage to write efficient and manageable queries. Common table expressions, PIVOT/UNPIVOT operators, and ranking functions are some examples of T-SQL enhancements. Refer to Chapter 6, “Transact-SQL Enhancements,” for a detailed description of new T-SQL features in SQL Server 2005.

Multiple Active Result Sets

Before SQL Server 2005, a connection supported only a single request at any time. If there was a pending request or an open result set (that is, a fire hose cursor), the connection could not execute another statement until the first one finished. SQL Server 2005, along with the new SQL Native Client data access API, or .NET 2.0 SqlClient provider, changes this by introducing a feature called Multiple Active Result Sets (MARS). As the name suggests, MARS provides the ability to have more than one pending request under a given SQL Server connection. MARS involves enabling the interleaved execution of multiple requests within a single connection. It allows a batch to run and, within the execution, allows other requests to execute. Note, however, that MARS is defined in terms of interleaving, not in terms of parallel execution. In many situations, MARS may provide an alternative to server-side cursors and provide performance improvements.

Query Notifications

Query notifications are a new functionality introduced in SQL Server 2005 that allow an application to request a notification from SQL Server when the results of a query change. The most common scenario where this can be used is while caching a result set. Instead of polling the database periodically or refreshing the cache periodically, an application can enlist with SQL Server 2005 to be notified whenever there is a change in the cached result set. This can aid in reducing the number of round trips to the database and improving the overall system throughput. The database engine uses Service Broker to deliver query notification messages. Therefore, Service Broker must be enabled in the database where the application requests the subscription.

HTTP SOAP Support

The SQLXML add-on to SQL Server 2000 allows you to invoke stored procedures over HTTP as web service methods. However, this requires the Microsoft Internet Information Services (IIS) web server and an ISAPI extension DLL. This layering can have some impact on performance when web service methods are invoked.

HTTP SOAP web services support is now built into the SQL Server 2005 database engine, eliminating the need for any web server or any ISAPI extension DLL, resulting in better performance. The HTTP endpoints can be created, and web service methods can be mapped to stored procedures, by using the CREATE ENDPOINT T-SQL statement. Refer to Chapter 10 for more details on this.

Performance Monitoring and Tuning Tools

Tools play an important role in monitoring, troubleshooting, and tuning database performance. The basic performance monitoring toolset in SQL Server 2000 included System Monitor or Performance Monitor, SQL Profiler, the Current Activity folder in Enterprise Manager, system stored procedures and DBCC statements, and the showplan and statistics display in Query Analyzer. The ITW was the only tuning tool available in SQL Server 2000.

SQL Server 2005 continues to support and enhance SQL Profiler, Activity Monitor, showplan and statistics display, and Performance Monitor. In addition, the new dynamic management objects add a powerful capability to monitor the system state. In addition to graphical and textual execution plans, SQL Server 2005 supports using Profiler or SQL Server Management Studio to generate execution plans in XML format.

As mentioned earlier in this chapter, the ITW has been replaced with a more powerful, full-fledged application called DTA. Plan guides (by using sp_create_plan_guide) are a new tuning tool available in SQL Server 2005. Chapter 5 provides a detailed description of these tools.

SQL Server 2005 contains about 78 DMVs and about 9 dynamic management functions (DMFs). All the dynamic management objects belong to “the sys” schema, and at least two-part names are required to access these objects. DMVs can be referenced using two-part, three-part, or four-part names, and DMFs can be referenced using two-part or three-part names.

DMVs and DMFs are grouped into the following 12 categories:

• .NET CLR

• Execution

• Full-text services

• Index

• IO

• Query notifications

• Replication

• Service Broker

• SQLOS

• Database

• Database mirroring

• Transaction

DMV and DMF names begin with "dm_" and then are named consistently to identify the category they belong to. For instance, all the dynamic management objects in the .NET CLR category begin with "dm_clr_", execution-related objects begin with "dm_exec_", index-related objects begin with "dm_index_", and I/O-related objects begin with "dm_io_".

A complete description of all the dynamic management objects is beyond the scope of this chapter. However, Table 9.2 describes a few dynamic management objects that you can use to monitor and troubleshoot performance issues.

Table 9.2. Dynamic Management Objects

images

To try out some of the dynamic management objects, you can start SQL Server Management Studio and run the following script in a query window:

SELECT SUM(pagesused) * 8 AS 'Plan Cache (KB)'
   FROM sys.dm_exec_cached_plans;

SELECT protocol_type, count(*) AS 'Total Connections'
FROM sys.dm_exec_connections
GROUP BY protocol_type;

SELECT login_name, client_interface_name, count(*) AS 'Total Sessions',
   sum(memory_usage) AS 'Memory Usage'
FROM sys.dm_exec_sessions
GROUP BY login_name, client_interface_name;

SELECT name, description, company, file_version, product_version
FROM sys.dm_os_loaded_modules;

SELECT [type], sum(single_pages_kb) AS 'Total Single Pages (KB)',
   sum(multi_pages_kb) AS 'Total Multi Pages (KB)',
   sum(virtual_memory_reserved_kb) AS 'Total Virtual Memory Reserved (KB)',
   sum(virtual_memory_committed_kb) AS 'Total Virtual Memory Committed (KB)',
   sum(awe_allocated_kb) AS 'Total AWE Allocated Memory (KB)'
  FROM sys.dm_os_memory_clerks
  GROUP BY [type]
  ORDER BY 2 DESC, 3 DESC;

SELECT * FROM sys.dm_os_performance_counters;

SELECT * FROM sys.dm_os_tasks
   ORDER BY context_switches_count DESC;

SELECT * FROM sys.dm_clr_properties;

The first SELECT statement uses the sys.dm_exec_cached_plans DMV to calculate total memory taken by the plan cache. The second SELECT statement uses the sys.dm_exec_connections DMV to total the connections to the SQL Server over each protocol. The third SELECT statement uses the sys.dm_exec_sessions DMV to group sessions by login name and displays total sessions and total memory used by each login. The next query displays all the loaded modules by using the sys.dm_os_loaded_modules DMV. The next SELECT statement uses the sys.dm_os_memory_clerks DMV to display the total single-page, multi-page, virtual, and AWE memory being managed by each clerk. The final three SELECT statements use the sys.dm_os_performance_counters, sys.dm_os_tasks, and sys.dm_clr_properties DMVs to display the current SQL Performance Monitor counter data, tasks, and CLR properties, respectively.

Summary

Performance tuning is often an iterative process in which you identify and fix bottlenecks, apply one set of configuration changes at a time, and test and collect data to see if the application meets its performance objective, and, if not, continue with the tuning-configuration-testing process. Proactive monitoring and tuning can save significant time, money, and resources. This chapter assumes that you are somewhat familiar with SQL Server 2000 performance monitoring and tuning, and it discusses the new techniques and features that you can use to proactively design, monitor, and tune database applications to maximize the performance.

The chapter begins with a discussion on optimizing physical database design, which is an important aspect of maximizing the performance and manageability of any database system. Next, some new techniques are presented to troubleshoot locking and blocking issues. The subsequent two sections describe techniques to optimize the queries and application design. Because Chapter 5 already contains a detailed description of SQL Server 2005 tools, including performance monitoring and tuning tools, this chapter discusses only DMVs in detail.

The next few chapters introduce you to developer and business intelligence features to show you how they affect you as a DBA. Chapter 10 starts with an overview of XML and web services support in SQL Server 2005.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset