Chapter 2. SQL Server 2005 Database Engine Architectural Changes

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 2. SQL Server 2005 Database Engine Architectural Changes

IN THIS CHAPTER

• The SQL Server Operating System (SQLOS)

• Non-Uniform Memory Access (NUMA) Architecture Support

• Query Optimization and Execution

• Row Versioning

• System Metadata Access and the Security Architecture

• Storage Engine and I/O Changes

SQL Server 2005 is loaded with features related to scalability, reliability, performance, high availability, programmability, security, and manageability. These features and enhancements are the primary drivers for the major overhauls in several areas of the database engine to accommodate a seamless integration throughout. Microsoft has made changes to SQL Server’s memory management, thread scheduling, statistics management, parallelism, system metadata management, query optimization and execution, transactions, and other areas to better support new hardware architectures.

SQL Server has a single code base that can scale a small application that supports a few users to a large and complex enterprise application that supports multi-terabyte databases and thousands of concurrent users. While the foundation for the architecture that can achieve such linear scalability was laid in SQL Server 7.0 and further enhanced in SQL Server 2000, SQL Server 2005 offers significant changes and enhancements, taking SQL Server further into mission-critical enterprise database space.

This chapter provides an overview of the architectural changes made to the SQL Server 2005 database engine. If you are interested only in the functional operations of SQL Server 2005 and want to consider engine behavior as a black box operation, you can safely skip this chapter. If you want to first understand new features and enhancements introduced in this release, you can come back to this chapter after you have become more familiar with SQL Server 2005’s functionality. This chapter covers the relational database engine architecture details only. Systems such as Analysis Services, Reporting Services, SQL Server Integration Services (formerly known as Data Transformation Services), and Notification Services have their own architectural enhancements, which are beyond the scope of this chapter.

The SQL Server Operating System (SQLOS)

An instance of a running application, known as a process, is merely a container for one or more threads that are scheduled to run on a processor for a time slice determined by the operating system. Threads allow applications to make more effective use of a CPU, even on a computer that has a single CPU. Windows is a preemptive, multitasking operating system. That means in order to run application code, the operating system grants a time slice to the thread. And when the time slice is over or when a high-priority thread needs to run, the operating system saves the contextual information for the current thread, preempts or stops the running thread, and loads the contextual information of the other thread so the other thread can run. This approach is designed to keep a single application from taking over the system and to provide even performance across the system.

User Mode Scheduler (UMS)

SQL Server efficiently leveraged the previously discussed Windows scheduling facility up through SQL Server 6.5. However, this general-purpose, one-size-fits-all scheduling approach limited the scalability heights that SQL Server was trying to achieve. The Windows preemptive scheduling approach results in context switches (stopping one thread and running another thread), which are expensive operations and involve switching from user mode to kernel mode. In a process such as SQL Server that makes use of a lot of threads, excessive context switching has a negative impact on the overall performance of and limits the scalability of SQL Server. This is why the SQL Server team decided that SQL Server should handle its own scheduling. SQL Server knows its own scheduling needs better than Windows does. Therefore, SQL Server can do a better job than Windows of implementing efficient thread scheduling and avoiding context switching.

SQL Server 7.0 first introduced the notion of User Mode Scheduler (UMS), which is a thin layer above the operating system, whose primary job is to optimize SQL Server thread management by minimizing the context switches and keeping as much of the SQL Server scheduling process as possible in user mode. UMS functionality is provided in a file named ums.dll under the binn folder. In addition to taking over scheduling from Windows, UMS also abstracts operating system–dependent features such as fibers and asynchronous I/O. But how does UMS take over scheduling from the preemptive operating system? UMS has Windows believe that all threads except the one that UMS wants to run are not viable. If a thread is in an infinite wait state, that means if a thread calls WaitForSingleObject and passes INFINITE for the timeout value, Windows considers the thread not viable for scheduling purposes and ignores it. The only way to awaken such a sleeping thread is to signal the thread’s event object. To the operating system, only one SQL Server thread per processor generally appears to be active at a time. So, even though the server may have hundreds of worker threads at any given time, only one of them for each processor on the server appears to Windows to be actually doing anything.

The Windows scheduler is a preemptive scheduler. UMS, on the other hand, follows the cooperative model and is a non-preemptive scheduler. UMS relies on threads to yield voluntarily. UMS takes this approach to keep from involving the Windows kernel any more than absolutely necessary. UMS cooperative scheduling requires more careful coding on the part of SQL Server development team, but it can actually be more efficient than a preemptive model because the scheduling process can be tailored to the specific needs of the application. When a thread yields—either because it has finished the task at hand or because it has executed code with an explicit call to one of the UMS yield functions—it checks the list of threads that are ready to run and signals the first thread in the list that it can run. This way, everything happens in user mode, avoiding a switch to kernel mode.

When SQL Server starts, one UMS scheduler is created for each processor in the machine. Each scheduler maintains the following five lists:

• Worker list—This is a list of available threads or fibers. The number of available threads is based on the max worker threads sp_configure configuration value. If you set max worker threads to 255 on an eight-processor machine, each processor, and hence each UMS scheduler, can host a maximum of approximately 32 workers. A single UMS worker can service multiple user connections or SPIDs.

• Runnable list—This is a list of UMS workers that are ready to execute an existing work request. When any UMS worker yields, as part of yielding it checks the scheduler’s runnable list for a ready worker and signals that worker’s event so that it can run.

• Resource waiter list—When a UMS worker requests a resource that is owned by another worker, it puts itself on the waiter list for the resource and enters an infinite wait state. When the worker that owns the resource is ready to release it, it scans the waiter list for workers that are waiting on the resource and moves them to the runnable list. When the worker owning the resource yields, it signals the event object of the first worker on the runnable list.

• I/O List—This is a list of outstanding asynchronous I/O requests.

• Timer List—This is a list of UMS timer request that encapsulate a timed work request, such as waiting on a resource for a specific amount of time before timing out.

In SQL Server 7.0 and 2000, you used the DBCC SQLPERF(umsstats) undocumented statement to monitor the health of each visible scheduler on the system. In SQL Server 2005, you can access the sys.dm_os_schedulers dynamic management view (DMV) to list statistics for both visible and hidden schedulers on the system. DMVs are a new type of metadata views provided for monitoring and troubleshooting purposes. They essentially provide a real-time snapshot of internal memory structures, indicating the server state. DMVs are discussed in Chapter 9, “Performance Analysis and Tuning.”

Introducing SQLOS

The UMS-based thread management architecture made SQL Server self-managing and easy to scale as new CPUs are added to the machine. During the SQL Server 7.0 rewrite, when UMS was introduced, the memory manager, storage engine, and relational engine were also upgraded to have built-in adaptive algorithms and self-tuning capabilities. When SQL Server 7.0 was shipped in 1998, it truly was the first enterprise database engine that was capable of automatic configuration and dynamic self-tuning—a concept that other database vendors initially downplayed as “not for the enterprise” but are now aggressively pursuing.

In SQL Server 2005, Microsoft has taken the self-managing and self-tuning paradigm to a much higher level. As a result, a component named SQLOS was born. SQLOS is a layer that sits on top of the underlying operating system and is responsible for managing operating system resources that are specific to SQL Server. SQLOS gives SQL Server the ability to serve its internal resource requirements much more efficiently and comprehensively. Each component within SQLOS is dedicated to performing specific functions as well as working with other components in a harmonious manner, providing large-scale performance with the ease of adapting to different hardware configurations, such as 32-bit, 64-bit, x64, dual core chips, and large memory addressability. In other words, no configuration changes are necessary within SQL Server in order for SQLOS to adapt to hardware resources while providing unprecedented scalability.

You will notice that SQL Server 2005 does not ship ums.dll anymore. UMS is now referred to as the “Non Preemptive Scheduler,” and it is one of the components of SQLOS. Figure 2.1 shows various SQLOS components.

Figure 2.1. SQLOS is a user mode operating system layer that sits between the SQL Server engine and the operating system and provides services such as thread management, memory management, and hosting services.

As Figure 2.1 shows, the two big components of SQLOS are non-preemptive scheduling (formerly known as UMS) and memory management. Other components include the resource monitor, exception handler, and hosting subsystems. SQLOS brings all these system components together and provides a cohesive API that the SQL Server development team can use to easily exploit hardware and operating system features.

Non-Preemptive Scheduling

The non-preemptive scheduling component of SQLOS is used for scheduling and synchronizing concurrent tasks without having to call the Windows kernel. It is responsible for scheduling Windows threads or fibers effectively. The scheduler is responsible for managing the scheduling process and for ensuring that only one thread processor is active at any given time. The sys.dm_os_schedulers DMV can be used to view a list of schedulers. A collection of schedulers that provides an abstraction layer over a group of CPUs is called a scheduling node. The term task refers to a unit of work that is scheduled by SQL Server. A Transact-SQL (T-SQL) statement batch can map to one or more tasks. For instance, a parallel query is executed by multiple tasks. The tasks are executed by worker threads. A worker thread represents a logical thread in SQL Server that is internally mapped (1:1) to either a Windows thread or, if lightweight pooling is on, to a fiber. The worker thread-to-Windows thread mapping is maintained until the worker thread is deallocated either because of memory pressure or because it has been idle for a long time. Worker threads are executed and managed by system threads.

The `max worker threads` Option

Each instance of SQL Server maintains a pool of either Windows threads or fibers for processing user queries. This thread pool helps optimize performance when large numbers of clients are connected to the server. The maximum size of this pool is controlled by the max worker threads server configuration option.

The minimum and default values for the max worker threads advanced option have changed in SQL Server 2005. In SQL Server 2000, the minimum value that can be set for the max worker threads option is 32; it is 128 in SQL Server 2005. In SQL Server 2000, the max worker threads option defaults to 255. In SQL Server 2005, this option is by default set to 0, which allows SQL Server to automatically configure the number of worker threads at startup. If the max worker threads option is left at the default of 0, SQL Server 2005 uses the formula shown in Table 2.1 to set the max worker threads configuration value.

Table 2.1. The max worker threads Formula

images

For instance, on an eight-processor 32-bit machine, max worker threads defaults to 256+((8–4)*8), which equals 288. On an eight-processor 64-bit machine, max worker threads defaults to 512+((8–4)*16), which equals 576. It is important to note that SQL Server does not actually create all the threads, but it reserves the memory required for creating the number of threads specified by the max worker threads option. When the actual number of user connections is smaller than the number set in max worker threads, one thread handles each connection. However, when the actual number of connections exceeds the number set in max worker threads, SQL Server pools the worker threads so that the next available worker thread can handle the request.

Routing Tasks to a Scheduler

SQL Server 2000 chose the scheduler in a round-robin fashion at connection time. All batches on a connection were run on the same scheduler, which could lead to imbalances, especially with long-lived connections. In SQL Server 2005, the decision to choose a scheduler is based on how busy the scheduler is. In other words, unlike in SQL Server 2000, where a task was routed to a particular scheduler, in SQL Server 2005, the routing decision is made based on the load on the scheduler. At connection time, SQL Server chooses a scheduler based on the load on the schedulers. Each subsequent batch on the same connection uses the same algorithm, and a new scheduler is chosen based on the load, with a bias toward the current scheduler for the connection. That is, if the load on the current scheduler (where the last batch for the connection was executed) is not appreciably higher than the average load, SQL Server does not look for a new scheduler; instead, it utilizes the warm cache and minimizes the CPU cycles used to find the least loaded scheduler. The columns such as load_factor and runnable_tasks_count in sys.dm_os_schedulers can be used to determine how busy a scheduler is.

Memory Management

On a 32-bit x86 platform, Windows gives all processes a 4GB virtual address space (VAS), which is divided into two partitions of 2GB each: the user mode partition and the kernel mode partition. If an application demands more than 2GB of user mode VAS, you can place a /3GB switch in the system’s BOOT.INI file to limit the kernel VAS to 1GB and provide 3GB of user mode VAS to the application. Windows XP and Windows 2003 support an additional /USERVA switch in the BOOT.INI file that provides a finer degree of control by allowing you to specify exactly how much address space to set aside for user mode access. If even 3GB memory is not sufficient, you can use a Pentium Pro or later processor, place the /PAE switch in your BOOT.INI file, and leverage the Address Windowing Extensions (AWE) facility provided by Windows to access up to 64GB of physical memory. Note that by reducing the kernel portion of VAS to 1GB, you reduce the space available to support internal management structures. If there is more than 16GB of physical memory available on a computer, the operating system needs 2GB of process address space for system purposes and therefore can support only a 2GB user mode address space. In order for AWE to use the memory range above 16GB, you need to be sure that the /3GB parameter is not in the BOOT.INI file. If it is, the operating system cannot address any memory above 16GB.

SQL Server divides its VAS into two regions: the buffer pool and reserved address space. The buffer pool, also known as the BPool, is a primary memory allocation source for SQL Server, and it is used as a data and index page cache and for memory allocations smaller than 8KB. The BPool size is governed by the min server memory and max server memory configuration options. The BPool never drops its memory allocation below the level specified in min server memory, and it never acquires more memory than the level specified in max server memory. Objects that use the BPool include the buffer cache, the procedure cache, system-level structures, locks, log caches, and connection contexts.

The second region in SQL Server VAS is often called MemToLeave, which actually is an incorrect term; the correct term for this memory region is reserved address space. The reserved address space region is used for internal SQL Server allocations that exceed 8KB of contiguous space and for allocations made by external consumers, such as extended stored procedures, OLE DB providers, in-process COM objects, SQLCLR (.NET common language runtime integration with SQL Server) assemblies, and so on. The reserved address space is a contiguous allocation of memory of size 256MB+(max worker threads*512KB). For instance, if max worker threads is set to 256, the reserved address space size is 256MB+(256*512KB), which equals 384MB.

As shown in Figure 2.1, the SQL Server 2005 memory management architecture consists of several components, such as memory nodes, memory clerks, caches, pools, and memory objects. The memory node component is responsible for providing the locality of allocations. It consists of the single-page allocator, the multi-page allocator, the large page allocator, the reserved page allocator, the virtual allocator, and the shared memory allocator. Memory clerks, which are the key to granular memory management in SQL Server 2005, enable SQLOS to track and control the amount of memory consumed by a component. The sys.dm_os_memory_clerks DMV can be used to figure out memory distribution and to see a list of active memory clerks and the amounts of different kinds of memory allocated by each clerk. For example, the following query uses the sys.dm_os_memory_clerks DMV to list each type of memory clerk along with the sum of reserved virtual memory, committed virtual memory, and single and multi-pages allocated by each clerk:

SELECT [type],
     SUM(virtual_memory_reserved_kb) AS TotalVirMemReservedKB,
     SUM(virtual_memory_committed_kb) AS TotalVirMemCommittedKB,
     SUM(multi_pages_kb) AS TotalMultiPagesKB,
     SUM(single_pages_kb) AS TotalSinglePagesKB
   FROM sys.dm_os_memory_clerks
   GROUP BY [type]
   ORDER BY TotalVirMemCommittedKB DESC, TotalVirMemReservedKB DESC,
         TotalMultiPagesKB DESC, TotalSinglePagesKB DESC;

You can use the sys.dm_os_virtual_address_dump DMV to obtain detailed information about the VAS.

In addition to the architectural change just discussed, the following are some more changes that have been introduced to memory management in SQL Server 2005:

• Dynamic AWE—As described earlier, the AWE mechanism allows 32-bit systems to access memory over 4GB. Instances of SQL Server 2000 running in AWE memory mode allocated the full amount of memory specified in max server memory on server startup. This has changed in SQL Server 2005. During startup, SQL Server 2005 running under Windows Server 2003 now reserves only a small portion of AWE mapped memory. As additional AWE mapped memory is required, the operating system dynamically allocates it to SQL Server. Dynamic AWE enables the buffer pool to dynamically manage AWE mapped memory to balance SQL Server memory use with the overall system requirements.

It is important to note that when AWE is enabled, SQL Server 2005 always attempts to use the AWE mechanism to allocate its memory. In other words, in AWE mode, most allocations are made through the AWE mechanism. Another important point to remember is that although SQL Server can dynamically release AWE mapped memory, the current amount of allocated AWE mapped memory cannot be swapped out to the page file. This means that, unless the objects are released and AWE mapped memory is freed, this memory cannot be swapped out to the page file and made available to the operating system or to other instances on the system. AWE mapped memory can never be swapped out to a page file. Therefore, you may consider setting max server memory for SQL Server to guarantee additional memory for other applications operating on the computer.

• Hot add memory—Dynamic AWE also allows SQL Server to increase memory if additional memory is added to a computer that supports hot add memory. Available in Windows Server 2003, Enterprise and Datacenter Editions, hot add memory allows memory to be added while the computer is running. Hot add memory is only available for 64-bit SQL Server and for 32-bit SQL Server when AWE is enabled. Hot add memory is not available for 32-bit SQL Server when AWE is not enabled. Hot add memory is only available for Windows Server 2003, Enterprise and Datacenter Editions. It also requires special hardware supported by the hardware vendor. To use hot add memory, when you start SQL Server 2005, you must use the -h option. Note that removing physical memory from the system still requires you to restart the server. Dynamic AWE is not supported on SQL Server 2005 Standard, Workgroup, and Express Editions.

• The common caching framework—SQL Server 2000 included two types of caches: the data page cache and the procedure cache. These two caches were tightly coupled, and each relied on the other cache’s eviction mechanism to control its size. SQL Server 2005, on the other hand, supports a lot more types of caches in order to better support new features and new requirements. You can use the type column in the sys.dm_os_memory_cache_* DMVs (such as sys.dm_os_memory_cache_entries) to see some of the cache types. To better support these new caches, SQL Server 2005 introduces the notion of the common caching framework, which controls the behavior of multiple caches. This framework provides a uniform caching mechanism and common costing policies to cache different types of data.

• The resource monitor—In SQL Server 2005, the lazy writer is responsible for freeing data pages only. SQL Server 2005 introduces a new thread, called the resource monitor, that manages the caches and clerks. The resource monitor responds to memory pressure by broadcasting the notification to appropriate memory clerks. In addition, the resource monitor also ensures that a single cache does not monopolize the buffer pool and that the overall cache memory does not exceed 75% of the buffer pool.

Another important function that the resource monitor performs is to respond to VAS pressure. In SQL Server 2000, it was hard for the server to recover once it got into VAS pressure. Server restart was the only option in such a scenario. In SQL Server 2005, if a memory node fails to allocate a region of 4MB or less, or when the resource monitor’s probe to VAS for a 4MB region fails, the resource monitor sends a notification that all the memory clerks that have the opportunity to do so should shrink. For instance, when such a broadcast notification is sent, a CPU node might try to shrink its threads, a CLR clerk might unload appdomains that are currently not in use, network libraries might shrink their network buffers, and so on.

Non-Uniform Memory Access (NUMA) Architecture Support

Today’s critical enterprise applications demand a larger number of processors, with increased clock speed. In the past, hardware vendors provided the support for more and faster CPUs, but with a single system bus and the memory latency, the additional processing power was not fully utilized. Such an architecture, where all memory accesses are posted to the same shared memory bus, is known as symmetric multiprocessing (SMP). To circumvent this, the concept of a large on-chip L3 cache was introduced, but it was also a limited solution. NUMA is designed to overcome the scalability limitations of SMP architecture.

NUMA hardware architecture includes more than one system bus, each serving a small set of processors. Each group of processors has its own memory and, possibly, its own I/O channels. Each group is called a NUMA node. For example, a 16-processor machine may have 4 NUMA nodes, each node having 4 CPUs, its own system bus, and, possibly, its own I/O channels. This allows for a greater memory locality for that group of schedulers when tasks are processed on the node. Note that each CPU can, however, access memory associated with other groups in a coherent way. Non-uniform memory access means that it takes longer to access some regions of memory (for example, remote memory on a different node) than others (for example, local memory on the same node). The main benefit of NUMA is scalability for high-end machines (generally eight or more processors). In summary, NUMA reduces memory contention by having several memory buses and only a small number of CPUs competing for a shared memory bus.

SQL Server 2005 is NUMA-aware. This means it can perform better on NUMA hardware without special configuration. When a thread running on a specific NUMA node has to allocate memory, SQL Server’s memory manager tries to allocate memory from the memory associated with the NUMA node for locality of reference. Each NUMA node has an associated I/O completion port that is used to handle network I/O. SQLOS, SQL Server startup, network binding, and BPool management have been designed to make effective use of NUMA.

SQL Server Configuration Manager allows you to associate a TCP/IP address and port to a single or multiple NUMA nodes. You can configure NUMA affinity so that clients can connect to specific nodes. NUMA affinity is configured as a server setting in SQL Server Configuration Manager. To set a TCP/IP address and port to a single or multiple nodes, you append a node identification bitmap (an affinity mask) in square brackets after the port number. Nodes can be specified in either decimal or hexadecimal format. For instance, both 1453[0x3] and 1453[3] map port 1453 to NUMA nodes 0 and 1. The affinity mask value [3] translates to binary 00000011, and because the 0th and 1st bits are set, the client requests on port 1453 are served on 0th and 1st NUMA nodes. Let’s look at another example. 1453[0x11] and 1453[17] both map the port 1453 to NUMA nodes 0 and 4 because the binary representation of hex 11 or decimal 17 is 00010001 and the 0th and 4th bits are set. The default node identification bitmap is -1, which means listen on all nodes. To configure a TCP/IP port to one or more NUMA nodes, you follow these steps:

Start SQL Server Configuration Manager.
Expand SQL Server 2005 Network Configuration.
Expand Protocols for <instance_name> and then click TCP/IP.
In the details pane, right-click the IP address to configure and then click Properties.
Specify the NUMA node identifier in brackets after the port numbers in the TCP Port box (for example, 1600[-1], 1460[0x11], 1400[4]). This maps port 1600 to all NUMA nodes, port 1460 to NUMA nodes 0 and 4, and port 1400 to NUMA node 2.

Setting the NUMA affinity and having different clients served by different NUMA nodes is an easy-to-manage alternative to traditional multi-instance-based server consolidation and load balancing approaches.

Query Optimization and Execution

Query processing, which involves parsing, compiling, optimizing, and execution, is one of the most critical functions of a relational database engine. SQL Server 2005 continues to build on the award-winning cost-based optimizer of SQL Server 2000, and it introduces several new features to improve the overall query performance. The following sections discuss the new enhancements made to SQL Server’s query optimization and execution.

Statistics Management in SQL Server 2005

SQL Server automatically creates and updates statistics. The query optimizer uses these statistics during query execution to determine things such as whether using an index would speed up a query and to develop an optimal query execution plan. The statistics include information such as cardinality (number of unique values), density (uniqueness of values within a dataset), selectivity (number of rows that will be returned by a particular query criterion), and so on, for one or more columns. In SQL Server 2000, the statistics were saved in the statblob column in the sysindexes system table. In SQL Server 2005, these statistics are now saved in an internal system table, sys.sysobjvalues. Therefore, the statblob column in the sys.sysindexes backward-compatibility view is now always returned as NULL.

Note

You cannot access system internal tables in multiuser mode. You have to start SQL Server in single-user mode, make the Resource database (discussed later) the current database, and then use the three-part naming convention (for example, AdventureWorks.sys.sysobjvalues) to view the system table contents.

Missing or out-of-date statistics can have a negative impact on the query optimizer’s ability to choose the most efficient plan. Out-of-date or missing statistics are indicated as warnings (with the table name in red text) when the execution plan of a query is graphically displayed in SQL Server Management Studio.

SQL Server 2005 introduces several enhancements that improve the management of statistics and also help the query optimizer choose the most efficient query plan for a broader range of queries. Here are some of these enhancements:

• Column-level change tracking—In SQL Server 2000, statistics update was determined by the number of row changes. SQL Server 2005, on the other hand, tracks the data changes at the column level and avoids the automatic update of statistics on columns that have not changed enough to warrant a statistics update. This reduces the instances of automatic update statistics and hence may improve query performance.

• The AUTO_UPDATE_STATISTICS_ASYNC database option—In SQL Server 2000, if a query caused an automatic statistics update, the query was blocked until the statistics were refreshed. This sometimes caused query timeout errors and unpredictable query response times. For such scenarios, SQL Server 2005 provides a databasewide option called AUTO_UPDATE_STATISTICS_ASYNC that, when turned on, results in statistics being updated in the background, without blocking the query. The query that caused the statistics update proceeds with the old statistics, while SQL Server starts a new thread to update the statistics. This provides more predictable query response time for some workloads. The AUTO_UPDATE_STATISTICS_ASYNC database option is turned off by default. It is recommended that you perform enough analysis and testing before turning this option on to ensure that it does not adversely affect the query performance or any other part of the engine. You can use the is_auto_update_stats_async_on column in the sys.databases catalog view to determine whether the AUTO_UPDATE_STATISTICS_ASYNC database option is turned on or off.

• Parallelism—SQL Server 2005 supports creating full scan statistics in parallel for both partitioned and non-partitioned tables. This can potentially lead to much faster statistics creation times.

• String statistics—For character columns, SQL Server now gathers information about the frequency distribution of substrings. In other words, SQL Server 2005 gathers a new kind of statistics called “tries” that aid the optimizer in better estimating the selectivity of conditions that use the LIKE operator. In addition, SQL Server now allows columns of large object types such as text, ntext, image, nvarchar(max), varchar(max), and varbinary(max) to be specified as statistics columns.

• Computed columns—SQL Server 2000 only partially supported (and did not document) creating and updating statistics on computed columns. SQL Server 2005 now fully supports and documents computed column statistics.

• A change in the sampling size formula—As in earlier releases, to minimize the cost of automaticstatistical update, SQL Server samples the data instead of analyzing all the data. For tables that have more than 1,024 pages (that is, tables over 8MB), SQL Server 2005 now ensures that a minimum of 8MB of data is sampled during statistics gathering.

• Statistics on up to 32 columns—The limit on the number of columns in a statistics object has been increased to 32, from 16.

• Enhanced DBCC SHOW_STATISTICS results—The DBCC SHOW_STATISTICS statement provides the option to restrict the output to header, density vector, histogram, or a combination of these. If no options are specified, all three result sets are returned. In addition, the output has been enhanced to display the name of statistics object being displayed and whether it is a string index.

Statement-Level Recompilation

Before a query, batch, stored procedure, trigger, prepared statement, or dynamic SQL statement (henceforth, “batch”) begins execution on a SQL Server database, the batch is compiled into a plan. The plan is then executed for its effects or to produce results. Such compiled plans are stored in a part of SQL Server’s memory called the plan cache. If SQL Server is later able to reuse a compiled plan from the plan cache, it avoids the compilation costs, which improves the overall performance.

Note

The area of memory where compiled plans are cached is sometimes referred as the procedure cache. However, the term plan cache is more accurate because this cache stores plans of other queries in addition to those of stored procedures.

When a cacheable batch is submitted to SQL Server 2005 for execution, it is compiled, and a query plan for it is put in the plan cache. A query plan is a read-only reentrant structure that is shared by multiple users. There are at most two instances of a query plan at any time in the plan cache: One for all the serial executions and one for all the parallel executions. Each user concurrently executing a batch has an execution context that holds data (such as parameter values) specific to his or her execution. Although execution contexts are reused, they are not reentrant (that is, they are single threaded). That is, at any point of time, an execution context can be executing only one batch submitted by a session, and while the execution is happening, the context is not given to any other session or user. A query plan and multiple associated execution contexts can coexist in the plan cache. However, just an execution context (without an associated query plan) cannot exist in the plan cache. Whenever a query plan is removed from the plan cache, all the associated execution contexts are also removed along with it. Also, query plan reuse does not necessarily imply execution context reuse. Execution contexts for parallel plans are not cached. An execution plan is a combination of a query plan and an execution context. You can query the sys.dm_exec_cached_plans DMV or sys.syscacheobjects backward-compatibility view to obtain information about the query execution plans that are cached by SQL server.

Before executing a query plan, SQL Server checks the correctness and optimality of that query plan. If one of the checks fails, the statement corresponding to the query plan or the entire batch is compiled again, and possibly a different query plan is produced. This process is known as recompilation. Recompilations are often performed for good reasons. However, sometimes excessive recompilations can degrade performance. In such cases, it becomes necessary to analyze the reasons for recompilations and try to reduce the occurrences of recompilations.

When a batch was recompiled in SQL Server 2000, all the statements in the batch were recompiled—not just the one that triggered the recompilation. SQL Server 2005 improves on this behavior by compiling only the statement that caused the recompilation—not the entire batch. This statement-level recompilation feature improves SQL Server 2005’s recompilation behavior when compared to that of SQL Server 2000. In particular, SQL Server 2005 spends less CPU time and memory during batch recompilations, and it obtains fewer compile locks. You can use a new Profiler trace event called SQL:StmtRecompile under the T-SQL event class to trace statement-level recompilations.

Calculating Query Plan Cost

In addition to stored procedures and triggers, SQL Server can cache the query plans for dynamic SQL, prepared queries, sp_executesql queries, ad hoc queries, auto-parameterized queries, and other batches. Every query plan and execution context has a cost associated with it. When the cost reaches 0, the plan or context becomes a candidate for deletion from the plan cache. In other words, the cost partially controls how long the plan or context lives in the plan cache. These costs are calculated and manipulated differently in SQL Server 2005 than they were in SQL Server 2000.

In SQL Server 2000, costs were calculated and manipulated as follows:

Query plan cost c = f(cpuTime, pagesRead, pagesWritten) / pagesUsedInMem

In SQL Server 2000, the query plan cost was a mathematical function that used four factors: CPU time spent generating the plan, number of pages read from disk, number of pages written to disk, and number of memory pages occupied by the query plan of the batch. In SQL Server 2000, the lazy writer thread occasionally swept through the plan cache and decremented costs by dividing them by 4. In case of memory pressure, query plans and execution contexts with costs of 0 were deleted from the plan cache. When a query plan or an execution context was reused, its cost was reset back to its compilation (or execution context generation) cost.

In SQL Server 2005, costs are calculated and manipulated as follows:

Query plan cost c = I/O cost + context switch cost + memory cost

In SQL Server 2005, the query plan cost is calculated in terms of the number of ticks, with a maximum of 31 ticks. It is a sum of the following:

• I/O cost—Two I/Os cost 1 tick, with a maximum of 19 ticks

• Context switch cost—Two context switches cost 1 tick, with a maximum of 8 ticks

• Memory cost—Sixteen memory pages (128KB) cost 1 tick, with a maximum of 4 ticks

In SQL Server 2005, the lazy writer does not decrement costs. Instead, as soon as the size of the plan cache reaches 50% of the BPool size, the next plan cache access decrements the ticks of all the plans by 1 each. If the sum of the sizes of all the caches in SQL Server 2005 reaches or exceeds 75% of the BPool size, a dedicated resource monitor thread gets activated, and it decrements the tick counts of all the objects in all the caches. As in SQL Server 2000, query plan reuse causes the query plan cost to be reset to its initial value.

Parallel Query Processing

Depending on factors such as available memory, type and estimated cost of query, number of schedulers on the server, load on the schedulers, and so on, SQL Server might decide to execute a query in parallel. You can use the max degree of parallelism sp_configure option to limit the number of processors to use in parallel plan execution. The default value of 0 uses all available processors. In addition to queries, SQL Server 2005 considers parallel execution plans for index DDL operations and static and keyset-driven cursor population. You can override the max degree of parallelism value in queries by specifying the MAXDOP query hint in the query statement.

The relational engine has been enhanced to perform more types of tasks in parallel. As mentioned earlier, SQL Server can now update statistics in parallel. Another example of improved support for parallelism is parallel index operations. In SQL Server 2005 Enterprise Edition, indexes can be created or rebuilt in parallel. DDL statements such as CREATE/ALTER/DROP INDEX and ALTER TABLE now accept a MAXDOP setting that you can provide to override the max degree of parallelism value.

SQL Server 2005 avoids the intra-query deadlock bug (see Microsoft Knowledge Base article 837983) present in SQL Server 2000. SQL Server 2000 Service Pack 3 first fixed this problem (see Microsoft Knowledge Base article 315662) by detecting a deadlock that involves the threads that are used by the same parallel query and returning error message 8650. SQL Server 2005 is designed so that intra-query deadlock will not happen.

Row Versioning

The SQL Server 2005 database engine implements a new functionality called row versioning that is designed to improve performance by avoiding reader-writer blocking. Whenever a transaction modifies a row, SQL Server uses the tempdb system database to maintain a copy of the original (that is, before image of the) row. If multiple transactions modify a row, multiple versions of the row are stored in a version chain. The versioning information is saved in 14 bytes of the row header. These 14 bytes include transaction sequence number (TXN) and are added to the header when the row is modified for the first time. Because a transaction can read the before versions of a row from the version store in tempdb, the reader does not block the writer.

The following are some of the new features in SQL Server 2005 that depend on row versioning functionality:

• Snapshot isolation level—This is a new transaction isolation level that provides functionality similar to that of the REPEATABLE READ isolation level, without the issue of readers blocking writers.

• New implementation of the read-committed isolation level—SQL Server 2005 provides a new database option called READ_COMMITED_SNAPSHOT that, when turned on, uses row versioning to provide the same functionality as the READ COMMITTED (the default) isolation level, without the issue of readers blocking writers.

• Multiple active result sets (MARS)—MARS is an ADO.NET 2.0 feature that is supported by SQL Server 2005 to allow the execution of multiple statements on a single connection. Row versioning is used to version the rows affected by data modification statements issued by a MARS session.

• Online index operation—This is a new high-availability feature introduced in SQL Server 2005 that allows you to create, rebuild, and drop indexes, while allowing concurrent access to the underlying table or clustered index data and any associated nonclustered indexes.

Row versioning is turned off by default. You can enable it by turning on the ALLOW_SNAPSHOT_ISOLATION database option and using the new snapshot isolation feature or by turning on the READ_COMMITED_SNAPSHOT database option and using the default READ COMMITTED isolation level. When row versioning is enabled, you must ensure that the tempdb database has adequate space to hold the version store.

You can monitor row versioning usage by using DMVs such as sys.dm_tran_current_transaction, sys.dm_tran_top_version_generators, sys.dm_tran_version_store, sys.dm_tran_active_snapshot_database_transactions, and sys.dm_tran_transactions_snapshot, as well as various Performance Monitor counters—such as Free Space in tempdb, Snapshot Transactions, and Version Store Size—under the SQLServer:Transactions object. The database engine periodically removes from the version store in tempdb rows that are no longer needed to support snapshot or read-committed transactions.

More details on row versioning and snapshot isolation can be found in Chapter 9.

System Metadata Access and the Security Architecture

System metadata refers to information such as the number and names of tables or views in a database, information about columns in a table, details on constraints and indexes defined for a table, users and login information, and so on. Many applications and scripts access this system metadata information for various purposes. SQL Server 2000 supported rich metadata exposure through SQL-92 INFORMATION_SCHEMA views, system tables at the server and database level, and system stored procedures. However, there were some limitations, such as the following:

• Although it was not recommended, SQL Server 2000 did allow direct updates to system objects.

• Even if a user did not have any permission on the object, the user could still see the object metadata by using one of the previously mentioned metadata access mechanisms.

• Developers and DBAs spent a lot of time understanding the strange bitmasks used in system tables.

• Last, but definitely not the least, upgrades and service packs involved dropping and re-creating system objects. This process was very time-consuming, and, more importantly, it made rolling back a service pack very difficult, if not impossible.

Metadata access and management has been completely redesigned and rethought in SQL Server 2005. The following section explains a new system database called the Resource database and the role it plays. Next, you will learn about how the new security architecture restricts access to system metadata. Finally, the following section discusses the concept of schemas, which is a major change in the SQL Server security architecture.

The `Resource` Database

The SQL Server 2005 database rengine makes use of a new system database called the Resource database (with database ID 32767). This is a hidden, read-only database that contains all the system objects that are included with SQL Server 2005. These system objects were present in the master database in previous releases. The Resource database consists of definitions of system stored procedures, views, functions, assemblies, and so on. The system objects are persisted in the resource database, but they logically appear in every database. These system objects are present in a schema named sys. (The concept of schemas is discussed later in this chapter.) The Resource database does not show up in graphical tools or when you access the sys.databases catalog view (discussed later in this chapter). The resource database files (that is, mssqlsystemresource.mdf and mssqlsystemresource.ldf) are present in the same folder where the master database files reside.

Caution

Do not move or rename the Resource database file, or SQL Server will not start. Also, do not put the Resource database in either compressed or encrypted NTFS file system folders. Doing so hinders performance and prevents upgrades.

Note that the Resource database does not contain any user data or user metadata. The system-level information for an instance of SQL Server is still saved in the master database. Therefore, it is not necessary to backup the Resource database (unless some changes are made to the Resource database to apply a Quick Fix Engineering (QFE) based on instructions from a Microsoft Customer Support Services specialist).

The Resource database makes upgrading to a new version of SQL Server an easier and faster procedure. In earlier versions, upgrading involved dropping and creating system objects. Because the Resource database file contains all system objects, an upgrade is accomplished by copying the single Resource database file to the local server. This copy method also means that all that is required to roll back a service pack is an overwrite operation of the current version of the Resource database with the identified older version.

You can access the ResourceVersion and ResourceLastUpdateDateTime server properties to determine the Resource database version and the date and time that the Resource database was last updated:

SELECT SERVERPROPERTY('ResourceVersion'),
SELECT SERVERPROPERTY('ResourceLastUpdateDateTime'),

The only way to access the Resource database is to start SQL Server in single-user admin mode (using the -m startup parameter) and then run the USE mssqlsystemresource SQL statement. After you have connected to a SQL Server 2005 instance in single-user admin mode and changed the database context to Resource database by running the USE mssqlsystemresource SQL statement, you can then directly query the system base tables, such as master.sys.sysdbreg, in the master database or the sys.sysidxstats system table in any user database. You need to ensure that the Resource database is the current database in the context, and then you can query system tables in any user database or in the master database. Note that these system tables are not the ones available in previous releases. The system tables from previous releases have been replaced with backward-compatibility views, as explained in the next section. These system base tables are used by the SQL Server 2005 database engine and by Microsoft personnel during the troubleshooting process.

Note

When DBCC CHECKDB is executed against the master database, a second CHECKDB is also run internally on the Resource database. This means that running DBCC CHECKDB on the master database can return extra results. The command returns extra result sets when no options are set or when either the PHYSICAL ONLY or ESTIMATE ONLY option is set.

Metadata Access and Visibility

The new system metadata architecture in SQL Server 2005 is designed to meet the following goals:

• Provide a simple, consistent, secure, and efficient way to access system metadata

• Disallow direct updates to system tables

• Provide maximum backward compatibility

• Restrict access to metadata and provide a permission-based solution to allow viewing the metadata

To meet these design goals, SQL Server 2005 introduces the changes described in the following sections.

Introducing Catalog Views

All the system tables available in previous releases are now shipped as backward-compatibility views, and their use is discouraged. SQL Server 2005 introduces the notion of catalog views, and using them is the preferred way to access system metadata. There are four different types of views in SQL Server 2005: catalog views, backward-compatibility views, DMVs, and information schema views. Using catalog views is the most efficient and recommended approach to access system metadata, and it is the only way to access metadata for new features such as Service Broker. System tables available in previous releases have been completely removed, and views having the same names as the system tables are provided for backward compatibility. The backward-compatibility views and information schema are not available for new features introduced in SQL Server 2005.

There are about 286 system metadata views, which include catalog views, DMVs, information schema views, and backward-compatibility views. You can run the following query to obtain a list of all the system metadata views available in SQL Server 2005:

SELECT * FROM sys.all_views
WHERE is_ms_shipped = 1
ORDER BY [name];

All the system metadata objects belong to the sys or INFORMATION_SCHEMA schemas. Catalog views use a naming convention wherein sys.% views describe a user’s metadata, sys.system_% describes system objects, sys.all_% is the union of system objects and user objects, and sys.server_% views describe server-level metadata. Consider the following SQL statements:

USE [AdventureWorks];
SELECT * FROM sysobjects ORDER BY [name];
SELECT * FROM dbo.sysobjects ORDER BY [name];
SELECT * FROM sys.sysobjects ORDER BY [name];
SELECT * FROM sys.objects ORDER BY [name];
SELECT * FROM sys.all_objects ORDER BY [name];
SELECT * FROM sys.system_objects ORDER BY [name];

The first SELECT statement should look familiar. However, note that sysobjects is no longer a system table; rather, it is a backward-compatibility view. The next two statements continue to use the sysobjects backward-compatibility view but illustrate that you can use the dbo or sys schema while accessing the backward-compatibility views. The next SELECT statement uses the sys.objects catalog view, which returns a list of user-defined objects and base system tables present in the database. The sys.all_objects catalog view returns system objects in addition to user-defined objects. These system objects are persisted in the Resource database but are accessible in every database. Finally, the sys.system_objects catalog view returns only the system objects.

Catalog views are designed using the inheritance model, where the base or parent view (such as sys.objects) contains a subset of columns and a superset of rows, while the derived or child views (such as sys.tables and sys.views) return a superset of columns and a subset of rows. The inheritance hierarchy is illustrated in Figure 2.2.

Figure 2.2. The catalog views design is based on the inheritance model.

Try out the following SQL statements and notice that sys.objects returns a subset of columns and a superset of rows, whereas sys.tables, sys.views, and sys.procedures return a superset of columns but a subset of rows:

USE AdventureWorks;
SELECT * FROM sys.objects;
SELECT * FROM sys.tables;
SELECT * FROM sys.procedures;
SELECT * FROM sys.views;

Catalog Security

SQL Server 2005 implements a metadata catalog security system that is similar in spirit to what the ANSI SQL-99 specification calls for. The basic idea is pretty simple: You can see metadata for objects you have access to; you get back an empty set for objects you don’t have access to. A new security layer has been added over the persisted system metadata. All the metadata access mechanisms go through this security layer. Therefore, if a user does not have permission on an object, metadata information for that object is not returned when the catalog is accessed.

For instance, let’s say a user X is created in the AdventureWorks sample database and no permissions are assigned to this user. If this user X tries to access the sys.objects catalog view, the sys.sysobjects backward compatibility view, or the INFORMATION_SCHEMA.TABLES view, an empty result set is returned. Let’s now assume that user X is granted SELECT or VIEW DEFINITION permission on the Sales.Store table. Now, when the same metadata query is run by user X, the results containing metadata for Sales.Store are returned. In other words, metadata information is filtered and made visible only on a need-to-know basis; that is, catalog security returns rows for objects on which the user has some permission. This is a huge change from previous releases, where metadata information was available to all users, regardless of whether they had permission on the object.

Note that sa has access to all the systemwide metadata, and dbo has access to all the databasewide metadata. Also, some metadata is accessible by all database users. Typically, this is for things such as filegroups that have no grantable permissions. Therefore, anyone can query the sys.filegroups catalog view and obtain the database filegroups information. Figure 2.3 shows the new metadata architecture.

Figure 2.3. A new security layer on top of persisted metadata ensures that the metadata access mechanism, such as catalog views, backward-compatibility views, and so on, get the filtered rows based on permissions assigned to the user.

More details on metadata security can be found in Chapter 7, “SQL Server 2005 Security.”

The `allow updates` Option

Because direct updates to system tables are not supported, the system configuration option allow updates is meaningless in SQL Server 2005.

User–Schema Separation

User–schema separation is another important change to the security and management architecture of SQL Server. As per the ANSI SQL-92 standard, the purpose of a schema is to act like a namespace to group related database entities and to avoid name collisions. For instance, if you want to group related objects, such as all the sales objects or all the human resources objects, or if you want to have two objects with the same name, the SQL-92 standard’s answer is to use database schemas.

In earlier versions of SQL Server, a database user was used to form a namespace and to avoid name collision. The name of the owner of the object was used, along with the object name, to uniquely identify an object (for example userX.table1, userY.table1). This approach of treating users as schemas caused of lot of database management headaches. For instance, let’s assume that you wanted to drop a user. Before you did that, either you had to drop all the objects that the user owned or assign the object ownership to a different user. Dropping objects was often not a practical solution, and changing the owner often required change in the application (for example, the application had to be updated to access userY.table1 instead of userX.table1). Applications often used the dbo user as the object owner, but that approach had its own problems (as discussed in Chapter 7).

SQL Server 2005 breaks the tight coupling between users and schemas, and it does not treat users as schemas. SQL Server 2005 introduces support for database schemas as per the SQL-92 standard. Therefore, object names in SQL Server 2005 no longer follow the <owner_name>.<object_name> pattern; rather, object names in SQL Server 2005 are <schema_name>.<object_name>, where <schema_name> is created by using the CREATE SCHEMA DDL statement, and schemas are owned by database users. Database users who have schema creation permissions can create schemas by using CREATE SCHEMA DDL statements. Database users who have object creation permissions can create objects in a schema. Now, if you have to drop a user, all that is required to be done is to just assign the schemas that the user owns to other users. No application change is required because the application will continue to access the objects as <schema_name>.<object_name>, no matter who the schema owner is. Also, instead of changing the object ownership of hundreds of database objects, as before, now you have to just change the ownership of very few database schemas. Figure 2.4 shows the old and new approaches to naming objects.

Figure 2.4. SQL Server 2005 introduces true support for database schemas, simplifying the security and management architecture.

Every database user has a default schema that is used for name resolution. If you do not specify one, SQL Server 2005 assigns dbo as the default schema for the new user. When the object is not fully qualified, SQL Server looks for the object in the user’s default schema. If SQL Server does not find the object, it looks for the object in the dbo schema.

User–schema separation is discussed in detail in Chapter 7.

Storage Engine and I/O Changes

The storage engine is an integral component of the SQL Server architecture, and it is responsible for managing database files, building and reading physical pages, controlling concurrency, handling logging and recovery, carrying out physical I/O, handling table and index traversal, and so on. SQL Server 2005 introduces several enhancements to the storage engine and physical I/O. Here is an overview of these enhancements:

• Instant file initialization—On Windows XP and Windows Server 2003, SQL Server 2005 can instantly initialize data files, without filling the reclaimed disk space with zeros. This can be a huge time saver when you’re creating very large databases (for instance, while creating databases for a restore operation). This feature is discussed in detail in Chapter 8, “Reliability and High Availability in the Database Engine.”

• Data file autogrow—In SQL Server 2000, the default FILEGROWTH setting for data files was 10%. SQL Server 2005 changes this to 1MB. The log file FILEGROWTH setting of 10% remains unchanged.

• Efficient read-ahead—The read-ahead algorithm has been updated to dynamically detect whether read-ahead is required, despite bad estimates from the optimizer.

• Multi-path I/O (MPIO)—SQL Server 2005 leverages the Windows Server 2003 MPIO capability, which allows multiple host bus adapters (HBAs) to be used for concurrent data access. MPIO enables applications to diversify requests to storage in order to circumvent a single point of failure. MPIO dynamically balances the I/O workload.

• Allocation units (AUs)—The SQL Server 2000 storage engine made use of index application map (IAM) pages to find out about the extents (that is, groups of eight pages) used by a heap (that is, a group of tables with no clustered index) or an index. A heap or an index had at least one IAM for each file on which it had allocated extents. The storage engine could read the IAM chain to build a sorted list of the disk addresses that needed to be read. There was a single IAM chain per heap/index in SQL Server 2000.

In SQL Server 2005, each heap or index can have multiple IAM chains. This change has been made to support large object (LOB) data types, such as varchar(max), and also to support row overflow data exceeding 8KB. SQL Server 2005 relaxes the restriction of a maximum of 8060 bytes per row for columns containing varchar, nvarchar, varbinary, sql_variant, or CLR user-defined type columns. Therefore, the following batch would fail in SQL Server 2000, but it succeeds in SQL Server 2005:

USE tempdb;
CREATE TABLE dbo.tblTest(
   c1 varchar(4000),
   c2 varchar(4000),
   c3 varchar(4000));
GO
INSERT INTO dbo.tblTest VALUES
   (REPLICATE('*', 4000), REPLICATE('*', 4000), REPLICATE('*', 4000));
GO

In other words, a SQL Server 2005 table row can contain more than 8,060 bytes. To support this and to support LOB types, SQL Server 2005 introduces two additional IAM chains. SQL Server also introduces a new term called allocation unit that refers to a set of pages owned by an IAM chain. These are the three types of allocation units:

• An allocation unit of type IN_ROW_DATA is used to manage data or index rows that contain all data except LOB data.

• An allocation unit of type LOB_DATA is used to manage LOB data of types text, ntext, image, xml, varchar(max), nvarchar(max), varbinary(max), or CLR user-defined data types.

• An allocation unit of type ROW_OVERFLOW_DATA is used to manage variable-length data stored in varchar, nvarchar, varbinary, or sql_variant columns that exceed the 8,060-byte row size limit.

You can use the sys.allocation_units and sys.system_internals_allocation_units catalog views to obtain information about each allocation unit in a database. Note that sys.system_internals_allocation_units is for internal use only and may change in the future.

• Table and index partitioning—You can significantly enhance the manageability and scalability of large tables and indexes by horizontally partitioning the data or index into multiple partitions. This feature supersedes the partitioned views functionality available in earlier releases.

Partitioning enables index and table data to be spread across multiple filegroups in partitions. A partition function defines how the rows of a table or an index are mapped to a set of partitions based on the values of certain columns, referred to as partitioning columns. A partition scheme maps each partition specified by the partition function to a filegroup. This lets you develop archiving strategies that enable tables to be scaled across filegroups and, therefore, physical devices. In some situations, partitioning may also improve performance. Partitioned tables and indexes are supported in the Enterprise and Developer Editions of SQL Server 2005 only.

SQL Server 2005 introduces new DDL statements, such as CREATE PARTITION FUNCTION and CREATE PARTITION SCHEME, and enhances existing DDL statements for creating tables and indexes, to allow you to set up table/index partitioning. A complete description of partitioning is beyond the scope of this chapter. Refer to SQL Server 2005 Books Online and whitepapers on the Microsoft website for more details.

Summary

SQL Server 2005 is a significant step up from its predecessors when it comes to database engine architecture enhancements. This chapter covers SQL Server 2005’s architectural enhancements and innovations. The chapter begins with an explanation of a core SQL Server engine component called SQLOS, followed by information on NUMA support in SQL Server 2005. The chapter then discusses query processing and optimization enhancements, followed by the new scalability and performance functionality called row versioning. Next, changes to the metadata and security architectures are discussed. The final section in this chapter presents an overview of storage engine enhancements.

It is important that you understand the feature and enhancements that necessitated the architectural changes discussed in this chapter. Chapter 3, “SQL Server 2005 Setup and Deployment,” covers information related to installing and upgrading SQL Server. The rest of the book explores new features introduced in SQL Server 2005.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 2. SQL Server 2005 Database Engine Architectural Changes

Create new playlist

Sign In

Sign Up

Chapter 2. SQL Server 2005 Database Engine Architectural Changes

The SQL Server Operating System (SQLOS)

User Mode Scheduler (UMS)

Introducing SQLOS

Non-Preemptive Scheduling

The max worker threads Option

Routing Tasks to a Scheduler

Memory Management

Non-Uniform Memory Access (NUMA) Architecture Support

Query Optimization and Execution

Statistics Management in SQL Server 2005

Note

Statement-Level Recompilation

Note

Calculating Query Plan Cost

Parallel Query Processing

Row Versioning

System Metadata Access and the Security Architecture

The Resource Database

Caution

Note

Metadata Access and Visibility

Introducing Catalog Views

Catalog Security

The allow updates Option

User–Schema Separation

Storage Engine and I/O Changes

Summary

Table of Contents for
Chapter 2. SQL Server 2005 Database Engine Architectural Changes

The `max worker threads` Option

The `Resource` Database

The `allow updates` Option