Chapter 4. A Tour of New Features in SQL Server 2005

IN THIS CHAPTER

Setup Enhancements

Database Administration and Management

Database Engine Enhancements

T-SQL Enhancements

SQL Server 2005 Security

Performance Monitoring and Tuning Enhancements

High-Availability Features

Business Intelligence Redefined

Notification Services

Full-Text Search Enhancements

Documentation Improvements

For over 11 years, Microsoft SQL Server has been delivering an enterprise-ready, scalable, reliable, and cost-effective data management platform. SQL Server 2005 takes this premise to the next level by introducing several innovative features related to enterprise data management, developer productivity, and business intelligence. This chapter provides an overview of all these new features, which are then discussed in great detail throughout rest of the book.

Setup Enhancements

As discussed in Chapter 3, “SQL Server 2005 Setup and Deployment,” SQL Server 2005 setup is based on Windows Installer 3.0 (MSI) technology. Here are some of the benefits offered by the MSI-based setups:

Rollback support—Windows Installer setups are transaction based; a rollback occurs if an installation fails for any reason.

Logging and troubleshooting—Windows Installer technology gives unprecedented control over the level of detail for logging and the name and location for the log file. A verbose setup log can be generated to troubleshoot setup issues.

Self-healing—This MSI feature ensures that all the files installed by the setup are available for the application to run. If any of the installed files are removed by accident, Windows Installer reinstalls the files.

Reliable installation and uninstallation—Windows Installer provides consistent and reliable version rules, which prevent newer files from being overwritten by old files. MSI features systemwide management of shared resources (files and registry keys) that Windows Installer–based applications use. This ensures that when you uninstall a product, Windows Installer does not remove any component that has other applications in its client list.

Ease of customizing and command-line control—An administrator can create MSI Transforms (MSTs) to customize the Windows Installer setup. Also, a large number of command-line switches are available with the MSI setup execution engine (msiexec.exe), and they can be used to customize the setup behavior.

By using Windows Installer technology for the setup, SQL Server 2005 leverages all these features.

SQL Server 2005 setup reduces the likelihood of setup failures by introducing a startup process known as Setup Configuration Checker (SCC). This process ensures that all the prerequisites for the installation are met before the actual process of installation kicks off.

The SQL Server 2000 CD includes separate setups for the relational engine and Analysis Services. In addition, the Notification Services and Reporting Services add-ons for SQL Server 2000 have their own setup interface. SQL Server 2005 changes that by introducing a unified, integrated setup environment that presents all the components in a single feature tree. SQL Server 2005 setup offers granular control over individual components that you can install or uninstall.

SQL Server 2005 setup is enhanced to support up to 50 instances each of SQL Server, Analysis Services, Notification Services, and Reporting Services.

In case of a setup error, the Internet Help feature introduced in SQL Server 2005 allows you to click the Help button on the setup error dialog and browse to the documentation page on the Microsoft website for more details on the error and to get some troubleshooting guidelines.

One of the common causes of setup failure and reboots with SQL Server 2000 has to do with MDAC version incompatibilities. SQL Server 2005 avoids these problems by not installing MDAC on the server. The new data access application programming interface (API), known as SQL Native Client (SNAC) is the key to breaking the dependency on a particular version of MDAC. This is discussed in more detail in Chapter 3.

SQL Server 2005 setup supports scripted cluster installation. It also allows you to selectively install the features in a clustered environment, rather than a canned feature set, as is the case with SQL Server 2000 clustered setup.

Database Administration and Management

SQL Server is well known for its graphical management tools that allow administrators, developers, and business intelligence professionals to easily manage data and to monitor and tune servers. These graphical tools and wizards make database professionals more productive and leveraged. Hence these tools are one of the key elements to reducing the total cost of ownership (TCO).

However, there are a few issues with the tools in releases prior to SQL Server 2005:

• Database administrators and developers have to use multiple tools to get a job done. For instance, they need to use Query Analyzer to run queries, Enterprise Manager to manage the server, Service Manager to manage services, separate client and server network utilities, and so on.

• Enterprise Manager suffers performance issues with servers that have a large number of databases and database objects.

• All the dialog boxes in Enterprise Manager are modal. Hence if you open a dialog, such as the Server Properties dialog, you cannot go back to Enterprise Manager until you close that dialog.

• The Profiler tool only allows tracing activity on the relational server; it does not have any knowledge of Analysis Services.

• The tools do not offer project/solution management or integration with source code control system.

• SQL Server 2000 Query Analyzer allows editing and running Transact-SQL (T-SQL) queries only. There is no built-in editor (other than the MDX sample application) to run Analysis Services Multidimensional Expressions (MDX) queries.

Some of the design goals of SQL Server 2005 are to address these problems; to give more power to DBAs, developers, and information workers; and to simplify the management of SQL Server by providing integrated tools.

Introducing SQL Server Management Studio

SQL Server Management Studio is a new integrated tool that combines the functionality of three tools: Query Analyzer, Enterprise Manager, and Analysis Manager. In addition, it allows you to manage all other components, such as Reporting Services, SQL Server Integration Services (SSIS; formerly known as DTS), Notification Services, and SQL Server Mobile.

SQL Server Management Studio defers loading all databases and database objects in the tree, and it expands the nodes asynchronously. This is very useful if you are working on a server with a large number of databases and database objects. In addition, Management Studio allows filtering of the database objects in the tree by name, schema, and creation date. All the dialog boxes in Managed Studio are non-modal, resizable, schedulable, and scriptable.

In addition to T-SQL queries, the Query Editor in Management Studio allows you to work with MDX, Data Mining Prediction (DMX), and XML for Analysis (XMLA) queries.

Management Studio supports the notion of solutions and projects and provides full integration with source code control system such as Microsoft Visual SourceSafe.

Figure 4.1 shows various windows inside the Management Studio, such as Registered Servers, Object Explorer, Query Editor, Solution Explorer, Execution Plan, and Properties.

Figure 4.1. SQL Server 2005 Management Studio is a new integrated management environment.

image

In a nutshell, Management Studio is an integrated environment for accessing, configuring, managing, and administering all components of SQL Server. Management Studio and other tools are discussed in more detail in Chapter 5, “SQL Server 2005 Tools and Utilities.”

SQL Server Configuration Manager

SQL Server Configuration Manager is an MMC snap-in (like Enterprise Manager in SQL Server 2000) that is intended to replace Server Network Utility, Client Network Utility, and Service Manager. SQL Server Configuration Manager was referred to as SQL Computer Manager in earlier SQL Server 2005 beta builds.

As shown in Figure 4.2, SQL Server Configuration Manager can be used to manage SQL Server 2005 services and service accounts, including Analysis Services, Agent, Reporting Services, Notification Services, SQL Server Browser, and full-text search. It can also be used to manage client and server network connectivity options.

Figure 4.2. The SQL Server Configuration Manager combines Client Network Utility, Server Network Utility, and Service Manager into a single integrated application.

image

SQL Server Configuration Manager ships with all SQL Server 2005 editions, including Express Edition.

SQLCMD and Dedicated Administrator Connection

SQLCMD is DOS command-prompt utility introduced in SQL Server 2005 that supersedes isql.exe and osql.exe. Like ISQL and OSQL, SQLCMD.exe can be used to execute T-SQL statements and scripts. Unlike ISQL and OSQL, SQLCMD has extensive support for scripting and variables. SQLCMD.exe uses OLE DB to connect and execute the T-SQL batches.

You can use the -A switch with SQLCMD.exe in situations where SQL Server is not responding or is otherwise unavailable. This switch establishes a dedicated administrator connection with SQL Server 2005 and can be very helpful while troubleshooting.

DDL Triggers and Event Notifications

With SQL Server 2000, the only way to audit the activity at the server level or at the database level for DDL events (such as CREATE/DROP/ALTER TABLE) was to use SQL Profiler. However, no DBA would prefer running SQL Profiler all the time on the production server as an auditing solution.

SQL Server 2005 introduces two solutions to this problem: DDL triggers and event notifications.

Almost all RDBMSs, including SQL Server, support DML triggers. You can write a T-SQL module called a trigger that will automatically execute if an INSERT, DELETE, or UPDATE occurs on a table on which the trigger is defined. This trigger executes synchronously in the scope of the same transaction as the action that caused the trigger to fire, and if you decide to, you can roll back the entire transaction.

DDL triggers are new in SQL Server 2005. You can define a DDL trigger at the database level or at the server level. A database-level DDL trigger allows you to capture events such as CREATE/ALTER/DROP TABLE, VIEW, USER, ROLE, and other DDL statements. The server-level DDL trigger allows you to respond to events such as CREATE/ALTER/DROP LOGIN, CERTIFICATE, and other server-level DDL statements.

Much like DML triggers, DDL triggers are also created by using CREATE TRIGGER T-SQL statements, and they also execute in the same transaction context as the action that started it. You can roll back an entire transaction from the trigger, and you can nest DDL triggers. You can use the EVENTDATA() function inside a DDL trigger to access the information about the event that fired the DDL trigger. The inserted and deleted tables are not created or available inside DDL triggers.

However, in case you would like to respond to DDL events asynchronously or process them on a different or remote server, SQL Server 2005 introduces an alternative to DDL triggers: event notifications. In addition to DDL events, you can also use the event notification mechanism to respond to various profiler trace events. For instance, an event notification mechanism can be set up to notify a DBA whenever a LOCK_DEADLOCK trace event is raised, indicating a deadlock scenario.

Unlike DDL triggers, event notifications can be processed asynchronously and, if required, on a different server. The CREATE EVENT NOTIFICATION T-SQL statement is used to set up an event notification to respond to DDL or trace events or event groups. When an event notification is raised, a message is posted to a Service Broker queue. Service Broker is discussed later in this chapter and in detail in Chapter 15, “SQL Server 2005 Service Broker.” Note that because an event notification is not executed in the same transaction scope as the action that raised it, the transaction cannot be rolled back. You can use the EVENTDATA() function to access the details about an event.

Database Mail

The SQL Mail functionality in SQL Server 2000 can be used to send and receive emails from within T-SQL code. However, it has few limitations:

• SQL Mail uses extended MAPI and requires the Microsoft Outlook 2000 client or later to be installed on the SQL Server machine. (Refer to Microsoft KB Articles 263556 and 281293 for more information about this.) This poses extra configuration headaches. But more than that, having an email client installed on a SQL Server box is considered a security risk.

• SQL Mail is not supported on 64-bit platforms.

• SQL Mail does not scale well.

• SQL Mail does not impose any restriction on attachment file size or attachment file extensions.

SQL Server 2005 fixes all these problems by introducing a new SMTP-based emailing solution called Database Mail. Database Mail was referred as SQLiMail in earlier SQL Server 2005 beta builds.

Database Mail does not require MAPI or Outlook to be installed; rather, it uses SMTP to send emails. Database Mail is scalable because it uses Service Broker to send emails asynchronously. The MaxFileSize and ProhibitedExtensions parameters in the msdb..sysmail_configuration table govern the attachment file size and prohibited attachment file extensions. Management Studio provides a user interface to tune these settings. Database Mail is supported on both 32-bit and 64-bit platforms. However, note that Database Mail is not a 100% replacement for SQL Mail in this release because it does not support reading emails. Chapter 5 covers Database Mail in great detail.

Management APIs: SMO, RMO, AMO, and WMI

SQL Server 2000 included two options to programmatically administering SQL Server: using COM-based SQL Distributed Management Objects (SQL-DMO) and using the Windows Management Instrumentation (WMI) API.

SQL-DMO was first introduced in SQL Server 6.5, where it was called SQLOLE; it was then renamed SQL-DMO in SQL Server 7. Enterprise Manager in SQL Server 7 and SQL Server 2000 internally uses SQL-DMO, and hence you could use SQL-DMO to automate all the functionality provided by Enterprise Manager.

SQL-DMO is still supported, but it has been deprecated in SQL Server 2005. It has not been updated to support the new features in this release. SQL-DMO is being superseded by a new .NET-based class library called SQL Server Management Objects (SMO). Administrators can use SMO to programmatically administer SQL Server 7, 2000, and 2005. SQL Server Management Studio internally uses SMO, and hence you can use SMO to automate all the functionality provided by Management Studio to manage a relational server.

In addition to providing maximum coverage of SQL Server 2005 features, the SMO API contains several other improvements over SQL-DMO. These changes include a cached object model, delayed instantiation of objects for improved scalability and performance, enhanced scripting capabilities, and improved ease of use. Further details on the SMO can be found in Chapter 5.

The replication part of the management API is now available in a separate .NET object library called Replication Management Objects (RMO). RMO allows you to programmatically configure and manage the replication architecture.

As mentioned earlier, SQL Server Management Studio internally uses the SMO API to provide server administration and management functionality. To manage Analysis Services, it uses another .NET-based object library called Analysis Management Objects (AMO). AMO is also used by the Business Intelligence Studio tool. AMO is the successor of the COM-based API called Decision Support Objects (DSO), which is available in previous releases. DSO is still available in SQL Server 2005, but AMO is the recommended API for programmatically managing Analysis Services.

WMI Overview

WMI is a component of the Microsoft Windows operating system. WMI provides an industry-standard programming interface for managing an enterprise environment. WMI is based on the Web-Based Enterprise Management (WBEM) initiative. WBEM is a set of standard technologies designed to allow administrators to use a consistent method of managing different systems in an enterprise. The standards that comprise WBEM are developed by the Distributed Management Task Force (DMTF; www.dmtf.org).

An entity managed through WMI is called a managed object. A managed object can be a hardware entity, such as memory or a disk drive, or a software entity, such as a user account or SQL Server. WMI provides a consistent way to monitor and control managed objects, regardless of the type of object. For instance, you can use WMI to monitor the amount of free space on a disk drive, to change the permissions for files on the disk drive, to manage virtual directories and permissions on an IIS Web server, and so on. You can access the WMI API from various programming languages, such as VBScript, C++, Visual Basic, and .NET Framework languages, and you can create applications and scripts to automate administrative tasks.

WMI, like SQL-DMO, is a COM API. The WMI API communicates with the WMI Windows Service, which in turn interacts with various WMI providers, such as the SQL Server Administration provider, the Active Directory provider, the NT Event Log provider, and so on. A WMI provider is a COM object that monitors a managed object for the WMI. For instance, you can write a VBScript that uses the NT Event Log WMI provider to monitor Application event log entries that are from SQL Server. For instance, as soon as SQL Server logs anything in the Application log, your VBScript can then send an email or a pager notification.

The WMI API also supports a querying language, WMI Query Language (WQL). By using WQL queries, you can request only the data of interest from the provider rather than incur the overhead of enumerating over all the objects.

WMI and SQL Server 2000

SQL Server 2000 was the first release to introduce the WMI API support. The WMI SQL Administration provider can be used to automate various administration tasks and monitor SQL Server events.

SQL Server WMI support is not installed by default. The SQL Server 2000 CD contains the setup files you need in order to use the WMI to administer SQL Server. After you have run the WMI setup, you can then run WMI scripts and applications to automate and monitor SQL Server.

WMI and SQL Server 2005

WMI support in enabled by default in SQL Server 2005, and it is further extended to support SQL Server 2005 features such as Reporting Services. SQL Server Configuration Manager internally uses the WMI API. Hence you can use WMI to automate all the functionality provided by SQL Server Configuration Manager. The WMI may seem like a complex API to use, but the new SMO .NET object library provides an object-oriented wrapper around WMI that is simpler to use. Finally, the WMI enhancements introduced in SQL Server 2005 allow you to generate SQL Server agent alerts in response to WMI events. For instance, you can write a T-SQL script to monitor WMI disk events, and if disk space on a particular drive drops below the specified size, a SQL Server Agent alert can be raised, which in turn notifies the administrator and may also start shrinking the database or log files.

WMI support in SQL Server 2005 is discussed in more detail, with examples, in Chapter 5.

Database Engine Enhancements

The two primary components of the SQL Server database engine are the relational engine and storage engine. The relational engine parses and processes the query, optimizes the execution plan, talks with the storage engine to obtain the results, and formats the results to be sent to the client. The other database engine component, storage engine, is responsible for physical I/O and concurrency control. The storage engine manages the database files and physical data pages. SQL Server 2005 contains significant changes to these two engine components to support new functionality and to improve on existing functionality.

The New xml Data Type

SQL Server 2000 first introduced support for XML by allowing turning a relational rowset into XML by using the FOR XML clause with the SELECT statement and converting XML into a relational rowset by using the OPENXML function. The support for XML in SQL Server 2000 was further extended with the SQLXML Web releases. However, if you need to store XML data inside a database table, the only option with SQL Server 2000 was to use the char/nchar/varchar/nvarchar or text/image data types. In addition, there is no built-in support for working with XML text stored in the database. For instance, if you store XML data in a varchar column, there is no way to execute XPath or XQuery queries to locate data inside XML or to efficiently update XML data.

SQL Server 2005 fills this gap by introducing a new data type called xml. In SQL Server 2005 you can have columns, variables, and parameters of xml data type. There are methods available to query and update XML data. You can store up to 2GB of XML data in a column. A typed XML column is one that has an XML Schema (XSD) associated with it; otherwise, it is called an untyped XML column.

Here is some sample T-SQL code that illustrates the xml data type:

USE TempDB;
GO

CREATE TABLE dbo.tblTest(id INT IDENTITY(1,1) PRIMARY KEY NOT NULL,
   logtext xml not null);
GO

INSERT INTO dbo.tblTest (logtext) VALUES ('<log result="1" />'),
GO

DECLARE @varTest xml;
SET @varTest = '<log result="0" />';
INSERT INTO dbo.tblTest (logtext) VALUES (@varTest);
GO

SELECT * FROM dbo.tblTest;
GO

SELECT * FROM dbo.tblTest
WHERE logtext.value('(/log/@result)[1]', 'int') = 1;
GO

DROP TABLE dbo.tblTest;
GO

This T-SQL script first creates a table that contains a column of xml data type. It then inserts some sample data by directly providing the XML text string and then by using an xml type variable. The final statement illustrates querying xml type column. If you run this code in the Management Studio Query Editor and view the results in Grid mode, you can see that the xml type column is shown as a hyperlink, and if you click it, you see formatted XML data in a new window.

XML Indexes

To optimize access to XML data, you can create indexes on xml type columns. Without an index, when a query is run on an xml type column, all the XML data is shredded at runtime to evaluate the query. This can be a significant performance hit and can cause queries to run more slowly. Indexes on an xml type column are a little different from those on other columns. First, you have to create a primary index on an xml type column, which is essentially a shredded representation of XML data, and then if you like, you can create secondary indexes to optimize XPath and XQuery queries on the xml type columns.

In addition to using standard XML querying mechanisms using XPath and XQuery, you can also define full-text search on xml type columns and run full-text queries.

In addition to the XML standard, SQL Server 2005 supports various other W3C standards, such as XPath, XQuery, XML namespaces, and XML Schema (XSD).

The xml data type and XQuery are discussed in complete detail in Chapter 10, “XML and Web Services Support in SQL Server 2005.”

Large Value Data Types

With SQL Server 2000, if you stored more than 8000 bytes of text or binary data in a column, the only option was to use text or image data types. However, these two data types are not easy to work with because they require different programming techniques than the other text or binary types, such as varchar and varbinary, and they have some very serious limitations (for example, many string functions do not work on text/image columns, you cannot define a variable of type text/image, and so on). To fix this, SQL Server 2005 introduces a new max specifier. The max specifier can be used with varchar, nvarchar, and varbinary types, and it allows you to store up to 2GB of data into a column, variable, or parameter, without any different handling requirements.

Here is an example of T-SQL code that illustrates the use of the max specifier:

USE TempDB;
GO

CREATE TABLE dbo.tblTest(id INT IDENTITY(1,1) PRIMARY KEY NOT NULL,
   logtext varchar(max) not null);
GO

DECLARE @char varchar(max);
DECLARE @varTest varchar(max);
SET @char = '-';
SET @varTest = REPLICATE(@char, 100000);
INSERT INTO dbo.tblTest (logtext) VALUES (@varTest);
GO

SELECT * FROM dbo.tblTest;
SELECT LEFT(logtext, 10) FROM dbo.tblTest;
SELECT SUBSTRING(logtext, 99990, 1) FROM dbo.tblTest;
SELECT LEN(logtext) FROM dbo.tblTest;
GO

DROP TABLE dbo.tblTest;
GO

This T-SQL batch first creates a table containing a column of type varchar(max). It then inserts 100,000 characters into this column, using a varchar(max) type column, and uses some string functions to illustrate that varchar(max) is no different from other text type columns. In addition, SQL Server 2005 also relaxes the 8060-bytes per row restriction. The following script will not work in SQL Server 2000, but runs successfully in SQL Server 2005:

USE tempdb;
CREATE TABLE dbo.tblTest(
   c1 varchar(4000),
   c2 varchar(4000),
   c3 varchar(4000));
GO
INSERT INTO dbo.tblTest VALUES
   (REPLICATE('*', 4000), REPLICATE('*', 4000), REPLICATE('*', 4000));
GO

.NET Framework Integration

One of the biggest features introduced in SQL Server 2005 is the .NET Framework integration, better known as SQLCLR. The SQL Server 2005 database engine hosts the .NET Framework CLR, and hence it allows .NET code to run from within the T-SQL script. SQLCLR integration enables developers to extend the T-SQL programming language and type system. The SQLCLR integration allows developers to write stored procedures, triggers, and user-defined functions, and to create new types and aggregates by using .NET.

.NET Framework Overview

The Microsoft .NET Framework, introduced in July 2000, is a development platform that allows the building of Windows, web, and mobile applications. To rapidly build and deploy .NET applications, developers can use Visual Studio .NET tools and a programming language of their choice, such as Visual Basic .NET, C#, C++, or J#, to make use of thousands of classes provided by the .NET Framework Base Class Library.

The .NET programming model is simplified by an execution engine, known as the common language runtime (CLR), which handles a lot of tasks that developers had to do previously. These tasks include allocating and managing memory, thread management, lifetime management or garbage collection, security, and enforcement of type safety.

The code written in .NET languages and executed by the CLR is called managed code. Developers can still use legacy COM objects or call a Win32 API from within .NET code, but because the COM objects and Win32 functions are not controlled by the CLR, they are referred to as unmanaged code. The .NET Framework code is compiled into intermediate language (IL) and saved into DLLs or EXEs known as assemblies.

Introducing SQLCLR

In SQL Server 2000, the only way to extend T-SQL was to write extended stored procedures, which required C++ programming knowledge and understanding of the Open Data Services (ODS) API. In addition, a poorly written extended stored procedure could potentially cause a number of unpredictable problems with SQL Server.

The SQLCLR integration in SQL Server 2005 allows developers to write stored procedures, triggers, and functions and to extend the type system and aggregates by using any of the .NET languages and the familiar ADO.NET programming model. The .NET security framework, the CLR, and the SQLCLR hosting layer security make sure that the managed code running inside SQL Server does not compromise the integrity of the SQL Server process.

SQLCLR easily outperforms T-SQL in scenarios such as doing a lot of procedural coding (looping, branching, and so on), string manipulation, or computational tasks in T-SQL code. However, for data access, using T-SQL is still the best approach. The SQLCLR assembly that is imported into the database contains compiled .NET code in IL format. When this code is executed, it is just-in-time (JIT) compiled into machine language. T-SQL is an interpreted language. The compiled code always executes faster than interpreted code.

Developers can use Visual Studio .NET and any of the .NET Framework languages to build assemblies, which are compiled .NET code that can be imported into a SQL Server database by using the CREATE ASSEMBLY T-SQL statement. Depending on what the assembly implements, a T-SQL stored procedure, function, trigger, user-defined type, or aggregate can be mapped to .NET code in the assembly.

Let’s say a developer gives you a .NET assembly called Payroll.dll that contains a managed stored procedure that accepts employee ID and year as the parameters, does some complex calculation, and returns a rowset that contains complete tax details. Here is how you would import the assembly, map a T-SQL stored procedure to the .NET method, and then eventually call the stored procedure:

CREATE ASSEMBLY [CLR_Payroll]
   FROM 'F:FromDevPayroll.dll'
   WITH PERMISSION_SET = SAFE;
GO

CREATE PROCEDURE dbo.usp_GetTaxDetails
(
   @EmployeeID INT,
   @Year INT
)
AS EXTERNAL NAME [CLR_Payroll].[StoredProcedures].[GetTaxDetails]
GO

EXEC dbo.usp_GetTaxDetails 10, 2004;
GO

The important point to note in this sample script is the WITH PERMISSION_SET = SAFE keyword. You as a DBA can control what an imported assembly can or cannot do by using one of the three permission buckets that SQL Server 2005 provides: SAFE, EXTERNAL_ACCESS, and UNSAFE.

SQLCLR Permission Buckets: SAFE, EXTERNAL_ACCESS, and UNSAFE

The assemblies that are imported with SAFE permission set can work with data and use some of the CLR utility classes, but they cannot access external resources (such as the file system or the network), the code must be verifiably type safe by the CLR, and things like COM-interop, PInvoke (Win32 API access), and multithreading are not allowed. This is the default permission set.

The assemblies imported with EXTERNAL_ACCESS are the same as those imported with SAFE, but they also allow access to external sources such as the file system, using the .NET Framework class library.

The assemblies imported with UNSAFE can do virtually anything. They can even call unmanaged code such as COM objects and Win32 functions. It is recommended that UNSAFE permissions be granted only to highly trusted assemblies by experienced developers or administrators. Only members of the sysadmin fixed server role can create UNSAFE assemblies.

The SQLCLR is the subject of Chapter 11, “SQL Server 2005 and .NET Integration.” Refer to that chapter for more details and examples of using the .NET integration features.

The New Catalog Views

With SQL Server 2000, the two ways to access SQL catalog or metadata were by using system tables or SQL-99 INFORMATION_SCHEMA views. SQL Server 2000 even allowed updating of system tables, which was not recommended but possible. In general, it is not recommended to access system tables directly. In addition, in SQL Server 2000 there was no real security implemented on metadata access. This meant that even though a database user did not have permissions to see the data, the user could still see what objects existed in the database (sysobjects) or see the structure of objects (sp_help).

With SQL Server 2005, the system tables are deeply hidden inside, and direct access to them is not allowed. They are not visible at all. The recommended way to access SQL metadata is to use the new catalog views. For backward compatibility, SQL Server provides compatibility views that have the same name and return the same data as the system tables in previous releases. In other words, system tables from previous releases are now compatibility views.

To summarize, SQL Server 2005 completely hides the system tables. There are three ways to access SQL metadata: through catalog views, compatibility views, and SQL-99 INFORMATION_SCHEMA views. Of these three features, only catalog views are available for all the features introduced in SQL Server 2005. For example, there is no INFORMATION_SCHEMA or compatibility view available to explore Service Broker objects. In addition, using catalog views is the most efficient way to access system metadata. Because SQL Server 2005 restricts access to base system tables, the allow updates sp_configure option in SQL Server 2005 is meaningless.

SQL Server 2005 implements a security layer on top of metadata views that is used for permission-based metadata access. For example, if a user executes a SELECT statement on a sysobjects compatibility view, a sys.objects catalog view, or a INFORMATION_SCHEMA. Tables view, SQL Server only returns the objects on which a user has access to select or modify the data, unless special VIEW DEFINITION permission is granted to the object.

The Resource Database

SQL Server 2005 also introduces the Resource database. It is an actual .mdf and .ldf file–based SQL Server 2005 database (mssqlsystemresource.mdf and mssqlsystemresource.ldf), but it is not visible in Management Studio or through metadata views. In previous SQL Server releases, the metadata (schema definitions, system stored procedure text, and so on) for system objects was stored in the master database, and user database–specific system objects were stored in the user database. This posed major issues during upgrade and patching of SQL Server metadata. To expedite and simplify the upgrade process, Microsoft decided to store the system objects metadata inside a hidden Resource database. With this approach, instead of running multiple scripts, the upgrade and rollback process is as simple as copying new or old Resource database files.

The Resource database contains metadata for all system objects, but the system objects logically appear in each database under the sys schema. The master database still holds the actual instance-level data; the user databases still hold the database-specific information, and so on. However, the system objects’ schema/metadata and system stored procedure text are stored in the Resource database.

The new metadata architecture is discussed in Chapters 2, “SQL Server 2005 Database Engine Architectural Changes,” and 7, “SQL Server 2005 Security.”

Scalability, Performance, and Reliability Enhancements

Scalability refers to the ability of a system to maintain and possibly improve performance as the load increases in a predictable manner. SQL Server 2005 introduces several enhancements to the database engine and other subsystems to make sure that the platform scales and performs well even with terabytes of data and thousands of users. In the following sections, you will see some of these new features, and you’ll learn more details about them in Chapters 8, “Reliability and High Availability in the Database Engine,” and 9, “Performance Analysis Tuning.”

Row Versioning and the Snapshot Isolation Level

SQL Server 2000 supports all four isolation levels defined by SQL-92 standard: READ UNCOMMITTED, READ COMMITTED, REPEATABLE READ, and SERIALIZABLE. The default isolation level in SQL Server 2000 and SQL Server 2005 is READ COMMITTED, which ensures that statements cannot read modified data that is not committed yet and hence prevents dirty reads.

Isolation Levels

READ UNCOMMITED is the lowest isolation level. It allows statements to read data that has been modified by other transactions but not yet committed. SERIALIZABLE is the highest isolation level, and it completely isolates two transactions, avoiding all the concurrency issues such as dirty reads, nonrepeatable reads, and phantom reads. In between these two are the READ COMMITTED and REPEATABLE READ levels. Like the READ COMMITTED isolation level, the REPEATABLE READ isolation level also avoids dirty reads by disallowing reading data that is updated but not committed. In addition, it makes sure that no transaction can modify the data that has been read by the current transaction, and hence readers can block writers. The REPEATABLE READ isolation level avoids dirty reads and nonrepeatable reads, but it can still result in phantom reads.

Snapshot Isolation

SQL Server 2005 introduces a new isolation level called snapshot isolation that is like REPEATABLE READ without the issue of readers blocking the writers. In other words, while the snapshot isolation level prevents dirty reads and nonrepeatable reads, other transactions can continue to update the data that is read by the current transaction. How does SQL Server achieve this? The answer is row versioning.

Row Versioning in Detail

Row versioning is an alternative technique to locking to ensure the integrity of transactions and to maintain the consistency of data when multiple users are accessing the data at the same time. With row versioning, SQL Server uses tempdb to maintain a version of each row that is modified. Whenever a transaction modifies a row, SQL Server copies the original row in the tempdb database. This technique is used to enable REPEATABLE READ behavior without the issue of readers blocking writers, as readers can now read the original row from the tempdb database. By avoiding blocking, snapshot isolation enables applications to scale and perform better. The applications with more read activity than writes benefit the most from the snapshot isolation level.

To see snapshot isolation in action, you can start two query window sessions in Management Studio and execute the following two T-SQL batches, which create the blocking scenario by using the REPEATABLE READ and READ COMMITTED isolation levels:

Connection 1:

USE [AdventureWorks];
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
BEGIN TRANSACTION;
SELECT * FROM Sales.Store;

Connection 2:

USE [AdventureWorks];
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
BEGIN TRANSACTION;
UPDATE Sales.Store SET SalesPersonID = 280 WHERE SalesPersonID = 281;

Notice that the UPDATE statement is blocked because some other transaction in the REPEATABLE READ isolation level has already read the data.

Run the ROLLBACK TRANSACTION statement in the first connection and then in the second Query Editor window.

Let’s now see how using the snapshot isolation level instead of repeatable read fixes the blocking problem. You need to change the T-SQL in the connection 1 window to use the snapshot isolation level:

Connection 1:

ALTER DATABASE AdventureWorks SET ALLOW_SNAPSHOT_ISOLATION ON;
GO

USE [AdventureWorks];
GO

SET TRANSACTION ISOLATION LEVEL SNAPSHOT;
BEGIN TRANSACTION;
SELECT * FROM Sales.Store;

Next, you should run this script in connection 1 and run the same UPDATE statement inside a transaction in connection 2. Note that this time the UPDATE statement is not blocked. If you switch back to connection 1 and run just the SELECT statement again, you should notice that it still shows the original values, which are similar to the REPEATABLE READ behavior. You should roll back the transactions in both the connections and close the query windows.

You can use a T-SQL statement similar to the following in order to find out whether snapshot isolation is enabled on the database:

SELECT snapshot_isolation_state, snapshot_isolation_state_desc
FROM sys.databases WHERE name = 'AdventureWorks';

Note that there is a cost involved with using snapshot isolation and with row versioning in general. Because versions of rows are stored in tempdb, you have to make sure that you have sufficient space available in the tempdb database, and you also need to consider the performance impact of tempdb I/O that is caused because of row versioning. Updates and deletes can be a little slow because UPDATE might cause tempdb activity. While reading the data, SQL Server has to traverse the version history to determine where to read the row from.

In addition to snapshot isolation level, SQL Server 2005 also introduces a new flavor of the READ COMMITTED isolation level that makes use of row versioning to avoid blocking. This is achieved by turning on the READ_COMMITTED_SNAPSHOT option on the database, as discussed in detail in Chapter 8.

Service Broker

Service Broker is a new scalability technology introduced in SQL Server 2005 that allows you to build reliable, asynchronous queued database applications. All you have to worry about is sending and receiving messages using T-SQL, while the rest of messaging framework, including routing and security, is built into the database engine.

Asynchronous Messaging

One of the important aspects of building scalable applications is performing parts of the operations asynchronously. For example, when an order is submitted, you want to process inventory, shipping, and accounting. If these three tasks are performed synchronously when an order record is inserted, the order submission process takes longer, and the application will not scale. However, if these three tasks are done asynchronously, the order process finishes very quickly, and the application can scale. When an order record is inserted, it just posts a message and returns; this message is later retrieved asynchronously and then the inventory, shipping, and accounting tables are updated appropriately.

This asynchronous messaging breaks the tight coupling between components and allows application to perform better, scale out, distribute the load, and defer the things for batch processing. However, the challenge with asynchronous messaging is that someone should guarantee that the message is delivered for sure, delivered once and only once, and delivered in the order in which it was sent in a batch of messages.

On the application and middle tiers, these services are provided by specialized technologies such as Microsoft Message Queuing (MSMQ), IBM WebSphere MQ (formerly MQSeries), and so on. The SQL Server Service Broker brings these services into the database engine so that you can do asynchronous messaging at the database level.

For example, you can use Service Broker to implement asynchronous triggers. In other words, when a trigger is fired, it posts a message into a queue and returns immediately. This message is later received and processed by a Service Broker application, which can be a T-SQL or SQLCLR stored procedure or an external application.

More details on Service Broker can be found in Chapter 14.

Nonclustered Indexes with Included Columns

Designing and using proper indexes is one of the keys to maximizing query performance. One of the recommendations for designing indexes is to keep the index size small, which tends to be efficient. On the other hand, having all the data required by the query obtained from the covering index without touching the clustered index or table data page results in lesser I/O and better query throughput.

You might wonder whether you should include more columns in the index to maximize the covered query instances or whether you should keep the index key size small and efficient. With SQL Server 2000, you had to choose between these two choices, based on your performance benchmark results. SQL Server 2005 introduces the concept of including non-key columns in the index; this is designed to provide the best of both worlds: smaller key size and at the same time more success of an index being a covering index.

You can use the INCLUDE keyword along with a list of one or more columns and the CREATE INDEX statement to indicate that you want one or more non-key columns to be saved in the leaf level of an index. This way, there are more chances that the query will find all it needs in the index pages itself, without any further lookups. However, you should carefully design and determine when and what columns should be included. Including non-key columns as part of indexes increases the disk space requirement, fewer index rows can fit on index pages, and data is stored and updated twice (once in the base table and then as part of an index at the leaf level). More details on this can be found in Chapter 9.

Persisted Computed Columns

A computed column’s value is calculated from an expression by using other columns in the same table. With SQL Server 2000, computed columns are always virtual columns, not physically stored in the table. Their value is determined at runtime, based on the columns in the expression and the expression itself.

SQL Server 2005 introduces the ability to persist the computed column values. The PERSISTED keyword can be specified with the computed column definition in the CREATE TABLE or ALTER TABLE statement to indicate that a computed column’s value must be physically stored and updated if any column in the expression is updated. If a computed column is persisted in the data pages, it can speed retrieval for computing-intensive columns. In addition, persisted computed columns allow defining indexes in cases where virtual computed columns prohibit creation of indexes. For instance, if an expression is of type float or real, an index cannot be created on such virtual computed columns, but an index can be created on persisted computed columns. If computed columns are persisted, you can create NOT NULL, FOREIGN KEY, and CHECK constraints in addition to UNIQUE and PRIMARY KEY constraints.


Note

The persisted computed columns feature is not available in SQL Server 2005 Mobile Edition.


Here is a sample T-SQL script that illustrates persisted computed columns:

USE TempDB;
GO

CREATE TABLE dbo.tblTest(id INT IDENTITY(1,1) PRIMARY KEY NOT NULL,
   price float not null,
   qty float not null,
   cost as price * qty);
GO

INSERT INTO dbo.tblTest (price, qty) VALUES (100, 100);
INSERT INTO dbo.tblTest (price, qty) VALUES (200, 200);
GO

SELECT * FROM dbo.tblTest WHERE cost < 40000;
GO

--next stmt will fail; index can't be created on imprecise computed column
CREATE INDEX idTest ON dbo.tblTest(cost);
GO

DROP TABLE dbo.tblTest;
GO

CREATE TABLE dbo.tblTest(id INT IDENTITY(1,1) PRIMARY KEY NOT NULL,
   price float not null,
   qty float not null,
   cost as price * qty PERSISTED NOT NULL CHECK (cost > 0));
GO

INSERT INTO dbo.tblTest (price, qty) VALUES (100, 100);
INSERT INTO dbo.tblTest (price, qty) VALUES (200, 200);
GO
--next stmt will fail; CHECK constraint violation
INSERT INTO dbo.tblTest (price, qty) VALUES (200, -1);
GO

SELECT * FROM dbo.tblTest WHERE cost < 40000;
GO
--next stmt will succeed; index can be created on imprecise

--but persisted computed column
CREATE INDEX idTest ON dbo.tblTest(cost);
GO

SELECT * FROM dbo.tblTest WHERE cost < 40000;
GO

DROP TABLE dbo.tblTest;
GO

This script begins by creating a table containing a virtual computed column of float data type. SQL Server 2000 and 2005 do not allow creating an index on such a column because the computed column is not precise. But if the computed column is PERSISTED, SQL Server 2005 allows creating an index on that column, as well as defining other constraints such as a CHECK constraint, as illustrated in this script.

Partitioned Tables and Indexes

Partitioning of data and indexes is not a new concept. Especially for very large databases (VLDBs), partitioning of data and indexes has been one of the design strategies. Table and index partitioning is a performance, scalability, and manageability feature. Data in a partitioned table is horizontally split into physical units, which may be spread across more than one filegroup in a database.

What’s new in SQL Server 2005 is the simplicity and ease with which you can implement and manage partitioned tables and indexes. You can provide a function and scheme for a table, and SQL Server will make sure to route rows automatically to appropriate partitions, based on the partition key. SQL Server 2005 also supports range partitions, where the partition key is the customized ranges of data. This subject is discussed in more detail, with examples, in Chapter 8.

Miscellaneous Performance Improvement Techniques

To wrap up this section, the following are some other performance-related changes introduced in SQL Server 2005.

DATE_CORRELATION_OPTIMIZATION

SQL Server 2005 introduces a new database option, DATE_CORRELATION_OPTIMIZATION, which can improve the performance of queries that perform an equi-join between two tables whose datetime columns are correlated and that specify a date restriction in the query predicate. This option is turned off by default. When you turn on this option, SQL Server maintains correlation statistics between any two tables in the database that have datetime columns and are linked by a foreign key constraint. These statistics are later used during query optimization to improve performance. You can use an ALTER DATABASE statement to turn on or off this option and the is_date_correlation_on field in the sys.databases catalog view to find out whether this option is turned off or on.

AUTO_UPDATE_STATISTICS_ASYNC

With SQL Server 2000, if the query optimizer determined that the statistics are stale, it started the statistics update and used the updated statistics to generate the best query plan. However, whenever this happened, that query ran more slowly than usual, and sometimes client requests experience timeouts. SQL Server 2005 introduces a database-level option that guarantees predictable query response time. When the AUTO_UPDATE_STATISTICS_ASYNC option is turned on, if the query optimizer determines that statistics are outdated, SQL Server uses the worker thread to start updating the statistics asynchronously. The current query continues to use existing statistics, which might not produce the most efficient query plan, but the query might execute quickly because it does not have to wait for update statistics to finish. The subsequent queries are then able to make use of the updated statistics when the worker thread finishes. You can use ALTER DATABASE statement to turn on this option and the is_auto_update_stats_async_on field in the sys.databases catalog view to find out whether this option is turned off or on.

RECOMPILE and OPTIMIZE FOR Hints

SQL Server supports specifying the WITH RECOMPILE option while creating stored procedures. This option tells SQL Server to discard the stored procedure plan and recompile the stored procedure every time it is invoked. With SQL Server 2000, it was all or nothing; however, with SQL Server 2005, you can use a RECOMPILE hint with the query inside a stored procedure to have only that query recompiled every time a stored procedure is called instead of recompiling entire stored procedure.

If you know that a parameter will often have a certain value, you can give this hint to SQL Server by using an OPTIMIZE FOR hint so that SQL Server will generate the most efficient plan, using the value specified.

NUMA, Hyperthreading, 64-Bit Support, and Indexed Views

To maximize the performance on high-end systems (those with eight or more processors), the SQL Server 2005 engine has been architected to leverage non-uniform memory access (NUMA) and hyperthreading. With Symmetric multiprocessing (SMP), all memory accesses are posted to the same shared memory bus. This works fine for a relatively small number of CPUs, but the problem with the shared bus appears when you have a larger number of CPUs competing for access to the shared memory bus. NUMA alleviates these bottlenecks by limiting the number of CPUs on any one memory bus and connecting the various nodes by means of a high-speed interconnection. SQL Server 2005 understands this technology and does all it can to avoid making the CPU travel on the interconnect bus in order to reduce the contention. NUMA support is discussed in detail in Chapter 2. SQL Server 2000 64-bit is supported only on Itanium 2 processors. SQL Server 2005 supports both Itanium 2 and x64 processor platforms. SQL Server 2005 includes many improvements for indexed views. SQL Server 2005 extends the list of constructs that can be used for indexed views. Indexed views and other performance improvements are discussed in Chapter 8.

Mirrored Backup Media Sets and Enhanced Error Detection

SQL Server 2005 can back up simultaneously to a maximum of four devices. It increases the reliability of a system by having extra copies of backup in case a backup media set goes bad. In addition, SQL Server 2005 supports database and backup checksums, to ensure the reliability of the media set. Finally, unlike SQL Server 2000, in SQL Server 2005, RESTORE VERIFYONLY guarantees that the backup is good as it checks everything to ensure that the data is correct.

XML Web Services Support

As mentioned earlier, basic XML support was first introduced in SQL Server 2000, providing the capability to retrieve a relational rowset as XML or to turn an XML string into a relational rowset. The XML support was further extended in SQLXML web releases. One of the features in SQLXML web releases is the ability to quickly expose a stored procedure or function as a web service method. If you are not familiar with XML web services, think of them as an integration API that can work over the Internet. The XML web services support in SQL Server 2005 is functionally similar to that provided by SQLXML web services, but it is more efficient, native, and secure, and it does not have any dependency on a web server such as IIS.

SQL Server 2005 allows you to expose stored procedures and functions as Web service methods. The Web services support in SQL Server 2005 does not require IIS, as it uses the new HTTP listener process (http.sys) that is available on Windows XP Service Pack 2 and Windows Server 2003 platforms. Hence, web services in SQL Server 2005 will only work on Windows XP Service Pack 2 and Window Server 2003.

The CREATE ENDPOINT T-SQL statement can be used to define a web service and web methods. After you define a service and web methods that map to stored procedures or functions, you can invoke those methods from any platform or any language over the Internet by just using HTTP and SOAP.

Native web services support in SQL Server 2005 is discussed in the second half of Chapter 10.

SQL Server 2005 Express and Mobile Editions

SQL Server 2005 Express Edition is the successor of MSDE, and SQL Server Mobile Edition is the successor of SQL Server CE Edition.

Like MSDE, SQL Server 2005 Express Edition is free, and it is intended to be used for personal or simple data-driven applications. The Express Edition contains several enhancements over MSDE. Some of the enhancements include availability of graphical tools (such as SQL Server Configuration Manager and Express Manager), XCopy deployment, 4GB database size, and removal of concurrent worker throttle mechanism that restricted the scalability under heavy load.

SQL Server Mobile Edition supports more platforms than SQL Server CE, hence the name change. It supports all the mobile devices, including Pocket PCs, Tablet PCs, Embedded CE devices, and smart phones. There are several significant improvements to this edition. Some of the new features introduced in Mobile Edition include integration with Management Studio and Visual Studio .NET 2005; synchronization changes, including configurable compression levels, progress status, column-level sync tracking, and support for multiple subscriptions; multiuser support and SQLCEResultSet for efficient data access; and SSIS support.

T-SQL Enhancements

Now that SQL Server 2005 allows the writing of stored procedures, functions, triggers, and other database objects using .NET code, is T-SQL going away? Absolutely not! T-SQL is still the premier language for writing SQL Server data access batches, stored procedures, and so on. Even with SQLCLR or any other data-access API, the query language used is still TSQL. SQL Server 2005 introduces some very cool enhancements to T-SQL that allow you to write better queries and modules. The following sections introduce some of these enhancements, and Chapter 6, “Transact-SQL Enhancements,” discusses each of these new T-SQL features in detail.

Recursive and Non-recursive Common Table Expressions (CTEs)

You can think of CTEs as a simple yet more powerful alternative to derived tables. In some cases, CTEs may be used in places where you are currently using temporary tables, table variables, or views. The three important motivations for introducing CTEs in SQL Server 2005 are recursion; to provide alternative, simplified, readable, and manageable syntax for complex SQL statements, possibly making use of derived tables; and ANSI SQL-99 compliance (CTEs are defined in SQL-99).

A CTE can be defined as a temporary named result set, which is derived from a simple query and defined within the execution scope of a SELECT, INSERT, UPDATE, or DELETE statement. It is important to note that the scope of a CTE is just the statement in which it is declared. The CTE named result set is not available after the statement in which it is declared and used.

Here is a simple example that uses a CTE to count number of direct reports for each manager in the AdventureWorks sample database:

USE AdventureWorks ;
GO
WITH DirReps(ManagerID, DirectReports) AS
(
    SELECT ManagerID, COUNT(*)
    FROM HumanResources.Employee AS e
    WHERE ManagerID IS NOT NULL
    GROUP BY ManagerID
)
SELECT * FROM DirReps ORDER BY ManagerID
GO

In the scope of a SELECT statement, this query declares a result set named DirReps, and then the outer SELECT statement retrieves everything from that result set. Nothing fancy here—just a simple example to give you a taste of CTE syntax.

The following CTE example that is a bit more involved. This query is used to trend the sales data; for every year, it gets the total sales amount and compares it with the previous year’s sales:

WITH YearlyOrderAmtCTE(OrderYear, TotalAmount)
AS
(
     SELECT YEAR(OrderDate), SUM(OrderQty*UnitPrice)
     FROM Sales.SalesOrderHeader AS H JOIN Sales.SalesOrderDetail AS D
          ON H.SalesOrderID = D.SalesOrderID
     GROUP BY YEAR(OrderDate)
),
SalesTrendCTE(OrderYear, Amount, AmtYearBefore, AmtDifference, DiffPerc)
AS
(
SELECT thisYear.OrderYear, thisYear.TotalAmount,
        lastYear.TotalAmount,
        thisYear.TotalAmount - lastYear.TotalAmount,
        (thisYear.TotalAmount/lastYear.TotalAmount - 1) * 100
FROM YearlyOrderAmtCTE AS thisYear
      LEFT OUTER JOIN YearlyOrderAmtCTE AS lastYear
      ON thisYear.OrderYear = lastYear.OrderYear + 1
)
SELECT * FROM SalesTrendCTE
GO

This query essentially defines two CTEs. The first CTE, called YearlyOrderAmtCTE, groups the total sales by year. This CTE is then used in the second CTE, called SalesTrendCTE, and the outer or main query selects all the rows from SalesTrendCTE. Notice how SalesTrendCTE uses YearlyOrderAmtCTE to get the current and previous years’ total sales figures.

In the previous example, the second CTE refers to the previous CTE. If a CTE refers to itself, it is then called a recursive CTE, and this is where CTEs get interesting. Without any further ado, here is an example of a recursive CTE:

WITH MgrHierarchyCTE(EmployeeID, EmployeeName, ManagerID, Level)
AS
(
   SELECT e.EmployeeID, c.FirstName + ' ' + c.LastName, e.ManagerID, 0
   FROM HumanResources.Employee AS e
        JOIN Person.Contact AS c
        ON c.ContactID = e.ContactID
   WHERE e.EmployeeID = 111

   UNION ALL

   SELECT mgr.EmployeeID, co.FirstName + ' ' + co.LastName, mgr.ManagerID,
          Level + 1

   FROM HumanResources.Employee AS mgr
        JOIN Person.Contact AS co
        ON co.ContactID = mgr.ContactID
        JOIN MgrHierarchyCTE AS cte
        ON cte.ManagerID = mgr.EmployeeID
)
SELECT * FROM MgrHierarchyCTE;
GO

This recursive CTE example illustrates traversing up the management hierarchy all the way up to the topmost manager for the employee with the ID 111. The query returns a result set that contains a row for each manager in the hierarchy, starting with the immediate manager and going up to the top-level manager.

The first SELECT statement within the CTE definition finds the immediate manager details. This row is then combined, using UNION ALL, with another SELECT statement that self-references the CTE to traverse the management hierarchy.

The most common problem with recursion is infinite loops. SQL Server solves this problem by defining a serverwide recursion level limit setting called MAXRECURSION, which defaults to 100. You can also specify a MAXRECURSION hint in your CTE query to limit the recursion level for that query.

CTEs are discussed in more detail in Chapter 6.

Improved Error Handling

In previous releases, the error handling inside T-SQL scripts was done by using @@ERROR at multiple places and by using GOTO and RETURN statements. SQL Server 2005 adopts the modern structured error handling by introducing the TRY...CATCH construct. If you are familiar with Visual Basic .NET or C# .NET, you can think of this as being equivalent to try...catch blocks; however, there is no equivalent to finally blocks.

Here is an example of the modern error handling paradigm in SQL Server 2005:

USE AdventureWorks
GO
BEGIN TRY
    -- Generate a constraint violation error.
    DELETE FROM Production.Product WHERE ProductID = 980;
END TRY
BEGIN CATCH
    SELECT ERROR_NUMBER() AS ErrorNumber,
           ERROR_SEVERITY() AS ErrorSeverity,
           ERROR_STATE() as ErrorState,
           ERROR_MESSAGE() as ErrorMessage;
END CATCH
GO

To better handle errors, all you have to do is put T-SQL statements inside BEGIN TRY and END TRY, and if an error occurs, you can obtain detailed error information by using various ERROR_XXXX functions. In addition to the previously mentioned functions, you can also use the ERROR_LINE() and ERROR_PROCEDURE() functions inside a module to find the exact line where the error occurred and the name of the module.

Limiting a Result Set by Using TOP and TABLESAMPLE

The TOP keyword is not new to SQL Server 2005. It has been available since SQL Server 7 and can be used to limit a result set to a specified exact number or percentage of rows. However, it is enhanced in this edition so that TOP can be used for DML statements (INSERT, UPDATE, and DELETE). In addition, you no longer have to hard-code the number of rows or percentage value; the TOP keyword now also accepts an expression.

In addition to TOP, SQL Server 2005 introduces the new keyword TABLESAMPLE, which you use to limit the number of rows returned. Unlike TOP, TABLESAMPLE returns a random set of rows from throughout the set of rows processed by the query, and TABLESAMPLE cannot be used with a view. As with TOP, you can specify an exact number of rows or a percentage number.

This sampling technique can be used to get better performance for queries on large databases where absolutely exact results are not desired. For example, if you want to find an approximate estimate for the average employee salary, you can use TABLESAMPLE to do aggregation on a sample of data rather than on an entire large dataset.

The SQL:2003 proposal includes two sampling methods: BERNOULLI and SYSTEM. SQL Server 2005 supports only the SYSTEM sampling method.

Here is an example script that passes a variable to the TOP keyword and illustrates the new TABLESAMPLE keyword:

USE AdventureWorks
GO

DECLARE @var INT;
SET @var = 3;
SELECT TOP (@var) * FROM Sales.Store;
GO

SELECT AVG(SickLeaveHours) FROM HumanResources.Employee
TABLESAMPLE SYSTEM (20 PERCENT)
REPEATABLE(3);
GO

Like all other T-SQL topics, TOP and TABLESAMPLE are also further discussed in Chapter 6.

Ranking Functions

SQL Server 2005 introduces four very useful functions—ROW_NUMBER(), RANK(), DENSE_RANK(), and NTILE()—that can be used to rank the rows in a partition. For example, you can generate a sequence number, generate a different number for each row, or rank rows based on some partition criteria:

ROW_NUMBER()—You can use this function to generate sequential row numbers. This can be very handy when you’re implementing scenarios such as paging in web pages.

RANK()—You can use this function to rank the rows within the partition of a result set. The rank of a row is 1 plus the number of ranks that precede the row in question.

DENSE_RANK()—This function is similar to RANK() except that it does not leave any gaps in ranking. The rank of a row is 1 plus the number of distinct ranks that precede the row in question.

NTILE()—You can use this function to distribute the rows in an ordered partition into a specified number of groups.

Here is an example of a T-SQL statement that makes use of all four of the ranking functions:

USE AdventureWorks;
GO
SELECT ROW_NUMBER() OVER(ORDER BY ListPrice DESC) as RowNum,
       ProductID, Name, ProductNumber, ListPrice,
       RANK() OVER (ORDER BY ListPrice DESC) as Rank,
       DENSE_RANK() OVER (ORDER BY ListPrice DESC) as DenseRank,
       NTILE(10) OVER (ORDER BY ListPrice DESC) as NTile_10,
       NTILE(20) OVER (ORDER BY ListPrice DESC) as NTile_20
FROM Production.Product;

If you try out this query in the Query Editor in Management Studio, you should notice that the first column, RowNum, is a sequential number starting with 1; the Rank column starts with 1 and when the price changes, the value for Rank column in the next row is 1 plus the number of rows having rank as 1, and so on. When the price column value changes, the DenseRank column value changes to 2, then 3, and so on. The NTile_10 and NTile_20 columns contain values starting from 1 through 10 and 1 through 20, dividing the result set into 10 and 20 groups, respectively.

PIVOT and UNPIVOT

PIVOT and UNPIVOT are two new keywords introduced in SQL Server 2005 that can be used with the SELECT statement. In simple terms, PIVOT can be used to turn rows into columns, and UNPIVOT can be used to turn columns into rows.

Let’s begin by looking at an example of the PIVOT keyword:

SELECT [316] AS Blade, [331] AS [Fork End]
FROM
   (SELECT ProductID, Quantity FROM Production.ProductInventory) AS pinv
   PIVOT
   (
      SUM (Quantity)
      FOR ProductID IN ([316], [331])
   ) AS pvt;
GO

This query essentially generates cross-tabulation reports to summarize the quantity for two products. You should run the query by using the Query Editor in Management Studio and notice that it returns two columns, named Blade and Fork End, and a single row that contains the total quantity available for these two products.

The counterpart of PIVOT is UNPIVOT, which can be used to turn columns into rows. Here is an example of UNPIVOT:

CREATE TABLE dbo.tblGrades
  (StudentID int NOT NULL PRIMARY KEY,
   Term1 CHAR(1) NOT NULL,
   Term2 CHAR(1) NOT NULL,
   Term3 CHAR(1) NOT NULL,
   Term4 CHAR(1) NOT NULL);
GO

INSERT INTO dbo.tblGrades SELECT 1, 'A', 'B', 'C', 'D';
INSERT INTO dbo.tblGrades SELECT 2, 'D', 'C', 'B', 'A';
GO

SELECT * FROM dbo.tblGrades;
GO

SELECT StudentID, Term, Grade
FROM
   (SELECT StudentID, Term1, Term2, Term3, Term4 FROM dbo.tblGrades) p
    UNPIVOT
      (Grade FOR Term IN
      (Term1, Term2, Term3, Term4)
)AS unpvt
GO

IF OBJECT_ID('dbo.tblGrades') IS NOT NULL

BEGIN
   DROP TABLE dbo.tblGrades;
END
GO

This sample T-SQL code begins by creating a table that stores student grades for four terms. Each term is a column in the table. The UNPIVOT statement is then used to turn term grade columns into rows. Figure 4.3 shows the power of the UNPIVOT keyword.

Figure 4.3. The UNPIVOT keyword can be used to convert columns into rows.

image

SQL Server 2005 introduces several other T-SQL enhancements, such as the APPLY keyword to invoke a table valued function for each row in the rowset; the OUTPUT keyword to retrieve rows affected by DML statements; referential integrity enhancements to set a default value or null when a row is deleted from the parent table; and enhancements to FOR XML and OPENXML. Refer to Chapter 6 for complete details on these enhancements.

SQL Server 2005 Security

Microsoft’s dedication to trustworthy computing is clearly visible in SQL Server 2005. SQL Server 2005 is secure by design, secure by default, and secure by deployment.

To make sure that SQL Server 2005 is secure by design, every team member on the SQL Server 2005 product attended the security training, and each component/feature, such as Service Broker and SQLCLR, has gone through threat analysis and testing. Code reviews, penetration testing, use of automated code tools, scenario-based testing, and low privileged testing are some other techniques that the SQL Server team used to ensure that the product is secure by design. All existing features have been reevaluated to make them more secure, and every new feature has undergone thorough scrutiny to make sure it meets the security standards.

Secure by default means that when an administrator first installs SQL Server 2005, the data environment is in its most locked-down state, with secure settings for all configuration options. SQL Server 2005 setup installs only the necessary components, features such as SQLCLR are not enabled by default, services such as Agent and Full-Text Search are set to manual start, and Database Mail is not configured by default. These types of changes minimize the attack surface, making the system secure by default.

Secure by deployment means providing administrators with good tools for security assessment, auditing and reducing the attack surface, assisting in automating monitoring, and following the principle of least privilege. The SQL Server Management Studio and Surface Area Configuration tools, DDL triggers, the new auditing capabilities, and departure from the sysadmin requirement to run tools such as Profiler are certainly steps toward making SQL Server solutions secure by deployment.

Let’s now briefly look at some of the new security-related features introduced in SQL Server 2005.

Password Policy Enforcement and Strengthening of SQL Authentication

Windows Authentication is still the recommended authentication mechanism in SQL Server 2005. However, if you have to use SQL authentication, you should know that two big improvements have been made to enhance SQL authentication.

SQL Server 2005 makes sure that you specify a strong password when creating a SQL login. For instance, the CREATE LOGIN statement will fail if you specify a blank password, specify a password that is the same as the login name, or use words such as “Admin” or “Administrator.” None of the following statements will succeed:

CREATE LOGIN TEST1 WITH PASSWORD = '';
GO
CREATE LOGIN TEST1 WITH PASSWORD = 'TEST1';
GO
CREATE LOGIN TEST1 WITH PASSWORD = 'Admin';
GO
CREATE LOGIN TEST1 WITH PASSWORD = 'SA';
GO
CREATE LOGIN TEST1 WITH PASSWORD = 'Administrator';
GO

All these statements fail, with the “Password validation failed” error.

In addition, if you are running on Windows Server 2003, which is, by the way, the recommended platform for SQL Server 2005, the same complexity and expiration policies used in Windows Server 2003 can be used for SQL authenticated logins.

The second enhancement is that SQL Server 2005 encrypts the login packets. In other words, unlike in previous releases, SQL Server 2005 never sends SQL authentication details in clear text. If SSL is available, SQL Server uses it; otherwise, it generates a 512-bit certificate and uses it for encryption.

User–Schema Separation

According to the ANSI SQL-92 standard, the sole purpose of a schema is to act as a namespace, to group related objects under one umbrella. However, SQL Server 2000 and previous releases did not differentiate between a schema and a user. A user was essentially treated as a schema. If an administrator wanted to delete a user, he or she needed to delete all the objects owned by that user or change the owner of all the objects owned by the user being deleted. Changing the object owner very often required changing the application code.

SQL Server 2005 breaks the user–schema tight coupling, treating schemas as per the ANSI SQL-92 standard—as meaningful names that can be used to group related objects. For example, in the AdventureWorks sample database, objects are grouped under schemas named Sales, HumanResources, Person, and so on.

Schemas are owned by users, such as dbo or some user UserX. Now, if an administrator wanted to drop the user UserX, all the administrator would have to do is change the schema owner and delete the user. As long as that user does not own any schemas and is not used for the EXECUTE AS context (discussed later in this chapter), the administrator should be able to delete the user. The applications still continue to access the objects by using schema names, such as Sales.Store or HumanResources.Employee.

With SQL Server 2005, every user has a default schema, and if the user has appropriate permissions, he or she can create and own multiple schemas. Each schema can then contain zero or more objects. While resolving the object name, SQL Server follows a simple rule to look for the object under the user’s default schema, and if it is not found, SQL Server uses dbo as the schema name.

Note that many users can have the same schema as their default schema. No two objects in a schema can have the same name. However, it is possible to have tables with the same name in different schemas. When you create a new database, SQL Server 2005 creates several schemas in the database, such as schemas named dbo, INFORMATION_SCHEMA, sys, and guest; and also a schema is created for every database role. All the catalog views, dynamic management objects, and system-stored procedures reside in the sys schema. User–schema separation is discussed in more detail, with examples, in Chapter 7.

Encryption Support

SQL Server 2005 provides native support for symmetric keys, asymmetric keys, and certificates that can be used to encrypt and decrypt data. Keys and certificates can be stored inside the database and later used for various purposes, such as authentication and encryption/decryption. SQL Server 2005 provides functions such as EncryptByKey, EncryptByPassPhrase, EncryptByAsmKey, EncryptByCert, and their decryption counterparts to encrypt and decrypt data by using keys, certificates, or pass phrases.

Refer to Chapter 7 for more details on keys, certificates, and encryption support in SQL Server 2005.

Module Execution Context and Impersonation

With previous SQL Server releases, if a user had EXECUTE permission on a module, SQL Server did not check permission on the objects accessed in the module, as long as the referred objects were owned by the same user as the module owner. This concept is known as ownership chaining. As soon as a module refers to an object that is owned by a different user than the module owner, the ownership chain breaks, and SQL Server checks whether the user executing the module has permission on the object being accessed.

SQL Server 2005 introduces a new T-SQL construct, EXECUTE AS, that can be used in defining the modules (stored procedures, functions, triggers, and queues) to specify the execution context that the module should run under. SQL Server 2005 uses the specified context to impersonate and run the module. The default EXECUTE AS context is CALLER. This ensures that the engine behaves as in the previous release, supporting the ownership chain concept. However, the module creator can specify whether the module should run under the context of SELF, OWNER, or a specific user.

In addition to using EXECUTE AS while defining the modules, you can also use EXECUTE AS USER = '<username>' within T-SQL batches to change the execution context, and you can use REVERT to change the execution context back to the user used to connect to the instance.

SQL Server 2005 introduces several other security-related features, such as granular permissions control, metadata or catalog security, and SQLCLR security. Chapter 7 is dedicated to SQL Server 2005 security and auditing.

In summary, the enhanced security model, granular permission control, secure-by-default deployment, password policy enforcement, and data encryption make SQL Server 2005 a secure and reliable platform for database and analytical applications.

Performance Monitoring and Tuning Enhancements

In SQL Server 2000, SQL Profiler and PerfMon were the two primary tools used to monitor and troubleshoot performance issues. Some other ways to monitor performance included using system stored procedures such as sp_who2 and sp_lock, DBCC statements such as DBCC INPUTBUFFER and DBCC OPENTRAN, and current activity monitor in Enterprise Manager. As far as performance tuning is concerned, the only tool available in SQL Server 2000 was the Index Tuning Wizard.

For SQL Server 2005, Microsoft has invested a lot of time and resources in improving the existing tools and introducing new tools to monitor and tune the performance. These include the following:

• Several improvements and new features have been introduced in the Profiler tool.

• The Index Tuning Wizard has been replaced with a full-fledged application called Database Engine Tuning Advisor that does a better job of making recommendations on physical database design, works well with large databases and large workloads, contains more functionality, and is more manageable.

• There are new dynamic management views (DMVs) and dynamic management functions (DMFs) to view the current state of the SQL Server 2005 system.

• There are new ways to monitor and avoid blocking and deadlocking.

• SQL Server 2005 provides new Performance Monitor counters.

The following sections present an overview of these changes, which are discussed in great detail in Chapter 9.

Profiler Enhancements

SQL Profiler is still the primary tool for monitoring the activity on a SQL Server instance. The change in SQL Server 2005 is that Profiler now supports tracing of Analysis Services and Integration Services events, in addition to SQL Server.

A new feature called Performance Counters Correlation allows the correlating of Performance Monitor counter data, with the Profiler trace collected at the same time.

SQL Server 2005 Profiler defines a new event called a deadlock graph, which can be captured to monitor and view deadlock data as a picture. Figure 4.4 shows a deadlock graph as traced by the Profiler.

Figure 4.4. SQL Server 2005 Profiler provides several new events, such as deadlock graph.

image

Several new event categories and events have been added to Profiler to support the tracing of the new functionality, such as Service Broker and online index operation. Profiler also defines event classes to trace OLE DB events and deprecated features.

Profiler allows capturing of a showplan and saving it as XML. This XML showplan can be later analyzed using Query Editor in Management Studio. Third-party applications might be available in the future to allow various other operations, such as comparisons, using an XML showplan.

Profiler includes new replay options that allow the controlling of things such as maximum number of replay threads. Refer to Chapter 9 for more details on changes introduced in SQL Profiler.

Database Engine Tuning Advisor

The Index Tuning Wizard from previous releases of SQL Server is now being replaced with a full-fledged GUI application (dtashell.exe) named Database Engine Tuning Advisor (DTA). DTA can also be invoked from the DOS command prompt (dta.exe). It can be used to tune the physical database design to maximize performance and to aid in manageability.

Tuning is generally an iterative process. With the Index Tuning Wizard in previous releases, it was a little difficult to go through the steps multiple times and review the results, and there was no way to compare and do what-if analysis. DTA is a full-blown application that provides the notion of sessions, which makes it is easier to iterate the tuning process, review and evaluate the recommendations, and view reports. Figure 4.5 shows DTA in action.

Figure 4.5. DTA contains significant improvements over the Index Tuning Wizard in terms of functionality, quality of recommendations, and ability to handle large databases and workloads.

image

Here is a summarized list of some of the important tuning advancements introduced in SQL Server 2005:

• In addition to providing recommendations regarding indexes and indexed views, DTA can make several other recommendations, such as how to range partition the data, indexes, and indexed views; including non-key columns with nonclustered indexes; XML indexes; and index rebuilding/reorganization suggestions to reduce the cost of index scans and seeks.

• The Index Tuning Wizard is restricted to workload on a single database and can tune a single database at a time. DTA, on the other hand, supports a workload that spans multiple databases, and in each DTA session, you can choose multiple databases to be considered for tuning.

• It is not recommended to run DTA on a production server during business hours. Depending on the workload and size of the database, DTA might impose significant tuning overhead on the server being tuned. The nice thing is that you can tell DTA when to stop. (Refer to Figure 4.5 and notice the very first option on the Tuning Options tab.)

• DTA session definition and tuning recommendation results can be exported as an XML document. The session definition XML file can later be provided to DTA at the command line (dta.exe) or as a GUI (dtashell.exe) application. The use of XML by DTA opens the door for third-party vendors to provide solutions that integrate with DTA to enhance and support additional functionality. The XSD schema for input and output XML is the same and is available in a file named dtaschema.xsd under the folder C:Program FilesMicrosoft SQL Server90ToolsBinnschemassqlserver20033dta. The month and year in the path might change when SQL Server is released.

• DTA allows you to evaluate recommendations, and when you are satisfied with recommendations, you can specify which ones to implement.

• Unlike Index Tuning Wizard, which only sysadmin server role members can use, DTA can be executed by any members of the db_owner database role.

DTA is discussed in detail in Chapter 9.

Dynamic Management Objects

SQL Server 2005 provides several dynamic management views and functions that essentially provide a real-time snapshot of internal memory structures that indicate the server state. In addition to Profiler and PerfMon, dynamic management objects are an essential tool for proactively monitoring SQL Server. If you run the following query in the Management Studio’s Query Editor, you should see a list of all the available DMVs and DMFs:

SELECT name, type_desc FROM sys.system_objects WHERE name like 'dm_%';

Like catalog views, dynamic management objects also belong to the sys schema.

Here is a sample T-SQL script that illustrates a DMV and a DMF:

USE AdventureWorks;
GO

SELECT * FROM sys.dm_tran_locks;
GO

SELECT * FROM sys.dm_db_index_physical_stats
   (DB_ID(), OBJECT_ID('Sales.Store'), NULL, 0, 'DETAILED'),
GO

Further details on dynamic management objects can be found in Chapter 9.

The Blocked Process Threshold Setting

SQL Server 2005 introduces a new advanced sp_configure setting called blocked process threshold, which can be used to generate an event as soon as blocking happens on the server. The value of this configuration setting is initially set to 0, indicating that this feature is disabled. You can turn on the sp_configure show advanced options setting, change the value of this parameter to a number between 1 and 86400 (indicating 24 hours), and run RECONFIGURE WITH OVERRIDE to immediately activate the option.

Let’s say you specified 10 as the value of this new configuration setting. Now if there is a blocking for 10 seconds, SQL Server 2005 generates an event that can be seen in Profiler and also can be captured by using the new event notification mechanism. Refer to Chapter 9 for more details and a demo of this feature.

High-Availability Features

Minimizing the maintenance window, avoiding unnecessary or unscheduled downtimes, and maximizing server availability are top priorities of the administrators. SQL Server 2000 supported three high-availability solutions: failover clustering, transactional replication, and log shipping. SQL Server 2005 builds on these proven technologies and introduces a few new features to assist DBAs in increasing server availability.

Failover Clustering Changes

A SQL Server 2000 cluster allowed a maximum of four nodes. SQL Server 2005 on Windows Server 2003 Enterprise or Datacenter Edition allows up to eight nodes in a cluster. In addition, SQL Server 2005 Analysis Services and Full-Text Search can now be clustered. The cluster setup now allows you to choose which components you want to install. Dynamic AWE memory is another big improvement in SQL Server 2005 clustering. In an active-active cluster scenario, SQL Server 2005 on Windows Server 2003 dynamically adjusts AWE memory usage after failover. Other clustering-related changes include mount point support, script-based setup, and other setup changes for clustering. These changes are discussed in Chapter 8.

Database Mirroring and Snapshot

Unlike failover clustering, which is a high-availability solution at the server level, log shipping is the database-level high-availability solution available in SQL Server 2000. Database mirroring is the successor to log shipping. Mirroring differs from log shipping in two ways: With database mirroring, changes are reflected in real-time on the standby server and the failover is automatic.

A database mirror is made of three instances of SQL Server 2005: the principal database resides on the production server; the mirror database resides on the second hot-standby server, and the witness server is one that monitors the state of the principal and mirror servers.

All the SQL Server 2005 components such as Service Broker and Database Mail support database mirroring. The data access API is updated to support transparent client redirection in case the mirror server takes the role of the principal server.

Database snapshot, a new feature introduced in SQL Server 2005, allows you to create a new database that is a read-only, static view of some other (source) database. The most common application of the database snapshot feature is in conjunction with database mirroring. The mirror database is always in a recovering state and never available for client connections. However, SQL Server 2005 allows creating database snapshots of the mirror database. These snapshots can then be used as the reporting databases, to separate OLTP from reporting.

Database mirroring and snapshot are discussed in great detail in Chapter 8.

Online Index Operations

Index fragmentation is one of the main reasons for performance degradation in SQL Server query execution. To fix the fragmented indexes in SQL Server 2000, there were three options: drop and re-create the indexes manually, rebuild the indexes by using DBCC DBREINDEX, or defragment the indexes by using DBCC INDEXDEFRAG. Most DBAs chose to use DBCC DBREINDEX or INDEXDEFRAG. The problems with DBREINDEX are that it is an atomic operation, so if you cancel it, the rollback might take a long time and that while the index is being built, table data cannot be updated. INDEXDEFRAG is an online index operation and can be cancelled; however, it does not utilize parallelism and can take a very long time to complete if an index is highly fragmented.

SQL Server 2005 supports the same three options as SQL Server 2000 for fixing index fragmentation. It introduces three new options as well: ALTER INDEX...REORGANIZE to defragment the index, ALTER INDEX...REBUILD to rebuild the index, and ALTER INDEX...REBUILD WITH (ONLINE = ON) for online index rebuilding.

Online index rebuilding allows concurrent modifications to the underlying table or index. SQL Server 2005 Profiler has a new event class that can be used to track the progress of online index operation.

Online index operation and other features related to indexes are discussed in more detail in Chapter 8.

Instant File Initialization

Let’s say you have to restore a 500GB database from a backup. You begin by creating a new blank database that is 500GB or larger in size. With SQL Server 2000, this step itself could take few hours because SQL Server reserved the space and initialized the data file by filling it with zeros. SQL Server 2005 introduces a new feature called instant file initialization, which skips the second step of zeroing out the file bytes; therefore, even large databases are created instantly. Note that instant file initialization works for data files only, and not for log files.

Online, Piecemeal, and Page Restore

SQL Server 2005 introduces new features so that in case of recovery, a database can be made available as soon as possible. If a database contains multiple filegroups, as soon as the primary filegroup is restored, you can bring the database online and in the background restore other filegroups. The page restore is designed to restore torn pages or pages that a checksum shows to be corrupt.

More discussion on high-availability features can be found in Chapter 8.

Replication

SQL Server Replication is a commonly used technology for copying and distributing data and database objects from one database to another, possibly on a different server, and then synchronizing between databases to maintain consistency. SQL Server 2000 supported 3 replication topologies or models: Snapshot, Transactional, and Merge. SQL Server 2005 builds on this solid foundation and introduces about 30 new features that extend the replication platform to provide new capabilities, simplify configuration and monitoring, and give better control to administrators. The following sections briefly discuss some new features, which are then discussed in more detail, along with other replication enhancements, in Chapter 8.

Replication of Schema Changes

With SQL Server 2000, administrators thought that replication was best suited for static environments, where there were no schema changes on the publisher. This was because SQL Server 2000 provided very limited support for making schema changes once an object is published. In some cases, administrators cannot afford to resynchronize the subscription to apply schema changes on the subscribers. The only two stored procedures available to accommodate schema changes on published objects are sp_repladdcolumn and sp_repldropcolumn. A combination of these two store procedure can be used to implement some other column schema changes (such as changing the data type), in addition to adding or dropping the column in a published object.

SQL Server 2005 contains broader support for allowing schema changes to be made on the published object, without any need for running a specialized stored procedure such as sp_repladdcolumn. SQL Server 2005 makes use of DDL triggers mechanism to capture events such as ALTER TABLE, ALTER VIEW, and so on and then applies the publisher schema changes on the subscribers.

Oracle Publishing

In snapshot and transactional replication topologies, SQL Server 2005 now supports bringing data from Oracle version 8.0.5 and above that resides on any operating system. It does not require anything to be installed on the Oracle server. You can configure and administer everything from SQL Server tools, and you don’t necessarily need to have Oracle knowledge to configure Oracle publisher.

Peer-to-Peer Transactional Replication

Peer-to-peer transactional replication is essentially an efficient way to implement scalable and highly available SQL Server solutions. Each node in the topology, a peer, can update the data, and data is synchronized to other peers, making sure all peers have the same copy of the data. Workload can therefore be distributed among peers or fail over to another peer. Unlike merge replication, peer-to-peer transactional replication does not provide the ability for conflict detection and resolution. It is designed for environments where peers update different parts of data, but same databases have to be available on all the nodes.

Merge Replication over HTTPS

Merge replication subscribers getting data over the Internet by using IIS on the server is not new. The SQL Server CE subscribers could always do that. However, if a subscriber was not a SQL Server CE Edition, then it was not possible for the subscriber to connect to the publisher over HTTPS. SQL Server 2005 fixes this and allows both Mobile and Server Editions to connect to the publisher over HTTPS.

There are many other replication improvements in SQL Server 2005 related to functionality, security, performance, monitoring, programming (RMO), and administration. Refer to Chapter 8 for more details.

Business Intelligence Redefined

Business intelligence refers to tools and techniques that support better decision making. By using these tools and techniques, you can turn raw data into insight. Data from possibly disparate source systems can be intelligently integrated to build a foundation for reporting, analysis, data mining, prediction, and expert systems.

SQL Server 7 first introduced OLAP Services, a data warehouse engine for online analytical processing. To continue the evolution of the business intelligence platform, SQL Server 2000 introduced several significant advancements to OLAP Services and renamed the technology Analysis Services.

Analysis Services 2000, in conjunction with data mining models and Data Transformation Services (DTS), which is Microsoft’s premier Extract-Transform-Load (ETL) tool, was a platform of choice for building data warehousing and business intelligence solutions.

SQL Server 2005 revolutionizes and redefines the business intelligence platform by introducing completely redesigned ETL and integration platform known as SQL Server Integration Services (SSIS; formerly known as DTS), and it adds several new features to Analysis Services 2005 by introducing new data mining models, a Web-based reporting solution, and integrated management and development tools.

The main theme of business intelligence in SQL Server 2005 is integrate, analyze, and report: You use SSIS for integration, Analysis Services 2005 and Data Mining for analyzing, and Reporting Services and Report Builder for presentation and delivery of reports. Figure 4.6 shows the core components of the SQL Server 2005 business intelligence framework.

Figure 4.6. SQL Server 2005 provides new tools and technologies to build business intelligence solutions.

image

The business intelligence framework in SQL Server 2005 is based on a solid foundation of a relational engine, and it is supported by innovative development and management tools.

SSIS has been redesigned from scratch, and it goes a step beyond being just an ETL tool. The new SSIS is designed to be an ultimate solution to integrate data from various sources, massage it, and send it to one or more destinations. SSIS is discussed in more detail in Chapter 12, “SQL Server Integration Services Overview.”

Microsoft really listened to the feedback it received on Analysis Services 2000, and it combined that with innovative techniques to make Analysis Services 2005 a highly scalable, available, and secure platform for building multidimensional solutions. Analysis Services 2000 is single-instance and cluster-unaware, whereas Analysis Services 2005 supports up to 50 instances and 8-node (32-bit) or 4-node (64-bit) failover clusters. Several enhancements have been made to the Analysis Services engine, cube and dimension handling, new data mining models, a new .NET Framework–based object model from programming, and native support for the XML for Analysis (XMLA) 1.0 specification. As you will see in Chapter 13, “What’s New in SQL Server Analysis Services 2005,” Analysis Services 2005 is loaded with new features as well as improvements to existing functionality.

In January 2004, Microsoft announced SQL Server 2000 Reporting Services, an enterprise reporting platform that supports a full reporting life cycle, including authoring, management, and delivery. Reporting Services can be used for both paper-oriented and interactive, Web–based reports. Reporting Services is now one of the core subsystems in SQL Server 2005. It incorporates feedback that Microsoft received on SQL Server 2000 reporting services, plus it has tighter integration with SQL Server 2005 and Analysis Services 2005. SQL Server Management Studio, Business Intelligence Development Studio, and other tools have been enhanced to support Reporting Services. SQL Server 2005 Reporting Services supports rich client printing, multivalued parameters, and interactive sorting, and it contains an enhanced expression editor.

SQL Server 2005 also introduces a new ad hoc report design tool targeted at business users. This tool, known as SQL Server 2005 Reporting Services Report Builder, is based on business intelligence technology acquired from ActiveViews Inc. in April 2004. Business users can very easily create reports based on data in a SQL Server relational database or Analysis Services OLAP cubes, without understanding or writing any T-SQL or MDX queries. The reports created using Report Builder can be published to Reporting Services or SharePoint Server.

Introducing Business Intelligence Development Studio

In much the same way as SQL Server Management Studio integrates Enterprise Manager and Query Analyzer into one shell, Business Intelligence Development Studio is an integrated environment for designing and building end-to-end business intelligence solutions. Business Intelligence Development Studio is a single tool that information workers can use to define new Analysis Services data sources, cubes, dimensions, and data mining models; design ETL packages; build reports; and deploy an entire solution to a test or production environment. Essentially, it combines Analysis Manager and DTS Designer, and it adds several new capabilities.

Like Management Studio, Business Intelligence Development Studio is also based on Visual Studio .NET, and it allows you to work with projects and solutions with complete SourceSafe integration. Figure 4.7 shows the Business Intelligence Development Studio environment, with an Analysis Services project being developed.

Figure 4.7. Business Intelligence Development Studio is an integrated IDE for building and deploying business intelligence solutions.

image

You might wonder when to use SQL Server Management Studio versus Business Intelligence Development Studio. If you are developing or maintaining business intelligence solutions, such as designing SSIS packages, reporting services reports, or Analysis Services objects, you should use Business Intelligence Development Studio. However, if you are responsible for developing and administrating relational database objects or administrating and configuring already deployed solutions that use business intelligence technologies, you should use SQL Server Management Studio.

Notification Services

Like Reporting Services, the Notification Services technology was also first introduced as an add-on to SQL Server 2000. Released in August 2002, Notification Services is a highly scalable solution for monitoring for events and sending alerts to millions of subscribers. It is a programming framework for building applications that generate and send notifications to subscribed users. MSN Alerts and New York Times are some examples where this technology is already being used to send alerts to subscribers. This platform is now tightly integrated into SQL Server 2005. The installation of Notification Services is now integrated with SQL Server 2005 setup. SQL Server 2005 Notification Services supports the 64-bit platform. The SQL Server 2005 Management Studio now provides a user interface for deploying and administering Notification Services instances instead of you doing it by using command-line tools, as in Notification Services for SQL Server 2000.

Full-Text Search Enhancements

Full-Text Search has been available since SQL Server 7. It is designed to allow fast and flexible indexing and querying of unstructured data in a database. SQL Server 2005 Full-Text Search enhancements are mainly related to performance and integration.

Full-Text Search now supports having multiple instances installed on the server. The engine has been optimized for faster indexing and querying. According to Microsoft, indexing of 20 million rows now takes about 10 hours to populate, whereas it used to take about 14 days. The full-text queries run 30% to 50% faster than in previous releases. New full-text DDL statements simplify configuration and administration. Running Full-Text Search queries over a linked server and searching across multiple selected columns are some of the new querying capabilities. Finally, the biggest improvement is that if you back up, restore, attach, or detach the database, the full-text catalogs can be part of it. There are several other new features, such as thesaurus support, XML data type support, and accent sensitivity/insensitivity.

Documentation Improvements

SQL Server Books Online is the first resource that database professionals and information workers refer to for any clarification or to understand SQL Server concepts. SQL Server 2005 Books Online has been significantly enhanced for better usability, searching, and organization of information. It is an integrated resource for both the relational engine and Analysis Services. The new tabbed interface, ability to save searches, improved full-text searching capabilities, new tutorials and samples, and search integration with the MSDN and Codezone Community websites are some of the features that enhance the usability of Books Online to quickly locate relevant information.

Summary

SQL Server 2005 includes hundreds of new features related to security, scalability, high availability, performance, monitoring and maintenance, developer productivity, standards compliance, business intelligence, and database administration. This chapter sets the stage for the rest of the book by giving an overview of these features. In Chapter 5, you will learn about new tools and utilities that you can use as a DBA to administer and manage SQL Server technologies.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset