Chapter 11. Linked Servers and Distributed Transactions

Networking environments are becoming more and more complex. Organizations that managed with a single Microsoft SQL Server now need additional servers, or they need to integrate their existing server with other heterogeneous data sources. SQL Server 2005 provides several features for integrating SQL Server databases with other SQL Server databases and with other data sources, including distributed data, linked servers, and replication. This chapter focuses on linked servers and distributed data. Distributed data includes support for distributed queries, distributed transactions, and remote stored procedure execution. These distributed data features are handled through linked servers, which can be SQL Servers or non-SQL Servers. You will learn about replication in the next chapter, "Chapter 12".

Working with Linked Servers and Distributed Data

Before you use distributed data, you must configure the linked servers you want to use. Linked servers depend on OLE DB providers to communicate with one another. Through OLE DB, you can link instances of SQL Server to other instances of SQL Server as well as to other data sources.

You use linked servers to handle distributed queries, distributed transactions, remote stored procedure calls, and replication. Basically, queries and transactions are distributed when they make use of two or more database server instances. For example, if a client is connected to one server instance and starts a query that accesses a different server instance, the query is distributed. On the other hand, if the same client queries two different databases on the same server instance, the query is considered a local query and is handled internally.

Using Distributed Queries

When you execute a distributed query, SQL Server interprets the command and then breaks it down for the destination OLE DB provider using rowset requests. A rowset is a type of database object that enables OLE DB data providers to support data with a tabular format. As their name implies, rowset objects represent a set of rows and columns of data. After creating the rowset objects, the OLE DB provider calls the data source, opens the necessary files, and returns the requested information as rowsets. SQL Server then formats the rowsets as result sets and adds any applicable output parameters.

Note

Note

With SQL-92, user connections must have the ANSI_NULLS and ANSI_WARNINGS options before they can execute distributed queries. Be sure to configure these options, if necessary. For more information, see the section titled "Configuring User and Remote Connections" in Chapter 6.

You can create simple distributed queries quickly by making your own rowsets. To do this, you use the Openrowset function. When you use this function, you do not need to use linked servers, and you can use the Openrowset function in place of a table in a query if you pass parameters that identify the OLE DB data source and provider.

You use the Openrowset function in the same way that you use virtual tables; simply replace the virtual table reference with an Openrowset reference. Example 11-1 shows the syntax and usage of Openrowset.

Example 11-1. Openrowset Syntax and Usage

Syntax for SELECT with Table Alias

SELECT selection FROM OPENROWSET(rowset_options) AS table_alias

Syntax for Openrowset

OPENROWSET

( { 'provider_name' , { 'datasource' ; 'user_id' ; 'password   
   | 'provider_string' }                                        
      , { [ catalog. ] [ schema. ] object                      
   | 'query' }                                                 
   | BULK 'data_file' ,                                       
       { FORMATFILE = 'format_file_path' [ <bulk_options> ]    
       | SINGLE_BLOB | SINGLE_CLOB | SINGLE_NCLOB }                
} )                                                             
                                                               
<bulk_options> ::=                                             
   [ , CODEPAGE = { 'ACP' | 'OEM' | 'RAW' | 'code_page' }]      
   [ , ERRORFILE = 'file_name' ]                                
   [ , FIRSTROW = first_row ]                                   
   [ , LASTROW = last_row ]                                     
   [ , MAXERRORS = maximum_errors ]                             
   [ , ROWS_PER_BATCH = rows_per_batch ]                        
                                                    
Usage

USE pubs

GO                                                             
SELECT a.*                                                     
FROM OPENROWSET('SQLOLEDB','Pluto';'netUser';'totem12',        
'SELECT * FROM pubs.dbo.authors ORDER BY au_lname, au_fname')  
AS a                                                           
GO
SELECT o.*                                                   
FROM OPENROWSET('Microsoft.Jet.OLEDB.4.0','C:                 
northwind.mdb';'Admin';", 'Orders')                            
AS o

The BULK rowset provider is similar to the BULK INSERT statement. The data_file parameter is used to specify the data file from which data will be copied into the target table. A format file is required to define the column types in the result set, except when you use SINGLE_BLOB, SINGLE_CLOB, or SINGLE_NCLOB. SINGLE_BLOB returns the contents of the data file as a single-row, single-column rowset of type varbinary (max). SINGLE_CLOB reads the data file as ASCII text and returns the contents of the data file as a single-row, single-column rowset of type varchar (max). SINGLE_NCLOB reads the data file as Unicode text and returns the contents as a single-row, single-column rowset of type nvarchar (max). Both SINGLE_CLOB and SINGLE_NCLOB use the collation of the current database.

When the OPENROWSET BULK option is used with an INSERT statement, you can use standard table hints, such as TABLOCK, as well as the special BULK INSERT table hints: IGNORE_CONSTRAINTS, IGNORE_TRIGGERS, KEEPDEFAULTS, and KEEPIDENTITY. When you use the BULK rowset provider with OPENROWSET, you must either specify column aliases in the FROM clause or specify column names in the format file. The syntax for the SELECT statement with the table alias then becomes:

SELECT selection FROM OPENROWSET(BULK rowset_options) AS 
table_alias[(column1_alias, column2_alias,...)]

Using Distributed Transactions

Distributed transactions are transactions that use distributed queries or remote procedure calls (RPCs). As you might expect, distributed transactions are more involved than distributed queries, primarily because you need a mechanism that ensures that transactions are committed uniformly or rolled back on all the linked servers. For example, if you start a transaction that updates databases on three different server instances, you want to make certain that the transaction is committed when it has completed successfully or that the transaction is rolled back if an error occurs. In this way, you ensure the integrity of the databases involved in the distributed transaction.

On SQL Server, three components are required for distributed transactions to be handled properly:

  • Resource managers. You must configure resource managers, which are the linked servers used in the distributed transactions. For details about how to configure resource managers, see the section titled "Managing Linked Servers" later in this chapter.

  • Distributed Transaction Coordinator service. The Distributed Transaction Coordinator service must be running on all servers that are handling distributed transactions. If it is not, distributed transactions will not work properly.

  • Transaction manager. The transaction manager coordinates and manages distributed transactions. The transaction manager on SQL Server is the Distributed Transaction Coordinator.

Note

Note

Applications other than SQL Server can use the Distributed Transaction Coordinator. If you try to analyze Distributed Transaction Coordinator performance, you should note which applications besides SQL Server are using Distributed Transaction Coordinator.

Each server instance involved in a distributed transaction is known as a resource manager. Resource managers coordinate transactions through a transaction manager, such as the Microsoft Distributed Transaction Coordinator. You can use other transaction managers if they support the X/Open XA specification for distributed transaction processing.

You handle distributed transactions in much the same manner as local transactions. Applications start distributed transactions in several ways:

  • Explicitly, by using BEGIN DISTRIBUTED TRANSACTION

  • Explicitly, by using OLE DB methods or ODBC functions to join a distributed transaction started by the application

  • Implicitly, by executing a distributed query within a local transaction

  • Implicitly, by calling a remote stored procedure within a local transaction (provided the REMOTE_PROC_TRANSACTIONS option is set ON)

At the end of the transaction, the application requests that the transaction be either committed or rolled back. To ensure that the transaction is handled properly on all servers, even if problems occur during the transaction, the transaction manager uses a commit process with two phases:

  • Phase 1: The prepare phase. The transaction manager sends a prepare to commit request to all the resource managers involved in the transaction. Each resource manager performs any necessary preparatory tasks and then reports their success or failure to the transaction manager. If all the resource managers are ready to commit, the transaction manager broadcasts a commit message and the transaction enters phase 2, the commit phase.

  • Phase 2: The commit phase. The resource managers attempt to commit the transaction. Each resource manager then sends back a success or failure message. If all the resource managers report success, the transaction manager marks the transaction as completed and reports this to the application. If a resource manager fails in either phase, the transaction is rolled back and the failure is reported.

SQL Server applications manage distributed transactions either through Transact-SQL or through the SQL Server database application programming interface (API). SQL Server itself supports distributed transactions using the ITransactionLocal (local transaction) and ITransactionJoin (distributed transactions) OLE DB interfaces as well as the rowset objects discussed previously. If an OLE DB provider does not support ITransactionJoin, then only read-only procedures are allowed for that provider. Similarly, the types of queries you can execute on a linked server depend on the OLE DB provider you are using.

With distributed queries and transactions, you can use most data manipulation language (DML) commands, such as SELECT, INSERT, UPDATE, and DELETE. You cannot, however, use data definition language (DDL) commands, such as CREATE, DROP, or ALTER. If you need to use DDL commands on linked servers, you may want to create stored procedures and then execute these stored procedures remotely, as necessary.

You can use the EXECUTE statement to execute commands and stored procedures on a linked server. Example 11-2 shows the syntax for using EXECUTE in this way.

Example 11-2. EXECUTE at linked_server

Syntax

EXEC [UTE] ( { @string_variable | [ N ] 'command_string' } [ + ...n ] 
    [ {, { value | @variable [ OUTPUT ] } } [...n] ] )  
    [ AS { LOGIN | USER } = ' name ' ]  
    [ AT linked_server_name ] [;]

Usage

EXEC ( 'SELECT * FROM william.sales') AT ORADBSVR8;

Running the Distributed Transaction Coordinator Service

The Distributed Transaction Coordinator service must run on each server that handles distributed transactions, and you usually will want the service to start automatically when the system starts. This ensures that the distributed transactions are executed as expected. By using SQL Server Configuration Manager, you can control the Distributed Transaction Coordinator service just as you do other SQL Server-related services. For details, see the section titled "Configuring SQL Server Services" in Chapter 3.

You can view the Distributed Transaction Coordinator service in SQL Server Management Studio by completing the following steps:

  1. In SQL Server Management Studio, connect to the server instance you want to use.

  2. In Object Explorer view, expand the Management node. You will see the status of the Distributed Transaction Coordinator and Full-Text Search services displayed. A green circle with a triangle indicates that the service is running. A red circle with a square indicates that the service is stopped.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset