Chapter 10. XML and Web Services Support in SQL Server 2005

IN THIS CHAPTER

XML Primer

The xml Data Type

FOR XML Enhancements

Native XML Web Services Support

Extensible Markup Language (XML) is a set of guidelines for describing structured data in plain text. These guidelines are defined as a specification by the World Wide Web Consortium (W3C; www.w3.org), the body that sets the standards for the Web. Because of its textual nature, XML has been widely adopted as a platform-independent format for data representation. Some of the current applications of XML include exchange of data over the Internet, application integration, content management and document publishing, use as configuration files, news/weblog syndication (RSS), and use as a file format such as Word documents stored as XML.

With the increased use of XML for representing documents and forms, more and more developers want to store XML documents directly into databases and also be able to effectively query and retrieve the XML data. SQL Server 2005 is the first release that has native support for XML storage and querying. This chapter shows you how to store XML data in a database column, index it, modify it, and query it by using another W3C specification, called XQuery.

In addition to the new XML data type and XQuery querying capabilities, SQL Server 2005 also includes enhancements to the OPENROWSET function to allow bulk loading of XML data, as well as several enhancements to the existing FOR XML clause.

The other topic discussed in this chapter is HTTP SOAP or web services support in SQL Server 2005. HTTP SOAP support in SQL Server 2005 allows you to execute T-SQL batches, stored procedures, and scalar-valued user-defined functions over the Internet, from any client platform. For instance, a Unix client application can use Perl to post a SOAP request envelope over the Internet (via HTTP or HTTPS) to a SQL Server 2005 web service method, which in turn can execute a batch or a stored procedure or a function on the server side and return a SOAP response envelope, which is an XML document that contains the results, which can be then parsed by the client Perl application on the Unix machine. This enables cross-platform application integration. You do not need Microsoft Internet Information Server (IIS) or any web server to enable and use the web services support in SQL Server 2005. The second part of this chapter discusses this in great detail.

Before looking at XML support in SQL Server 2005, here is a quick introduction to XML.

XML Primer

If you have ever seen or worked with HTML, you will find an XML document very similar to it. Both HTML and XML are markup languages based on tags within angled bracket. But unlike HTML, XML does not have a fixed set of tags. You can create your own tags, as long as the tags adhere to the rules defined by the XML specification, and then include your data between those tags. Because of this extensibility feature, today more than 450 other markup languages or standards are based on XML, which means they use XML syntax (for example, Scalable Vector Graphics [SVG], VoiceXML, Wireless Markup Language [WML], NewsML).

Whereas HTML’s primary focus is presentation, XML’s main focus is data and its structure. You can use technologies such as XSL Transformations (XSLT) and XSL Formatting Objects (XSL-FO) to present the data in an XML document in different formats. This is one of the reasons XML is being used heavily in content management and document publishing applications.


Note

Unlike HTML, XML is case-sensitive.


The textual nature of XML makes data highly portable, enabling you to send data and integrate applications across platforms. The most common use of XML is for “data on move”—that is, transferring data from one machine to another, cross-platform, crossnetworks, and over the Internet.

Let’s now see what an XML document looks like.

Well-Formed XML Document

As mentioned earlier, an XML document might look very similar to an HTML document. However, XML has more strict syntax rules than HTML. An XML document must meet the following requirements:

• All tags must be closed.

• Tags are case-sensitive.

• An XML document must have a single top-level root element.

• Elements must be nested properly, without overlap.

• No element may have two attributes with the same name.

• Attribute values must be enclosed in either single or double quotes.

If an XML document meets all these requirements, it is considered a well-formed XML document. Here is an example of a well-formed XML document that contains some details from a survey form:

<IndividualSurvey xmlns="http://schemas.microsoft.com/AW/IndividualSurvey">
  <TotalPurchaseYTD>8248.99</TotalPurchaseYTD>
  <DateFirstPurchase>2001-07-22Z</DateFirstPurchase>
  <BirthDate>1966-04-08Z</BirthDate>
  <MaritalStatus>M</MaritalStatus>
  <YearlyIncome>75001-100000</YearlyIncome>
  <Gender>M</Gender>
  <TotalChildren>2</TotalChildren>
  <NumberChildrenAtHome>0</NumberChildrenAtHome>
  <Education>Bachelors &amp; Masters</Education>
  <Occupation>Professional</Occupation>
  <HomeOwnerFlag>1</HomeOwnerFlag>
  <NumberCarsOwned>0</NumberCarsOwned>
  <CommuteDistance>&lt; 4 Miles</CommuteDistance>
</IndividualSurvey>

The XML specification identifies five characters (that is, <, >, &, ', and ") that have special meanings. If any of these characters is required, the alternative entity references (that is, &lt;, &gt;, &amp;, &apos;, and &quot;) must be used in its place in an XML document.

In addition to elements and attributes, an XML document may contain other specialpurpose tags, such as comments (enclosed in <!-- ... -->), processing instructions (enclosed in <? ... ?>) to provide some instructions to the XML parser, and CDATA sections (enclosed in <![CDATA[ ... ]]>). Everything inside a CDATA section is ignored by the parser, so you can include special characters such as <, >, and & within a CDATA section without escaping them as &lt;, &gt;, and &amp;.

Valid XML Documents

One of the primary goals of XML is to enable exchange of data between organizations and applications. To do this, the XML document format that is used for the exchange of information must first be defined and agreed upon. W3C provides a specification called XML Schema Definition (XSD) that can be used to define the structure of an XML document, the data types of elements and attributes, and other rules and constraints that an XML document should follow. If a well-formed XML document adheres to the structure and rules defined by a XSD schema document, that XML document is considered a valid XML document. You can find more details on XSD at www.w3.org/XML/Schema.

XML Namespaces

Namespaces are generally used for two purposes:

• To group related information under one umbrella

• To avoid name collisions between groups

XML namespaces also serve these two purposes. The following sample XML document declares two namespaces, and for each namespace, it defines a prefix (s1 and s2), which is a convenient name for the lengthy namespace URI. The first three child elements are in the s1 namespace, and the others are in the s2 namespace (see Listing 10.1).

Listing 10.1. Survey XML Document

<?xml version="1.0" encoding="UTF-8"?>
<StoreSurvey xmlns:s1="http://schemas.microsoft.com/Survey1"
             xmlns:s2="http://schemas.microsoft.com/Survey2">
  <s1:AnnualSales>300000</s1:AnnualSales>
  <s1:AnnualRevenue>30000</s1:AnnualRevenue>
  <s1:BankName>International Bank</s1:BankName>

  <s2:AnnualSales currency="USD">320000</s2:AnnualSales>
  <s2:AnnualSales currency="Euro">2000</s2:AnnualSales>
  <s2:AnnualRevenue>29000</s2:AnnualRevenue>
  <s2:BankName>National Bank</s2:BankName>

</StoreSurvey>

XML does not enforce the use of namespaces, and it is totally legal for elements to have the same name. So you could have AnnualSales or BankName elements repeated without using the namespace, but how would you identify in your code which one is which? This is where namespaces come handy. Note in Listing 10.1 that the root element (StoreSurvey) is in no namespace, which is the default namespace here.

Also note the use of a processing instruction, an XML declaration line at the top of Listing 10.1, which tells the XML parser about the XML specification version that the document is based on and the text encoding used for the document.

Navigating XML by Using XPath

XML Path (XPath) is yet another W3C specification (see www.w3.org/TR/xpath), and it is widely used to search through XML documents and to retrieve specific parts of an XML document. XPath is based on the notion that every XML document can be visualized as a hierarchical tree. XPath is a language for expressing paths through such trees, from one node to another. XPath query syntax is similar to the syntax you would use to locate a file or files on a Unix file system (that is, using forward slashes to indicate levels of hierarchy). In addition, XPath enables you to retrieve elements or attributes that satisfy a given set of criteria.

For instance, the following XPath expression can be used to retrieve the annual revenue element in the s2 namespace from Listing 10.1:

/StoreSurvey/s2:AnnualRevenue

The conditions are expressed in square brackets, and the attribute names are preceded with the @ symbol. For instance, the //s2:AnnualSales[@currency="USD"] XPath expression returns annual sales elements that have the currency attribute value USD.

In the following sections, you’ll see how SQL Server 2005 supports XML.

The xml Data Type

SQL Server 2005 introduces a new data type, xml, to store XML data inside a relational database. The xml data type can be used for a column in a table, as a variable, as a parameter to a stored procedure or function, and as a function return value. The SQL Server 2005 xml data type implements the ISO SQL-2003 standard xml data type. You can store an entire XML document or XML fragments (XML data without single top-level element) in xml data type columns and variables. By default, SQL Server does not enforce the storing of well-formed or valid XML documents, and it allows the storing of XML fragments in xml data type columns and variables.

As with other data types, you can assign defaults, define column-level and table-level constraints, create XML indexes, and define full-text indexes on xml data type columns. However, there are few restrictions:

• SQL Server internally stores XML data as a binary large object (BLOB), and hence the internal representation of XML in a column cannot exceed 2GB. The depth of the XML hierarchy is limited to 128 levels in SQL Server 2005.

• The xml data type column cannot be defined as a primary key or foreign key; in addition, the UNIQUE, COLLATE, and RULE constraints cannot be used on xml data type columns.

• The xml data type cannot be compared or sorted, except with the IS NULL operator. Hence, you cannot use an xml data type column with an ORDER BY or GROUP BY clause.

• An xml data type column cannot be part of a clustered or a nonclustered index. You can define XML indexes on xml type columns, as discussed later in this chapter.

• The XML declaration processing instruction (<?xml ... ?>) is not preserved when the instance is stored in the database. XML data is always stored and retrieved using UTF-16 encoding. If an XML document has other processing instructions besides the XML declaration line, they all are preserved.

• The order of attributes and insignificant whitespace is not preserved. Also, the single or double quotes around attributes are not preserved; SQL Server always returns attributes enclosed in double quotes.

• You cannot cast an xml data type column to either text or ntext. You can cast or convert an xml data type to varchar or nvarchar, including to varchar(max) and nvarchar(max), and then use that to implicitly or explicitly cast to text or ntext.

• Only ISNULL, COALESCE, and DATALENGTH scalar functions are supported with xml data type columns.

xml data type columns cannot be used in distributed partitioned views.


Caution

The xml data type is not supported on SQL Server 2005 Mobile Edition. If xml data type columns are synced to the SQL Server Mobile database, they will be converted to the ntext type.


Here is an example of using the xml data type in a table:

USE [AdventureWorks];
GO

--User-defined functions used later while creating a table
IF OBJECT_ID('dbo.IsXMLFragment') IS NOT NULL
   DROP FUNCTION dbo.IsXMLFragment;
GO
CREATE FUNCTION dbo.IsXMLFragment(@xmlData xml) returns bit
BEGIN
   RETURN
      CASE @xmlData.value('count(/*)', 'bigint')
         WHEN 1 THEN (@xmlData.exist('/text()'))
         ELSE 1
      END
END;
GO

IF OBJECT_ID('dbo.EmpIDPresent') IS NOT NULL
   DROP FUNCTION dbo.EmpIDPresent;

GO
CREATE FUNCTION dbo.EmpIDPresent(@xmlData xml) returns bit
BEGIN
   RETURN
     @xmlData.exist('/Employee/@ID')
END;
GO

IF OBJECT_ID('dbo.tblXMLTest') IS NOT NULL
   DROP TABLE dbo.tblXMLTest;
GO
CREATE TABLE dbo.tblXMLTest
   (id INT IDENTITY(1,1) PRIMARY KEY,
    col1 XML,
    col2 XML NOT NULL,
    col3 XML DEFAULT N'<Person />',
    col4 XML NOT NULL DEFAULT N'<Address />',
    col5 XML CHECK(dbo.IsXMLFragment(col5) = 0),
    col6 XML NULL CHECK(dbo.EmpIDPresent(col6) = 1) );
GO

This script is available in the file XMLDataTypeDemo.sql in the code download for this book. This script creates a table with six columns of xml data type. The second column, col2, specifies the NOT NULL constraint; the third column, col3, specifies the column default; the fifth column, col5, specifies a check constraint and uses a user-defined function to ensure that only well-formed XML documents are inserted into the column; and the sixth column, col6, also specifies a check condition and uses a user-defined function to ensure that the XML document or fragment has a top-level element called Employee and an attribute called ID. The exist() and value() methods used inside the user-defined functions, as well as other methods available with the xml data type, are discussed later in this chapter.

In the following script, the first three INSERT statements succeed, and the other four INSERTs fail:

INSERT INTO dbo.tblXMLTest(col2, col5) VALUES
(N'<col2Data />', N'<col5Data />'),
INSERT INTO dbo.tblXMLTest(col2, col5, col6) VALUES
(N'<col2Data />', N'<col5Data />', N'<Employee ID="123" />'),
INSERT INTO dbo.tblXMLTest(col2, col5, col6) VALUES
('<col2Data />', '<col5Data />', '<Employee ID="1"/><Employee ID="2"/>'),
GO

SELECT * FROM dbo.tblXMLTest;
GO

INSERT INTO dbo.tblXMLTest(col5, col6) VALUES
(N'<col5Data />', N'<Employee ID="123" />'),
INSERT INTO dbo.tblXMLTest(col2, col5, col6) VALUES
(N'<col2Data />', N'<col5Data /><col5Data />', N'<Employee ID="123" />'),
INSERT INTO dbo.tblXMLTest(col2, col5, col6) VALUES
(N'<col2Data />', N'<col5Data />', N'<Employee/>'),
INSERT INTO dbo.tblXMLTest(col2, col4, col5) VALUES
(N'<col2Data />', NULL, N'<col5Data />'),
GO

Of the four failed INSERTs toward the end, the first one fails because col2 does not allow NULLs, the second INSERT fails because col5 does not allow XML fragments, the third one fails because col6 does not have XML with the Employee top-level element having an ID attribute, and the final one fails because col4 is non-NULLable.

The following script shows how to declare a variable of the xml data type:

DECLARE @var1 XML
SET @var1 = '<TestVariables />'

INSERT INTO dbo.tblXMLTest(col2, col5) VALUES
(@var1, @var1);

SELECT * FROM dbo.tblXMLTest;
GO

The following script block illustrates how to pass an xml data type as a function parameter to implement a technique called “property promotion,” where some value from XML is copied to a separate table column, so that you don’t have to necessarily access and parse XML to access that value:

CREATE FUNCTION dbo.fn_GetID (@empData XML)
RETURNS INT
AS
BEGIN
   DECLARE @retVal INT
   -- do some more things here on the input XML
   SELECT @retVal = @empData.value('(/Employee/@ID)[1]', 'INT')
   RETURN @retVal
END;
GO

SELECT dbo.fn_GetID(col6) FROM dbo.tblXMLTest;

ALTER TABLE dbo.tblXMLTest ADD empNo AS dbo.fn_GetID(col6);

INSERT INTO dbo.tblXMLTest(col2, col5, col6) VALUES
('<col2Data />', '<col5Data />', '<Employee ID="999"/>'),
GO

SELECT * FROM dbo.tblXMLTest;
GO

IF OBJECT_ID('dbo.tblXMLTest') IS NOT NULL
   DROP TABLE dbo.tblXMLTest;
GO
IF OBJECT_ID('dbo.fn_GetID') IS NOT NULL
   DROP FUNCTION dbo.fn_GetID;
GO
IF OBJECT_ID('dbo.IsXMLFragment') IS NOT NULL
   DROP FUNCTION dbo.IsXMLFragment;
GO
IF OBJECT_ID('dbo.EmpIDPresent') IS NOT NULL
   DROP FUNCTION dbo.EmpIDPresent;
GO

This script begins with a user-defined function that accepts an xml data type parameter, uses the value function on the input XML parameter, and returns an integer value. Next, the ALTER TABLE statement adds a new computed column whose value is the integer returned by the fn_GetID function, passing the col6 column to it.

Validating XML Data by Using Typed XML

So far you have seen examples of storing untyped XML documents and XML fragments into xml data type columns, variables, and parameters. SQL Server 2005 supports associating XSD schemas with the xml data type so that the XML value for the column, variable, or parameter adheres to the structure and data types defined by the associated XSD schema. Having SQL Server validate the xml data type column, variable, or parameter by associating it with an XML schema collection is known as typed XML.

With typed XML, you can associate an XSD schema collection with the xml data type column, and the engine validates against the associated schema and generates an error if the validation fails. The typed XML has two main benefits. First, it ensures that the XML data in the column, variable, or parameter is according to the schema you desire. Second, it helps the engine to optimize storage and query processing. Also, when declaring the typed XML, you can specify the DOCUMENT clause, which ensures that the XML has only one top-level element and hence XML fragments are disallowed. The default is CONTENT, which allows XML fragments.


Tip

For performance reasons, it is recommended that you use typed XML. With untyped XML, the node values are stored as strings, and hence the engine has to do the data conversion when you extract the XML values or use node values in the predicate (for example, /person/age < 50); on the other hand, with typed XML, no data conversion takes place because the XML data values are internally stored based on types declared in the XSD schema. Typed XML makes parsing more efficient and avoids any runtime conversions.


Here’s an example of creating typed XML:

  1. Make sure you have an XSD schema in a XML schema collection. SQL Server 2005 provides the CREATE, ALTER, and DROP XML SCHEMA COLLECTION DDL statements, which can be used to manage the XML schema collections in a database. When you have your XSD schema in the database as part of an XML schema collection, it can be associated with any number of xml data type columns or variables to create typed XML.

    The following T-SQL statements create a schema collection called SampleXSDs that contains an XSD schema which defines the purchase order:

    USE AdventureWorks;
    GO

    IF EXISTS
    (SELECT schema_id FROM sys.xml_schema_collections
      WHERE name = 'SampleXSDs')
    BEGIN
        RAISERROR('Schema collection named SampleXSDs already exists.',
                16, 1);
        RETURN
    END;

    CREATE XML SCHEMA COLLECTION SampleXSDs AS
    N'<?xml version="1.0" encoding="UTF-16"?>
    <xsd:schema xmlns:xsd = "http://www.w3.org/2001/XMLSchema"
             targetNamespace = "http://schemas.sams.com/PO"
             elementFormDefault="qualified"
             attributeFormDefault="unqualified"
             xmlns = "http://schemas.sams.com/PO">

      <xsd:complexType name="USAddress">
        <xsd:sequence>
          <xsd:element name="name"   type="xsd:string"/>
          <xsd:element name="street" type="xsd:string"/>
          <xsd:element name="city"   type="xsd:string"/>
          <xsd:element name="state"  type="xsd:string"/>

          <xsd:element name="zip" type="xsd:decimal"/>
        </xsd:sequence>
        <xsd:attribute name="country" type="xsd:NMTOKEN"
                       fixed="US"/>
      </xsd:complexType>

      <xsd:complexType name="Items">
        <xsd:sequence>
          <xsd:element name="item" minOccurs="0" maxOccurs="unbounded">
            <xsd:complexType>
              <xsd:sequence>
                <xsd:element name="productName" type="xsd:string"/>
                <xsd:element name="quantity">
                  <xsd:simpleType>
                    <xsd:restriction base="xsd:positiveInteger">
                      <xsd:maxExclusive value="100"/>
                    </xsd:restriction>
                  </xsd:simpleType>
                </xsd:element>
                <xsd:element name="USPrice"  type="xsd:decimal"/>
                <xsd:element name="shipDate" type="xsd:date"
                             minOccurs="0"/>
              </xsd:sequence>
            </xsd:complexType>
          </xsd:element>
        </xsd:sequence>
      </xsd:complexType>

      <xsd:complexType name="PurchaseOrderType">
        <xsd:sequence>
          <xsd:element name="shipTo" type="USAddress"/>
          <xsd:element name="billTo" type="USAddress"/>
          <xsd:element name="items"  type="Items"/>
        </xsd:sequence>
        <xsd:attribute name="orderDate" type="xsd:date"/>
      </xsd:complexType>

      <xsd:element name="purchaseOrder" type="PurchaseOrderType"/>

    </xsd:schema>'

    The XSD schema first defines a type called USAddress, and then it defines a type called Items, and finally it defines a type called PurchaseOrderType, which makes use of the USAddress and Items types. The XSD schema defines a single root element called purchaseOrder that is of type PurchaseOrderType. Before creating a new schema collection, the script uses the sys.xml_schema_collections catalog view to check whether a schema collection with that name already exists. In this example, it raises an error and returns; alternatively, you can run ALTER XML SCHEMA COLLECTION and add an XSD schema to this existing schema collection.

  2. After you register an XSD schema in a schema collection, SQL Server 2005 stores the schema metadata in various system tables. You can use the XML catalog views to view the XSD schema collection details. Execute the following T-SQL batch to see the details of an XSD schema in the XML schema collection created in step 1:

    DECLARE @collection_id INT
    SELECT @collection_id = xml_collection_id
    FROM sys.xml_schema_collections
    WHERE name = 'SampleXSDs';

    SELECT * FROM sys.xml_schema_namespaces
    WHERE xml_collection_id = @collection_id;

    SELECT * FROM sys.xml_schema_elements
    WHERE xml_collection_id = @collection_id;

    SELECT * FROM sys.xml_ schema_attributes
    WHERE xml_collection_id = @collection_id;

    SELECT * FROM sys.xml_ schema_types
    WHERE xml_collection_id = @collection_id;

    SELECT XML_SCHEMA_NAMESPACE('dbo', 'SampleXSDs'),

    For each XML schema collection, there is an entry in the sys.xml_schema_ collections catalog view. One schema collection can contain multiple XSD schemas that have different target namespaces. For each such schema, there is an entry in the sys.xml_schema_namespaces catalog view. The types, elements, attributes, and so on are available via the catalog views sys.xml_schema_types, sys.xml_schema_elements, sys.xml_schema_attributes, and so on. The last statement shows how to view the registered XSD schema by using the XML_SCHEMA_ NAMESPACE function. Other schema collection catalog views include sys.xml_schema_components, sys.xml_schema_facets, sys.xml_schema_model_groups, sys.xml_schema_wildcards, and sys.xml_schema_wildcard_namespaces.

  3. When an XSD schema is available in an XML schema collection, associate the schema with the XML column to yield typed XML:

    IF OBJECT_ID('dbo.tblTypedXML') IS NOT NULL
       DROP TABLE dbo.tblTypedXML;
    GO

    CREATE TABLE dbo.tblTypedXML
       (id int IDENTITY(1,1) PRIMARY KEY,
        col1 XML (dbo.SampleXSDs));
    GO


    Note

    XML schema collections are database scoped and cannot be referenced across databases. Also, unlike temporary tables, there is no notion of temporary XML schema collections. Therefore, even if you precede a schema collection name with # or ##, it is still created in the current database and not in the tempdb database. If you are creating a temporary table that contains a typed XML column, the schema collection must first be created in the tempdb database.


    The first INSERT statement in the following T-SQL batch succeeds because it validates with the XSD schema specified for the typed XML column, but the second INSERT fails because it is missing the billTo element:

    INSERT INTO dbo.tblTypedXML (col1) VALUES
    (N'<purchaseOrder orderDate="2005-03-03Z"
                          xmlns="http://schemas.sams.com/PO">
       <shipTo country="US">
          <name>Alice Smith</name>
          <street>123 Maple Street</street>
          <city>Mill Valley</city>
          <state>CA</state>
          <zip>90952</zip>
       </shipTo>
       <billTo country="US">
          <name>Robert Smith</name>
          <street>8 Oak Avenue</street>
          <city>Old Town</city>
          <state>PA</state>
          <zip>95819</zip>
       </billTo>
       <items>
          <item>
             <productName>Lawnmower</productName>
             <quantity>1</quantity>
             <USPrice>148.95</USPrice>
          </item>
       </items>
    </purchaseOrder>
    '),

    PRINT 'Following statement will fail:'
    INSERT INTO dbo.tblTypedXML (col1) VALUES
    (N'<purchaseOrder orderDate="2005-03-03Z"
                         xmlns="http://schemas.sams.com/PO">
       <shipTo country="US">
          <name>Alice Smith</name>
          <street>123 Maple Street</street>
          <city>Mill Valley</city>
          <state>CA</state>
          <zip>90952</zip>
       </shipTo>

       <items>
          <item>
             <productName>Lawnmower</productName>
             <quantity>1</quantity>
             <USPrice>148.95</USPrice>
          </item>
       </items>
    </purchaseOrder>
    '),
    GO

  4. Finally, clean up:

    IF OBJECT_ID('dbo.tblTypedXML') IS NOT NULL
       DROP TABLE dbo.tblTypedXML;
    GO

    IF EXISTS
    (SELECT schema_id FROM sys.xml_schema_collections
               WHERE name = 'SampleXSDs')
    BEGIN
       DROP XML SCHEMA COLLECTION SampleXSDs
    END;

Bulk Loading XML Data

If you have XML data in a disk file and would like to load it into an xml data type column, you can use the BULK rowset provider with the OPENROWSET function and specify the SINGLE_CLOB option to read the entire file as a single-row, single-varchar(max) column value.

Let’s assume that you have the following XML text saved into a disk file called “c:PO.xml” on the SQL Server machine:

<purchaseOrder orderDate="1999-10-20" xmlns="http://schemas.sams.com/PO">
   <shipTo country="US">
      <name>Alice Smith</name>
      <street>123 Maple Street</street>
      <city>Mill Valley</city>
      <state>CA</state>
      <zip>90952</zip>
   </shipTo>
   <billTo country="US">
      <name>Robert Smith</name>
      <street>8 Oak Avenue</street>
      <city>Old Town</city>
      <state>PA</state>
      <zip>95819</zip>
   </billTo>
   <items>
      <item>
         <productName>Lawnmower</productName>
         <quantity>1</quantity>
         <USPrice>148.95</USPrice>
      </item>
   </items>
</purchaseOrder>

The following T-SQL statements illustrate creating a table with an untyped xml data type column and bulk loading the preceding XML file into this column:

USE AdventureWorks;
GO
IF OBJECT_ID('dbo.tblBulkLoadXML') IS NOT NULL
   DROP TABLE dbo.tblBulkLoadXML;
GO
CREATE TABLE dbo.tblBulkLoadXML
(id INT IDENTITY(1,1) NOT NULL PRIMARY KEY,
  POData XML NOT NULL);
GO
INSERT INTO dbo.tblBulkLoadXML (POData)
   SELECT * FROM
   OPENROWSET(BULK 'c:PO.xml', SINGLE_CLOB) AS POData;
GO
SELECT * FROM dbo.tblBulkLoadXML;


Note

You can use the technique discussed here for using the OPENROWSET function to read XML from a file. However, SQL Server 2005 does not provide any way to write the XML data from a XML column to a file. You can write a SQLCLR managed procedure to write XML into the file. Chapter 11, “SQL Server 2005 and .NET Integration,” contains the C# code for this.


Before we look at other functions to query and modify the xml data type columns and variables, let’s first take a look at a quick overview of the XQuery specification.

Introduction to XQuery

If SQL is used to query relational data, XQuery is a language that is used to query data that is either physically stored as XML or virtualized as XML. In addition, XQuery also allows general processing of XML (such as creating nodes). XQuery borrows a lot of features from XPath. XQuery 1.0 is said to be an extension to XPath 2.0, adding support for better iteration, sorting of results, and construction (that is, the ability to construct the shape of the desired XML). In summary, XQuery is an expression-based declarative query language that is used to efficiently extract data from XML documents. In SQL Server 2005, the data stored in xml data type columns and variables can be queried using XQuery. SQL Server 2005 partially implements the W3C XQuery specification (see www.w3.org/TR/xquery) and is aligned with the July 2004 working draft (see www.w3.org/TR/2004/WD-xquery-20040723/).

An XQuery query consists of two parts: the prolog and the query body. The optional prolog section is used to declare the namespaces used in the query, and the required query body consists of an expression that is evaluated by the SQL Server 2005 engine to produce the desired output. Each XQuery prolog entry must end with a semicolon (;).

These are the three most common expression types used in XQuery queries:

Path expressions—Exactly as in XPath, the path expressions are used to locate nodes within an XML tree. The path expression (for example, /purchaseOrder/items/item[5]/text()) may consists of steps (separated by / or //), filter steps, axis steps, predicates, a node test, and a name test.

FLWOR expressions—FLWOR, which stands for “for-let-where-order by-return,” is pronounced “flower,” and it is the core XQuery expression that allows looping, variable binding, sorting, filtering, and returning of results. SQL Server 2005 does not support the LET construct.

Constructors—As mentioned earlier, in addition to querying XML data, XQuery allows the creating of XML nodes. There are two approaches to creating XML nodes in XQuery. The “direct” approach involves directly writing XML text or using expressions that produce XML inside curly braces ({}). The other approach, called “computed,” involves the use of keywords such as element, attribute, document, text, processing-instruction, comment, or namespace to create the respective nodes. SQL Server 2005 supports both approaches.

SQL Server 2005 primarily provides three xml data type methods that can be used to run XQuery queries. These methods are query(), value(), and exist(). The other two xml type methods are nodes(), which is used to shred xml type instance into relational data, much like OPENXML, and the modify() method, which is used to modify the content of an xml type column or a variable.

XML Type Methods

The xml data type supports five methods that you can use to manipulate XML instances; you can call these methods by using xmltype.method() syntax:

• The query() method allows users to run an XQuery expression to obtain a list of XML nodes as an instance of untyped XML.

• The value() method is useful for extracting scalar values from XML documents.

• The exist() method is useful for evaluating XQuery expressions to determine whether the expression results in an empty or nonempty set. This method returns 1 for a nonempty result, 0 for an empty result, and NULL if the XML instance itself is NULL.

• The modify() method is useful for modifying the content of an XML document.

• The nodes() method is useful for decomposing an XML document into relational data.

Let’s begin with an example of the query() method:

USE AdventureWorks;
GO
SELECT Name, Demographics.query('
      declare namespace d=
  "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/StoreSurvey";
    d:StoreSurvey/d:AnnualSales')
    AS AnnualSales
FROM Sales.Store;
GO

This T-SQL statement illustrates the query() method. The parameter to the query() method is an XQuery query that begins with a namespace declaration in the prolog section and a path expression as the query body. This SELECT statement queries the Demographics xml data type column to find out each store’s annual sales. Note that the query() method returns the untyped XML as the result.

You can also declare namespaces by using the WITH XMLNAMESPACES clause, as shown here:

WITH XMLNAMESPACES (
  'http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/StoreSurvey'
   AS "d")

SELECT Name, Demographics.query('d:StoreSurvey/d:AnnualSales') AS AnnualSales
FROM Sales.Store;
GO

You can use the value() method to get the node data as a scalar value:

SELECT Name, Demographics.value('
    declare namespace d=
   (d:StoreSurvey/d:AnnualSales)[1]', 'decimal(18,2)')
   AS AnnualSales
FROM Sales.Store;
GO

The value() method takes two parameters: an XQuery expression string and a string that indicates the type of returned scalar value (decimal(18,2), in this example). The second parameter cannot be specified as an xml data type, a CLR user-defined type, or an image, text, ntext, timestamp, or sql_variant data type. The XQuery expression must return a singleton or an empty sequence.

The exist() method takes just one parameter—the XQuery expression—and returns either 1 (indicating TRUE), 0 (indicating FALSE), or NULL (indicating that the xml data type instance was NULL). This method is generally used in the WHERE clause. Let’s say you want to get a list of stores that have T3 internet lines. One of the ways in which you can do this is by running the following query:

SELECT * FROM Sales.Store
WHERE Demographics.value('
declare namespace d=
"http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/StoreSurvey";
""
(/d:StoreSurvey/d:Internet)[1]', 'varchar(20)') = 'T3';
GO

However, this is not an efficient approach because SQL Server has to retrieve the node value, do the conversion (Unicode to ANSI, in this case), and then do the comparison. An alternate and better approach is to use the exist() method as shown here:

SELECT * FROM Sales.Store
WHERE Demographics.exist('
declare namespace d=
  "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/StoreSurvey";
""
/d:StoreSurvey[d:Internet = "T3"]') = 1;
GO

The exist() method exploits the PATH and other XML indexes (discussed later in this chapter) more effectively and can yield better performance than the value() method.

Before concluding this section, here are some more examples that show the capabilities of XQuery. The first two queries illustrate the FLWOR expressions, and the next two queries illustrate constructing XML using XQuery:

--FLWOR Example 1
SELECT EmployeeID, Resume.query('
   declare namespace
    r="http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume";""
   for $emp in /r:Resume/r:Employment
   order by $emp/r:Emp.OrgName
   return concat(string($emp/r:Emp.OrgName), "~")
') AS PrevEmployers
FROM HumanResources.JobCandidate
WHERE EmployeeID IS NOT NULL

--FLWOR Example 2
SELECT name,
   Instructions.query('
   declare namespace i="http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ProductModelManuInstructions";
   for $step in //i:step
   where count($step/i:tool) > 0 and count($step/i:material) > 0
   return $step
  ') AS StepWithToolAndMaterialReq
FROM Production.ProductModel
WHERE Instructions IS NOT NULL

--Constructing XML – direct approach
SELECT FirstName, LastName, AdditionalContactInfo.query('
declare namespace a=
"http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ContactTypes";""

<PhoneNumbers>
{
  for $p in //a:number
  return <Number>{string($p)}</Number>
}
</PhoneNumbers>
') AS AdditionalPhoneNumbers
FROM Person.Contact
WHERE AdditionalContactInfo IS NOT NULL;

--Constructing XML – computed approach
SELECT FirstName, LastName, AdditionalContactInfo.query('
declare namespace a=
"http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ContactTypes";""

element PhoneNumbers
{
  for $p in //a:number
  return
   element Number {string($p)}
}
') AS AdditionalPhoneNumbers
FROM Person.Contact
WHERE AdditionalContactInfo IS NOT NULL;

The first query in this batch illustrates the FLWOR expression. The $emp looping variable is bound to the Employment elements under the Resume top-level elements, and for each such element, the XQuery expression sorts and returns the organization names, separated by tilde characters (~). The second FLWOR example, which illustrates the where clause, and returns the steps that have both material and tool nodes. The final two examples show constructing XML directly and by using computed approach.

XQuery in SQL Server 2005 supports various types of functions and operators, including arithmetic, comparison, and logical operators and the string manipulation, data accessors, and aggregate functions. XQuery expressions also support the if-then-else construct that you can use to perform operations based on the value of a conditional expression. The sql:column() and sql:variable() functions can be used inside XQuery expressions to access a non-XML relational column and an external variable.

SQL Profiler includes a trace event called XQuery Static Type under the TSQL event category that you can use to trace XQuery activity. The event provides a method name, including the column on which the method was executed and the static type of the XQuery expression. To use it, you run Profiler, select the XQuery Static Type event, and run the preceding queries. If you do not see the events in Profiler, you can run DBCC FREEPROCCACHE and then run the XQuery queries again.

In summary, XQuery can be used for querying and reshaping the data stored in xml data type columns and variables. Processing XML at the server by using XQuery can result in reduced network traffic, better maintainability, reduced risk, and increased performance.

Indexes on XML Type Columns

Properly designed indexes are the key to improving query performance. At the same time, you need to consider the cost of maintaining indexes in measuring the overall benefits of the indexes created. As mentioned earlier, SQL Server 2005 stores XML data as a BLOB, and every time an XML column is queried, SQL Server parses and shreds the XML BLOB at runtime to evaluate the query. This can be quite an expensive operation, especially if you have large XML data stored in the column or if you have a large number of rows. To improve the performance of queries on xml data type columns, SQL Server 2005 provides two new types of indexes: primary XML indexes and secondary XML indexes. You can create XML indexes on typed or untyped XML columns.

You can create primary XML indexes by using the CREATE PRIMARY XML INDEX DDL statement. This type of index is essentially a shredded and persisted representation of XML BLOB data. For each XML BLOB in the column, the primary index creates several rows of data in an internal table. This results in improved performance during query execution time because there is no shredding and parsing involved at runtime. The primary XML index requires a clustered index on the primary key of the base table in which the XML column is defined. If the base table is partitioned, the primary XML index is also partitioned the same way, using the same partitioning function and partitioning scheme. After a primary XML index is created on a table, you cannot change the primary key for that table unless you drop all the XML indexes on that table.

All the primary XML index does is avoid the runtime shredding. After you analyze your workload, you can create secondary XML indexes to further improve the performance of an XML query.

A secondary XML index cannot be created unless you have a primary XML index. There are three types of secondary XML indexes—PATH, VALUE, and PROPERTY—each designed for improving the response time for the respective type of query. You can create secondary XML indexes by using the CREATE XML INDEX DDL statement.

If your XML queries make use of path expressions, such as /Production/Product/Material, the PATH secondary XML index can improve the performance of such queries. In most cases, the exist() method on XML columns in a WHERE clause benefits the most from the PATH indexes. The PATH index builds a B+ tree on the path/value pair of each XML node in the document order across all XML instances in the column.

If your XML queries are based on node values and do not necessarily specify the full path or use wildcards in the path (for example, //Sales[@amount > 10000] or //Catalog[@* = “No”]), the VALUE secondary XML index can improve the performance of such queries. The VALUE index creates a B+ tree on the value/path pair of each node in the document order across all XML instances in the XML column.

If your XML queries retrieve multiple values from individual XML instances, the PROPERTY XML index might benefit from using such queries because it groups the properties for each XML together.


Tip

You can include the word Primary, Path, Property, or Value in the name of a primary or secondary XML index that you create to quickly identify the type of XML index.


The following T-SQL batch illustrates creating primary and secondary XML indexes and then using the catalog views and functions to view the XML index details:

USE AdventureWorks;
GO
IF OBJECT_ID('dbo.tblIndexDemo') IS NOT NULL
   DROP TABLE dbo.tblIndexDemo;

GO
CREATE TABLE dbo.tblIndexDemo
   (ID INT IDENTITY(1,1) PRIMARY KEY NOT NULL,
    col1 XML NOT NULL);
GO

CREATE PRIMARY XML INDEX PrimaryXMLIdx_col1
ON dbo.tblIndexDemo(col1);
GO

CREATE XML INDEX PathXMLIdx_col1
ON dbo.tblIndexDemo(col1)
USING XML INDEX PrimaryXMLIdx_col1
FOR PATH;
GO

CREATE XML INDEX PropertyXMLIdx_col1
ON dbo.tblIndexDemo(col1)
USING XML INDEX PrimaryXMLIdx_col1
FOR PROPERTY;
GO

CREATE XML INDEX ValueXMLIdx_col1
ON dbo.tblIndexDemo(col1)
USING XML INDEX PrimaryXMLIdx_col1
FOR VALUE;
GO

--Get Index Information
SELECT * FROM sys.xml_indexes
   WHERE [object_id] = OBJECT_ID('dbo.tblIndexDemo'),

--Cleanup
IF OBJECT_ID('dbo.tblIndexDemo') IS NOT NULL
   DROP TABLE dbo.tblIndexDemo;
GO

This script creates a table that has a column of xml data type. It then creates a primary XML index on this column, followed by all three types of secondary indexes on the same column. You can use the sys.xml_indexes catalog view to view the details on XML indexes. This catalog view indicates whether an XML index is a primary or secondary index; if it is secondary, the view indicates what primary index it is based on and what type (PATH, PROPERTY, or VALUE) of secondary index it is.


Tip

SQL Server 2005 allows you to create full-text indexes on xml data type columns. You can combine a full-text search with XML index usage in some scenarios to first use fulltext indexes to filter the rows and then use XML indexes on those filtered rows, in order to improve the query response time. However, note that attribute values are not full-text indexed as they are considered part of the markup, which is not full-text indexed.


Modifying XML Data

SQL Server 2005 provides the modify() method, which you can use to change parts of XML content stored in an xml data type column or variable. When you change the xml type table column, the modify() method can only be called within a SET clause in an UPDATE statement; when you change the xml type variable, the modify() method can only be called by using the SET T-SQL statement. The modify() function can be used to insert one or more nodes, to delete nodes, or to update the value of a node. This function takes an XML Data Modification Language (XML DML) expression, which is an extension to the XQuery specification. XQuery 1.0 does not support the update functionality. Hence, SQL Server 2005 introduces an extension to XQuery 1.0 by including three new casesensitive keywords—”insert”, “delete”, and “replace value of”—that you can use inside an XQuery query to change parts of the XML data.

The following T-SQL script shows an example of the modify() method and XML DML:

DECLARE @xmlVar xml
SET @xmlVar = N'
  <Person>
    <Phone type="h">111-111-1111</Phone>
    <Phone type="c">222-222-2222</Phone>
  </Person>
';
SELECT @xmlVar;

--insert
SET @xmlVar.modify(
  'insert <Phone type="w">333-333-3333</Phone>
   into (/Person)[1]'),

--delete
SET @xmlVar.modify(
  'delete /Person/Phone[@type="c"]'),

--change node value
SET @xmlVar.modify(
  'replace value of (/Person/Phone[@type="h"]/text())[1]
  with "444-444-4444"'),

SELECT @xmlVar;
GO

At the end, the @xmlVar XML variable has the following value:

<Person>
  <Phone type="h">444-444-4444</Phone>
  <Phone type="w">333-333-3333</Phone>
</Person>

FOR XML Enhancements

The native XML support in SQL Server 2000 was introduced by providing the FOR XML clause and the OPENXML function. The FOR XML clause can be used in a SELECT statement to convert a relational rowset into an XML stream. On the other hand, the OPENXML function does the reverse: It provides the rowset view over an XML document. SQL Server 2005 enhances these two constructs to fix some of the limitations from previous releases and also to add support for the new xml data type.

The following FOR XML clause enhancements have been introduced in SQL Server 2005:

• The new TYPE modifier can be used to generate an instance of an xml data type that can be assigned to a variable or can be directly used in a query. The following query uses the TYPE modifier with a FOR XML clause and saves the generated XML fragment in an xml data type variable, on which the XQuery query is executed later on:

USE AdventureWorks;
GO
DECLARE @Contacts xml
SET @Contacts =
(SELECT TOP 10 FirstName, MiddleName, LastName, EmailAddress, Phone
  FROM Person.Contact ORDER BY LastName
  FOR XML AUTO, TYPE);

SELECT @Contacts.query('
<Contacts>
{
for $c in /Person.Contact
return
   <Contact>
     <Name>{data($c/@LastName)}{data(" ")}{data($c/@FirstName)}
     </Name>
     <Phone>{data($c/@Phone)}</Phone>
     <Email>{data($c/@EmailAddress)}</Email>
   </Contact>
}

</Contacts>
'),
GO

• The FOR XML queries can now be nested. It is important to use the TYPE directive with the internal query; otherwise, the XML generated by the internal query will be entitized (that is, < will be replaced with &lt; and so on). Many complex queries that were written using FOR XML EXPLICIT can now be replaced with nested FOR XML queries, which are simple to write and manage. Here is an example of nesting the FOR XML clause:

SELECT SalesOrderNumber, PurchaseOrderNumber,
(SELECT AddressLine1, AddressLine2, City, PostalCode
  FROM Person.Address
  WHERE AddressID =
    (SELECT AddressID FROM Sales.CustomerAddress
     WHERE CustomerID = s.CustomerID AND AddressTypeID = 3)
  FOR XML AUTO, ELEMENTS, TYPE
  )
FROM Sales.SalesOrderHeader s
WHERE PurchaseOrderNumber IS NOT NULL
FOR XML AUTO

• The FOR XML clause in SQL Server 2000 could only generate XML Data Reduced (XDR) schemas. (XDR is Microsoft’s proprietary XML schema format.) The new XMLSCHEMA directive in SQL Server 2005 allows you to generate inline XSD schemas, which are based on the W3C standard. The optional input to the XMLSCHEMA directive is the target namespace URI, as in the following example:

SELECT * FROM HumanResources.Department
FOR XML AUTO, XMLSCHEMA('urn:test.com'),

The default namespace URL is auto-generated in a format such as

urn:schemas-microsoft-com:sql:SqlRowSet1.

• The new ROOT directive allows you to generate a well-formed XML document with a single root element. The optional input to the ROOT directive is the name of the topmost element. By default, the top-level element is called root when the ROOT directive is specified:

SELECT * FROM HumanResources.Department
FOR XML AUTO, ROOT('Departments'),

• The RAW mode now supports the ELEMENTS directive so that the generated XML stream contains elements instead of attributes for columns. The first SELECT statement is without the ELEMENTS clause, and the next one uses the ELEMENTS clause with FOR XML RAW. In addition, you can now pass the name of an element that is generated for each record, instead of the default element, called row. Try the following two SELECT statements to see the RAW mode enhancements:

SELECT * FROM HumanResources.Department
FOR XML RAW, ROOT('Departments'),

SELECT * FROM HumanResources.Department
FOR XML RAW('Department'), ELEMENTS, ROOT('Departments'),

• In SQL Server 2000, the EXPLICIT mode provided the most control over the structure of the XML document generated. However, it is not easy to write queries by using the EXPLICIT mode. SQL Server 2005 simplifies this by providing a new mode called PATH that allows you to use XPath and specify where and how in the hierarchy the column should appear. The PATH mode provides a simpler way to mix elements and attributes and control the hierarchy of generated XML. An optional argument with the PATH mode is the element name for each record. By default, it is called row. Here is an example of the PATH mode:

SELECT
   DepartmentID "@id",
   ModifiedDate "@updated",
   Name         "Name",
   GroupName    "Group"
FROM HumanResources.Department
FOR XML PATH ('Department'), ROOT('Departments'),

• The ELEMENTS directive now provides an XSINIL option to map NULL values to an element with an attribute of xsi:nil="true" instead of completely omitting the element. If you execute the following batch in SQL Server 2000, you notice that for first row, the col2 element is completely missing:

USE [tempdb];
GO

CREATE TABLE tblTest
(col1 INT IDENTITY(1,1) NOT NULL,
  col2 VARCHAR(20) NULL);
GO

INSERT INTO tblTest DEFAULT VALUES;
INSERT INTO tblTest (col2) VALUES ('Value2'),
GO

SELECT * FROM tblTest FOR XML AUTO, ELEMENTS;
GO

--SELECT * FROM tblTest FOR XML AUTO, ELEMENTS XSINIL;
GO

DROP TABLE tblTest;
GO

If you execute the script in SQL Server 2005, you notice the same thing. If you uncomment the statement containing the XSINIL option and run it in SQL Server 2005, you notice that for col2 having the NULL value, an element is generated with an xsi:nil="true" attribute.

Native XML Web Services Support

After SQL Server 2000 shipped, Microsoft released SQLXML web releases to update and enhance the XML support in SQL Server 2000. SQLXML release 3.0 introduced support for SOAP web services, allowing stored procedure and user-defined functions to be invoked over HTTP as web service methods. The similar XML web services support is now built into SQL Server 2005.

SQLXML provided the web services support via an Internet Services API (ISAPI) DLL running under Microsoft’s IIS web server. The fact that the ability to expose stored procedures and functions as web service methods is natively built into SQL Server 2005 eliminates the dependency on IIS and also means there is no need to install SQLXML to use the web services functionality. Shortly you’ll learn about the native HTTP SOAP support introduced in SQL Server 2005. But first, here is a quick introduction to SOAP, web services, and WSDL.


Note

Native HTTP SOAP support is not a complete replacement for SQLXML—only for the web services support. SQLXML contains a lot more functionality, including support for URL queries, templates, updategrams, and bulk loading, which is available only via SQLXML.


Introduction to SOAP, Web Services, and WSDL

XML web services is a technology based on HTTP, SOAP, and XML that allows one application on one platform to invoke a method in another application, possibly running on a totally different platform, over the Internet.

The most common application of web services is application-to-application integration over the Internet. The client application posts an XML package, called a SOAP request envelope, over HTTP to a web server where the web service is located. On the server side, the web service method is invoked, and it returns another XML package, called a SOAP response envelope, which is received and parsed by the client to see the method execution results. In case of an error, the web service method returns a SOAP fault message.

Web services are platform and language independent. This means you can develop web services by using any programming language, deploy them on any platform, and then invoke a web API from any platform, using any programming language. As long as the client can generate and post SOAP request XML text, the server can parse the posted XML and reply back with the SOAP response XML text. Finally, the client can parse and process that response XML. This is all you need in order to implement and utilize XML web services.

Web Services Building Blocks

The four primary building blocks of today’s XML web services are XML, SOAP, HTTP, and WSDL.

XML is the key to application interoperability and cross-platform integration. Its flexibility (XML defines the syntax, and it is a meta-language that can be used to create other languages), extensibility (XML does not have a fixed set of tags), and portability (XML is text) makes XML a perfect choice for sending and receiving messages over the Internet. Web services use XML as the marshaling format.

The W3C created the SOAP standard, which defines the format of messages sent to implement web services. SOAP was originally an acronym for Simple Object Access Protocol, but now this messaging format specification is no longer an acronym but is simply called SOAP. SOAP uses XML syntax to define the request and response payloads. In other words, the SOAP specification defines how the request and response XML payloads should look, how the web service should report errors, how to send additional information via SOAP headers, and so on. Standardizing on the web services request and response payloads enables you to write web services by using technology from one vendor and consume web services by using technology from a different vendor.

Here is what the SOAP request payload, including the SOAP-specific HTTP headers and the request envelope, looks like:

POST /StockQuote HTTP/1.1
Host: www.stockquoteserver.com
Content-Type: text/xml; charset="utf-8"
Content-Length: nnnn
SOAPAction: "Some-URI"

<SOAP-ENV:Envelope
  xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"
  SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">
   <SOAP-ENV:Body>
       <m:GetLastTradePrice xmlns:m="Some-URI">
           <symbol>DIS</symbol>
       </m:GetLastTradePrice>
   </SOAP-ENV:Body>
</SOAP-ENV:Envelope>

And here is what the SOAP response payload, including the SOAP-specific HTTP headers and the response envelope, looks like:

HTTP/1.1 200 OK
Content-Type: text/xml; charset="utf-8"
Content-Length: nnnn

<SOAP-ENV:Envelope
  xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"
  SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"/>
   <SOAP-ENV:Body>
       <m:GetLastTradePriceResponse xmlns:m="Some-URI">
           <Price>34.5</Price>
       </m:GetLastTradePriceResponse>
   </SOAP-ENV:Body>
</SOAP-ENV:Envelope>

The SOAP request and response can be sent over any protocol (HTTP, SMTP, and so on). As an analogy, if you have a package that you want to send somewhere, you can ship it by using FedEx, UPS, USPS, or another carrier. The most commonly used transport protocol for today’s web services, and the only one supported by SQL Server 2005, is HTTP. The native web services support in SQL Server 2005 is available on only two platforms: Windows Server 2003 and Windows XP SP2. This is because only on these two platforms is the HTTP stack part of the operating system kernel, which is used by SQL Server 2005 to provide HTTP SOAP support without requiring IIS. As a matter of fact, having the HTTP stack (http.sys) in the operating system kernel allows faster execution of HTTP requests. Hence, the native HTTP SOAP support in SQL Server 2005 performs better than ISAPI and IIS-based SQLXML. Prior to HTTP SOAP support in SQL Server 2005, the only network protocol that could be used to access SQL Server was Tabular Data Stream (TDS). The HTTP SOAP support now enables HTTP as an alternate protocol over which SQL Server can be accessed from any platform, without installation of any client or SQL Server network library components. For instance, you can now access SQL Server 2005 over HTTP from a Perl client running on a Unix machine or a C++ or .NET application running on a Pocket PC or smart phone, without requiring any client components such as MDAC.

The fourth and final primary building block for web services is Web Services Description Language (WSDL). WSDL, like SOAP and XML, is a specification defined by the W3C (see www.w3.org/TR/wsdl). WSDL is essentially an XML-based format that describes the complete set of interfaces exposed by a web service. WSDL defines the set of operations and messages that can be sent to and received from a given web service. A WSDL document serves as a contract between web services consumers and the server. A WSDL document describes what functionality a web service offers, how it communicates, and where to find it. In summary, WSDL provides the information necessary for a client to interact with a web service. SQL Server 2005 can dynamically auto-generate the WSDL, or you can write a stored procedure that spits out a custom WSDL document, or you can disable availability of a WSDL document for a web service created by using native HTTP SOAP support.


Note

Remember that the native HTTP SOAP support in SQL Server 2005 is available only on Windows Server 2003 and Windows XP SP2.


Creating HTTP Endpoints

To have SQL Server listen on HTTP for SOAP requests, the first step is to create an HTTP endpoint by using the CREATE ENDPOINT T-SQL statement. The CREATE ENDPOINT statement allows you to create a new web service and, optionally, to define the methods that the endpoint exposes. The ALTER ENDPOINT statement can be used to change the web service settings and to add or remove web methods.

When you create an HTTP endpoint, you have to provide a unique URL, which may include a port number, that SQL Server uses to listen for incoming SOAP requests. When requests are submitted to this URL, the http.sys kernel HTTP stack routes the posted message to the SQL Server endpoint associated with the URL. SQL Server then executes the stored procedure or function, depending on the web method referred in the posted SOAP request envelope, serializes the stored procedure or function results as XML, and replies with the SOAP response envelope.

Although not recommended for security reasons, SQL Server 2005 does allow you to execute ad hoc T-SQL batches over HTTP. You can use the BATCHES=ENABLED option with the CREATE ENDPOINT statement to enable execution of ad hoc T-SQL batches on the endpoint. When this option is enabled, the web service adds a method called sqlbatch in the sql namespace, which clients can invoke to execute ad hoc batches.

CREATE ENDPOINT is a generic statement that is used to create HTTP SOAP endpoints, or to create a TCP-based endpoint to accept T-SQL requests, or to create an endpoint for service broker or database mirroring functionality. With this statement, you provide the name of the endpoint, the authorization details, and a collection of other options, depending on the type of endpoint being created.


Note

The CREATE ENDPOINT statement cannot be executed within the scope of a user transaction. That is, if you have started a transaction by using the BEGIN TRANSACTION statement, you cannot execute CREATE ENDPOINT in that session unless you commit or roll back the transaction.


The following is an example of using a CREATE ENDPOINT statement to create an HTTP SOAP web service that exposes the following stored procedure as a web service method:

USE AdventureWorks;
GO

IF OBJECT_ID('dbo.GetEmployees') IS NOT NULL
   DROP PROCEDURE dbo.GetEmployees;

GO

CREATE PROCEDURE dbo.GetEmployees
AS
BEGIN
   SELECT c.FirstName, c.LastName, c.EmailAddress, c.Phone,
          e.NationalIDNumber, e.LoginID, e.Title, e.BirthDate,
          e.MaritalStatus, e.Gender, e.HireDate
   FROM HumanResources.Employee e JOIN Person.Contact c
   ON e.ContactID = c.ContactID;
END;

Let’s now use the CREATE ENDPOINT statement to expose the preceding stored procedure as a web service method that can be invoked over HTTP to get a list of employees in the AdventureWorks sample database:

IF EXISTS (SELECT endpoint_id FROM sys.endpoints WHERE name = 'AWEmployees')
   DROP ENDPOINT AWEmployees;
GO

CREATE ENDPOINT AWEmployees
   STATE = STARTED
   AS HTTP
      ( AUTHENTICATION = (INTEGRATED),
        PATH = '/SQLWebSvcs/AW',
        PORTS = (CLEAR)
      )
   FOR SOAP
      (  WEBMETHOD 'GetEmployees'
         (  NAME = 'AdventureWorks.dbo.GetEmployees'),
         WSDL = DEFAULT,
         DATABASE = 'AdventureWorks'
      );
GO

This batch first uses the sys.endpoints catalog view to check whether the endpoint named AWEmployees already exists. If it does, the batch drops the endpoint by using the DROP ENDPOINT DDL statement. Next, the CREATE ENDPOINT statement defines an HTTP endpoint and a web method that maps to the GetEmployees stored procedure created earlier.


Caution

If you are executing the preceding T-SQL statements on Windows XP SP2 and have IIS running, you might get error 0x80070020 and have CREATE ENDPOINT fail. In this case, either you can stop IIS or use a different port to create the HTTP SOAP endpoint. If you have an HTTP SOAP endpoint on port 80 (that is, with PORTS = (CLEAR)), and if IIS is Native XML Web Services Support 327 also listening on the same port, starting IIS fails. You have to either drop the HTTP SOAP endpoint or use a different port for the endpoint. Note that this problem does not occur on Windows Server 2003 because IIS 6.0 on that platform is http.sys based, whereas IIS 5.1 on Windows XP SP2 is not http.sys based.


The web service is available at http://localhost/SQLWebSvcs/AW. After you successfully create the HTTP SOAP endpoint by using the preceding statement, you can start Internet Explorer and type http://localhost/SQLWebSvcs/AW?wsdl in the address bar. Internet Explorer should return an XML document, the WSDL for the web service, and it should contain one web method called GetEmployees. To consume this web service, you can try using Visual Studio .NET to add a web reference to http://localhost/SQLWebSvcs/AW?wsdl, which creates the proxy class for the web service, which you can use to invoke web service methods. The code download for this book includes a C# client application that uses the web service created in this section.

Administering HTTP SOAP Endpoints

Following the “secure by default” principle, SQL Server 2005 does not include any HTTP SOAP endpoint out of the box. Members of the sysadmin role and the users who have CREATE ENDPOINT permissions can create HTTP SOAP endpoints. The endpoints are stopped by default, and the STATE = STARTED options can be used with CREATE ENDPOINT or ALTER ENDPOINT to start HTTP SOAP access.

SQL Server 2005 includes Secure Sockets Layer (SSL) support to accept incoming requests and send responses on a secure (https://) channel.

To control access to the endpoint, the AUTHENTICATION option can have one or more of five values: BASIC, DIGEST, INTEGRATED, NTLM, or KERBEROS. BASIC authentication mode requires SSL and consists of an authentication header containing the Base 64–encoded username and password, separated by a colon. In DIGEST authentication mode, the username and password are hashed using a one-way hashing algorithm (MD5) before the request is sent to the server. The server has access to either the raw password or a stored MD5 hash created at the time the password was set. It can then compare the stored calculated value to the one provided by the client. This way, the client can prove that it knows the password without actually giving it to the server. In INTEGRATED authentication mode, the server first tries to authenticate by using Kerberos, and if it is not supported by the client, or if negotiation is not possible, authentication falls back to NTLM. NTLM is the authentication mechanism supported by Windows 95, Windows 98, and Windows NT 4.0 (client and server). This authentication mechanism is a challenge-response protocol that offers stronger authentication than either basic or digest. NTLM is implemented in Windows 2000 and later versions by a Security Support Provider Interface (SSPI). Kerberos authentication is an Internet standard authentication mechanism. Kerberos authentication is supported in Windows 2000 and later versions by an SSPI. After a user is authorized, the user can be routed to a specific database by using the DATABASE= parameter in the CREATE ENDPOINT statement.

An endpoint is a server-level securable. The owner of an endpoint or a sysadmin role member can use GRANT, REVOKE, or DENY for the ALTER, EXECUTE, CONNECT, TAKE OWNERSHIP, and VIEW DEFINITION endpoint permissions regarding a specified principal.

Table 10.1 lists the four catalog views that you can access to view HTTP SOAP endpoint metadata.

Table 10.1. HTTP SOAP-Specific Catalog Views

images

SQL Server:General Statistics (or MSSQL$InstanceName:General Statistics, for a named SQL Server 2005 instance) contains several Performance Monitor counters that you can use to monitor things such as HTTP authenticated requests, SOAP batch SQL requests, SOAP method invocations, SOAP WSDL requests, failed SOAP requests, execution of SOAP requests, and successful SOAP requests.

Summary

This chapter presents the new XML and web services support introduced in SQL Server 2005. The new xml data type allows you to save XML documents and fragments into a relational database. This gives you a consistent way to save structured and unstructured data inside a relational database.

You can associate XSD schemas with an XML type column or variable to yield typed XML. For a typed XML column or variable, SQL Server makes sure that the XML content validates with the structure and type defined in the associated XSD document from the schema collection.

SQL Server 2005 partially implements the XQuery W3C specification to allow querying of XML data. You can create indexes based on XML type to improve the query response time. The XML DML extension to the XQuery specification allows you to update parts of XML data.

This chapter also presents an overview of improvements made to the FOR XML clause from SQL Server 2000.

The final section in this chapter introduces the concept of SOAP and web services and then shows how SQL Server 2005 natively supports mapping of stored procedures and functions to web service methods.

Chapter 11 presents details on another big developer productivity feature introduced in SQL Server 2005—.NET integration with the SQL Server 2005 engine.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset